High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

vitalif · 2024-07-30T23:02:14Z

Bug report criteria

This bug report is not security related, security issues should be disclosed privately via etcd maintainers.
This is not a support request or question, support requests or questions should be raised in the etcd discussion forums.
You have read the etcd bug reporting guidelines.
Existing open issues along with etcd frequently asked questions have been checked and this is not a duplicate.

What happened?

My etcd 3.5.8 ate 12 GB of RAM...
The database size is < 2 MB (etcdctl get --write-out=json --prefix / gives me 1966139 byte dump).
The database takes 1.5 GB on disk (/var/lib/etcd0.etcd).
There are 6 active streams.
Client traffic shows 2.2 MB/s in and 2.5 MB/s out in monitoring dashboard.
There are 128+5 keys which are bumped continuosly through leases.
There are other 139 keys which are updated every 30 seconds.

As I've already experienced such memory usage with this etcd, it was started with --enable-pprof=true, pprof shows the following:

# go tool pprof http://192.168.1.144:2379/debug/pprof/heap
Fetching profile over HTTP from http://192.168.1.144:2379/debug/pprof/heap
Saved profile in /root/pprof/pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.012.pb.gz
File: etcd
Type: inuse_space
Time: Jul 31, 2024 at 1:45am (MSK)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) top
Showing nodes accounting for 10.88GB, 99.59% of 10.92GB total
Dropped 179 nodes (cum <= 0.05GB)
Showing top 10 nodes out of 17
      flat  flat%   sum%        cum   cum%
   10.88GB 99.59% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb.(*InternalRaftRequest).Marshal
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-middleware.ChainUnaryServer.func1.1.1
         0     0% 99.59%    10.88GB 99.59%  github.com/grpc-ecosystem/go-grpc-prometheus.(*ServerMetrics).UnaryServerInterceptor.func1
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb._KV_Txn_Handler
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/api/v3/etcdserverpb._KV_Txn_Handler.func1
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).Txn
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).processInternalRaftRequestOnce
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).raftRequest (inline)
         0     0% 99.59%    10.88GB 99.59%  go.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).raftRequestOnce

Is it a memory leak or what?...

What did you expect to happen?

I expected etcd to consume at most 1 GB of memory with 2 MB database

How can we reproduce it (as minimally and precisely as possible)?

Install Vitastor on 1 node. :-) but I'm not sure it will be reproduced easily.

Anything else we need to know?

It's still running, I can collect other information or connect with debugger if you tell me what I should collect. :-)

Etcd version (please run commands below)

$ etcd --version
etcd Version: 3.5.8
Git SHA: 217d183e5
Go Version: go1.19.5
Go OS/Arch: linux/amd64

$ etcdctl version
etcdctl version: 3.5.8
API version: 3.5

Etcd configuration (command line flags or environment variables)

Startup options:

etcd -name etcd0 --data-dir /var/lib/etcd0.etcd --enable-pprof=true --snapshot-count=1000 --advertise-client-urls http://192.168.1.144:2379 --listen-client-urls http://192.168.1.144:2379 --initial-advertise-peer-urls http://192.168.1.144:2380 --listen-peer-urls http://192.168.1.144:2380 --initial-cluster-token vitastor-etcd-1 --initial-cluster etcd0=http://192.168.1.144:2380 --initial-cluster-state new --max-txn-ops=100000 --max-request-bytes=104857600 --auto-compaction-retention=10 --auto-compaction-mode=revision

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

$ etcdctl member list -w table
+------------------+---------+-------+---------------------------+---------------------------+------------+
|        ID        | STATUS  | NAME  |        PEER ADDRS         |       CLIENT ADDRS        | IS LEARNER |
+------------------+---------+-------+---------------------------+---------------------------+------------+
| 720141fbb7123c6f | started | etcd0 | http://192.168.1.144:2380 | http://192.168.1.144:2379 |      false |
+------------------+---------+-------+---------------------------+---------------------------+------------+

$ etcdctl --endpoints=<member list> endpoint status -w table
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
|      ENDPOINT      |        ID        | VERSION | DB SIZE | IS LEADER | IS LEARNER | RAFT TERM | RAFT INDEX | RAFT APPLIED INDEX | ERRORS |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+
| 192.168.1.144:2379 | 720141fbb7123c6f |   3.5.8 |  1.2 GB |      true |      false |        51 |   16960478 |           16960478 |        |
+--------------------+------------------+---------+---------+-----------+------------+-----------+------------+--------------------+--------+

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

serathius · 2024-07-31T07:47:13Z

Looks like #17098 for me, but to confirm. Can you provide:

percentage of PUT requests vs all other requests.
average size of payload in PUT request

ahrtr · 2024-07-31T10:39:10Z

Could you also attach the generated pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.*.pb.gz file?

ahrtr · 2024-08-05T10:28:34Z

The database size is < 2 MB (etcdctl get --write-out=json --prefix / gives me 1966139 byte dump).
The database takes 1.5 GB on disk (/var/lib/etcd0.etcd).

Please also perform a compaction + defragmentation, and retry, please let's know the result. thx

socketpair · 2024-08-05T11:00:34Z

@ahrtr Topic starter concerns on RAM consumption, not about disk usage. If they are connected somehow, yes it might help, but disk usage itself is not the issue. Note, RAM usage is about 10 times bigger then the disk usage.

ahrtr · 2024-08-05T11:23:06Z

If they are connected somehow

They are related, bbolt mmaps the file into memory directly.

vitalif · 2024-08-05T12:09:45Z

Here's the profile pprof.etcd.alloc_objects.alloc_space.inuse_objects.inuse_space.012.pb.gz

Percentage of PUT requests I think was close to 100% )). There were puts + 6 watchers, that's all. RPC rate graph shows 0.85 rps and 2.5 MByte/s traffic.

Memory usage actually dropped the next day along with update traffic when I restarted my software and it stopped updating... something. I think it was old statistics but I'm not sure. 2.5 MB/s is quite a lot given that total amount of data in etcd is 2 MB :-). My software updates statistics every 5 seconds in that installation, so 2.5 MB/s is 12.5 MB every 5 seconds... Still more than 2 MB. So I'm not sure what it was. Maybe these were some unapplied compare failures and retries... I'll monitor this instance for more time.

Here's the screenshot from grafana:

socketpair · 2024-08-05T15:42:36Z

@ahrtr okay, but

does it map multiple times to different places ?
If yes - it does not increase memory usage (RSS) because of shared memory.
if no, why sizes differ so much ? seems it allocates huge amounts of anonymous memory (copies ?)

socketpair · 2024-08-05T15:44:32Z

I'm not an expert in Go. Is there any memory mapper ? I mean a tool that shows what memory block is used for.

clement2026 · 2024-09-06T07:40:15Z

@socketpair I get that you’re concerned about memory size. You might want to try setting the --experimental-snapshot-catchup-entries flag. The default value is 5000, but you can lower it to, say, 500, which should bring your memory usage down to around 3GB.

The downside of setting a low value for --experimental-snapshot-catchup-entries is that slow followers might need to catch up with the leader using a snapshot, which is less efficient. But if you’re running a single instance of etcd, you can even set it to 1 and can keep your memory usage under 2GB.

Based on your posts, it seems the average size of a put request is around 2MB.

Your etcd instance’s heap size is roughly between (--experimental-snapshot-catchup-entries) * 2MB and (--experimental-snapshot-catchup-entries + --snapshot-count) * 2MB. In your case, that’s about 10GB to 12GB.

I’m sharing some tests here for your reference.

Test 1

--experimental-snapshot-catchup-entries 5000

Setup:

rm -rf etcd0.etcd; etcd -name etcd0 \
--data-dir etcd0.etcd \
--enable-pprof=true \
--snapshot-count=1000 \
--advertise-client-urls http://localhost:2379 \
--listen-client-urls http://localhost:2379 \
--initial-advertise-peer-urls http://localhost:2380 \
--listen-peer-urls http://localhost:2380 \
--initial-cluster-token vitastor-etcd-1 \
--initial-cluster etcd0=http://localhost:2380 \
--initial-cluster-state new \
--max-txn-ops=100000 \
--max-request-bytes=104857600 \
--auto-compaction-mode=periodic \
--auto-compaction-retention=5s \
--experimental-snapshot-catchup-entries 5000

Benchmark script:

./bin/tools/benchmark txn-mixed --total=50000000 --val-size=2000000 --key-space-size=1 --rw-ratio=1

Test 2

The setup and benchmark script is the same except --experimental-snapshot-catchup-entries 500.

Test 3

--experimental-snapshot-catchup-entries 1

Hope it helps.

serathius · 2024-09-09T08:25:28Z

Awesome analysis @clement2026, makes me very excited about your work in #17098. Would be awesome to paste those findings there too.

clement2026 · 2024-09-09T08:36:47Z

I’m glad you found the analysis helpful. I’ll include them in #17098.

vitalif added the type/bug label Jul 30, 2024

jmhbnz added area/performance priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. labels Aug 1, 2024

This was referenced Sep 12, 2024

Reduce memory usage of etcd member catchup mechanism #17098

Open

[3.5] Backport DefaultSnapshotCount 10K #18589

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

vitalif commented Jul 30, 2024

serathius commented Jul 31, 2024

ahrtr commented Jul 31, 2024

ahrtr commented Aug 5, 2024

socketpair commented Aug 5, 2024 •

edited

Loading

ahrtr commented Aug 5, 2024

vitalif commented Aug 5, 2024 •

edited

Loading

socketpair commented Aug 5, 2024 •

edited

Loading

socketpair commented Aug 5, 2024 •

edited

Loading

clement2026 commented Sep 6, 2024 •

edited

Loading

serathius commented Sep 9, 2024

clement2026 commented Sep 9, 2024

High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

High memory usage (memory leak?) with 1 node cluster and small DB (but a lot of updates) #18382

Comments

vitalif commented Jul 30, 2024

Bug report criteria

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Etcd version (please run commands below)

Etcd configuration (command line flags or environment variables)

Etcd debug information (please run commands below, feel free to obfuscate the IP address or FQDN in the output)

Relevant log output

serathius commented Jul 31, 2024

ahrtr commented Jul 31, 2024

ahrtr commented Aug 5, 2024

socketpair commented Aug 5, 2024 • edited Loading

ahrtr commented Aug 5, 2024

vitalif commented Aug 5, 2024 • edited Loading

socketpair commented Aug 5, 2024 • edited Loading

socketpair commented Aug 5, 2024 • edited Loading

clement2026 commented Sep 6, 2024 • edited Loading

Test 1

Test 2

Test 3

serathius commented Sep 9, 2024

clement2026 commented Sep 9, 2024

socketpair commented Aug 5, 2024 •

edited

Loading

vitalif commented Aug 5, 2024 •

edited

Loading

socketpair commented Aug 5, 2024 •

edited

Loading

socketpair commented Aug 5, 2024 •

edited

Loading

clement2026 commented Sep 6, 2024 •

edited

Loading