Skip to content

chore: add dynamic stash buckets #5258

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

chore: add dynamic stash buckets #5258

wants to merge 2 commits into from

Conversation

romange
Copy link
Collaborator

@romange romange commented Jun 9, 2025

Before: our Dash segments contained both regular and stash buckets. Each segment was allocated in one shot.
The the segment parameters was carefully chosen to be "allocator friendly" so that segment size was very close to what mimalloc actually allocates a block with size close to the segment size. This indeed has a good memory locality.

However when a segment is split, the occupancy rate of two new segments drops to 47-50% which causes memory spikes for small value workloads.

This change alters the structure of the segment. Now its stash buckets will be allocated separately and lazily when an item can not be inserted into its home buckets. When a segment is split, its stash buckets will be most likely deallocated. So for example, a 56+4 bucket segment will be split into two 56 bucket segments, so the occupancy rate of each one of them will at ~53.5% percent. Moreover, we will be able to use more stash buckets in the future to be able reach near 100% occupancy rate because now the stash bucket count can be chosen separately without affecting the block size of the "stripped" segment.

@romange romange force-pushed the Pr5 branch 9 times, most recently from d75f1be to f1e47d4 Compare June 12, 2025 05:33
@romange romange added this to the v1.32 milestone Jun 12, 2025
@romange romange requested a review from adiholden June 21, 2025 00:12
@romange
Copy link
Collaborator Author

romange commented Jun 23, 2025

hmm, this absolutely does not help - I loaded a dataset and its capacity was actually (negligibly) larger with this PR while I would expect to see the reduction in the capacity.

Before

used_memory:68234918128
used_memory_human:63.55GiB
used_memory_peak:68234918128
used_memory_peak_human:63.55GiB
fibers_stack_vms:7471072
fibers_count:115
used_memory_rss:68757360640
used_memory_rss_human:64.04GiB
used_memory_peak_rss:68757360640
maxmemory:104517999001
maxmemory_human:97.34GiB
used_memory_lua:0
object_used_memory:33866070000
type_used_memory_string:33866070000
table_used_memory:33873575808
prime_capacity:880803840
expire_capacity:13440
num_entries:624649320
inline_keys:102878963

After

used_memory:68408793328
used_memory_human:63.71GiB
used_memory_peak:68408793328
used_memory_peak_human:63.71GiB
fibers_stack_vms:7471072
fibers_count:115
used_memory_rss:68939550720
used_memory_rss_human:64.20GiB
used_memory_peak_rss:68939550720
maxmemory:104786893209
maxmemory_human:97.59GiB
used_memory_lua:0
object_used_memory:33866070000
type_used_memory_string:33866070000
table_used_memory:33881964544
prime_capacity:884607360
expire_capacity:13440
num_entries:624649320
inline_keys:102878963

Before: our Dash segments contained both regular and stash buckets.
Each segment was allocated in one shot.
The the segment parameters was carefully chosen to be "allocator friendly" so that
segment size was very close to what mimalloc actually allocates a block with size close to
the segment size. This indeed has a good memory locality.

However when a segment is split, the occupancy rate of two new segments drops to 47-50%
which causes memory spikes for small value workloads.

This change alters the structure of the segment. Now its stash buckets will be allocated separately
and lazily when an item can not be inserted into its home buckets. When a segment is split,
its stash buckets will be most likely deallocated. So for example, a 56+4 bucket segment will be split into
two 56 bucket segments, so the occupancy rate of each one of them will at ~53.5% percent.
Moreover, we will be able to use more stash buckets in the future to be able reach near 100% occupancy rate
because now the stash bucket count can be chosen separately without affecting the block size of the "stripped" segment.

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant