Skip to content

serialize segment directory during full sync / dfs snapshotting #5355

Open
@romange

Description

@romange

Background

Currently we do not have the ability to preallocate DashTable segments during the snapshot load or the full sync load on replica side because segment directory in DashTable is in fact a binary tree of depth k encoded into an array of size 2^k.
See here how segments are split:
https://docs.google.com/presentation/d/1w6KyMmPVfgEN7kYiCf7wOw2U_XfsSdad1bVq0RKWsBE/edit?usp=sharing

Consider the following case: table with 8 segments overall, 4 distinct.
S1, S1, S1, S1, S2, S3, S4, S4
/* This corresponds to the tree:

            R
          /  \
        S1   /\
            /\ S4
           S2 S3

but of course there are many other variations of trees. Therefore a single number of elements does not allow us to preallocate the original tree on the destination server.

Goal

Encode Segment tree information for DFS snapshot (not rdb) or the replication so that the loader will be able to recreate the tree.

  1. DashTable::IterateDistinct (bad name - should be renamed to IterateDistinctSegments) iterates over all the unique segments. Segment::local_depth() and Segment::segment_id() are enough to serialize the tree.
    So in the example above: (S1, 1), (S2, 3), (S3, 3), (S4, 2) are enough to encode the tree topology.
  2. We should introduce an additional RDB_ opcode to pass all the unique segments and their local depths on the save side.
  3. We should support this on the loader side.
  4. DashTable currenty lacks the ability to grow itself based on the series of segments and their local depths. We should add this API and allow the loader to preallocate the segments.

Notes

  • (2) should be under flag that is disabled by default as we do not want to break snapshot compatibility with the older versions.
  • For replication we manage compatibility automatically using DflyVersion that is communicated by replica.

While this improvement helps with Load performance in general, it's very important for tiering algorithms that need to decide which items to offload during the snapshot load and preallocating dashtable is important for them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions