Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Node-local core assignment: core count decrease #20312

Merged
merged 16 commits into from
Jun 28, 2024

Conversation

ztlpn
Copy link
Contributor

@ztlpn ztlpn commented Jun 27, 2024

Implement copying partition data from extra kvstore shards (i.e. kvstore shards with ids >= current shard count) and use it to allow decreasing core count.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Features

  • Allow decreasing core count if node-local core assignment is enabled.

@ztlpn ztlpn added this to the v24.2.1-rc1 milestone Jun 28, 2024
@ztlpn ztlpn force-pushed the flex-assignment-decrease-core-count branch from 4a37528 to 7c48853 Compare June 28, 2024 12:19
@ztlpn ztlpn requested a review from mmaslankaprv June 28, 2024 12:21
ztlpn added 16 commits June 28, 2024 14:50
If the number of cores was reduced, we need to have some way to access
kvstores for extra cores. Allow constructing kvstore for shard id >=
than the number of cores to achieve that.
Since kvstore operations can in theory fail, copying everything and
then removing (after copy is fully successful) is better than moving
pieces of kvstore state one-by-one (in practice move is still a
piecewise copy-then-remove).

Second reason: we need separate remove helpers to clean garbage and
obsolete kvstore data.
Sometimes a partition should still exist on this node, but its kvstore
state is no longer relevant (e.g. it was transferred to a different
shard but hadn't been deleted yet). Handle this case in
shard_placement_table and controller_backend.
…d transfers

Previously if a cross-shard transfer failed, we couldn't really tell on
the source shard if we should retry or not (we may have failed to remove
obsolete state after a successful transfer, in this case retrying is
dangerous). Mark the state on the source shard obsolete immediately after a
successful transfer to fix that.

Also introduce more detailed failure conditions in prepare_tranfer() -
are we waiting for the source or the destination shard? This will come
handy when we implement moving data from extra shards because we'll have
to clean the destination ourselves.
Pass the current number of kvstore shards to the start method and move
existing partitions on extra shards to one of the current shards if it
is possible.
Calculate max allowed number of partition replicas with the new core count
and reject core count decrease if total number of partition replicas is
greater.
Now that shard_balancer will copy partition data from extra kvstore
shards, we can relax the check in validate_configuration_invariants.
@ztlpn ztlpn force-pushed the flex-assignment-decrease-core-count branch from 7c48853 to a2a27c6 Compare June 28, 2024 13:04
@@ -365,6 +370,9 @@ ss::future<> shard_placement_table::initialize_from_kvstore(
[&ntp2init_data](shard_placement_table& spt) {
return spt.scatter_init_data(ntp2init_data);
});
for (auto& spt : extra_spts) {
co_await spt->scatter_init_data(ntp2init_data);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

any reason why we process existing shard data concurrently, but extra shards one by one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particular reason, but scatter_init_data is CPU-bound, so no benefit in doing it concurrently either.

@ztlpn ztlpn merged commit 27c5cee into redpanda-data:dev Jun 28, 2024
19 checks passed
@ztlpn ztlpn deleted the flex-assignment-decrease-core-count branch June 28, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants