Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not wait for the reallocation batch to finish #9992

Merged
merged 5 commits into from
Apr 14, 2023

Conversation

mmaslankaprv
Copy link
Member

@mmaslankaprv mmaslankaprv commented Apr 12, 2023

Changed the logic in cluster::members_backend to calculate
reallocations when pending reallocation count is smaller than max
reallocation batch size. This way the available learner recovery
bandwidth may be fully utilised and node operations may finish faster.

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v23.1.x
  • v22.3.x
  • v22.2.x

Release Notes

Improvements

  • faster rebalancing after node is added or decommissioned

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv force-pushed the queue-based-balancing branch 2 times, most recently from b3cf6d8 to c9b8238 Compare April 12, 2023 13:56
src/v/cluster/members_backend.cc Show resolved Hide resolved
src/v/cluster/members_backend.cc Show resolved Hide resolved
src/v/cluster/members_backend.cc Show resolved Hide resolved
src/v/cluster/members_backend.cc Outdated Show resolved Hide resolved
Replaced a vector of reallcations with a map where `ntp` is a key. This
will allow for fast lookups when checking if partition was already
scheduled for realloaction.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Renamed `reallocate_replica_set` method to
`reconcile_reallocation_state` to express the complexity of state
changes that the method is responsible for.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
Changed the logic in `cluster::members_backend` to calculate
reallocations when pending reallocation count is smaller than max
reallocation batch size. This way the available learner recovery
bandwith may be fully utilized and node operations may finish faster.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv merged commit 8c91f1c into redpanda-data:dev Apr 14, 2023
@mmaslankaprv mmaslankaprv deleted the queue-based-balancing branch April 14, 2023 12:45
@piyushredpanda
Copy link
Contributor

Let's not forget to backport, @mmaslankaprv

@mmaslankaprv
Copy link
Member Author

/backport v23.1.x

@vbotbuildovich
Copy link
Collaborator

Failed to run cherry-pick command. I executed the below command:

git cherry-pick -x 9436afc2824a81ef785b6f3de63b7c5dff07ed4c 09c5528b4b5becc82def4bf3d8c65f76ea3b4f40 efa7edb542576373082519458ccc0496aae6b39d c70a8615fdbc98223c829e4f1e8cbda39dc09c67 939bf114d9e166d75962b676599864f529526e56

Workflow run logs.

@mmaslankaprv mmaslankaprv mentioned this pull request Apr 14, 2023
7 tasks
mmaslankaprv added a commit that referenced this pull request Apr 17, 2023
vshtokman added a commit that referenced this pull request Apr 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants