-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix a use-after-move in partition_balancer
#7406
Fix a use-after-move in partition_balancer
#7406
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great catch
great find 🤯 |
bb547a2
to
d1a16bc
Compare
force-push: lint failure |
@dlex - does this need backport? |
Only to 22.3 |
CI failure due to #6903, retrying |
/backport v22.3.x |
The pull request is not merged yet. Cancelling backport... |
partition_balancer
partition_balancer
CI failure due to #7418, retrying |
allocation_domain = get_allocation_domain( | ||
del_cmd.key)](std::error_code ec) { | ||
if (ec == errc::success) { | ||
vassert( | ||
topic_assignments.has_value(), | ||
"Topic had to exist before successful delete"); | ||
deallocate_topic( | ||
*topic_assignments, in_progress, allocation_domain); | ||
.then( | ||
[this, | ||
tp_ns = std::move(del_cmd.key), | ||
topic_assignments = std::move(topic_assignments), | ||
in_progress = std::move(in_progress)](std::error_code ec) { | ||
if (ec == errc::success) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch.
prior to this change get_allocation_domain ran before the continuation (when the capture list was created). but now it runs when the continuation is scheduled. is that something to be concerned with? in this case it might make sense to remove std::move
from tp_ns
in the capture list and pay a small string copy cost to keep things simple and preserve the semantics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No concern here. tp_ns
is not changed (the capture is not mutable), the lambda is not a coro, and get_allocation_domain()
is a pure function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No concern here. tp_ns is not changed (the capture is not mutable), the lambda is not a coro, and get_allocation_domain() is a pure function.
awesome. thanks for the explanation
PR looks good--some minor feedback but no blockers from me. |
/backport v22.3.x |
fyi i updated the release notes section with text from what @dlex suggested |
A use-after-move in lambda capture expression may lead to incorrect initialization of allocation domain to
0
, which would affect deallocation of__consumer_offsets
partitions causing a vassert inallocation_node
.Related to #7343 (fixes one of the underlying issues).
Backports Required
UX Changes
Release Notes
Bug Fixes
__consumer_offsets
partitions