[v23.1.x] cloud_storage: Prevent segment reuploads from adjacent segment merger #9845
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport #9657
This PR fixes #9651
Currently, the
segment_collector
is initialised using desired size of the segment reupload. The adjacent segment merger scans the manifest and finds the right upload candidate (a series of small segments). Then it calculates its size and uses it to createsegment_collector
. This is not correct because if the last segment is not aligned perfectly the collector will return a result without one last segment. This last segment will overflow the size target and be discarded.With some configurations (close min and target values for adjacent segment merger) this can lead to a situation when the segment gets re uploaded multiple times.
To avoid this several things were added:
adjacent_segment_merger
checks the size of the segment run that it gets out of thesegment_collector
before uploading. If the size is smaller it discards the result.segment_collector
is updated to accept extra parameter which is used as a target offset for the end of the offset range. If this parameter is set thesegment _collector
will continue scanning until it will reach the segment that contains the target offset. Then it will adjust the upload so this last segment won't be fully uploaded. Instead, it will be uploaded up to the target offset.Backports Required
Release Notes
Bug Fixes