-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: add topic_delete_unavailable_test, tweak tiered storage topic deletion order #7460
Conversation
9f309c3
to
ebbc22f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. A couple very minor nits and a question.
objects_before = set( | ||
self.redpanda.s3_client.list_objects( | ||
self.si_settings.cloud_storage_bucket, topic=self.topic)) | ||
assert (len(objects_before) > 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Extra set of parens around logical condition
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned this up while rebasing
after_keys = set([ | ||
o.Key for o in self.redpanda.s3_client.list_objects( | ||
self.si_settings.cloud_storage_bucket, topic=next_topic) | ||
]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: You can pass an iterator into the set
constructor, but it really doesn't matter here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleaned this up while rebasing
self._populate_topic(next_topic) | ||
after_keys = set([ | ||
o.Key for o in self.redpanda.s3_client.list_objects( | ||
self.si_settings.cloud_storage_bucket, topic=next_topic) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit surprised you don't have to wait here. Does an upload happen quickly enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
populate_topic includes a wait for local segments to be trimmed, so we are guaranteed to already have objects in the bucket.
/ci-repeat 1 skip-unit dt-repeat=20 tests/rptest/tests/topic_delete_test.py::TopicDeleteCloudStorageTest |
ebbc22f
to
5f95949
Compare
Unavailability of the S3 backend should not prevent cleanup of local objects.
Convenient when a test wants the objects for a particular topic.
This tests deleting a tiered storage topic while the S3 backend is unavailable. Related: redpanda-data#7433
This affected the new topic_delete_unavailable_test but also existing tests. Test intended to write 10MB of data per partition (i.e. 10 1MB segments), but was actually writing 10MB total across 3 partitions. wait_for_segments_removal(5) was proceeding immediately because partitions only had ~3 partitions to begin with.
5f95949
to
e2d5652
Compare
Rebased for conflict. |
@VladLazar this should be good for a re-review after the holidays. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
||
# Wait for segments evicted from local storage | ||
for i in range(0, 3): | ||
for i in range(0, self.partition_count): | ||
wait_for_segments_removal(self.redpanda, topic_name, i, 5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe wait_for_segments_removal
should fail if the initial number of segments is less than the final desired count.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it would be nice, but I think it's going to be racy unless the test is very confident that the segment count cannot end up lower than the target. I have limited trust in most of the tests that assert segment counts as a proxy for data retention
Test failure was: |
Rebase + merge after #7547
A new test covers deleting a tiered storage topic while the
S3 backend is unavailable.
The controller behavior in this situation is to bypass the topic, not retry applying the controller log message, so in order to avoid holding up local deletion we must do the local deletion first. The resulting behavior is that objects in S3 may be orphaned (this PR doesn't cause that, it just avoids the local data being left behind as well).
Related: #7433
Backports Required
UX Changes
None
Release Notes