-
Notifications
You must be signed in to change notification settings - Fork 580
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage: cleanup empty directories on startup #6533
Conversation
b07a874
to
3f02d5f
Compare
This passed tests here: (the release-clang-amd64 github step looks hung because the job name changed in buildkite) |
|
||
auto entry_path = std::filesystem::path(target) | ||
/ std::filesystem::path(entry.name); | ||
files_in_dir++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think files_in_dir
could just be a bool, and toggled with if unlikely(!files_in_dir)
, although admittedly the memory and cpu savings would be tiny.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks fine to me, but I'm not familiar with how the system reacts to these things being removed, so I'll defer approval to someone who does. :)
struct walk_result { | ||
uint64_t cache_size{0}; | ||
std::vector<file_list_item> deletion_candidates; | ||
std::vector<ss::sstring> empty_dirs; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I was a bit confused by the naming initially as empty_dirs
are also "candidates for deletion", but I guess the file_list_item
type indicates the fact that deletion_candidates
are files, which justifies the different treatment.
struct walk_result { | ||
uint64_t cache_size{0}; | ||
std::vector<file_list_item> deletion_candidates; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Related to this follow-up - https://github.com/redpanda-data/core-internal/issues/114 |
Also, maybe slightly related to this one - https://github.com/redpanda-data/core-internal/issues/111 |
Structs with named members are preferred to pairs.
...including the new empty dir deletion.
No need to actually count files, when we are only interested in whether a directory is empty.
The "candidates for deletion" are only such if the caller uses them that way. From the point of view of the walker, it's just a list of all regular files.
Walking a tree doesn't imply sorting by atime: that sort belongs in the LRU cache trimming function.
976a6d4
to
8ab8bf1
Compare
This is a defensive measure. Use of a bare lambda is currently safe because of how list_directory is implemented: the lambda gets moved into a stream object internally, which keeps the lambda and its captures alive as long as it will be used. However, we generally avoid passing reference captures into lambdas without some mitigation, and it isn't strictly guaranteed that this directory listing function will behave this way, so we add a ss::coroutine::lambda wrapper to make it more obviously correct.
8ab8bf1
to
68c16ee
Compare
Added a commit to make the coroutine lambda in There are also commits to address style notes from the earlier version, taking the opportunity to clean this up a bit while touching it. |
std::set<std::string> result_paths(cloud_storage::walk_result& r) { | ||
std::set<std::string> paths; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: could this be unordered_set
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/backport v22.2.x |
On reflection this seems worth backporting so that we have a simple option for systems in the field pre-22.3 |
Cover letter
This is to cover us against upgraded systems from earlier versions that didn't properly remove empty directories, and also to cover the case where redpanda stops in between removing a file and checking for an empty directory to delete.
Backport Required
UX changes
None
Release notes
Improvements