-
Notifications
You must be signed in to change notification settings - Fork 543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[cleaner] Separate process of preparing obfuscations #3262
[cleaner] Separate process of preparing obfuscations #3262
Conversation
Congratulations! One of the builds has completed. 🍾 You can install the built RPMs by following these steps:
Please note that the RPMs should be used only in a testing environment. |
Up until now, the archive abstractions have defined what files `sos clean` will use to prepare the mappings for obfuscation before entering the normal obfuscation loop over every file in every archive. While this is straight forward enough, it is not particularly flexible, and prevents us from easily using other approaches for preparing the mappings beyond what is directly obtained via the parsers (which in some cases need special handling to be prepared at all). Change this by introducing `SoSPrepper`s which will be used to determine now only what files to pass to which parsers on an archive-by-archive basis, but will also allow for manually retrieving items from disaparate sources within the archive(s) and handing those directly to the mappings, without the need for those items to first pass the parser check. Related: RH: SUPDEV-135 Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Adds a new prepper for IP network address sourcing. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Adds a new Prepper for handling hostname determination for preparing the mapping and parser. As part of this new prepper, pass the CLI options to each prepper for use. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Adds a new Prepper to handle feeding relevant files to the mac parser for initial preparation of the mappings. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Adds a new Prepper for usernames, and removes the bits from the parser and mapping that otherwise handled the initial preparation of the mapping. The prepper will source from the same initial files, as well as from the `--usernames` command line option. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
Adds a new Prepper to handle keyword preparation. This is slightly inefficient since we will only realistically need this once, but baking it into the archive loop does not pose any other problems, and it would be more fragile to break out a special flow just for keywords. Signed-off-by: Jake Hunsaker <jhunsake@redhat.com>
1c6c59a
to
359aaa4
Compare
Preppers may specify their own priority in order to influence the order in | ||
which mappings are prepped. Further, Preppers have two ways to prepare | ||
the maps - either by generating a list of filenames or via directly pulling | ||
content out of select files without the assistance of a parser. A lower | ||
priority value means the prepper should run sooner than those with higher | ||
values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick comment: swap 2nd and 3rd sentence (as priority levels follows the 1st sentence, while "two ways to prepare the maps" is followe by the next paragraph.
Nice separation (and also unification of "prepping" methods like |
Another similar nitpick: reviewing the patch set and reading comments of methods here and there, I spot two typos in https://github.com/sosreport/sos/blob/main/sos/cleaner/__init__.py#L710-L713 - could they be added to the patchset or shall I fix them directly? ("Obfuscate and individual file" -> "Obfuscate an individual file", "written to a temp file without our own tmpdir" -> ".. within our own tmpdir") |
def test_mapping_method_translation(self): | ||
self.assertEqual([], self.prepper.get_items_for_map('foobar', None)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment like to test_parser_method_translation
.
Just the few minor issues / nitpicks found. But some testing is blocked by #3271 (well, until one uninstalls |
Since that is from an earlier commit, let's tackle that later. I have a strong feeling that any work that follows this on the path to multi-process handling of files will re-write that method anyways, so we can address it then. |
This patchset is the first that should set us on a path to providing multi-process
--clean
support in a way that is not fragile or particularly disadvantageous for single-archive cleaning.In order to facilitate multi-process down the road, we need to separate some tight bindings between parsers and mappings that exist today. The first of which, is how the mappings get prepared from archives before we do the actual obfuscation work.
This set introduces the use of
SoSPrepper
s, which will be used to pull select data out of archives and use it to prepare the parsers and mappings. This was previously done viaprep_files
dicts for archive abstractions and via additional methods in parsers and/or mappings to handle specific data from specific prep_file contents.By separating this out, we gain flexibility to allow the individual Prepper entities to grok data as needed, rather than always relying on the existing regex parsing or special-case handling in mappings. This also allows us to provide a better framework for specific use cases in the future - e.g. we could create cluster-specific Preppers that ensure that we obfuscate cluster hosts together, so that their obfuscation numbering is sequential.
The next steps will include moving compiled regex handling directly to parsers, and not binding the mappings directly to the parsers.
Related: RH: SUPDEV-135
Please place an 'X' inside each '[]' to confirm you adhere to our Contributor Guidelines