-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Destroy MetaFile classes with fire #23
Conversation
Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile. They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in foramts.py. Replaces the classes, their methods, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema. This commit shifts repository_lib from use of the old classes to the new function. In later commits, we should use this function more broadly, since it can be of use in all schema construction. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It pertains to now-deleted metadata classes. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Testing will now use (and test) build_dict_conforming_to_schema. Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
All local upTUF tests expected to succeed do; Uptane tests continue to succeed when using this PR for their TUF dependency; the Uptane demo continues to run correctly. I have to get back to the third timeserver rotation PR -- which is pressing -- using this, so while this should be reviewed, it can be done post-merge. The sister PR in the proper TUF repository is here, needs a little bit more prodding, and will not be merged until it is finished and reviewed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! See inline comments for a few suggestions/questions. I should mention that, while I thoroughly reviewed tuf/formats.py
and tuf/repository_lib.py
, I only skimmed the tests.
# There are very few things that really need to be done differently. | ||
return tuf.formats.build_dict_conforming_to_schema( | ||
tuf.formats.ROOT_SCHEMA, | ||
_type='Root', # TODO: Does this have to be capitalized? -.- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The spec suggests lower-case (see e.g. 4.3. File formats: root.json), but the reference implementation mandates a capitalized _type
(see e.g. ROOT_SCHEMA
in formats.py
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a thought, would it make sense to make _type
optional in build_dict_conforming_to_schema
and use the one defined in the schema object, if it isn't passed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On casing: the casing is at odds with the same content in theupdateframework/tuf. The modern reference implementation mandates lower-cased _type. I'm grumbling about that.
On _type being optional: I thought about something like that, but I'd like build_dict_conforming_to_schema
to remain general enough to be used for every schema construction, and code handling _type
specially feels like cheating. If the long list of function call arguments becomes an issue, then it might be wise to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re casing: I did have the feeling that your # TODO
-question was rhetorical. :)
Re _type: You're right, I didn't think of other schemas. Though it might be okay to restrict this (or an additional wrapper) function to schemas that have a _type
field (i.e. {ROOT, TARGETS, SNAPSHOT, TIMESTAMP}_SCHEMA
). What do you think?
Note that the function as it currently is is already restricted to a subset of schemas, i.e. Object
-type schemas...
>>> from securesystemslib.formats import HEX_SCHEMA
>>>
>>> HEX_SCHEMA.matches("beef")
True
>>>
>>> build_dict_conforming_to_schema(HEX_SCHEMA, "beef")
# [...]
TypeError: build_dict_conforming_to_schema() takes 1 positional argument but 2 were given
>>>
>>> build_dict_conforming_to_schema(HEX_SCHEMA, dead="beef")
# [...]
securesystemslib.exceptions.FormatError: {'dead': 'beef'} did not match 'pattern /[a-fA-F0-9]+$/'
Obviously not a blocker ... especially since the PR is already merged :) ... just some thoughts...
def from_metadata(object): | ||
raise NotImplementedError | ||
for key, value in kwargs.items(): | ||
d[key] = value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason of copying kwargs
? Why not just:
schema.check_match(kwargs)
return kwargs
If you want to cut ties to any passed references the loop won't do, as it only creates a shallow copy. So if creating a full is a requirement, you should maybe use copy.deepcopy
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. That is quite pretty, you're right. I think I'll go with return deepcopy(kwargs).
# if not isinstance(schema, schema.Schema): | ||
# raise ValueError( | ||
# 'The first argument must be a schema.Schema object, but is not. ' | ||
# 'Given schema: ' + repr(schema)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do you favor the duck type check over the strict type check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just trying to be a good Python boy, I guess. /: All I really require is that whatever the object is supports check_match, and so that is all I'm asking for. I included the alternative code because it's a principle I'm less than sure about. :) What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK the good Python boy "doesn't ask for permission but for forgiveness":
try:
schema.check_match(kwargs)
except AttributeError:
raise ValueError("...")
@@ -684,161 +673,41 @@ def make_metadata(version, expiration_date, filedict): | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason you did not nuke TimestampFile
and its from_metadata
and make_metadata
methods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the PR description you say you did.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks -- missed this. Adding in another PR.
|
||
|
||
Checks the result to make sure that it conforms to the given schema, raising | ||
an error if not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not really conform with our requirements for (API) function docstrings. However, I think it is informative enough.
# not to include the delegations argument based on whether or not it is | ||
# None, consider instead adding a check in | ||
# build_dict_conforming_to_schema that skips a keyword if that keyword | ||
# is optional in the schema and the value passed in is set to None.... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another alternative:
...
extra = {}
if delegations is not None:
extra = {"delegations": delegations}
return tuf.formats.build_dict_conforming_to_schema(
tuf.formats.TARGETS_SCHEMA,
_type='Targets',
version=version,
expires=expiration_date,
targets=filedict,
**extra)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. That's interesting. That might be a little too magic for this reference implementation. I do like it, though. Let's see if the need comes up more often.
|
||
# TODO: Later on, write a test looper that takes pairs of key-value args | ||
# to substitute in on each run to shorten this.... There's a lot of | ||
# test code that looks like this, and it'd be easier to use a looper. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed! :)
I recently started to do a lot more table-driven testing (see in-toto/test_gpg.py for an example).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's good stuff. I have some stuff like that in Uptane, in test_secondary.py for example. Can't remember where else.
Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile.
They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in formats.py.
This PR replaces the classes, their methods, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema.
This commit shifts repository_lib from use of the old classes to the new function.
In later PRs, we should use this function more broadly, since it can be of use in all schema construction.