Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroy MetaFile classes with fire #23

Merged
merged 3 commits into from
Mar 18, 2019

Conversation

awwad
Copy link
Owner

@awwad awwad commented Mar 15, 2019

Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile.

They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in formats.py.

This PR replaces the classes, their methods, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema.

This commit shifts repository_lib from use of the old classes to the new function.

In later PRs, we should use this function more broadly, since it can be of use in all schema construction.

Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile,
SnapshotFile, and TimestampFile.  They each had an unused
from_ method and a used make_ method.  They were all additional,
unnecessary representations of the same metadata, and it is
very important that metadata formats be defined once in the
reference implementation, in the schemas that are already used
more broadly, in foramts.py.

Replaces the classes, their methods, and some associated variables
with a single short function called build_dict_conforming_to_schema
that takes keyword arguments and builds a dictionary, then checks
to make sure that the result conforms to the given schema.

This commit shifts repository_lib from use of the old classes to
the new function.

In later commits, we should use this function more broadly, since it
can be of use in all schema construction.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It pertains to now-deleted metadata classes.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Testing will now use (and test) build_dict_conforming_to_schema.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad awwad requested a review from lukpueh March 18, 2019 18:01
@awwad
Copy link
Owner Author

awwad commented Mar 18, 2019

All local upTUF tests expected to succeed do; Uptane tests continue to succeed when using this PR for their TUF dependency; the Uptane demo continues to run correctly.

I have to get back to the third timeserver rotation PR -- which is pressing -- using this, so while this should be reviewed, it can be done post-merge.

The sister PR in the proper TUF repository is here, needs a little bit more prodding, and will not be merged until it is finished and reviewed.

@awwad awwad merged commit 90ee24a into develop Mar 18, 2019
Copy link
Collaborator

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! See inline comments for a few suggestions/questions. I should mention that, while I thoroughly reviewed tuf/formats.py and tuf/repository_lib.py, I only skimmed the tests.

# There are very few things that really need to be done differently.
return tuf.formats.build_dict_conforming_to_schema(
tuf.formats.ROOT_SCHEMA,
_type='Root', # TODO: Does this have to be capitalized? -.-
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The spec suggests lower-case (see e.g. 4.3. File formats: root.json), but the reference implementation mandates a capitalized _type (see e.g. ROOT_SCHEMA in formats.py).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, would it make sense to make _type optional in build_dict_conforming_to_schema and use the one defined in the schema object, if it isn't passed?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On casing: the casing is at odds with the same content in theupdateframework/tuf. The modern reference implementation mandates lower-cased _type. I'm grumbling about that.

On _type being optional: I thought about something like that, but I'd like build_dict_conforming_to_schema to remain general enough to be used for every schema construction, and code handling _type specially feels like cheating. If the long list of function call arguments becomes an issue, then it might be wise to do this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re casing: I did have the feeling that your # TODO-question was rhetorical. :)

Re _type: You're right, I didn't think of other schemas. Though it might be okay to restrict this (or an additional wrapper) function to schemas that have a _type field (i.e. {ROOT, TARGETS, SNAPSHOT, TIMESTAMP}_SCHEMA). What do you think?

Note that the function as it currently is is already restricted to a subset of schemas, i.e. Object-type schemas...

>>> from securesystemslib.formats import HEX_SCHEMA
>>> 
>>> HEX_SCHEMA.matches("beef")
True
>>>
>>> build_dict_conforming_to_schema(HEX_SCHEMA, "beef")
# [...]
TypeError: build_dict_conforming_to_schema() takes 1 positional argument but 2 were given
>>>
>>> build_dict_conforming_to_schema(HEX_SCHEMA, dead="beef")
# [...]
securesystemslib.exceptions.FormatError: {'dead': 'beef'} did not match 'pattern /[a-fA-F0-9]+$/'

Obviously not a blocker ... especially since the PR is already merged :) ... just some thoughts...

def from_metadata(object):
raise NotImplementedError
for key, value in kwargs.items():
d[key] = value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason of copying kwargs? Why not just:

schema.check_match(kwargs)
return kwargs

If you want to cut ties to any passed references the loop won't do, as it only creates a shallow copy. So if creating a full is a requirement, you should maybe use copy.deepcopy?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. That is quite pretty, you're right. I think I'll go with return deepcopy(kwargs).

# if not isinstance(schema, schema.Schema):
# raise ValueError(
# 'The first argument must be a schema.Schema object, but is not. '
# 'Given schema: ' + repr(schema))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you favor the duck type check over the strict type check?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to be a good Python boy, I guess. /: All I really require is that whatever the object is supports check_match, and so that is all I'm asking for. I included the alternative code because it's a principle I'm less than sure about. :) What do you think?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the good Python boy "doesn't ask for permission but for forgiveness":

try:
  schema.check_match(kwargs)

except AttributeError:
  raise ValueError("...")

@@ -684,161 +673,41 @@ def make_metadata(version, expiration_date, filedict):

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you did not nuke TimestampFile and its from_metadata and make_metadata methods?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the PR description you say you did.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks -- missed this. Adding in another PR.



Checks the result to make sure that it conforms to the given schema, raising
an error if not.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does not really conform with our requirements for (API) function docstrings. However, I think it is informative enough.

# not to include the delegations argument based on whether or not it is
# None, consider instead adding a check in
# build_dict_conforming_to_schema that skips a keyword if that keyword
# is optional in the schema and the value passed in is set to None....
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's another alternative:

...
extra = {}
if delegations is not None:
  extra = {"delegations": delegations}

return tuf.formats.build_dict_conforming_to_schema(
    tuf.formats.TARGETS_SCHEMA,
    _type='Targets',
    version=version,
    expires=expiration_date,
    targets=filedict,
    **extra)

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. That's interesting. That might be a little too magic for this reference implementation. I do like it, though. Let's see if the need comes up more often.


# TODO: Later on, write a test looper that takes pairs of key-value args
# to substitute in on each run to shorten this.... There's a lot of
# test code that looks like this, and it'd be easier to use a looper.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed! :)
I recently started to do a lot more table-driven testing (see in-toto/test_gpg.py for an example).

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good stuff. I have some stuff like that in Uptane, in test_secondary.py for example. Can't remember where else.

awwad added a commit that referenced this pull request Mar 28, 2019
missed from the prior PR (github.com//pull/23)

Thanks for spotting this, @lukpueh

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants