Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Destroy unnecessary MetaFile classes in formats.py #836

Merged
merged 15 commits into from
Apr 3, 2019

Conversation

awwad
Copy link
Contributor

@awwad awwad commented Mar 15, 2019

This PR is related to two efforts:

The primary effect is to replace six classes and their associated methods and some related external functions with one, simpler, general purpose function in formats.py, build_dict_conforming_to_schema. These functions are involved in the production of metadata.

This PR also:

  • corrects a mistake in TIMESTAMP_SCHEMA, the exclusion of spec_version (since writing timestamp metadata previously did not properly respect this schema, written metadata still included spec_version)
  • Adds a notion of the version of the TUF specification that this reference implementation intends and is expected to support -- tuf.SPECIFICATION_VERSION

Details

Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile, SnapshotFile, and TimestampFile. They each had an unused from_ method and a used make_ method. They were all additional, unnecessary representations of the same metadata, and it is very important that metadata formats be defined once in the reference implementation, in the schemas that are already used more broadly, in formats.py.

Replaces the classes, their methods, some related external functions, and some associated variables with a single short function called build_dict_conforming_to_schema that takes keyword arguments and builds a dictionary, then checks to make sure that the result conforms to the given schema.

This commit shifts repository_lib from use of the old classes to the new function.

In later commits, we should use this function more broadly, since it can be of use in all schema construction.

There are several TODOs added to the code, mostly for post-#660 tasks.

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@awwad awwad force-pushed the destroy_unnecessary_classes_in_formats branch 3 times, most recently from 8c0338a to 1b994fd Compare March 28, 2019 23:14
Kills classes MetaFile, RootFile, TargetsFile, MirrorsFile,
SnapshotFile, and TimestampFile.  They each had an unused
from_ method and a used make_ method.  They were all additional,
unnecessary representations of the same metadata, and it is
very important that metadata formats be defined once in the
reference implementation, in the schemas that are already used
more broadly, in foramts.py.

Replaces the classes, their methods, and some associated variables
with a single short function called build_dict_conforming_to_schema
that takes keyword arguments and builds a dictionary, then checks
to make sure that the result conforms to the given schema.

This commit shifts repository_lib from use of the old classes to
the new function.

In later commits, we should use this function more broadly, since it
can be of use in all schema construction.

There are several TODOs added to the code, mostly for post-#660
tasks.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Add a tuf-level variable in tuf/__init__.py indicating the version
of the TUF specification that the code in this repository is
intended and expected to conform to.

This will be used when writing metadata.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It was excluded from the Timestamp schema definition in error.

In the process of making metadata writing use the Timestamp schema
strictly, this bug was discovered.  Metadata previously written
included specification version, but the schema check did not.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
It pertains to now-deleted metadata classes.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Testing will now use (and test) build_dict_conforming_to_schema.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad awwad force-pushed the destroy_unnecessary_classes_in_formats branch from 1b994fd to e7c8229 Compare March 29, 2019 15:39
@awwad
Copy link
Contributor Author

awwad commented Mar 29, 2019

Rebased after #844

@awwad awwad requested a review from lukpueh March 29, 2019 16:55
@awwad
Copy link
Contributor Author

awwad commented Mar 29, 2019

See sister PR and review comments here in Uptane's TUF fork.

tuf.formats.make_role_metadata concerned itself with exclusivity checks
for paths and path_hash_prefixes, but no code actually used it for
relevant data.  It's yet another custom metadata writer replaced by
build_dict_conforming_to_schema.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Make a copy of the provided fields so that the caller's provided values
do not change when the returned values are changed.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
in tuf.formats.build_dict_conforming_to_schema

Populate _type with the expected value for the given schema, and
populate spec_version with tuf.SPECIFICATION_VERSION.  Do this only
when the values are not provided, and support overriding them.

Also adds testing for the above and takes advantage of the above
in repository_lib's _generate metadata functions.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad awwad force-pushed the destroy_unnecessary_classes_in_formats branch from bf84ad5 to 7ecf522 Compare March 29, 2019 19:23
since make_role_metadata is being replaced by
build_dict_conforming_to_schema

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad
Copy link
Contributor Author

awwad commented Mar 29, 2019

I think I've addressed your comments on the sister (upTUF) PR, @lukpueh. I've also expanded this PR just a little bit. Should be a quick re-review, I hope. :)

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
tuf/formats.py Outdated
@@ -24,8 +24,7 @@
module should be read and understood before tackling this module.

'formats.py' can be broken down into three sections. (1) Schemas and object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/three/two/

tuf/formats.py Outdated
tuf.formats.TARGETS_SCHEMA.check_match(result)
else:
# If the field has not been provided and is required, check to see if
# the field is one of the one of the fields we automatically fill.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/one of the one of the/one of the/

tuf/formats.py Outdated
#
for schema_element in schema._required: #pylint: disable=protected-access
key = schema_element[0]
element_type = schema_element[1]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check above if the schema has a check_match method and here just assume that it has a _required property, which itself and whose elements are subscriptable?

I recommend to get rid of above hasattr-check and instead use the commented out isinstance-check but with a schema.Object which should be guaranteed to have a _required-list and, as a child of schema.Schema, also a check_match-method.

Also, consider my comment in awwad#23, about how this function only makes sense for Object-type schemas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw. you can unpack the _required-tuples in the loop definition, which imho looks nicer:

for key, element_type in schema._required:
  ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Re. the additional typing requirements: I forgot to edit the type check after also requiring _required. You're right: I should switch to a type check now.

  2. I'll add a comment noting that this is intended for Object-type schemas (which are for the most part the ones for which it's of interest).

  3. On the 2-tuples: quite right, of course -- dunno why I didn't unpack that way. Fixing.

elif (key == 'spec_version' and
element_type == SPECIFICATION_VERSION_SCHEMA):
# If not provided, use the specification version in tuf/__init__.py
dictionary[key] = tuf.SPECIFICATION_VERSION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, I suggested this, but now that I look at it I think it's pretty specific, at least for the function name build_dict_conforming_to_schema. Do you think there is a better name? If not, disregard this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is still intended to be a general purpose (Object-type) schema builder, so I would rather remove this convenience functionality than change that scope. There may be a better name, but I don't want to narrow its purpose to Role metadata construction, if that's what you mean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as it is.

tuf.formats.TARGETS_SCHEMA,
version=version,
expires=expiration_date,
targets=filedict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I couldn't sell the kwargs alternative to you? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whatcha mean? The extras bit (e.g. delegations) in the calls below this one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function uses kwargs and returns deepcopy(kwargs) (after some modifications).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gonna merge and use this in some other work, now that the rest has been addressed, but if I missed something, please LMK. I'll keep the extras-style optional arguments construction in mind if there are other places calls like this start to need to happen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I was referring to passing something like an extra_kwargs that may or may not contain the delegations (sorry for being unclear).

@@ -356,6 +348,7 @@
TIMESTAMP_SCHEMA = SCHEMA.Object(
object_name = 'TIMESTAMP_SCHEMA',
_type = SCHEMA.String('timestamp'),
spec_version = SPECIFICATION_VERSION_SCHEMA,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this require new tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This in itself, no. The spec version was simply missing in the schema for Timestamp. It was already being written in metadata despite the fact that it was missing in the schema, because this schema was not really used to build the metadata before. Now the schema is used to build metadata.

I want to add more validation that uses these schemas in updater, and to add some additional testing for formats.py related to the schemas more generally, but that's not really related to this change, and it'll have to wait.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Edits prompted by Lukas's review.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
- Expand on the arguments/kwargs
- Note that the function is particular to schema.Object objects.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
May the Python gods forgive me for having three different
names with the same names and different casing on one line....

Perhaps I should address the import module as....

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad
Copy link
Contributor Author

awwad commented Apr 2, 2019

Thanks for the review, @lukpueh!

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments. I think you want to do an isinstance-check on SCHEMA.Object in build_dict_conforming_to_schema. Otherwise it looks good.

tuf/formats.py Outdated
filedict = targets_metadata.get('targets')
delegations = targets_metadata.get('delegations')
# Check the schema argument type (must provide check_match and _required).
if not isinstance(schema, SCHEMA.Schema):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Schema/Object/

tuf/formats.py Outdated
# Check the schema argument type (must provide check_match and _required).
if not isinstance(schema, SCHEMA.Schema):
raise ValueError(
'The first argument must be a schema.Schema object, but is not. '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Schema/Object/

elif (key == 'spec_version' and
element_type == SPECIFICATION_VERSION_SCHEMA):
# If not provided, use the specification version in tuf/__init__.py
dictionary[key] = tuf.SPECIFICATION_VERSION
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine as it is.

tuf.formats.TARGETS_SCHEMA,
version=version,
expires=expiration_date,
targets=filedict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yes, I was referring to passing something like an extra_kwargs that may or may not contain the delegations (sorry for being unclear).

Raise an error if it's not a schema.Object instance (not just
if it's not a schema.Schema instance).

Also adds a test for this.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
@awwad awwad merged commit 254e0c4 into develop Apr 3, 2019
@awwad awwad deleted the destroy_unnecessary_classes_in_formats branch May 2, 2019 16:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants