Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ngclient: top-level-roles update tests #1636

Merged
merged 5 commits into from
Nov 10, 2021

Conversation

sechkova
Copy link
Contributor

Addresses: #1606

Description of the changes being introduced by the pull request:
Tries to address the problem of sticking meaningful tests for ngclient/updater

  • Adds tests for top-level metadata update following the Detailed client workflow
  • Adds one experimental test case for loading metadata.

Trying to answer some pending questions:

  • I like keeping test cases simple and short even though a sequence of refresh -> modify repo -> new refresh can be tempting for combining in one test. This is sort of a personal preference, I'm glad to hear other opinions.
  • For this subset of test I think viewing updater as a black box and checking whether files were written on disk is enough. Checking the contents of files seems redundant. We serve exactly the same metadata we've generated from the simulator. Checking whether we've downloaded the correct file version could be enough.
  • Testing whether metadata was loaded from cache requires inspecting the internals of Updater or alternatively "spying" the repository's fetch calls.
  • Testing if expired metadata from the local cache is loaded and used to verify the new version requires peeking as deep as
    Updater._trusted_metadata_set["timestamp] != None I don't find it fir for these tests, it seems like TrustedMetadataSet tests should accommodate it.

Missing tests cases from the spec:

  • Fast forward attack recovery
  • Consistent snapshot off tests

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@coveralls
Copy link

coveralls commented Oct 26, 2021

Pull Request Test Coverage Report for Build 1439879957

Warning: This coverage report may be inaccurate.

This pull request's base commit is no longer the HEAD commit of its target branch. This means it includes changes from outside the original pull request, including, potentially, unrelated coverage changes.

Details

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+1.1%) to 98.571%

Totals Coverage Status
Change from base Build 1426058326: 1.1%
Covered Lines: 3116
Relevant Lines: 3129

💛 - Coveralls

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like how much cleaner most of this is than either the legacy tests and even some of the comparable trustedmetadataset tests: e.g. test_new_snapshot_version_rollback() is directly readable -- something I can't say of any rollback tests elsewhere.

The two main issues I have are:

  • overall niggles about download count test
  • most tests only check raised errors: file existence on disk is not checked, versions or content of the files is not checked. I'd like to see at least explanations of why these are not needed -- especially for the error cases I think we really should be checking that local files are what we expect

Both of these could be solved as followup issues: the tests are certainly already useful.

Comment on lines 205 to 207
self.sim.root.expires = datetime.utcnow().replace(
microsecond=0
) + timedelta(days=5)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added RepositorySimulator.safe_expiry for this purpose -- it might be in the wrong place but maybe worth using... you could define a similar one for the expired case to avoid the ugly date math in multiple tests

Comment on lines 85 to 90
metadata_files_after_refresh = os.listdir(self.metadata_dir)
metadata_files_after_refresh.sort()
self.assertListEqual(
metadata_files_after_refresh,
["root.json", "snapshot.json", "targets.json", "timestamp.json"],
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

these checks are currently done in this one test only but I wonder if it's worth making this file list check more ergonomic and a one-liner (see _assert_files() in test_updater_ng.py, although yours is better in that it uses assertListEqual) and then do this check in more tests -- I'd really like to be sure our local metadata cache contains what we expect in all situations, especially the failure situations...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some convenience methods and a lot of additional checks with 5394c47.

args, kwargs = wrapped_download_metadata.call_args_list[1]
self.assertIn("timestamp", args)

def test_trusted_root_os_error(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

Suggested change
def test_trusted_root_os_error(self):
def test_trusted_root_missing(self):

Comment on lines 123 to 130
root_path = os.path.join(self.metadata_dir, "root.json")
md_root = Metadata.from_file(root_path)
md_root.signed.expires = datetime.utcnow().replace(
microsecond=0
) - timedelta(days=5)
for signer in self.sim.signers["root"]:
md_root.sign(signer)
md_root.to_file(root_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike the pattern where local metadata is modified (unless that's the only way to test something, like test_trusted_root_unsigned()). I think in this case you could instead

  • use simulator to create an expired root,
  • refresh (expecting failure)
  • check that current local root is now the new expired root
  • initialize a new updater: this should succeed since local root is allowed to be expired

Comment on lines 100 to 114
with mock.patch.object(
updater, "_download_metadata", wraps=updater._download_metadata
) as wrapped_download_metadata:
updater.refresh()

self.assertEqual(wrapped_download_metadata.call_count, 2)
for call in wrapped_download_metadata.call_args_list:
args, kwargs = call
self.assertNotIn("snapshot", args)
self.assertNotIn("targets", args)

args, kwargs = wrapped_download_metadata.call_args_list[0]
self.assertIn("root", args)
args, kwargs = wrapped_download_metadata.call_args_list[1]
self.assertIn("timestamp", args)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a huge fan of hooking into updater internal methods... I think implementing download counters in RepositorySimulator would be more reliable and readable.

I'd also like to see this done for more than one single call to refresh() -- it seems like this only tests a single special case when metadata has not been updated in remote: why not verify that the first refresh downloads what you expect as well?

We could leave this test out of this PR, and file a new issue for download/file open tests. If you'd like to include this already, I'd at least ask you to have another look at the call_args_list parsing: it feels vastly more complex than it should be -- I think you should be able to verify the exact argument for two calls instead of doing four separate assert*In() calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the entire commit. I'll open a follow up issue about testing whether cached metadata is loaded and verified as expected.

I did the weird parsing since the full call list was something like call("root", 123242, 2) but I agree it is ugly.

# intermediate files were downloaded.

# Create some big number of root files in the repository
highest_repo_root_version = UpdaterConfig.max_root_rotations + 10
Copy link
Member

@jku jku Oct 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha I love this -- why not create 42 different root versions if it means you don't have to modify updater config from default!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh well ... in order to test the default config :))
I updated the test with randomly chosen smaller values.

Comment on lines 180 to 191
root_signers = self.sim.signers["root"]
self.sim.signers["root"].clear()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this does not do what you expect: root_signers is also empty after this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh 🤦

with self.assertRaises(ExpiredMetadataError):
self._run_refresh()

def test_new_targets_hash_mismatch(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this test is the only one I don't quite understand... maybe we can catch up on this in chat

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we already discussed, I've assumed incorrectly what compute_metafile_hashes_length does and in the same time its implementation is buggy. I opened #1651 and updated the tests.

def tearDown(self):
self.temp_dir.cleanup()

def _run_refresh(self) -> Updater:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add doc to this function? That was a pylint warning I had to address for all functions besides the ones with names such as test_*, setUp* and tearDown*.
For this function in test_updater_with_simulator.py I had added """Creates a new updater and runs refresh.""". You decide if you like it or not.

updater.refresh()
return updater

def _init_updater(self) -> Updater:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add doc to this function? That was a pylint warning I had to address for all functions besides the ones with names such as test_*, setUp* and tearDown*.

@jku
Copy link
Member

jku commented Oct 28, 2021

Testing whether metadata was loaded from cache requires inspecting the internals of Updater or alternatively "spying" the repository's fetch calls.

I think we agree on the possible checks but I'll just reiterate so there's no confusion:

  1. checking that expected metadata is read from cache
    • can be done by hooking into updater internals...
    • but I think wrapping open() would work just as well
  2. checking that the expected files are downloaded from remote
    • can be done by hooking into updater internals...
    • but adding some download counters into RepositorySimulator could be cleaner and more maintainable
  3. checking if cached metadata is used to verify new metadata downloaded from remote
    • trickier but I suppose we can rely on negative tests here: tests that are expected to fail on that verification
  4. checking that correct metadata is in cache after successful or failed refresh()
    • this seems essential to me
    • in practice could mean just checking the version number but comparing the bytes might be as easy
  5. checking refresh() return value

There's no need to do all of these for all tests : 1 and 2 could be checked by just some tests, 3 might not need explicit checks but I think 4 is something we should check almost as extensively as 5

@sechkova
Copy link
Contributor Author

After iterating over the tests again, I added much more checks related to expected contents on disk after refresh.
I will open follow up issues about expected cached/downloaded metadata, etc.

@sechkova sechkova requested a review from jku October 29, 2021 14:29

def _assert_content_equals(self, role: str, version: Optional[int]=None) -> None:
"""Assert that local file content is the expected"""
expected_content = self.sim._fetch_metadata(role, version)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've abused using the internal _fetch_metadata. Should we make it public? Much more convenient when you want to get the repository content in bytes than download_bytes(url, size ...).

Copy link
Member

@jku jku Nov 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed we should somehow make that ergonomic... Your proposal sounds ok to me.

The only alternative I can think of is adding a RepositorySimulator.assert_metadata(directory: str, roles: List[str]) which asserts that directory contains metadata matching current repository state for roles -- possibly there's a good default value for roles as well.

You probably would not need _assert_files_exist() anymore as the new function could do that at the same time.

But I guess then you'd need to handle the failing cases somehow differently... Your call, could leave this as enhancement.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks quite nice to me. Left a few smaller comments.

I think the assert methods could still be improved... but maybe as a follow up? Just to document what it looks like to me, it seems that there are three separate assert types:

  • _assert_files_exist
  • _assert_content_equals
  • _assert_version_equals (this function does not exist but same four lines are repeated a lot). This is mostly used in error cases where client version should not match repository version

The issue I have with this is that it's hard to easily see if a specific test is using the correct asserts or not... Anyway i can see it's not trivial to solve so we can leave as follow up task

Comment on lines 54 to 60
updater = Updater(
self.metadata_dir,
"https://example.com/metadata/",
"https://example.com/targets/",
fetcher=self.sim,
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have made more noise about the recent constructor API change (sorry): please add the target dir argument to this call

Comment on lines 126 to 127
# The expiration of the trusted root metadata file does not lead
# to failure and Updater in successfully initialized.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something wrong with sentence end here but maybe this could be shorter as well:

Suggested change
# The expiration of the trusted root metadata file does not lead
# to failure and Updater in successfully initialized.
# Local root metadata can be loaded even if expired

in general I really appreciate it if we manage to keep comments in tests to single lines -- makes it easier to read

md_root = Metadata.from_file(root_path)
initial_root_version = md_root.signed.version

updater.refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there really no error here? I would expect an error. can you check legacy updater and possibly file an issue for ngclient?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surprisingly (for me too), this is the correct behaviour. The spec treats this case the same way as if the next root version was not found. The legacy code does the same thing.

If this file is not available, or we have downloaded more than Y number of root metadata files (because the exact number is as yet unknown), then go to step 5.3.10.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that seems like a very weird choice... thanks for checking

Comment on lines +336 to +337
# Hash mismatch error
with self.assertRaises(RepositoryError):
self._run_refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume the real error is something more specific?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reasons that I cannot remember now, LengthOrHashMismatchError does not inherit from RepositoryError but the latter is raised from it:

except exceptions.LengthOrHashMismatchError as e:

As a result LengthOrHashMismatchError is not raised here. Should we reconsider this usage?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Hmm, maybe we should reconsider that, but this PR seems fine

Comment on lines 358 to 361
# TODO: RepositorySimulator works always with consistent snapshot
# enabled which forces the client to look for the snapshot version
# written in timestamp (which leads to "Unknown snapshot version").
# This fails the test for a snapshot version mismatch.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An option is to change the simulator so it does always return metadata, even if the requested version does not match the actual version. I don't think it would break anything... Feel free to just file an issue for this TODO though

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did it in e9e5965. This means that consistent_snapshot with non-root metadata means nothing for the simulator, though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, it still supports consistent_snapshot... but only the latest/current consistent snapshot: trying to download using previous snapshot versions will fail in interesting ways (because the metadata will be incorrect). I think this is fine?

@jku jku mentioned this pull request Nov 2, 2021
3 tasks
Add ngclient/updater tests following the top-level-roles metadata
update from the specification (Detailed client workflow)
using RepositorySimulator.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Extend the TestRefresh cases with additional checks
for expected metadata files and their content written
on the file system.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Fix formatting and some potential linter and typing
errors.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Define _assert_version_equals for checking if the
local metadata file's version is as expected.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@sechkova
Copy link
Contributor Author

sechkova commented Nov 9, 2021

I did some improvement of the asserts as you suggested and opened #1669 to figure out a smarter way for asserting the local metadata state.

@sechkova sechkova requested a review from jku November 9, 2021 12:47
Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah looks nice: the assertion changes really improve readability, I can see what's being tested.

No need to block merging for this... but are you able to test how much runtime these tests add? It is a big chunk of tests that creates and verifies a lot of metadata... I think there's a lot of optimization we could do in RepositorySimulator and how we use it but I'm not interested in doing that unless it's a real bottleneck

Comment on lines +413 to +421
def test_new_targets_version_mismatch(self):
# Check against snapshot role’s targets version

# Increase targets version without updating snapshot
self.sim.targets.version += 1
with self.assertRaises(BadVersionNumberError):
self._run_refresh()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no _assert_files_exist() here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I amended the last commit, thanks.

Except for 'root' role, RepositorySimulator does not
keep previous metadata versions, it always serves the latest
one. The metadata version check during fetch serves mostly
for informative purposes and removing it allows generating test
metadata with mismatching version.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@sechkova
Copy link
Contributor Author

sechkova commented Nov 9, 2021

No need to block merging for this... but are you able to test how much runtime these tests add? It is a big chunk of tests that creates and verifies a lot of metadata... I think there's a lot of optimization we could do in RepositorySimulator and how we use it but I'm not interested in doing that unless it's a real bottleneck

Well testing time is relative but RepositorySimulator seems to be performing good enough for our criteria.
Some super quick results:

> time python test_updater_top_level_update.py
........................
----------------------------------------------------------------------
Ran 24 tests in 0.091s

OK

real	0m0.271s
user	0m0.217s
sys	0m0.051s

In comparison with test_updater_ng.py and test_api.py which use the pre-generated data on disk:

> time python test_updater_ng.py 
.........
----------------------------------------------------------------------
Ran 9 tests in 0.248s

OK

real	0m0.432s
user	0m0.350s
sys	0m0.076s

> time python test_api.py 
..................
----------------------------------------------------------------------
Ran 18 tests in 0.208s

OK

real	0m0.336s
user	0m0.302s
sys	0m0.029s

and key rotation tests:

> time python test_updater_key_rotations.py
.
----------------------------------------------------------------------
Ran 1 test in 0.100s

OK

real	0m0.272s
user	0m0.231s
sys	0m0.040s

In total the whole test suite now takes ~25 seconds to complete in my environment.

@jku jku merged commit 0088ebd into theupdateframework:develop Nov 10, 2021
@sechkova sechkova deleted the ng-tests-metadata-update branch November 10, 2021 12:57
@lukpueh lukpueh mentioned this pull request Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants