Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RepositorySimulator: add non-consistent snapshot support #1666

Merged
merged 3 commits into from
Nov 18, 2021

Conversation

sechkova
Copy link
Contributor

@sechkova sechkova commented Nov 8, 2021

Extend filename partitioning to support serving non-versioned
metadata and non-prefixed target files.

Fixes #1637, fixes #1667, fixes #1573

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@sechkova
Copy link
Contributor Author

sechkova commented Nov 8, 2021

I didn't add tests in this PR, I opened this one instead: #1667

@coveralls
Copy link

coveralls commented Nov 8, 2021

Pull Request Test Coverage Report for Build 1476016133

  • 0 of 0 changed or added relevant lines in 0 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 97.463%

Totals Coverage Status
Change from base Build 1470751886: 0.0%
Covered Lines: 3948
Relevant Lines: 4032

💛 - Coveralls

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you somehow test this privately?

The core difficulty with this issue is the corner we painted ourselves in by telling warehouse they can be consistent_snapshot and offer non-consistent targets for download at the same time: so we can't 100% tell if a targetname is hash prefixed or not when a client requests it ...

This code seems to assume that if target name contains a "." then it contains a prefix. I realize there probably aren't perfect solutions here but this seems non-optimal: it should be possible to request non-prefixed targets with dots in their names

role = "timestamp"
else:
version, _, role = ver_and_name.partition(".")
role, version = self._partition_filename(ver_and_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, it seems to do the right thing, but your function is called "partition filename" and then you need to feed it something that is explicitly just a portion of a filename ...

@sechkova
Copy link
Contributor Author

sechkova commented Nov 9, 2021

This code seems to assume that if target name contains a "." then it contains a prefix. I realize there probably aren't perfect solutions here but this seems non-optimal: it should be possible to request non-prefixed targets with dots in their names

Oh yes, you are right. I guess I will rework it in parallel with tests to cover all the weird cases.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

marking "request changes" so it's visible in issue list

@sechkova
Copy link
Contributor Author

I reworked the implementation and now RepositorySimulator makes the decision whether the client requests versioned/prefixed files or not based on: self.root.consistent_snapshot and self.prefix_targets_with_hash. While the specification describes how repository and client sync on consistent_snapshot, I presumed that they both need to know in advance whether they will work with hash-prefixed targets and I added the same prefix_targets_with_hash option to the repository.

I also added one new test file for consistent snapshot. I couldn't decide if the class TestConsistentSnapshot needs to be in a separate test file or integrated in another. Let me know your opinion on this as well as the general idea/readability of the tests.
Since we want to test the client and not that much the RepositorySimulator and it is not possible to verify that internally Updater builds the correct prefix+name, I "wrapped" RepositorySimulator._fetch_* to check the calls made by Updater.

@sechkova sechkova requested a review from jku November 15, 2021 17:29
role = "timestamp"
else:
# root is always version-prefixed while timestamp is always NOT
if ver_and_name.endswith("root") or (
Copy link
Contributor Author

@sechkova sechkova Nov 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail if a delegated role name ends with "root" (and consistent_snapshot is not enabled). But otherwise it is hard to generalize and I guess we don't want to support that weird cases in RepositorySimulator?

Copy link
Member

@jku jku Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah no need to spend a lot of time polishing a test utility to perfection...

that said you could move version, _, role = ver_and_name.partition(".") to before this line so you could compare role == "root" (but would keep rest of the code as is -- I admit the else clause is unintuitive if this is done)?

up to you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to do it in a readable way ... at least now it is correct for the cases I can think of.

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your reasoning about prefix_targets_with_hash option makes sense.

I also think counting the fetches at repository side is fine and actually better than poking into updater internals.

The patching in the tests is not pretty but it is much improved in this version. I think the asserts you use do not check whether other fetches were also made though, which I think we want (so we know it doesn't download things twice or something)... I think we should be comparing our expected calls list directly to the actually made calls list. WDYT?

...which leads me to: Making the call counters a RepositorySimulator feature might make the tests more readable and probably wouldn't be a lot of code: sim.call_stats.metadata could contain basically the same list of tuples that your testdata contains -- but this could also be a followup idea if you'd rather get these tests in ASAP.

role = "timestamp"
else:
# root is always version-prefixed while timestamp is always NOT
if ver_and_name.endswith("root") or (
Copy link
Member

@jku jku Nov 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah no need to spend a lot of time polishing a test utility to perfection...

that said you could move version, _, role = ver_and_name.partition(".") to before this line so you could compare role == "root" (but would keep rest of the code as is -- I admit the else clause is unintuitive if this is done)?

up to you

for targetpath in targetpaths:
sim.targets.version += 1
sim.add_target("targets", b"content", targetpath)
sim.update_snapshot()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not important but there should be no need to update snapshot more than once

# Copyright 2021, New York University and the TUF contributors
# SPDX-License-Identifier: MIT OR Apache-2.0

"""Test ngclient Updater consistent snapshot"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be better (we're testing consistent snapshot in all client tests, hopefully). This file seems to test

  • that the client works with all combinations of consistent_snapshot and prefix_targets_with_hash
  • that expected URLs are being fetched with all of these combinations

@sechkova
Copy link
Contributor Author

sechkova commented Nov 17, 2021

...which leads me to: Making the call counters a RepositorySimulator feature might make the tests more readable and probably wouldn't be a lot of code: sim.call_stats.metadata could contain basically the same list of tuples that your testdata contains -- but this could also be a followup idea if you'd rather get these tests in ASAP.

I think I will open a separate issue. It may double some of the work but it would be easier at least for me to do it as a follow up.
There it is: #1682

@sechkova
Copy link
Contributor Author

sechkova commented Nov 17, 2021

Just noting that I'd like to squash the last two commits before merging :)

Copy link
Member

@jku jku left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Extend URL partitioning to support serving non-versioned
metadata and non-prefixed target files.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
Add tests for ngclient.Updater toggling
'consitent_snapshot' and 'prefix_targets_with_hash'.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@sechkova
Copy link
Contributor Author

Squashed the last two commits, should be ready now.

@sechkova
Copy link
Contributor Author

sechkova commented Nov 18, 2021

I now saw this issue: #1573
I think test_refresh_on_consistent_targets() can be replaced by the new test file (test_updater_consistent_snapshot.py) and we can close it. I can do it in this PR if you agree?

@jku
Copy link
Member

jku commented Nov 18, 2021

Sounds good to me!

Consistent snapshot and consistent targets are now
extensively tested in test_updater_consistent_snapshot.py.

Signed-off-by: Teodora Sechkova <tsechkova@vmware.com>
@jku jku merged commit 1b5df4c into theupdateframework:develop Nov 18, 2021
@jku
Copy link
Member

jku commented Nov 18, 2021

Thanks

kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 1, 2021
This change is to implement a feature to track the fetch calls to
the metadata and targets. This feature was mentioned in PR theupdateframework#1666 that
generated issue theupdateframework#1682.
It is the ``RepositorySimulator.fetch_tracker``. It also changes the
``test_updater_consistent_snapshot`` to use the fetch_tracker
instead of mocking it.

It implements a ``dataclass`` that stores the calls to fetch metadata
(``_fetch_metadata``) and targets (``_fetch_targets``).
This dataclass has a method to clean (``clear``) the current metadata
and targets tracker.

The fetch calls for metadata are stored as a list of tuples that
contains metadata role name and version, and fetch calls for targets
are stored as a list of target paths.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 2, 2021
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds ``RepositorySimulator.fetch_tracker``. It also changes
the ``tests/test_updater_consistent_snapshot.py`` to use the
``fetch_tracker`` instead of using mock.

It implements a ``dataclass`` that stores the calls to fetch metadata
(``_fetch_metadata``) in ``fetch_tracker.metadata`` and targets
(``_fetch_targets``) in ``fetch_tracker.targets``.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 3, 2021
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds RepositorySimulator.fetch_tracker. It also changes
the tests/test_updater_consistent_snapshot.py to use the
fetch_tracker instead of using mock.

It implements a dataclass that stores the calls to fetch metadata
(_fetch_metadata) in fetch_tracker.metadata and targets
(_fetch_targets) in fetch_tracker.targets.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 3, 2021
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds RepositorySimulator.fetch_tracker. It also changes
the tests/test_updater_consistent_snapshot.py to use the
fetch_tracker instead of using mock.

It implements a dataclass that stores the calls to fetch metadata
(_fetch_metadata) in fetch_tracker.metadata and targets
(_fetch_targets) in fetch_tracker.targets.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 7, 2021
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds RepositorySimulator.fetch_tracker. It also changes
the tests/test_updater_consistent_snapshot.py to use the
fetch_tracker instead of using mock.

It implements a dataclass that stores the calls to fetch metadata
(_fetch_metadata) in fetch_tracker.metadata and targets
(_fetch_targets) in fetch_tracker.targets.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
kairoaraujo pushed a commit to kairoaraujo/python-tuf that referenced this pull request Dec 8, 2021
This commit implements a feature in Repository Simulator to
track the fetch calls to the metadata and targets. This feature was
mentioned in PR theupdateframework#1666 that generated issue theupdateframework#1682.
This commit adds RepositorySimulator.fetch_tracker. It also changes
the tests/test_updater_consistent_snapshot.py to use the
fetch_tracker instead of using mock.

It implements a dataclass that stores the calls to fetch metadata
(_fetch_metadata) in fetch_tracker.metadata and targets
(_fetch_targets) in fetch_tracker.targets.

The fetch calls for metadata, and targets are stored as lists.

Signed-off-by: Kairo de Araujo <kdearaujo@vmware.com>
@lukpueh lukpueh mentioned this pull request Dec 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants