Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TUTORIAL.md and add regression testing for future issues #775

Merged
merged 33 commits into from
Dec 16, 2019

Conversation

awwad
Copy link
Contributor

@awwad awwad commented Aug 20, 2018

Add a regression test for the tutorial instructions. Running test_tutorial.py attempts the commands replicated from TUTORIAL.md. This should help us avoid breaking the tutorial with future changes without noticing by having automated testing run the tutorial and produce helpful output.

It would be wise to add a similar test for the client use instructions.

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@awwad awwad changed the title (WIP) Add a regression test for the tutorial instructions (WIP) Fix TUTORIAL.md and add regression testing for future issues Aug 20, 2018
@awwad awwad force-pushed the add_tutorial_regression_test branch from e5709b2 to f7f003d Compare August 21, 2018 18:45
@awwad awwad requested a review from lukpueh November 5, 2018 22:00
Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While going through your test_tutorial.py and comparing it line by line with the code snippets in TUTORIAL.md, to see if the instructions are properly replicated (which they seem to be), I had a crazy idea:
Why not run Python's doctest over the TUTORIAL.md? That would be an actual regression test, i.e. if the tutorial is changed, doctest would immediately detect regressions without manually syncing the test code.

From the doctest docs, one of it's use cases is:

To write tutorial documentation for a package, liberally illustrated with input-output examples. Depending on whether the examples or the expository text are emphasized, this has the flavor of “literate testing” or “executable documentation”.

The function doctest.testfile should mostly work right away, with a few minor tweaks to the tutorial snippets:

  • Add additional function calls and their expected output here and there to assert the proper functioning,
  • Replace non-deterministic output, such as keyids (you already do this in test_tutorial.py)
  • Replace shell code with Python code (you already do this in test_tutorial.py)
  • Deal with user input (you already do this in test_tutorial.py)
    IMHO the easiest way would be to just monkeypatch the corresponding functions generate_and_write_*_keypair, import_*_privatekey_from_file. You can use doctest's globs variable to pass the patched function names.

What do you think? If you want I can help you. Also let me know if you prefer the setup as it is now. Then I'll resume my comparing lines. :)

as those created by this test...).
"""
for directory in ['repository', 'my_repo', 'client',
'repository/targets/my_project']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repository/targets/my_project is already deleted in the first iteration of this loop, i.e. shutil.rmtree('repository')

for fname in [
os.path.join('repository', 'targets', 'file1.txt'),
os.path.join('repository', 'targets', 'file2.txt'),
os.path.join('repository', 'targets', 'file3.txt'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're already deleting the repository dir above.

'unclaimed_key.pub']:
if os.path.exists(fname):
os.remove(fname)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a more general note, would it make sense to create and change into a temporary test directory in TestTutorial.setUp and change back and remove the temporary directory in TestTutorial.tearDown?

@awwad
Copy link
Contributor Author

awwad commented Nov 27, 2018

It's a good thought. I'll have to try it out to see how practical it proves to be. If it makes the tutorial harder to read, that's a dealbreaker.

@awwad awwad mentioned this pull request Nov 27, 2018
lukpueh added a commit to lukpueh/tuf that referenced this pull request Nov 11, 2019
lukpueh added a commit to lukpueh/tuf that referenced this pull request Nov 13, 2019
TODO:
- Rephrase two comments in TUTORIAL.md snippets
- See if there are any issues in "Dump Metadata", "Delegations", "Revoke
Delegated Role", "Wrap up" and fix or ticketize
- Create regression test for client (or ticketize)
- Create commits
- Evaluate theupdateframework#808 (some if it is already resolved or will be resolved
with theupdateframework#775, split out rest into new tickets, e.g. doctest style
tutorial tests)
lukpueh added a commit to lukpueh/tuf that referenced this pull request Nov 18, 2019
    TODO:
    - Rephrase comments in TUTORIAL.md snippets
      (where code changed)
    - See if there are any issues in "Revoke
    Delegated Role", consistent snapshot, delegate to hashed bins
    - Finish regression test for client
    - Create commits
    - Evaluate theupdateframework#808 (some if it is already resolved or will be resolved
    with theupdateframework#775, split out rest into new tickets, e.g. doctest style
    tutorial tests)
lukpueh added a commit to lukpueh/tuf that referenced this pull request Nov 19, 2019
TODO:
- Create commits for TUTORIAL.md and test_tutorial.py
- Update PR theupdateframework#775
- Evaluate and update issue theupdateframework#808
  (some if it is already resolved or will be resolved
  with theupdateframework#775, split out rest into new tickets, e.g. doctest style
  tutorial tests)
@lukpueh lukpueh requested a review from mnm678 November 20, 2019 13:47
@lukpueh lukpueh changed the title (WIP) Fix TUTORIAL.md and add regression testing for future issues Fix TUTORIAL.md and add regression testing for future issues Nov 20, 2019
@lukpueh
Copy link
Member

lukpueh commented Nov 20, 2019

Taking over @awwad's work, I addressed my own inline review comments from above and most of the suggestions I made in #808 (comment).

The tutorial is now fully copy-pastable (runnable) and should generally not leave the repository in an inconsistent state.

Remaining issues as described in #808 should be fixed in separate PRs.

I suggest reviewers to:

Please read commit messages for details!

Copy link
Member

@joshuagl joshuagl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a really solid set of improvements to the tutorial, great work @lukpueh

docs/TUTORIAL.md Outdated Show resolved Hide resolved
docs/TUTORIAL.md Outdated Show resolved Hide resolved
docs/TUTORIAL.md Outdated Show resolved Hide resolved
docs/TUTORIAL.md Outdated Show resolved Hide resolved
docs/TUTORIAL.md Outdated
>>> repository.dirty_roles()
Dirty roles: ['timestamp', 'snapshot', 'targets']
Dirty roles: ['root', 'snapshot', 'targets', 'timestamp']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, I only have:

Dirty roles: ['targets']

I believe I ran through the tutorial verbatim, apart from that I didn't exit ipython so I didn't have to re-import and reload the keys.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the tuf repository code has a bit of a problem with state-keeping. dirty_roles() shows roles that have been changed in memory, e.g. by loading a private key for a role (load_signing_key()), which is necessary after loading an existing repository (load_repository()). This does not mean, however, that the according metadata on disk actually needs to be updated, although this is exactly what's done with dirty roles when calling writeall().

This clearly needs a revision. But probably not as part of the current PR.

A related issue, that is writeall() ignoring roles that were not explicitly changed and thus not marked as dirty, even though they should be updated in dependency of other roles is documented in #958.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation @lukpueh. I wasn't sure why the output I saw differed from that shown in the tutorial, so wanted to point it out. I'm happy with this not being addressed in the current PR, should we capture the issue around state-keeping so we don't forget?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the best quick fix for the tutorial would be to just remove the calls to dirty_roles() and instead add a call to mark_dirty() passing all the roles that need to be written in a subsequent call to writeall(). This guarantees that the repository on disk is always in a consistent state, even if the interpreter was exited.

What do you think, @joshuagl?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this idea. I think it demonstrates how things fit together and helps make some of the workflow clearer.

# ----- Tutorial Section: Delegations
generate_and_write_rsa_keypair(
'unclaimed_key', bits=2048, password='password')
public_unclaimed_key = import_rsa_publickey_from_file('unclaimed_key.pub')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very minor nit, but the tutorial isn't consistent in its use of ' vs " for strings. I suspect it highlights the new content Lukas added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are absolutely right! I pointed this out in #808 (comment) (3rd item), which I suggested in #775 (comment) should be fixed along with other remaining issues in separate PRs.

Apologies for the many sources of truth here. I can see how it is hard for a reviewer to keep track of all the things that need to be fixed. Thanks for bearing with me! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I read through #775 but forgot to read through #808 before reviewing. Glad we have this tracked, let's ignore for this PR.

'unclaimed_key', bits=2048, password='password')
public_unclaimed_key = import_rsa_publickey_from_file('unclaimed_key.pub')
repository.targets.delegate(
'unclaimed', [public_unclaimed_key], ['myproject/*.txt'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This emits:

'myproject/*.txt' is not located in the repository's targets directory: '/home/joshuagl/tuf-tut/repository/targets'

I think this is the first point in the tutorial we haven't warned the user to expect output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly thought about it but left it to be fixed along with other issues in the tutorial.

Moreover, I have the suspicion that this is a bogus warning. It seems to literally check for a file called "myproject/*.txt" ignoring the fact that this is a path pattern.

What to you think about creating an issue, and adding a comment in the tutorial snippet that asks the reader to ignore that output, referring to that issue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds reasonable. Let's be sure to have the issue mention that the tutorial needs updating once the issue is resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My suspicion was wrong. The warning does not stem from an invalid file check but rather an unexpected file path prefix check. I documented the issue and a proposed fix in #963.

os.path.join('repository', 'targets', 'myproject'), recursive_walk=True)

repository.targets('unclaimed').delegate_hashed_bins(
targets, [public_unclaimed_key], 32)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we make a remark about the output here? I get:

Creating hashed bin delegations.
1 total targets.
32 hashed bins.
256 total hash prefixes.
Each bin ranges over 8 hash prefixes.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.
Adding a verification key that has already been used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! It is another instance of me thinking this should be addressed. More specifically with the 2nd to last item in #808 (comment).

But adding a comment there that briefly explains the output does not hurt.

docs/TUTORIAL.md Outdated

>>> repository.dirty_roles()
Dirty roles: ['00-07', '08-0f', '10-17', '18-1f', '20-27', '28-2f', '30-37', '38-3f', '40-47', '48-4f', '50-57', '58-5f', '60-67', '68-6f', '70-77', '78-7f', '80-87', '88-8f', '90-7', '98-9f', 'a0-a7', 'a8-af', 'b0-b7', 'b8-bf', 'c0-c7', 'c8-cf', 'd0-d7', 'd8-df', 'e0-e7', 'e8-ef', 'f0-f7', 'f8-ff', 'unclaimed']

Copy link
Member

@joshuagl joshuagl Nov 28, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the same roles but sorted differently, which I'd expect knowing that various Python data structures have non-deterministic ordering. Does it make sense to make these deterministic (i.e. sorted() this list)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Patch to sort the list before printing in dirty_roles() submitted in #961

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#961 was merged 🎉

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is odd. I had already added this change in ac01033 specifically for that reason, albeit one level deeper. Are you sure you got a different sorting?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh no, I was running the steps from the develop branch, not a branch with the changes from this PR 🤦‍♂

We should revert/undo ff21b36. Apologies @lukpueh!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all! :) What do you think about cherry-picking 282c9aa into this PR and reverting #961?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me

JustinCappos added a commit that referenced this pull request Nov 29, 2019
Minor tutorial related tweaks made after reviewing PR #775
@lukpueh
Copy link
Member

lukpueh commented Nov 29, 2019

Thank you very much for the review, @joshuagl. Let me address/reply to your excellent observations!

@@ -21,7 +21,7 @@ init:

install:
- set PATH=%PYTHON%;%PYTHON%\\Scripts;%PATH%
- python -m pip install -U pip
- python -m pip install -U pip setuptools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to explicitly install mock for Python 2.7 on appveyor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True. Let me fix this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshuagl, do you happen to know how to install the dependency only for Python 2.7 builds on appveyor?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not off the top of my head, but I think we should be able to use for.matrix.only/.except:
https://www.appveyor.com/docs/build-configuration/#specializing-matrix-job-configuration

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Maybe 932c025 works too.

lukpueh added a commit that referenced this pull request Nov 29, 2019
- line-wraps are integrated with
190a736 in #775
- sorting the value returned by `get_dirty_roles()` in
`dirty_roles()` is not necessary as `get_dirty_roles()` already
returns a sorted list per ac01033
in #775.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit that referenced this pull request Nov 29, 2019
…utorial

Revert "Minor tutorial related tweaks made after reviewing PR #775"
@lukpueh lukpueh force-pushed the add_tutorial_regression_test branch from 2f0cf72 to 21db98f Compare December 2, 2019 12:34
awwad and others added 25 commits December 16, 2019 15:16
Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
(that is, despite currently existing issue to be remedied in #774)

Currently, repository_tool.get_filepaths_in_directory yields
relative paths, not the absolute paths it promises in its docstring.
This test will now function despite this and continue to function
after #774 is merged.

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Note that target filepaths specified in the repo use '/' even on
Windows.

(That property is important to make sure that metadata is platform-
independent.)

Signed-off-by: Sebastien Awwad <sebastien.awwad@gmail.com>
Reasons are:

- The prompt says 2.7.3  Sep 26 2013, which makes the tutorial look
  outdated
- There is another section that explains how snippets should be
  executed in a Python interactive interpreter
- The only activity in the snippet is importing tuf functions and
  creating a repo, both of which is done in another snippet below.
  And the here created repo is not re-used
- The tutorial is long enough

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Fix expected output
- Update comments
- Also see #955

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Correctly show that repo.get_filepaths_in_directory() returns
  absolute and not relative paths
- Pass absolute path to repo.targets.add_target() to fix exception
- Also see #957

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
The section does not fit in the continuity of the tutorial and
misses commands to make it fully copy-pastable.

This commit marks the section as "Excursion" and updates the
snippets to make the commands fully copy-pastable, using files
created in prior tutorial sections.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
In "Targets" section:
- Remove `repository.targets('<delegated rolename>').add_target(...)`
  command, because it is not copy-pastable and delegations
  have not yet been covered at that point.
- Update the "remove targets" snippet to remove the previously added
  "myproject/file4.txt" instead of "file3.txt", because we will add
  "myproject/file4.txt" to the delegated "unclaimed" role in the
  Delegation section.

In "Delegation" section:
- Change "unclaimed" delegation pattern from 'foo*.tgz', for which
  no file exists in the tutorial, to 'myproject/*.txt'.
- Add "myproject/file4.txt" to the delegated unclaimed targets role
- Remove the command that updates the version of the "unclaimed"
  role, because this should not be done manually, and the add_target
  call shows just as well how to access delegated roles.
- Comment out the revoke delgated role section, leaving a TODO note
  for required updates (should be ticketized).

In "Delegate to Hashed Bins":
- Add call to remove target "myproject/file4.txt" from "unclaimed",
  because it is further delgated to hashed bins
- Add dirty_roles() call to show all the newly created bins
- Add mark_dirty() and writeall() calls to create a consistent
  state of the repo

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Fix expected output
- Update comments
- Add a few additional calls, to help the reader understand the
  repo state
- Also see #958

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
The text above the snippet explains the basic idea of "consistent
snapshots" and how to generate them with `write` and `writeall`.

The commands in the snippet just leave the repo in an inconsistent
state (see comment).

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Change directory names to what cli tool repo.py expects
- Remove unrelated "tufenv" note
- Mention that `tuf` must be installed

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Instead of keeping track of files created during the tutorial and
removing them afterwards, this commit updates the test case to
create and change into a temporary directory in setUp and
change back and remove the tempdir in tearDown.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
roledb.get_dirty_roles(repo_name) returns the list representation
of the global _dirty_roles[repo_name] set. To make the return value
deterministic this commit sorts the list before returning it.

The commit also removes calls to sorted on the return value of
get_dirty_roles in test_roledb.py and test_repository_tool.py.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Add, remove and update function calls to match code snippets in
tutorial.

This commit also adds tests for outputs of `repo.status()` and
`repo.dirty_roles()` functions.
Note that the compare-to strings need to be constructed
programatically, akin to how they are constructed in the relevant
functions, in order to avoid issues with unicode prefixes in
Python2/3, e.g.
    "Dirty roles: ['root']"
vs  "Dirty roles: [u'root']"

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
This should fix the following build error:

error in tuf setup command: 'tests_require' must be a string or
list of strings containing valid project/version requirement
specifiers; Expected version spec in mock; python_version < "3.3"
at ; python_version < "3.3"

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Co-Authored-By: Joshua Lock <jlock@vmware.com>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Ask the reader to ignore a misleading warning about the location of
  a delegation path pattern.
  The comment may be removed when fixing the warning in
  #963.

- Comment out text that has become obsolete when commenting out
  the "Revoke Delegated Role" section (in an earlier commit).

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Explain and show output of delegate_hashed_bins() function call in
tutorial snippet.

Also update the subsequent comment for better continuity.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
TUF does not reliably mark roles as dirty whose metadata needs
to be re-generated.

Only roles that have changed are marked as dirty, but sometimes
roles metadata needs to be updated, although the role wasn't
changed directly (see #958).

Furthermore, the tutorial assumes at one point that the reader
leaves and re-enter the interpreter session, being forced to reload the
signing keys, roles that later need to be re-written, are marked as
dirty. If the reader does not leave the interpreter, the roles are
not marked as dirty (see #964).

To not confuse the reader with flawed state-keeping, and to never
write an inconsistent repository to disk, the tutorial lets the
reader explicitly mark all roles that need to be re-written as
"dirty".

This can be changed once above issues are fixed.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
- Add missing remove_target call from "Delegate to Hashed Bins"
  section
- Add comments to dirty_roles output assertion

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh
Copy link
Member

lukpueh commented Dec 16, 2019

I just rebased on top of develop, which does not test on 3.4 anymore, but instead tests on 3.7 and 3.8. Travis builds pass. So I guess it's time to merge.

Thanks for the initial work, @awwad, and for the great reviews, @joshuagl and @mnm678!

Remaining issues shall be fixed with #808.

@lukpueh lukpueh merged commit 530d4a8 into develop Dec 16, 2019
@trishankatdatadog trishankatdatadog deleted the add_tutorial_regression_test branch October 13, 2021 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants