Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not immediately verify latest root when rotating #885

Conversation

trishankatdatadog
Copy link
Member

Fixes issue #:

N/A

Description of the changes being introduced by the pull request:

When rotating the root (say from v2 to v3, then v3 to v4), the reference implementation currently immediately verifies signatures on the latest root (say, v4). This will fail when v4 is signed using v3's but not v2's keys. This PR changes the implementation so that signatures on the latest root is not immediately verified.

Please verify and check that the pull request fulfills the following
requirements
:

  • The code follows the Code Style Guidelines
  • Tests have been added for the bug fix or new feature
  • Docs have been added for the bug fix or new feature

@JustinCappos JustinCappos requested a review from lukpueh June 4, 2019 21:03
@JustinCappos
Copy link
Member

@tanishqjasoria Can you also take a look?

@trishankatdatadog
Copy link
Member Author

Yeah, I should say that someone should triple-check this to make we are not skipping some important security check. But at that point, all we are trying to do is get the latest version number of the root.

Cc @lukpueh

@tanishqjasoria
Copy link

I guess there might be a problem. Form switching to _get_file() from _get_metadata_file(), the root.json we get is not verified.
Then, we are checking if the root metadata is changed or not by comparing the next_version and latest_version and the latest_version is extracted from the unverified root.json file.
So if the unverified root.json file points that the latest_version is same as the current_version, then the root metadata would not be updated even if there is an update available and we would continue using the old root metadata.

@trishankatdatadog
Copy link
Member Author

trishankatdatadog commented Jun 5, 2019

@tanishqjasoria Good observation, but this would be true even if with a signed and verified "latest" root metadata file: attackers who control the repository can always perform a freeze attack (limited by the expiration of the previous root, which could be as long as years).

@tanishqjasoria
Copy link

@trishankatdatadog I missed that!

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch, @trishankatdatadog! I am not sure I understand the full consequences of your changes. Could you provide answers to my questions below?

tuf/client/updater.py Outdated Show resolved Hide resolved
# should refresh them here as a protective measure. See Issue #736.
self._rebuild_key_and_role_db()
self.consistent_snapshot = \
self.metadata['current']['root']['consistent_snapshot']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment in L1152-1154 claims that self.consistent_snapshot is set with _update_metadata().

https://github.com/theupdateframework/tuf/blob/c75a2b7fe1eaabc88f82a5f0bdfc8bd25cab7ab1/tuf/client/updater.py#L1152-L1154

Why set it here explicitly? Is the comment above inaccurate?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK, it's possible.

tuf/client/updater.py Outdated Show resolved Hide resolved
Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I somehow missed the call to _update_metadata in the top line of below for-loop. It downloads all the root files again using _get_metadata_file exactly how it used to fetch latest_root_metadata_file, before your patch. That means, all the checks that I listed in my last review are performed eventually.

So I'd say, LGTM! :)

Do you have capacities to add a test that has at least one intermediate root and where the latest root is signed with keys that are not available in the beginning?

On a side-note, I re-read the relevant part of the spec and it seems a bit different from what we do here. As per the spec the client downloads root files with incrementing version numbers until the server 404s. So in the spec there is no initial download of the latest root.

And on another side-note, did you see @vladimir-v-diaz's TODO comment in the refresh, where _update_root_metadata is called?

https://github.com/theupdateframework/tuf/blob/cf6fdec5101fec5e2402dc8d073fa17ace1fc25f/tuf/client/updater.py#L1079-L1082

@trishankatdatadog
Copy link
Member Author

@lukpueh Following our discussion, I have adjusted the code to follow the spec. Please review.

To answer your questions:

  1. Could someone else please help add the tests? Maybe an intern...
  2. That TODO requires a wider discussion on our mailing list. @JustinCappos

@trishankatdatadog
Copy link
Member Author

trishankatdatadog commented Jun 13, 2019

Our spec should also be updated to say: once you have turned on consistent snapshots, you should always also write versioned besides unversioned root metadata files so that outdated clients can updated to your latest root. @JustinCappos

Update: theupdateframework/specification#52

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your update, @trishankatdatadog. It does look more like the spec now. However, the code does not work yet. @JustinCappos, do we have an intern who can take over, i.e. fix the things I noted below and add some tests?

tuf/client/updater.py Outdated Show resolved Hide resolved
tuf/client/updater.py Outdated Show resolved Hide resolved
tuf/client/updater.py Outdated Show resolved Hide resolved
tuf/client/updater.py Outdated Show resolved Hide resolved
tuf/client/updater.py Outdated Show resolved Hide resolved
tuf/client/updater.py Outdated Show resolved Hide resolved
@trishankatdatadog
Copy link
Member Author

@lukpueh Sorry, I wrote on one machine, and tested on another, so there were some mistakes in translation.

Does this work better for you?

Copy link
Member

@lukpueh lukpueh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! I'm unsure whether you need request.exceptions. Otherwise it looks good.

tuf/client/updater.py Outdated Show resolved Hide resolved
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
@trishankatdatadog trishankatdatadog force-pushed the trishankatdatadog/correctly-rotate-root branch from fbff603 to 5fc3d90 Compare July 1, 2019 17:21
@lukpueh
Copy link
Member

lukpueh commented Jul 2, 2019

Thanks for the update, @trishankatdatadog! This looks sound now. I'd still like to see tests before we merge. @JustinCappos, do we have a student you might want to volunteer? If not, I'll add some tests.

Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
@trishankatdatadog trishankatdatadog force-pushed the trishankatdatadog/correctly-rotate-root branch from 54a101b to 3bb4f73 Compare July 2, 2019 14:50
@trishankatdatadog
Copy link
Member Author

This is very important to Datadog for our next update. Could we get this fixed ASAP? Thanks.

@trishankatdatadog trishankatdatadog added this to the 0.12.0 milestone Oct 2, 2019
@lukpueh lukpueh assigned lukpueh and unassigned tanishqjasoria Oct 3, 2019
@lukpueh
Copy link
Member

lukpueh commented Oct 3, 2019

FYI. I am working on tests...

@lukpueh lukpueh mentioned this pull request Oct 3, 2019
@trishankatdatadog
Copy link
Member Author

Thanks for helping to write the tests, @lukpueh...

@lukpueh
Copy link
Member

lukpueh commented Oct 4, 2019

And here they are: trishankatdatadog/tuf@b170601...lukpueh:correctly-rotate-root-test

@trishankatdatadog, could you please sign off your most recent commit? And would you mind adding my three tests-adding commits linked above on top of your branch? Then we can review the tests here on the same PR.

Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
…trishankatdatadog/tuf into trishankatdatadog/correctly-rotate-root

Signed-off-by: Trishank K Kuppusamy <trishank.kuppusamy@datadoghq.com>
@trishankatdatadog trishankatdatadog force-pushed the trishankatdatadog/correctly-rotate-root branch from 7f32a1e to cd2a361 Compare October 4, 2019 21:42
@trishankatdatadog
Copy link
Member Author

@lukpueh I had to manually set DCO to pass, should be OK (only one commit done directly from GitHub didn't have sign off). Great test, BTW. Let me know if you need anything else!

lukpueh added a commit that referenced this pull request Oct 7, 2019
…ectly-rotate-root

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@lukpueh
Copy link
Member

lukpueh commented Oct 7, 2019

Merged manually via 3d342e6. Tests will be added with #930.

@lukpueh lukpueh closed this Oct 7, 2019
@lukpueh
Copy link
Member

lukpueh commented Oct 7, 2019

Thanks for the fix, @trishankatdatadog!! 💯

@lukpueh lukpueh mentioned this pull request Oct 14, 2019
3 tasks
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 24, 2020
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in theupdateframework#885.

The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.

The following script can be used to reproduce the issue (run in
tests directory):

```python
import subprocess
import requests
import random

counter = 0

port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'

sess = requests.Session()

try:
  while True:
    sess.get(url, timeout=3)
    counter +=1

finally:
  print(counter)
  server_process.kill()
```

It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g.

```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.

As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.
@lukpueh lukpueh mentioned this pull request Feb 24, 2020
3 tasks
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 24, 2020
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in theupdateframework#885.

The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.

The following script can be used to reproduce the issue (run in
tests directory):

```python
import subprocess
import requests
import random

counter = 0

port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'

sess = requests.Session()

try:
  while True:
    sess.get(url, timeout=3)
    counter +=1

finally:
  print(counter)
  server_process.kill()
```

It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g.

```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.

As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 24, 2020
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in theupdateframework#885.

The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.

The following script can be used to reproduce the issue (run in
tests directory):

```python
import subprocess
import requests
import random

counter = 0

port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'

sess = requests.Session()

try:
  while True:
    sess.get(url, timeout=3)
    counter +=1

finally:
  print(counter)
  server_process.kill()
```

It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g.

```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.

As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 24, 2020
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in theupdateframework#885.

The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.

The following script can be used to reproduce the issue (run in
tests directory):

```python
import subprocess
import requests
import random

counter = 0

port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'

sess = requests.Session()

try:
  while True:
    sess.get(url, timeout=3)
    counter +=1

finally:
  print(counter)
  server_process.kill()
```

It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g. ...
```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.

As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
lukpueh added a commit to lukpueh/tuf that referenced this pull request Feb 24, 2020
Since theupdateframework#885 the tests in TestUpdater and TestKeyRevocation fail on
Appveyor Python 2.7 builds. After some live debugging, it turns out
that the tests fail due to the extra amount of http requests to
the simple http server (see tests/simple_server.py) that were
added in theupdateframework#885.

The simple server runs in a subprocess and is re-used for the
entire TestCase. After a certain amount of requests it becomes
unresponsive. Note that neither the subprocess exits (ps -W), nor
does the port get closed (netstat -a). It just doesn't serve the
request, making it time out and fail the test.

The following script can be used to reproduce the issue (run in
tests directory):

```python
import subprocess
import requests
import random

counter = 0

port = random.randint(30000, 45000)
command = ['python', 'simple_server.py', str(port)]
server_process = subprocess.Popen(command, stderr=subprocess.PIPE)
url = 'http://localhost:'+str(port) + '/'

sess = requests.Session()

try:
  while True:
    sess.get(url, timeout=3)
    counter +=1

finally:
  print(counter)
  server_process.kill()
```

It fails repeatedly on the 69th request, but only if
`stderr=subprocess.PIPE` is passed to Popen. Given that for each
request the simple server writes about ~60 characters to stderr,
e.g. ...
```
127.0.0.1 - - [24/Feb/2020 12:01:23] "GET / HTTP/1.1" 200 -
```
... it looks a lot like a full pipe buffer of size 4096. Note that the
`bufsize` argument to Popen does not change anything.

As a simple work around we silence the test server on
Windows/Python2 to not fill the buffer.

Signed-off-by: Lukas Puehringer <lukas.puehringer@nyu.edu>
@trishankatdatadog trishankatdatadog deleted the trishankatdatadog/correctly-rotate-root branch July 13, 2020 17:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants