PYTHON-1350 Store IV along with encrypted text when using column-level encryption #1160

absurdfarce · 2023-05-20T06:36:40Z

No description provided.

…aching of default args

absurdfarce · 2023-05-20T06:39:02Z

tests/integration/standard/test_policies.py

@@ -161,3 +165,42 @@ def test_end_to_end_simple(self):
        (encrypted,unencrypted) = session.execute(prepared, [expected]).one()
        self.assertEquals(expected, encrypted)
        self.assertEquals(expected, unencrypted)
+
+    def test_end_to_end_different_cle_contexts(self):


When run manually this test fails (with the expected errors) without the rest of the changes in this PR and passes with them.

absurdfarce · 2023-05-20T06:45:24Z

cassandra/policies.py

-        self.mode = mode
+    # Fix block cipher mode for now.  IV size is a function of block cipher used
+    # so fixing this avoids (possibly unnecessary) validation logic here.
+    mode = modes.CBC


A change from the prior implementation: we fix the block cipher mode now. Doing so allows us to make clear assumptions about the expected size of the IV. We certainly can get around this other ways, but honestly the way we were previously exposing the block cipher mode wasn't great; it leaked the implementation badly since users who wanted to supply a non-default value had to do so via types from the cryptography module.

We can re-introduce the idea of making this configurable if this becomes an issue, but for an initial pass it seemed reasonably sane to constrain this in order to support ease of use. I'm certainly open to counter-arguments, though.

absurdfarce · 2023-05-21T04:18:34Z

cassandra/policies.py

-        cipher = self._get_cipher(coldesc)
+        iv = bytes[:AES256_BLOCK_SIZE_BYTES]
+        encrypted_bytes = bytes[AES256_BLOCK_SIZE_BYTES:]
+        cipher = self._get_cipher(coldesc, iv=iv)


Worth being explicit about this: the IV supplied via the policy constructor is never used for decrypted read values. We only use that when writing new data. Seems obvious given how the code is organized but may not be obvious to users... perhaps we should more explicitly call this out?

I think it should be fine.

JeremiahDJordan · 2023-06-01T16:14:37Z

cassandra/policies.py

-        cipher = self._get_cipher(coldesc)
+        iv = bytes[:AES256_BLOCK_SIZE_BYTES]
+        encrypted_bytes = bytes[AES256_BLOCK_SIZE_BYTES:]
+        cipher = self._get_cipher(coldesc, iv=iv)


I think it should be fine.

JeremiahDJordan · 2023-06-01T16:17:03Z

cassandra/policies.py

@@ -1286,11 +1293,13 @@ def encrypt(self, coldesc, obj_bytes):

        cipher = self._get_cipher(coldesc)
        encryptor = cipher.encryptor()
-        return encryptor.update(padded_bytes) + encryptor.finalize()
+        return self.iv + encryptor.update(padded_bytes) + encryptor.finalize()


I wonder if it should include the size of the IV as a prefix to the IV?

It's an excellent idea. We don't need it for this impl given that we're only supporting AES-256 and the IV for that cipher is a fixed size... but yeah, other impls will have different requirements. I'll add that in... nice catch!

After looking at this again I'm very much on the fence about whether we need to include the IV size here.

The IV size is a function of the block chaining mode and all modes supported by the current version of cryptography (with one exception) use an IV size that's equal to the block size of the cipher. We're in the impl for AES-256 so we know the block size will be 16 bytes. We might wind up with other policy impls based on other ciphers but with respect to the internals of this policy (which is the scope we're dealing with here) 16 bytes seems like a reasonable assumption. Only way that breaks down would seem to be one of the following:

We somehow make the mode configurable again and don't exclude GCM

We somehow make the mode configurable again and cryptography subsequently releases new versions which add some number of new modes which make use of IVs of different sizes and we don't exclude the new modes

These scenarios are not implausible necessarily, but I guess the question is whether we want to add a byte or two [1] to every encrypted value to future-proof them against changes like these. On the one hand we just added 16 bytes to each of these values to preserve the IV so a few more bytes doesn't seem like a big jump. On the other hand we could add flexibility while still guarding against the failure cases above by being more cautious if and when we make the mode configurable. For example, just white-listing the modes which use block-size IVs. We could expand that to, say, all modes that support an IV of 256 bytes or less if we want to add an additional byte for IV length but that set doesn't differ much from the set of modes that support an IV equal to the block size.

Thoughts?

[1] We'd actually need a lot more than this if we wanted full coverage; GCM can support IVs up to 2^64 - 1 bits

We're in the impl for AES-256 so we know the block size will be 16 bytes.

The only reason I can think of to do it is to fail fast if it's wrong, which means someone encrypted the data using a different class than AES-256. So maybe it is not worth adding bytes only to detect misconfigurations. I mean the users is going to get garbage out, or a decryption error out, if they did use some other function to encrypt the data. So maybe that is enough to clue them in their configuration is wrong.

I'm inclined to agree, I don't think using length as an indicator of whether we used the same impl to encrypt the data works. AES is a great example. AES-256 uses a 16 byte block size, but so does AES-128. A 1 byte IV size value would thus be reasonable for both (as discussed above) so it wouldn't give us any indicator as to whether we should use one or the other for a given blob.

There's perhaps a separate question there about whether we should provide some additional flags there, maybe a magic number of some kind to indicate which algorithm a given blob was encoded with. But since we have exactly one column encryption policy impl at the moment I'm not sure we know enough to pick the right design.

So I think we're saying go with what's in the PR now (IV in the blob, no leading IV size byte). Sound right to you @JeremiahDJordan ?

…sync_with_upstream_3.29.1 version 3.28.0 * tag '3.28.0' of https://github.com/datastax/python-driver: Release 3.28.0: changelog & version PYTHON-1352 Add vector type, codec + support for parsing CQL type (datastax#1161) Update docs.yaml to point to most recent 3.27.0 docs changes CONN-38 Notes for 3.27.0 on PYTHON-1350 (datastax#1166) PYTHON-1356 Create session-specific protocol handlers to contain session-specific CLE policies (datastax#1165) PYTHON-1350 Store IV along with encrypted text when using column-level encryption (datastax#1160) PYTHON-1351 Convert cryptography to an optional dependency (datastax#1164) Jenkinsfile cleanup (datastax#1163) PYTHON-1343 Use Cython for smoke builds (datastax#1162) Don't fail when inserting UDTs with prepared queries with some missing fields (datastax#1151) Revert "remove unnecessary import __future__ (datastax#1156)" docs: convert print statement to function in docs (datastax#1157) remove unnecessary import __future__ (datastax#1156) Update docs.yaml to include recent fixes to CLE docs Fix for rendering of code blocks in CLE documentation (datastax#1159) DOC-3278 Update comment for retry policy (datastax#1158) DOC-2813 (datastax#1145) Remove different build matrix selection for develop branches (datastax#1138)

…l encryption (datastax#1160)

absurdfarce added 5 commits May 19, 2023 20:36

Expand integration test to simulate the problem

d800a5d

Re-work default val for policy IV in order to avoid any issues with c…

10fb359

…aching of default args

Fix block cipher mode, add some tests for IV length, various other fixes

d378264

Explicitly make caching key a function of key + iv only

47dcda5

Store IV along with encrypted bytes and extract them when decrypting

f4add8b

absurdfarce marked this pull request as draft May 20, 2023 06:37

absurdfarce commented May 20, 2023

View reviewed changes

absurdfarce commented May 21, 2023

View reviewed changes

absurdfarce changed the title ~~PYTHON-1350: Store IV along with encrypted text when using column-level encryption~~ PYTHON-1350 Store IV along with encrypted text when using column-level encryption May 23, 2023

absurdfarce marked this pull request as ready for review May 23, 2023 20:40

absurdfarce requested a review from JeremiahDJordan May 31, 2023 20:26

JeremiahDJordan approved these changes Jun 1, 2023

View reviewed changes

Merge branch 'master' into python-1350

f13bba7

absurdfarce merged commit 531a6c7 into master Jun 2, 2023

absurdfarce deleted the python-1350 branch June 2, 2023 15:52

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

5fc0f8c

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

b2f4a5d

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

6a515d8

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

0127505

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

ec07845

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

16887c0

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

b23c3ef

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 9, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

4124125

…l encryption (datastax#1160)

dkropachev pushed a commit to dkropachev/python-driver that referenced this pull request Aug 15, 2024

PYTHON-1350 Store IV along with encrypted text when using column-leve…

030466b

…l encryption (datastax#1160)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PYTHON-1350 Store IV along with encrypted text when using column-level encryption #1160

PYTHON-1350 Store IV along with encrypted text when using column-level encryption #1160

absurdfarce commented May 20, 2023

absurdfarce May 20, 2023

absurdfarce May 20, 2023

absurdfarce May 21, 2023

JeremiahDJordan Jun 1, 2023

JeremiahDJordan Jun 1, 2023

JeremiahDJordan Jun 1, 2023

absurdfarce Jun 1, 2023

absurdfarce Jun 1, 2023

JeremiahDJordan Jun 1, 2023

absurdfarce Jun 2, 2023

JeremiahDJordan Jun 2, 2023

PYTHON-1350 Store IV along with encrypted text when using column-level encryption #1160

PYTHON-1350 Store IV along with encrypted text when using column-level encryption #1160

Conversation

absurdfarce commented May 20, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment