Add encryption infrastructure#1365
Conversation
|
May need changes to support BYOK. |
|
Needs a design doc. |
0b8a08f to
07e0e00
Compare
There was a problem hiding this comment.
Pull request overview
Adds foundational encryption plumbing to the server so partitions can persist a payload codec configuration (including a wrapped DEK) and later materialize an encrypt/decrypt codec via a KMS client abstraction.
Changes:
- Introduces a
KMSCryptoClientabstraction plus a KMS-envelope codec loader to unwrap DEKs and construct codecs. - Adds an AES-GCM payload codec and extends payload codec config models to support AES-GCM + base64url-encoded binary fields.
- Adds
cryptographyas a dependency and includes unit tests for AES-GCM, config round-tripping, and loader behavior.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| uv.lock | Locks cryptography dependency needed for AES-GCM implementation. |
| packages/server/pyproject.toml | Adds cryptography to server runtime dependencies. |
| packages/server/src/memmachine_server/common/payload_codec/payload_codec_config.py | Adds KMS-envelope/AES-GCM codec config models + base64url (de)serialization. |
| packages/server/src/memmachine_server/common/payload_codec/kms_envelope_payload_codec_loader.py | Adds loader that unwraps DEKs via KMS and materializes codecs. |
| packages/server/src/memmachine_server/common/payload_codec/aes_gcm_payload_codec.py | Implements AES-GCM encode/decode codec using cryptography. |
| packages/server/src/memmachine_server/common/payload_codec/init.py | Exports new loader/config types. |
| packages/server/src/memmachine_server/common/kms/kms_crypto_client.py | Adds abstract KMS crypto client interface (encrypt/decrypt). |
| packages/server/src/memmachine_server/common/kms/init.py | Exposes KMS abstractions via package exports. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_payload_codec.py | Adds AES-GCM round-trip + wrong-AAD failure test. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_payload_codec_config.py | Adds AES-GCM config JSON round-trip and base64url assertions. |
| packages/server/server_tests/memmachine_server/common/payload_codec/test_kms_envelope_payload_codec_loader.py | Adds loader tests (success + unsupported config). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
sscargal
left a comment
There was a problem hiding this comment.
Summary
Nice, well-scoped infrastructure addition — clean ABCs, sensible AAD propagation, and the loader's match pattern leaves room for future codec types. CI is green across all platforms and the happy-path tests are correct.
Requesting changes for a few items not already covered by @copilot's notes. Copilot has already flagged the nonce_size validation gap (twice) and the permissive urlsafe_b64decode behavior; those still apply and aren't repeated here.
Blocking
- GCM auth tag length check missing in
decode— see inline onaes_gcm_payload_codec.py:39. Currently conflates truncated buffers with active tampering. NotImplementedErroris the wrong exception type for an unsupported runtime config — see inline onkms_envelope_payload_codec_loader.py:45.binascii.Errorfromurlsafe_b64decodeis not wrapped withfrom err— see inlines onpayload_codec_config.py:36and:51.AGENTS.mdL94 requiresraise ... from errchaining.
Suggestions
KMSEnvelopePayloadCodecConfigis meant as an abstract base but is instantiable and has notypediscriminator; the test attest_kms_envelope_payload_codec_loader.py:80shows this leaks. See inline onpayload_codec_config.py:23.- Test gaps for defensive branches the author explicitly added (short-payload guard, tampered-ciphertext distinct from AD mismatch,
associated_data=Nonepath). See inlines on the test files.
Tests
Existing tests cover round-trip + AD-mismatch + unsupported-config rejection — the right baseline shape. The PR checklist has I have added unit tests unticked despite tests existing — likely just an oversight.
Local checks
Skipped local ruff / ty re-run. CI ran both across Python 3.12/3.13/3.14 on Linux/macOS/Windows and all 40+ checks passed.
| if len(value) < self._nonce_size: | ||
| raise ValueError("Encrypted payload is too short") |
There was a problem hiding this comment.
The length check guards against a missing nonce, but AES-GCM also requires a 16-byte auth tag. A 12–27 byte buffer passes this check and then surfaces as an opaque InvalidTag from AESGCM.decrypt, which conflates truncated ciphertext with active tampering in logs/Sentry.
Suggest:
if len(value) < self._nonce_size + 16:
raise ValueError("Encrypted payload is too short (missing nonce or auth tag)")The GCM tag is fixed at 16 bytes for AES-128/192/256.
| raise NotImplementedError( | ||
| f"Unsupported KMS envelope payload codec config: " | ||
| f"{type(config).__name__}" |
There was a problem hiding this comment.
NotImplementedError is conventionally for unimplemented abstract methods / unfinished features. An unsupported runtime config is a caller/data error — TypeError (matches Python's own behavior for unmatched match over types) or ValueError is more appropriate, so callers can distinguish the loader is incomplete from you passed a bogus config.
The test at test_kms_envelope_payload_codec_loader.py:88-89 would need a matching update.
| if isinstance(value, bytes): | ||
| return value | ||
| if isinstance(value, str): | ||
| return urlsafe_b64decode(value.encode("ascii")) |
There was a problem hiding this comment.
Complements the Copilot note above: when urlsafe_b64decode rejects malformed input it raises binascii.Error, which is currently re-raised unwrapped. AGENTS.md L94 requires raise ... from err chaining so the original cause is preserved. Suggest:
import binascii
if isinstance(value, str):
try:
return urlsafe_b64decode(value.encode("ascii"))
except (binascii.Error, ValueError) as e:
raise ValueError("wrapped_dek is not valid base64url") from eSame pattern needed for associated_data at line 51.
|
|
||
|
|
||
| PayloadCodecConfigUnion = PlaintextPayloadCodecConfig | ||
| class KMSEnvelopePayloadCodecConfig(BaseModel): |
There was a problem hiding this comment.
KMSEnvelopePayloadCodecConfig is intended as an abstract base (no type discriminator) but is freely instantiable — test_kms_envelope_payload_codec_loader.py:80-96 confirms this by constructing an UnknownKMSEnvelopePayloadCodecConfig directly.
A couple of options:
- If genuinely abstract: make it
ABC(or atyping.Protocol) and move the base64 validators into a shared mixin. Bare instantiation should fail. - If purely structural: consider composition — give each concrete config an
envelope: KMSEnvelopeParamsfield instead of inheritance. That flattens the discriminated union and removes the abstract base in a union hazard.
Either way, consider model_config = ConfigDict(frozen=True) — config objects holding (wrapped) key material should be immutable.
There was a problem hiding this comment.
Composition is probably better than inheritance here from a design pattern perspective, but it will make config processing more complicated.
| b"0" * 32, | ||
| associated_data=b"partition:block", | ||
| ) | ||
| with pytest.raises(InvalidTag): |
There was a problem hiding this comment.
Three defensive branches the codec adds are currently untested — easy wins:
nonce_size <= 0constructor guard (codec line 24-25)- short-payload
ValueErrorbranch (codec line 39-40) - tampered ciphertext distinct from AD mismatch — the existing
wrong_codectest only varies the AAD, so a regression that silently disabled GCM auth-tag verification while leaving AAD intact would still pass. Flipping a byte in the ciphertext (buf = bytearray(encoded); buf[20] ^= 1) and assertingInvalidTagwould catch it.
| key_ref="partition_key", | ||
| wrapped_dek=b"wrapped-dek-bytes", | ||
| nonce_size=12, | ||
| associated_data=b"partition:context", |
There was a problem hiding this comment.
associated_data=None is never exercised in the codec, config, or loader tests. Both _decode_associated_data / _serialize_associated_data None branches and AESGCM.encrypt(...) with aad=None are unverified. A second loader test with associated_data=None would close this — it's the path most production callers will hit first if they forget to pass AAD.
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Signed-off-by: Edwin Yu <edwinyyyu@gmail.com>
Purpose of the change
For supporting sensitive data encryption.
The approach to encrypting in SQL will be to store codec config (including wrapped DEK) in database.
Description
Based on #1375 for easier rebasing, but really only creates stuff for encryption.
DOES NOT HANDLE key creation/rotation.
Alternative is encryption at rest but database sees the plaintext data.
KMSCryptoClient may get implementations with the following backends:
Type of change
How Has This Been Tested?
Checklist
Maintainer Checklist