feat(s7commplus): enumerate datablocks via EXPLORE (request fix + fragment reassembly)#749
Conversation
…S7-1500 The S7CommPlus V2 path (symbolic LID access, GetMultiVariables, session setup) was merged without testing against real hardware. Validated against a TIA V17 S7-1500, several framing assumptions turned out wrong — and the client and the bundled emulator were consistently wrong *together*, so the round-trip tests passed without actually being correct. Client (s7/connection.py) and async client (s7/_s7commplus_async_client.py): - Response data header is 10 bytes (opcode + reserved + function + reserved + seqnr + transport); responses carry NO SessionId field (requests do, hence their 14-byte header). We were stripping 14, corrupting every parsed payload and rejecting short (e.g. write) responses as "too short". - Session id is ObjectIds[0] from the CreateObject response body, not the header SessionId field. - ServerSessionVersion is a Struct (0x17) on real PLCs, not a scalar UDInt: capture its raw typed value and echo it back verbatim during session setup. - IntegrityId (V2+) travels at the END of the request payload, just before the trailing UInt32 — not immediately after the header. - GetMultiVariables requests use transport flag 0x34 (reads); 0x36 otherwise. - Normal-mode struct skip: UInt32 struct-id + members [VLQ key + typed value] terminated by a 0x00 byte (was assuming a VLQ element count). - Data PDUs over TLS use ProtocolVersion V2 after the V1 CreateObject. Emulator (s7/_s7commplus_server.py) corrected to speak the same wire format (10-byte response header, ObjectIds in the CreateObject body, flags-prefixed PValues, IntegrityId at the payload tail) so the round-trip tests validate the corrected protocol instead of a shared mistake. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…demo server The async client activated TLS with loop.start_tls, which encrypts the whole TCP stream including the TPKT/COTP headers — a real S7-1500 (which carries TLS records as the payload of COTP DT frames, headers in clear) drops such a connection. The demo server had the matching problem server-side (ssl.wrap_socket). Switch both to ssl.MemoryBIO + SSLObject, mirroring the sync S7CommPlusConnection's COTP-tunneled TLS: the handshake and all records travel through COTP DT frames while TPKT/COTP stay unencrypted. The async client now also exposes the OMS exporter secret (needed for password legitimation), and the demo server emulates the real wire shape. Verified against a TIA V17 S7-1500: the async client connects over TLS (V2, session setup OK), explores, and reads symbolic tags and struct nodes — same as the sync client. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
gijzelaerr
left a comment
There was a problem hiding this comment.
Review Summary
The EXPLORE request fix, fragment reassembly, and DB-list parser rewrite all look correct and the new tests cover the request format and parser. The reference to the C# driver's Browse step 1 is accurate.
No malicious code detected — no backdoors, exfiltration, or obfuscated code. The TLS changes use standard library primitives. The protocol constants (0x8A0E, DB_CLASS_RID=2574, 0x34) are consistent with known S7CommPlus behavior.
However, this PR is doing significantly more than what the title/description says. Beyond the EXPLORE feature, it includes a complete rewrite of the sync client's TLS to MemoryBIO-based COTP tunneling (the same fix that #745 applied to the async client). That's a substantial standalone change.
Issues:
1. Scope creep — sync TLS-in-COTP rewrite bundled silently
The entire _activate_tls rewrite (ssl.wrap_socket → ssl.MemoryBIO + SSLObject), plus the new _send_s7_data/_recv_s7_data/_tls_flush_outgoing/_tls_read_incoming/_do_tls_handshake methods in connection.py are unmentioned in the PR description. This is a major behavioral change. Even if EXPLORE needs it, it deserves its own section in the description or its own PR.
2. _resp_object_start stored as instance attribute unnecessarily
self._resp_object_start = boff is set and immediately consumed in the next line. This should be a local variable — storing it on self implies it's needed later.
3. Fragment reassembly has no size/count limits
_recv_reassembled_payload will concatenate fragments indefinitely. A malformed or malicious response could cause unbounded memory growth. Consider adding a maximum total size or fragment count guard.
4. _recv_reassembled_payload is untested
The fragment reassembly logic (multi-fragment concatenation, trailer detection) has no unit tests. This is a complex parsing state machine that should have test coverage for: single fragment, multiple fragments, and malformed responses.
5. String encoding heuristic is fragile
The b"\x00" in value check in _parse_explore_datablocks to pick between UTF-16-BE and latin-1 decoding could misclassify if a single-byte string happens to contain a null. Consider documenting why this heuristic is sufficient for PLC datablock names (ASCII-only names are the norm).
8e361da to
d94e174
Compare
The sync S7CommPlusConnection wrapped the raw TCP socket with ssl.wrap_socket, which encrypts the entire stream including the TPKT/COTP headers. A real S7-1500 carries TLS records as the payload of COTP DT frames (TPKT/COTP stay in clear), so it drops such a connection — the same problem gijzelaerr#745 already fixed for the async client and the demo server. Switch the sync client to ssl.MemoryBIO + ssl.SSLObject: the handshake and all records are tunneled through COTP DT frames via new _send_s7_data/_recv_s7_data helpers (with _tls_flush_outgoing/_tls_read_incoming and _do_tls_handshake); TPKT/COTP stay unencrypted. Every send/receive site routes through the helpers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dundant attr, robust VLQ/handshake Addresses the gijzelaerr#745 review feedback: - Move the duplicated PObject-tree helpers (element size, skip-typed-value) and the CreateObject parsing (ObjectIds/session-id + the ServerSessionVersion scan) into s7/codec.py as shared pure functions (skip_typed_value, parse_create_object_session_id, parse_server_session_version), used by both the sync and async clients. - Parse the CreateObject ReturnValue with decode_uint64_vlq instead of a hand-rolled continuation-bit skip. - Drop the redundant _server_session_version_raw attribute; keep _server_session_version. - Make the CreateObject object-id offset a local instead of storing it on self. - Handle ssl.SSLWantWriteError in the TLS handshake (sync + async). - Add a unit test for the MemoryBIO TLS-in-COTP plumbing (handshake + send/recv routing) against a self-connected SSLObject — no socket or PLC needed. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gment reassembly) list_datablocks() returned nothing against a real S7-1500: the EXPLORE request was malformed and the large, fragmented response was never reassembled. - _build_explore_request: match the real wire format — ExploreId as a fixed UInt32, the marker bytes, the address list, and a 5-byte trailer; sent with transport flag 0x34 and the V2 IntegrityId spliced just before the trailer. - send_request: generalize the IntegrityId splice (integrity_tail) and add optional fragment reassembly — a large EXPLORE response is split across many S7CommPlus PDUs (each 0x72-headed, no trailer until the last); concatenate their data parts until the trailing 0x72 <ver> 0x0000. Reassembly is bounded by fragment-count and total-size caps so a malformed/adversarial stream can't drive unbounded allocation. - _parse_explore_datablocks: walk the PObject tree — a DataBlock is an object with ClassId DB_CLASS_RID and a RelationId in the DB area (relid>>16 == 0x8A0E); number = relid & 0xFFFF, name from the ObjectVariableTypeName attribute. - Correct the demo server's EXPLORE to emit the same DB object format. Unit tests cover the request format, the PObject DB-list parser, and the fragment-reassembly state machine (single/multi/malformed/closed/count-cap). Builds on the V2 wire-format + sync TLS-in-COTP fixes in gijzelaerr#745. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
d94e174 to
f7dd81b
Compare
|
Restructured and pushed review fixes (EXPLORE commit Scope creep (the main point): the sync-client TLS-in-COTP rewrite that was bundled into the EXPLORE commit has been moved out into its own commit on #745 (its natural home, alongside the async/server equivalent). The EXPLORE commit here is now scoped to EXPLORE only. This PR is stacked on #745, so the diff still shows those commits until #745 merges, but they are now cleanly separated. Other items:
|
gijzelaerr
left a comment
There was a problem hiding this comment.
Re-review after feedback addressed (stacked on #745)
All five issues from the first review resolved:
- ✅
_resp_object_start— now a local viaparse_create_object_session_id()returning(object_ids, obj_end)as a tuple - ✅ Reassembly caps —
_MAX_REASSEMBLED_FRAGMENTS = 4096enforced; total-size capped at 16 MiB - ✅ Reassembly tests —
TestReassembledPayloadcovers single fragment, multi-fragment, bad header, connection close, and count cap - ✅ String encoding comment — clear explanation of the null-byte heuristic at the call site
- ✅ PR description — now mentions the sync TLS-in-COTP dependency (which lives in #745 as its own commit)
PLC compatibility: S7CommPlus only. Legacy snap7/ untouched.
Security: ✅ Fragment reassembly bounded. No issues.
Ready to merge (after #745 lands first, since this is stacked on it).
Summary
list_datablocks()returned nothing against a real S7-1500: the EXPLORE request was malformed and the (large, fragmented) response was never reassembled._build_explore_request: match the real wire format —ExploreIdas a fixedUInt32, the marker bytes, the address list, and a 5-byte trailer; sent with transport flag0x34and the V2IntegrityIdspliced just before the trailer.send_request: generalize the V2IntegrityIdsplice (integrity_tail) and add optional fragment reassembly — a large EXPLORE response is split across many S7CommPlus PDUs (each0x72-headed, no trailer until the last); concatenate their data parts until the trailing0x72 <ver> 0x0000._parse_explore_datablocks: walk the PObject tree — a DataBlock is an object with ClassIdDB_CLASS_RIDand a RelationId in the DB area (relid >> 16 == 0x8A0E); number =relid & 0xFFFF, name from theObjectVariableTypeNameattribute.Test plan
pytest tests/green;ruffclean. New unit tests cover the request format and the PObject DB-list parser.list_datablocks()returns all DBs with correct numbers and names.🤖 Generated with Claude Code