Skip to content

feat(s7commplus): enumerate datablocks via EXPLORE (request fix + fragment reassembly)#749

Merged
gijzelaerr merged 5 commits into
gijzelaerr:masterfrom
gridsociety:s7commplus-explore-datablocks
Jun 18, 2026
Merged

feat(s7commplus): enumerate datablocks via EXPLORE (request fix + fragment reassembly)#749
gijzelaerr merged 5 commits into
gijzelaerr:masterfrom
gridsociety:s7commplus-explore-datablocks

Conversation

@ale-rinaldi

@ale-rinaldi ale-rinaldi commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Summary

list_datablocks() returned nothing against a real S7-1500: the EXPLORE request was malformed and the (large, fragmented) response was never reassembled.

  • _build_explore_request: match the real wire format — ExploreId as a fixed UInt32, the marker bytes, the address list, and a 5-byte trailer; sent with transport flag 0x34 and the V2 IntegrityId spliced just before the trailer.
  • send_request: generalize the V2 IntegrityId splice (integrity_tail) and add optional fragment reassembly — a large EXPLORE response is split across many S7CommPlus PDUs (each 0x72-headed, no trailer until the last); concatenate their data parts until the trailing 0x72 <ver> 0x0000.
  • _parse_explore_datablocks: walk the PObject tree — a DataBlock is an object with ClassId DB_CLASS_RID and a RelationId in the DB area (relid >> 16 == 0x8A0E); number = relid & 0xFFFF, name from the ObjectVariableTypeName attribute.
  • Correct the demo server's EXPLORE to emit the same DB object format.

Test plan

  • pytest tests/ green; ruff clean. New unit tests cover the request format and the PObject DB-list parser.
  • Live, against a real S7-1500: list_datablocks() returns all DBs with correct numbers and names.

Stacked on #745 — the EXPLORE response parsing relies on the corrected 10-byte response framing and the sync TLS-in-COTP transport (now a dedicated commit on #745). Full per-tag browse() (parsing the type-info tree) is a planned follow-up.

🤖 Generated with Claude Code

claude added 2 commits June 18, 2026 01:00
…S7-1500

The S7CommPlus V2 path (symbolic LID access, GetMultiVariables, session
setup) was merged without testing against real hardware. Validated against
a TIA V17 S7-1500, several framing assumptions turned out wrong — and the
client and the bundled emulator were consistently wrong *together*, so the
round-trip tests passed without actually being correct.

Client (s7/connection.py) and async client (s7/_s7commplus_async_client.py):
- Response data header is 10 bytes (opcode + reserved + function + reserved
  + seqnr + transport); responses carry NO SessionId field (requests do,
  hence their 14-byte header). We were stripping 14, corrupting every parsed
  payload and rejecting short (e.g. write) responses as "too short".
- Session id is ObjectIds[0] from the CreateObject response body, not the
  header SessionId field.
- ServerSessionVersion is a Struct (0x17) on real PLCs, not a scalar UDInt:
  capture its raw typed value and echo it back verbatim during session setup.
- IntegrityId (V2+) travels at the END of the request payload, just before
  the trailing UInt32 — not immediately after the header.
- GetMultiVariables requests use transport flag 0x34 (reads); 0x36 otherwise.
- Normal-mode struct skip: UInt32 struct-id + members [VLQ key + typed value]
  terminated by a 0x00 byte (was assuming a VLQ element count).
- Data PDUs over TLS use ProtocolVersion V2 after the V1 CreateObject.

Emulator (s7/_s7commplus_server.py) corrected to speak the same wire format
(10-byte response header, ObjectIds in the CreateObject body, flags-prefixed
PValues, IntegrityId at the payload tail) so the round-trip tests validate
the corrected protocol instead of a shared mistake.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…demo server

The async client activated TLS with loop.start_tls, which encrypts the whole TCP
stream including the TPKT/COTP headers — a real S7-1500 (which carries TLS records
as the payload of COTP DT frames, headers in clear) drops such a connection. The
demo server had the matching problem server-side (ssl.wrap_socket).

Switch both to ssl.MemoryBIO + SSLObject, mirroring the sync S7CommPlusConnection's
COTP-tunneled TLS: the handshake and all records travel through COTP DT frames while
TPKT/COTP stay unencrypted. The async client now also exposes the OMS exporter secret
(needed for password legitimation), and the demo server emulates the real wire shape.

Verified against a TIA V17 S7-1500: the async client connects over TLS (V2, session
setup OK), explores, and reads symbolic tags and struct nodes — same as the sync client.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@gijzelaerr gijzelaerr left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review Summary

The EXPLORE request fix, fragment reassembly, and DB-list parser rewrite all look correct and the new tests cover the request format and parser. The reference to the C# driver's Browse step 1 is accurate.

No malicious code detected — no backdoors, exfiltration, or obfuscated code. The TLS changes use standard library primitives. The protocol constants (0x8A0E, DB_CLASS_RID=2574, 0x34) are consistent with known S7CommPlus behavior.

However, this PR is doing significantly more than what the title/description says. Beyond the EXPLORE feature, it includes a complete rewrite of the sync client's TLS to MemoryBIO-based COTP tunneling (the same fix that #745 applied to the async client). That's a substantial standalone change.

Issues:

1. Scope creep — sync TLS-in-COTP rewrite bundled silently

The entire _activate_tls rewrite (ssl.wrap_socketssl.MemoryBIO + SSLObject), plus the new _send_s7_data/_recv_s7_data/_tls_flush_outgoing/_tls_read_incoming/_do_tls_handshake methods in connection.py are unmentioned in the PR description. This is a major behavioral change. Even if EXPLORE needs it, it deserves its own section in the description or its own PR.

2. _resp_object_start stored as instance attribute unnecessarily

self._resp_object_start = boff is set and immediately consumed in the next line. This should be a local variable — storing it on self implies it's needed later.

3. Fragment reassembly has no size/count limits

_recv_reassembled_payload will concatenate fragments indefinitely. A malformed or malicious response could cause unbounded memory growth. Consider adding a maximum total size or fragment count guard.

4. _recv_reassembled_payload is untested

The fragment reassembly logic (multi-fragment concatenation, trailer detection) has no unit tests. This is a complex parsing state machine that should have test coverage for: single fragment, multiple fragments, and malformed responses.

5. String encoding heuristic is fragile

The b"\x00" in value check in _parse_explore_datablocks to pick between UTF-16-BE and latin-1 decoding could misclassify if a single-byte string happens to contain a null. Consider documenting why this heuristic is sufficient for PLC datablock names (ASCII-only names are the norm).

Comment thread s7/connection.py
Comment thread s7/connection.py
Comment thread s7/_s7commplus_client.py
claude added 2 commits June 18, 2026 10:57
The sync S7CommPlusConnection wrapped the raw TCP socket with ssl.wrap_socket,
which encrypts the entire stream including the TPKT/COTP headers. A real S7-1500
carries TLS records as the payload of COTP DT frames (TPKT/COTP stay in clear),
so it drops such a connection — the same problem gijzelaerr#745 already fixed for the async
client and the demo server.

Switch the sync client to ssl.MemoryBIO + ssl.SSLObject: the handshake and all
records are tunneled through COTP DT frames via new _send_s7_data/_recv_s7_data
helpers (with _tls_flush_outgoing/_tls_read_incoming and _do_tls_handshake);
TPKT/COTP stay unencrypted. Every send/receive site routes through the helpers.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…dundant attr, robust VLQ/handshake

Addresses the gijzelaerr#745 review feedback:
- Move the duplicated PObject-tree helpers (element size, skip-typed-value) and the
  CreateObject parsing (ObjectIds/session-id + the ServerSessionVersion scan) into
  s7/codec.py as shared pure functions (skip_typed_value, parse_create_object_session_id,
  parse_server_session_version), used by both the sync and async clients.
- Parse the CreateObject ReturnValue with decode_uint64_vlq instead of a hand-rolled
  continuation-bit skip.
- Drop the redundant _server_session_version_raw attribute; keep _server_session_version.
- Make the CreateObject object-id offset a local instead of storing it on self.
- Handle ssl.SSLWantWriteError in the TLS handshake (sync + async).
- Add a unit test for the MemoryBIO TLS-in-COTP plumbing (handshake + send/recv routing)
  against a self-connected SSLObject — no socket or PLC needed.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…gment reassembly)

list_datablocks() returned nothing against a real S7-1500: the EXPLORE request
was malformed and the large, fragmented response was never reassembled.

- _build_explore_request: match the real wire format — ExploreId as a fixed
  UInt32, the marker bytes, the address list, and a 5-byte trailer; sent with
  transport flag 0x34 and the V2 IntegrityId spliced just before the trailer.
- send_request: generalize the IntegrityId splice (integrity_tail) and add
  optional fragment reassembly — a large EXPLORE response is split across many
  S7CommPlus PDUs (each 0x72-headed, no trailer until the last); concatenate
  their data parts until the trailing 0x72 <ver> 0x0000. Reassembly is bounded
  by fragment-count and total-size caps so a malformed/adversarial stream can't
  drive unbounded allocation.
- _parse_explore_datablocks: walk the PObject tree — a DataBlock is an object
  with ClassId DB_CLASS_RID and a RelationId in the DB area (relid>>16 == 0x8A0E);
  number = relid & 0xFFFF, name from the ObjectVariableTypeName attribute.
- Correct the demo server's EXPLORE to emit the same DB object format.

Unit tests cover the request format, the PObject DB-list parser, and the
fragment-reassembly state machine (single/multi/malformed/closed/count-cap).

Builds on the V2 wire-format + sync TLS-in-COTP fixes in gijzelaerr#745.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@ale-rinaldi ale-rinaldi force-pushed the s7commplus-explore-datablocks branch from d94e174 to f7dd81b Compare June 18, 2026 09:23
@ale-rinaldi

Copy link
Copy Markdown
Contributor Author

Restructured and pushed review fixes (EXPLORE commit f7dd81b):

Scope creep (the main point): the sync-client TLS-in-COTP rewrite that was bundled into the EXPLORE commit has been moved out into its own commit on #745 (its natural home, alongside the async/server equivalent). The EXPLORE commit here is now scoped to EXPLORE only. This PR is stacked on #745, so the diff still shows those commits until #745 merges, but they are now cleanly separated.

Other items:

  1. _resp_object_start — now a local (the session parsing moved to codec.parse_create_object_session_id in the fix(s7commplus): correct V2 symbolic reads on a real S7-1500 (wire format + async/server TLS-in-COTP) #745 base).
  2. Fragment reassembly caps — _recv_reassembled_payload now bounds fragment count and total size.
  3. Reassembly tests — added TestReassembledPayload (single / multi-split / bad-header / closed / cap).
  4. WString decoding heuristic — kept, with a comment explaining why the null-byte test is the right discriminator for this framing (see the inline thread).

@gijzelaerr gijzelaerr left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review after feedback addressed (stacked on #745)

All five issues from the first review resolved:

  1. _resp_object_start — now a local via parse_create_object_session_id() returning (object_ids, obj_end) as a tuple
  2. Reassembly caps_MAX_REASSEMBLED_FRAGMENTS = 4096 enforced; total-size capped at 16 MiB
  3. Reassembly testsTestReassembledPayload covers single fragment, multi-fragment, bad header, connection close, and count cap
  4. String encoding comment — clear explanation of the null-byte heuristic at the call site
  5. PR description — now mentions the sync TLS-in-COTP dependency (which lives in #745 as its own commit)

PLC compatibility: S7CommPlus only. Legacy snap7/ untouched.

Security: ✅ Fragment reassembly bounded. No issues.

Ready to merge (after #745 lands first, since this is stacked on it).

@gijzelaerr gijzelaerr merged commit fbd573f into gijzelaerr:master Jun 18, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants