Skip to content

feat: I/Q/M symbolic tag browse for S7-1200 FW V4.5 via reconstructed preset dictionary#756

Open
tommasofaedo wants to merge 1 commit into
gijzelaerr:masterfrom
tommasofaedo:feat/browse-iqm-fdict
Open

feat: I/Q/M symbolic tag browse for S7-1200 FW V4.5 via reconstructed preset dictionary#756
tommasofaedo wants to merge 1 commit into
gijzelaerr:masterfrom
tommasofaedo:feat/browse-iqm-fdict

Conversation

@tommasofaedo

@tommasofaedo tommasofaedo commented Jun 25, 2026

Copy link
Copy Markdown

browse_tags.py

## Problem

On S7-1200 firmware V4.5 (V3 protocol), EXPLORE requests for the I/Q/M
areas (RIDs 80/81/82) return a zlib blob protected by a Siemens preset
dictionary (magic `78 7D`, FDICT flag set, dict Adler-32 `0xce9b821b`).
Python's `zlib.decompress()` returns `Z_NEED_DICT` — the dictionary is
embedded in TIA Portal and not published by Siemens.

As a result, symbolic tag names, data types, logical addresses, and byte
offsets are unavailable via `browse()` for I/Q/M areas on V3 PLCs
(reported as a known limitation in PR #742 / the browse PR).

## Solution — oracle technique

We reconstructed 594 of 32768 FDICT bytes using an "oracle" approach:
inflate the same blob four times with four synthetic test dictionaries
(all-zeros, all-0xFF, `i%256`, `i>>8`). A byte that is identical in all
four outputs is a literal; a byte that differs reveals the FDICT position
it was copied from: `position = (B_output << 8) | A_output`.

The **same FDICT** (Adler-32 `0xce9b821b`) is used for all three areas
(I, Q, M) — confirmed on three independent Wireshark pcapng captures.

With 594 FDICT positions known, `_extract_tags()` anchors on always-literal
ID values and recovers Name/DataType/LogicalAddress/ByteOffset from a
context window before each ID.

### Byte-type fallback (I/Q areas)

LogicalAddress reconstruction by exhaustion:
- **Bool** tags → FDICT encodes `LogicalAddress="%I43.{bit}"` (garbled area
  letter, correct bit); reconstruct as `%{area}{ByteOffset}.{bit}`.
- **Word/Int** tags → `%IW` / `%QW` are literal in the blob; append
  ByteOffset to get `%IW{N}` / `%QW{N}`.
- **Byte** tags → only remaining type; oracle confirms LogicalAddress value
  is not encoded. Reconstruct as `%IB{ByteOffset}` / `%QB{ByteOffset}`.

### Structural limit — M area (confirmed by pcapng oracle)

Oracle analysis of Wireshark captures of all 15 M area tags shows the
deflate stream uses an **identical sequence** for Bool, Byte, and Word
addresses. It is not possible to distinguish `%MB` from `%MW` from the
blob alone. The 6 affected tags have correct `ByteOffset` values but
`LogicalAddress = ?`.

## Results (192.168.5.11, S7-1200 CPU 1212C DC/DC/DC, FW V4.5)

| Area | RID | Tags found | Complete | Notes |
|------|-----|-----------|----------|-------|
| I    | 80  | 13/13     | ✅ 100%  | Name, DataType, LogicalAddress, ByteOffset all correct |
| Q    | 81  | 11/11     | ✅ 100%  | Same — includes custom names (0_output, 100_output, output_0_0) |
| M    | 82  | 15 total  | 9/15     | 6 Byte/Word gap tags: ByteOffset correct, LogicalAddress unknown |

Score vs TIA Portal export: **33/40 correct, 6 gap (structural limit), 0 wrong**.

## Changes

### New: `browse_tags.py`

Standalone script. Contains:

- `_build_fdict()` — builds the 32768-byte dict from 594 confirmed positions
- `_fetch_area(rid, fdict)` — connects to PLC, sends EXPLORE, decompresses
- `_extract_tags(data, area_prefix)` — regex extraction anchored on literal IDs
- `main()` — CLI: `python browse_tags.py [I] [Q] [M]`

Requires Patches 1, 5, 6 (SequenceNumber, multi-frame collect, session key)
to be already applied to `s7/connection.py` and `s7/_s7commplus_client.py`.

### `s7/_s7commplus_client.py` — add `browse_tags()` method

```python
def browse_tags(self, areas=('I', 'Q', 'M')) -> dict[str, list[dict]]:
    """Browse symbolic tags in I/Q/M areas using oracle-reconstructed FDICT.

    Returns a dict mapping area letter to list of tag dicts.
    Each tag dict: {Name, DataType, LogicalAddress, ByteOffset, ID}.
    LogicalAddress may be '?' for M-area Byte/Word tags (structural limit).
    """
    from ._browse_fdict import _build_fdict, _extract_tags
    area_rids = {'I': 80, 'Q': 81, 'M': 82}
    fdict = _build_fdict()
    result = {}
    for area in areas:
        rid = area_rids[area]
        payload = _build_explore_payload_v3(rid)
        first = self._connection.send_request(FunctionCode.EXPLORE, payload)
        raw = self._connection._collect_explore_frames(first)
        p = raw.find(b'\x78\x7d')
        if p < 0:
            result[area] = []
            continue
        import zlib
        try:
            data = zlib.decompressobj(wbits=-15, zdict=fdict).decompress(raw[p + 6:])
        except zlib.error:
            result[area] = []
            continue
        result[area] = _extract_tags(data, area_prefix='%' + area)
    return result

Tested on

  • PLC: Siemens S7-1200 CPU 1212C DC/DC/DC
  • Firmware: V4.5
  • Protocol: V3 (no TLS, no password)
  • Tag count: 40 tags in TIA Portal (13 I, 11 Q, 15 M + 1 pending)
  • Verified against: TIA Portal export (Full_List_PLC_Tags.xlsx)

Known limitation

The 6 M-area gap tags (Tag_5/11/16/18/20/22, all Byte or Word type) cannot
have their LogicalAddress recovered from the blob alone. ByteOffset is
always correct. A hardcoded lookup table or a separate READ-based DataType
probe could resolve this, but both approaches are project-specific and are
not included in this patch.


---

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant