Skip to content

fix(streaming): prevent glm-4.5 XML auto-detect from emitting Hermes tool_call JSON blob as function name (#9722)#9940

Open
Dennisadira wants to merge 1 commit into
mudler:masterfrom
Dennisadira:fix/streaming-xml-false-positive-hermes
Open

fix(streaming): prevent glm-4.5 XML auto-detect from emitting Hermes tool_call JSON blob as function name (#9722)#9940
Dennisadira wants to merge 1 commit into
mudler:masterfrom
Dennisadira:fix/streaming-xml-false-positive-hermes

Conversation

@Dennisadira
Copy link
Copy Markdown
Contributor

Summary

Closes #9722.

When /v1/chat/completions streams a response from a Hermes/NousResearch-format model (output looks like <tool_call>\n{"name":"bash","arguments":{...}}\n</tool_call>), the Go-side XML auto-detector incorrectly picks up the glm-4.5 format and treats the entire JSON blob as the function name.

Root cause: ParseXMLIterative with format=nil tries every XML preset in order. glm-4.5 uses <tool_call> as ToolStart (not a ScopeStart). When it finds <tool_call> but no <arg_key> inside, it falls into the empty-tool-call path that extracts everything between <tool_call> and </tool_call> as the function name. For Hermes output that content is a full JSON object, so Name ends up as '{"name":"bash","arguments":{...}}'. Because the XML branch returned a non-empty result, the JSON fallback (ParseJSONIterative) — which correctly handles Hermes — was suppressed, and the client received a malformed streaming chunk.

Fix

Added filterMalformedXMLToolCalls() that discards any auto-detected FuncCallResult whose Name starts with {. Applied in all three auto-detect branches of ParseXMLIterative:

  1. tryParseXMLFromScopeStart loop (fast path)
  2. TryConsumeXMLToolCalls loop (fallback path)
  3. Partial-exception recovery in the same loop

When all results from a format are filtered, the loop continues to the next format. Hermes output falls through all XML formats and is picked up correctly by ParseJSONIterative.

The user-specified format path (xmlFormat != nil) is intentionally left unfiltered.

Test plan

  • Added ParseXMLIterative regression tests: auto-detect must not return a JSON-blob function name for Hermes-style <tool_call>JSON</tool_call> input (both full and partial streaming).
  • Existing ParseFunctionCall Hermes tests still pass (HaveLen(1), correct name/args).
  • Full pkg/functions test suite: 175/176 specs pass (1 pre-existing pending spec).
  • go build ./core/http/... ./pkg/functions/... — clean.

🤖 Generated with Claude Code

…call JSON as function name

When ParseXMLIterative runs in auto-detect mode (xmlFormat=nil) on a
Hermes/NousResearch-style response such as:

  <tool_call>
  {"name": "bash", "arguments": {"script": "ls"}}
  </tool_call>

the glm-4.5 XML format would false-positive: it finds <tool_call>,
discovers no <arg_key> element, and falls into the "empty tool call"
path that treats everything between <tool_call> and </tool_call> as
the function name. The result is a FuncCallResult where Name holds the
entire JSON blob '{"name":"bash","arguments":{...}}' and Arguments is
"{}". During streaming this malformed chunk was emitted to the client
(and because the XML path fired, the JSON fallback path that correctly
handles Hermes format was suppressed).

Fix: add filterMalformedXMLToolCalls() which discards any auto-detected
result whose function name begins with '{'. Apply it in all three
auto-detect branches of ParseXMLIterative. When all results from a
format are malformed the loop continues to the next format, eventually
falling through so ParseJSONIterative can handle Hermes output correctly.

The user-specified format path (xmlFormat != nil) is intentionally
left unfiltered; if an operator explicitly configures glm-4.5 they own
the result.

Fixes mudler#9722

Signed-off-by: Adira Denis Muhando <dennisadira@gmail.com>
Assisted-by: Claude Sonnet 4.6
@mudler
Copy link
Copy Markdown
Owner

mudler commented May 22, 2026

@Dennisadira I'm curious how did you bump into this. That code path is a fallback when autoparser doesn't kick in, or backends do not support chatdeltas via gRPC protocol.

@Dennisadira
Copy link
Copy Markdown
Contributor Author

I was tracking down #9722 (the streaming path emitting the same tool_call at multiple index values). While reading through the streaming loop in chat.go to understand which parser fires when, I traced into ParseXMLIterative to map all the auto-detect branches.

To reproduce the Hermes scenario I added a unit test that feeds <tool_call>{"name":"bash","arguments":{...}}</tool_call> directly to ParseXMLIterative with format=nil. The test showed the glm-4.5 preset returning a FuncCallResult with the raw JSON blob as Name — because it matches <tool_call> as ToolStart (not ScopeStart), finds no <arg_key> inside, and falls into the empty-tool-call path.

You're right that in a typical Hermes deployment the C++ autoparser fires first and this XML branch is never reached. But it can trigger when a backend delivers tokens without gRPC chatDeltas (e.g. a custom REST backend), which is the scenario #9722 describes. Fixing it here means the fallback path is also safe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Streaming /v1/chat/completions emits the same tool_call at multiple index values

2 participants