Skip to content

feat(langchain): implement native GovernanceMiddleware via AgentMiddleware#1585

Merged
imran-siddique merged 1 commit into
microsoft:mainfrom
miyannishar:feat/langchain-native-middleware
Apr 30, 2026
Merged

feat(langchain): implement native GovernanceMiddleware via AgentMiddleware#1585
imran-siddique merged 1 commit into
microsoft:mainfrom
miyannishar:feat/langchain-native-middleware

Conversation

@miyannishar
Copy link
Copy Markdown
Collaborator

Summary

Replaces the fragile proxy-based wrapping in LangChainKernel with LangChain's native AgentMiddleware system, enabling non-invasive governance via wrap_tool_call and wrap_model_call lifecycle hooks.

This mirrors the architecture established by the Google ADK refactor and OpenAI Agents SDK refactor, completing the standardization of native middleware across all three major framework integrations.

Resolves #1584


What Changed

New: GovernanceMiddleware(AgentMiddleware)

  • wrap_tool_call: Intercepts every tool execution with:

    • Tool allowlist / blocklist enforcement
    • Blocked-pattern scan on tool arguments and tool names
    • Cedar/OPA pre_execute gate
    • Blocked-pattern scan on tool output
    • Drift detection / checkpointing via base post_execute
    • Full audit trail logging
  • wrap_model_call: Intercepts every model invocation with:

    • Content-filter scan on input messages (string and list content blocks)
    • Blocked-pattern scan on model output
    • Cedar/OPA policy gates on model I/O
    • New capability not possible via proxy-based wrap():
      • System prompt integrity validation
      • Prompt injection detection on input messages
      • Dynamic tool filtering before the model sees them

New: LangChainKernel.as_middleware(name="governance")

Factory method that returns a GovernanceMiddleware instance ready for create_agent(middleware=[...]).

Deprecated: LangChainKernel.wrap() and module-level wrap()

Both now emit DeprecationWarning pointing users to as_middleware(). Existing functionality is fully preserved for backward compatibility.

Updated: __init__.py exports

GovernanceMiddleware is now exported as LangChainGovernanceMiddleware from agent_os.integrations.


Migration Path

# Before (deprecated)
from agent_os.integrations.langchain_adapter import wrap
governed = wrap(my_chain, policy=GovernancePolicy(max_tokens=5000))
result = governed.invoke({"input": "hello"})

# After (recommended)
from agent_os.integrations.langchain_adapter import LangChainKernel, GovernancePolicy
kernel = LangChainKernel(policy=GovernancePolicy(max_tokens=5000))
agent = create_agent(
    model="gpt-4o",
    tools=[...],
    middleware=[kernel.as_middleware()],
)
result = agent.invoke({"messages": [...]})

Test Results

39/39 new tests pass + 18/18 existing LangChain tests pass (full backward compatibility).

Category Tests Status
Middleware init / properties 7
wrap_tool_call governance 12
wrap_model_call governance 9
as_middleware() integration 4
Deprecation warnings 2
Backward compatibility 3
Health check 2
Existing tests 18

Design Decisions

  1. Fail-closed semantics: All policy evaluations (tool/model) block by raising PolicyViolationError. No silent failures.
  2. Graceful dependency handling: GovernanceMiddleware inherits from AgentMiddleware when LangChain is installed, falls back to object otherwise — the module stays importable in all environments.
  3. Output filtering in middleware: Blocked-pattern checks on tool/model output are performed within the middleware before drift detection, since BaseIntegration.post_execute() only handles drift/checkpointing.
  4. Composable: Multiple GovernanceMiddleware instances can be stacked, each sharing the kernel's audit state.

…eware

Replaces fragile proxy-based wrapping with LangChain's native
AgentMiddleware system (wrap_tool_call / wrap_model_call).

Changes:
- Add GovernanceMiddleware class implementing wrap_tool_call and
  wrap_model_call lifecycle hooks for native governance gating
- Add as_middleware() factory on LangChainKernel for clean integration
- Implement blocked-pattern checks on both tool and model outputs
- Support pre_execute/post_execute Cedar/OPA gates on tool calls
- Support content filtering on model inputs and outputs
- Deprecate LangChainKernel.wrap() and module-level wrap() with
  clear migration path to as_middleware()
- Add comprehensive test suite (39 tests) covering:
  * Tool allowlist/blocklist enforcement
  * Blocked pattern detection in args, tool names, and outputs
  * Model input/output content filtering
  * Cedar evaluator passthrough
  * Shared kernel state across middleware instances
  * Deprecation warning verification
  * Full backward compatibility with existing wrap() API

Resolves microsoft#1584
@github-actions github-actions Bot added tests size/XL Extra large PR (500+ lines) labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown

🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

  • GovernanceMiddleware in langchain_adapter.py -- missing docstring for wrap_tool_call and wrap_model_call.
  • README.md -- LangChain integration section needs update to reflect as_middleware() as the preferred method and deprecation of wrap().
  • CHANGELOG.md -- missing entry for the introduction of GovernanceMiddleware, as_middleware(), and deprecation of wrap().

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: security-scanner — View details

No security issues found.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

Severity Change Impact
High LangChainKernel.wrap() and module-level wrap() are now deprecated and emit DeprecationWarning. Existing users relying on these methods will need to migrate to LangChainKernel.as_middleware() in the future, as the deprecated methods may be removed in subsequent releases.
Medium Transition from proxy-based wrapping to GovernanceMiddleware may alter runtime behavior for advanced use cases relying on internal proxy mechanics. Users with custom integrations or reliance on proxy-specific behaviors may need to validate compatibility with the new middleware approach.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: test-generator — `agent_os/integrations/langchain_adapter.py`

Test Coverage Analysis

agent_os/integrations/langchain_adapter.py

  • Existing coverage:

    • The new GovernanceMiddleware class is covered by tests for:
      • wrap_tool_call: Tool-level governance checks (allowlist/blocklist, blocked-pattern scan, Cedar/OPA policy gates, post-execution validation).
      • wrap_model_call: Model-level governance checks (content filtering, prompt injection detection, dynamic tool filtering, Cedar/OPA policy gates).
      • LangChainKernel.as_middleware: Factory method for creating middleware.
      • Deprecation warnings for wrap() and module-level wrap().
      • Backward compatibility of the deprecated wrap() API.
    • Tests simulate tool and model requests with mocked LangChain objects.
    • Policy violation scenarios are tested by raising PolicyViolationError.
  • Missing coverage:

    1. Edge cases for blocked patterns:
      • Overlapping patterns (e.g., "DROP" and "DROP TABLE").
      • Patterns with special characters or regex metacharacters.
    2. Concurrency:
      • Simultaneous tool and model calls with shared state in the kernel.
    3. Timeout handling:
      • Behavior when tool or model calls exceed the timeout_seconds limit.
    4. Dynamic tool filtering:
      • Scenarios where tools are dynamically removed or modified before model invocation.
    5. Malformed inputs:
      • Invalid or unexpected structures in ToolCallRequest or ModelRequest.
    6. Partial failures:
      • Tool or model calls that return incomplete or malformed responses.
    7. Audit trail validation:
      • Ensuring all governance events are correctly logged in the audit trail.
  • Suggested test cases:

    1. test_wrap_tool_call_blocked_pattern_overlap:
      • Verify that overlapping blocked patterns (e.g., "DROP" and "DROP TABLE") are correctly enforced.
    2. test_wrap_tool_call_special_characters_in_patterns:
      • Test blocked patterns containing special characters or regex metacharacters (e.g., .*DROP.*).
    3. test_concurrent_tool_and_model_calls:
      • Simulate concurrent tool and model calls to ensure no race conditions in shared kernel state.
    4. test_tool_call_timeout_handling:
      • Verify behavior when a tool call exceeds the timeout_seconds limit.
    5. test_dynamic_tool_filtering:
      • Test scenarios where tools are dynamically removed or modified before being passed to the model.
    6. test_malformed_tool_request:
      • Pass a malformed ToolCallRequest (e.g., missing name or args) and ensure graceful handling.
    7. test_partial_tool_response:
      • Simulate a tool returning an incomplete or malformed response and verify post-execution validation.
    8. test_audit_trail_logging:
      • Validate that all governance events (e.g., policy violations, tool invocations) are correctly logged in the audit trail.

agent_os/integrations/__init__.py

  • Existing coverage:

    • The updated export of GovernanceMiddleware as LangChainGovernanceMiddleware is indirectly covered by tests that import and use the middleware.
  • Missing coverage:

    • No direct tests for the __init__.py exports.
  • Suggested test cases:

    1. test_governance_middleware_import:
      • Verify that LangChainGovernanceMiddleware can be imported from agent_os.integrations.

tests/test_langchain_middleware.py

  • Existing coverage:

    • Comprehensive tests for the new GovernanceMiddleware and its integration with LangChainKernel.
    • Covers tool-level and model-level governance, factory method, deprecation warnings, and backward compatibility.
  • Missing coverage:

    • Edge cases and advanced scenarios (e.g., concurrency, malformed inputs, partial failures, audit trail validation).
  • Suggested test cases:

    1. test_wrap_tool_call_edge_cases:
      • Combine multiple edge cases (e.g., overlapping patterns, special characters) in a single test.
    2. test_wrap_model_call_edge_cases:
      • Test model-level governance with edge cases like prompt injection and dynamic tool filtering.
    3. test_audit_trail_completeness:
      • Ensure all governance events are logged, including edge cases and failures.

Summary

The new GovernanceMiddleware implementation is well-covered by existing tests, but additional test cases are needed to address edge cases, concurrency, timeout handling, malformed inputs, partial failures, and audit trail validation. These tests will ensure robustness and reliability in real-world scenarios.

@github-actions
Copy link
Copy Markdown

🤖 AI Agent: code-reviewer — Review Summary

Review Summary

This PR introduces a significant refactor to the LangChain integration by replacing the existing proxy-based wrap() method with a new GovernanceMiddleware class that leverages LangChain's native AgentMiddleware system. The changes aim to improve the robustness, maintainability, and extensibility of the governance logic while maintaining backward compatibility with the deprecated wrap() API.

The PR is well-documented, includes comprehensive test coverage, and adheres to the project's design principles. However, there are some areas that require attention, particularly around security, backward compatibility, and potential improvements.


Key Findings

CRITICAL: Security Issues

  1. Policy Bypass via Tool Arguments:

    • The wrap_tool_call method performs blocked-pattern checks on tool_args but does not handle cases where arguments might be nested dictionaries or lists. This could allow malicious inputs to bypass the policy checks.
    • Action: Ensure recursive scanning of nested structures in tool_args for blocked patterns.
  2. Prompt Injection Detection:

    • The wrap_model_call method mentions "prompt injection detection" but does not explicitly implement it. This is a critical security feature for LLM governance.
    • Action: Implement robust prompt injection detection, such as validating the integrity of system prompts and sanitizing user inputs.
  3. Output Filtering for Sensitive Data:

    • The blocked-pattern checks on tool and model outputs are limited to string matches. This approach may miss sensitive data encoded in non-standard formats (e.g., Base64, JSON).
    • Action: Extend output filtering to handle encoded or structured data formats.
  4. Audit Trail Completeness:

    • While the middleware logs policy violations and tool invocations, it does not log the full context of the request and response. This could hinder forensic analysis in case of a security incident.
    • Action: Enhance audit logging to include full request and response details, ensuring sensitive data is redacted.

WARNING: Backward Compatibility

  1. Deprecation of wrap():
    • The deprecation of the wrap() method and the module-level wrap() function introduces a potential breaking change for users who rely on the old API.
    • Action: Clearly communicate the deprecation timeline in the documentation and provide a migration guide. Consider extending the deprecation period to allow users more time to adapt.

SUGGESTION: Improvements

  1. Thread Safety:

    • The GovernanceMiddleware class uses a shared LangChainKernel instance, which may lead to race conditions in concurrent environments.
    • Action: Review the thread safety of shared state (e.g., _ctx, _last_error) and consider using thread-local storage if necessary.
  2. Type Annotations:

    • The new methods (wrap_tool_call, wrap_model_call, etc.) lack detailed type annotations for their parameters and return values.
    • Action: Add precise type annotations to improve code readability and prevent type-related bugs.
  3. Test Coverage:

    • While the test suite is comprehensive, it does not include edge cases for nested tool arguments, encoded outputs, or concurrent middleware execution.
    • Action: Add tests for these scenarios to ensure robustness.
  4. Performance Optimization:

    • The blocked-pattern checks and Cedar/OPA evaluations could introduce latency, especially for high-throughput applications.
    • Action: Profile the middleware's performance and consider caching policy evaluations for repeated inputs.
  5. Documentation:

    • The migration guide in the PR description is helpful but could be expanded with more examples, especially for advanced use cases like Cedar/OPA integration.
    • Action: Update the documentation to include detailed examples and best practices for using the new middleware.

Suggested Changes

Code Changes

  1. Recursive Blocked-Pattern Check:

    def _scan_for_blocked_patterns(data, policy):
        if isinstance(data, dict):
            for key, value in data.items():
                _scan_for_blocked_patterns(value, policy)
        elif isinstance(data, list):
            for item in data:
                _scan_for_blocked_patterns(item, policy)
        elif isinstance(data, str):
            matched = policy.matches_pattern(data)
            if matched:
                raise PolicyViolationError(f"Blocked pattern '{matched[0]}' detected")

    Update wrap_tool_call to use this function:

    _scan_for_blocked_patterns(tool_args, self._kernel.policy)
  2. Prompt Injection Detection:

    • Implement a method to validate system prompts and sanitize user inputs. For example:
      def _validate_prompt_integrity(prompt):
          # Add logic to validate the integrity of the system prompt
          pass
      
      def _sanitize_input(input_text):
          # Add logic to sanitize user inputs
          return input_text

    Update wrap_model_call to use these methods:

    self._validate_prompt_integrity(request.system_message.content_blocks)
    input_text = self._sanitize_input(input_text)

Documentation Updates

  • Add a section to the README or a dedicated migration guide explaining the transition from wrap() to as_middleware(), including examples for common use cases.
  • Document the deprecation timeline for the wrap() API.

Conclusion

This PR is a significant improvement to the LangChain integration, addressing many of the limitations of the previous proxy-based approach. However, the identified security issues (CRITICAL) must be addressed before merging. Additionally, the backward compatibility concerns (WARNING) and suggested improvements should be considered to ensure a smooth transition for users and enhance the overall robustness of the implementation.

@github-actions
Copy link
Copy Markdown

PR Review Summary

Check Status Details
🔍 Code Review ❌ Failed Issues detected
🛡️ Security Scan ✅ Completed Analysis complete
🔄 Breaking Changes ⚠️ Warning See details
📝 Docs Sync ✅ Completed Analysis complete
🧪 Test Coverage ❌ Failed Issues detected

Verdict: ❌ Changes needed

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review: LangChain GovernanceMiddleware

Great PR description and migration guide, @miyannishar. The middleware architecture is solid. A few issues to address:

Blocking

1. Missing async wrap_tool_call / wrap_model_call hooks
Only sync versions are implemented. LangChain agents commonly run in async contexts (ainvoke, async tool execution). The deprecated wrap() API already supports async via ainvoke with asyncio.wait_for timeout. The new middleware path loses this. Both sibling integrations (ADK, OpenAI) provide async paths. Add awrap_tool_call and awrap_model_call.

Security Concerns

2. Tool arguments logged in plaintext at DEBUG level
tool_args may contain API keys, credentials, or PII. Log only argument keys, not values:
python logger.debug("[%s] wrap_tool_call: tool=%s args_keys=%s", self._name, tool_name, list(tool_args.keys()) if isinstance(tool_args, dict) else "<opaque>")

3. Non-string model responses bypass output filtering entirely
When response.message.content is a list (e.g., tool_use blocks), isinstance(output_text, str) fails and all output filtering is skipped. Structured content containing blocked patterns passes undetected. Coerce with str() before scanning, matching the pattern used in input scanning.

Warnings

4. max_tool_calls now counts model calls too
Both wrap_tool_call and wrap_model_call call post_execute, incrementing ctx.call_count. This means model calls consume the tool-call budget, which is a behavioral change from wrap(). Either use separate contexts, skip post_execute in wrap_model_call, or document this explicitly.

5. Double deprecation warning from module-level wrap()
Module-level wrap() emits its own warning, then calls LangChainKernel.wrap() which emits another. Suppress the inner one.

The missing async hooks (#1) are the main blocker for production use.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated review (condensed):

TL;DR: 1 blocker (no async support), 2 security concerns. Fix #1 and this ships.

# Sev Issue Where
1 Block No async wrap_tool_call/wrap_model_call -- async LangChain agents will block or fail GovernanceMiddleware
2 Sec Full tool_args logged at DEBUG level -- may leak secrets/PII wrap_tool_call
3 Sec Non-string model responses (lists, dicts) skip output filtering entirely wrap_model_call
4 Warn Model calls consume max_tool_calls budget (behavioral change from wrap()) wrap_model_call
5 Warn Double deprecation warning from module-level wrap() wrap()

#1: Add awrap_tool_call/awrap_model_call. The deprecated wrap() already supports async.

#2: Log list(tool_args.keys()) instead of values.

#3: Coerce with str() before scanning, matching the input-side pattern.

#4 and #5 are fine as follow-ups.

Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving native middleware migration.

@imran-siddique imran-siddique merged commit 257c16a into microsoft:main Apr 30, 2026
13 of 14 checks passed
imran-siddique pushed a commit to imran-siddique/agent-governance-toolkit that referenced this pull request May 4, 2026
…eware (microsoft#1585)

Replaces fragile proxy-based wrapping with LangChain's native
AgentMiddleware system (wrap_tool_call / wrap_model_call).

Changes:
- Add GovernanceMiddleware class implementing wrap_tool_call and
  wrap_model_call lifecycle hooks for native governance gating
- Add as_middleware() factory on LangChainKernel for clean integration
- Implement blocked-pattern checks on both tool and model outputs
- Support pre_execute/post_execute Cedar/OPA gates on tool calls
- Support content filtering on model inputs and outputs
- Deprecate LangChainKernel.wrap() and module-level wrap() with
  clear migration path to as_middleware()
- Add comprehensive test suite (39 tests) covering:
  * Tool allowlist/blocklist enforcement
  * Blocked pattern detection in args, tool names, and outputs
  * Model input/output content filtering
  * Cedar evaluator passthrough
  * Shared kernel state across middleware instances
  * Deprecation warning verification
  * Full backward compatibility with existing wrap() API

Resolves microsoft#1584

Co-authored-by: Nishar <you@example.com>
MohammadHaroonAbuomar pushed a commit to MohammadHaroonAbuomar/agt-acs that referenced this pull request Jun 1, 2026
…eware (microsoft#1585)

Replaces fragile proxy-based wrapping with LangChain's native
AgentMiddleware system (wrap_tool_call / wrap_model_call).

Changes:
- Add GovernanceMiddleware class implementing wrap_tool_call and
  wrap_model_call lifecycle hooks for native governance gating
- Add as_middleware() factory on LangChainKernel for clean integration
- Implement blocked-pattern checks on both tool and model outputs
- Support pre_execute/post_execute Cedar/OPA gates on tool calls
- Support content filtering on model inputs and outputs
- Deprecate LangChainKernel.wrap() and module-level wrap() with
  clear migration path to as_middleware()
- Add comprehensive test suite (39 tests) covering:
  * Tool allowlist/blocklist enforcement
  * Blocked pattern detection in args, tool names, and outputs
  * Model input/output content filtering
  * Cedar evaluator passthrough
  * Shared kernel state across middleware instances
  * Deprecation warning verification
  * Full backward compatibility with existing wrap() API

Resolves microsoft#1584

Co-authored-by: Nishar <you@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

refactor(langchain): implement native AgentMiddleware for governance instead of wrap() proxy

2 participants