feat(openai): native RunHooks lifecycle + BaseIntegration inheritance#1582
Conversation
Refactor OpenAIAgentsKernel to extend BaseIntegration and implement the OpenAI Agents SDK's native RunHooks lifecycle interface, replacing the fragile proxy-based wrap()/wrap_runner() workarounds. Changes: - Implement GovernanceRunHooks(RunHooks) with all 5 lifecycle callbacks: on_agent_start, on_agent_end, on_tool_start, on_tool_end, on_handoff - Extend BaseIntegration for centralized Cedar/OPA policy evaluation via pre_execute()/post_execute() - Add as_hooks() as the primary API entry point - Thread tool_name/tool_args into Cedar context for precise policy gating - Deprecate wrap(), wrap_runner(), create_tool_guard() with warnings - Fix ctx.events -> ctx.tool_calls for ExecutionContext compatibility - Fix double-block in on_agent_start (content filter + Cedar gate) - Standardize docstrings and comments to match ADK/LangChain conventions - Update test_coverage_boost.py for new BaseIntegration API Test results: 3070 passed, 0 failures (full regression) Closes microsoft#1576
🤖 AI Agent: security-scanner — View detailsNo security issues found. |
🤖 AI Agent: docs-sync-checker — Docs SyncDocs Sync
|
🤖 AI Agent: breaking-change-detector — API CompatibilityAPI Compatibility
|
🤖 AI Agent: code-reviewer — Review SummaryReview SummaryThis pull request introduces a significant refactor to the Below is a detailed review of the changes, focusing on the specified focus areas. CRITICAL Issues
WARNING: Potential Breaking Changes
Suggestions for Improvement
Summary of Feedback
Overall, this is a well-structured and much-needed refactor that improves the maintainability and robustness of the |
🤖 AI Agent: test-generator — `agent_os/integrations/openai_agents_sdk.py`Test Coverage Analysis
|
PR Review Summary
Verdict: ❌ Changes needed |
imran-siddique
left a comment
There was a problem hiding this comment.
Code Review: OpenAI RunHooks
Thanks for the thorough work here, @miyannishar. The test coverage and backward compatibility approach are excellent. Found a few issues that need addressing before merge:
Blocking (Security)
1. on_agent_start fails-open when
equire_human_approval=False
When content matches a �locked_pattern and
equire_human_approval is False, the violation is logged but execution continues. A governance framework must always block forbidden content. The
equire_human_approval flag should control whether violations go through an approval queue, not whether they're enforced at all. Compare with on_tool_start which correctly always raises.
Fix: always raise PolicyViolationError on blocked content, regardless of
equire_human_approval.
2. on_tool_end and on_agent_end fail-open on blocked output
Blocked patterns in tool/agent output are logged as warnings but don't raise PolicyViolationError or call on_violation(). Blocked content (e.g., DROP TABLE, leaked secrets) silently passes through to the caller. Output filtering must be fail-closed.
Warnings
3. Shared mutable _tool_call_count/_handoff_count across concurrent runs
Counters are on the kernel instance, not per-run. Concurrent Runner.run() calls sharing one kernel will race on these counters. Consider moving counters to GovernanceRunHooks or keying by run-id.
4. max_handoffs not propagated to GovernancePolicy
The constructor accepts max_handoffs but doesn't pass it to the policy dataclass, unlike max_tool_calls. Policy-based init will ignore handoff limits.
5. Budget check operator inconsistency
on_tool_start uses > while BaseIntegration.pre_execute uses >= for the same semantic check. These can disagree on whether the Nth call is allowed.
Nits
- ExecutionContext removed from all without a re-export deprecation cycle
- �s_hooks() should raise ImportError when SDK isn't installed (match ADK adapter pattern)
The fail-open issues (#1, #2) are the blockers. The rest can be addressed in follow-ups if needed.
imran-siddique
left a comment
There was a problem hiding this comment.
Updated review (condensed from above):
TL;DR: 2 blockers (fail-open security gaps). Fix those and this ships.
| # | Sev | Issue | Where |
|---|---|---|---|
| 1 | Block | on_agent_start logs but doesn't raise on blocked content when require_human_approval=False |
on_agent_start |
| 2 | Block | on_tool_end/on_agent_end warn on blocked output but never raise or call on_violation() |
on_tool_end, on_agent_end |
| 3 | Warn | Shared mutable counters across concurrent Runner.run() calls |
__init__ |
| 4 | Warn | max_handoffs not passed to GovernancePolicy unlike max_tool_calls |
__init__ |
| 5 | Warn | Budget check uses > here vs >= in BaseIntegration.pre_execute |
on_tool_start |
#1: Always raise PolicyViolationError on blocked patterns. require_human_approval should gate the approval queue, not enforcement.
#2: Match on_tool_start behavior: raise + call on_violation() on blocked output.
Warnings are nice-to-haves, can be follow-up PRs.
imran-siddique
left a comment
There was a problem hiding this comment.
Approving native hooks migration.
…microsoft#1582) Refactor OpenAIAgentsKernel to extend BaseIntegration and implement the OpenAI Agents SDK's native RunHooks lifecycle interface, replacing the fragile proxy-based wrap()/wrap_runner() workarounds. Changes: - Implement GovernanceRunHooks(RunHooks) with all 5 lifecycle callbacks: on_agent_start, on_agent_end, on_tool_start, on_tool_end, on_handoff - Extend BaseIntegration for centralized Cedar/OPA policy evaluation via pre_execute()/post_execute() - Add as_hooks() as the primary API entry point - Thread tool_name/tool_args into Cedar context for precise policy gating - Deprecate wrap(), wrap_runner(), create_tool_guard() with warnings - Fix ctx.events -> ctx.tool_calls for ExecutionContext compatibility - Fix double-block in on_agent_start (content filter + Cedar gate) - Standardize docstrings and comments to match ADK/LangChain conventions - Update test_coverage_boost.py for new BaseIntegration API Test results: 3070 passed, 0 failures (full regression) Closes microsoft#1576 Co-authored-by: Nishar <you@example.com>
Summary
Refactors
OpenAIAgentsKernelto extendBaseIntegrationand implement the OpenAI Agents SDK's nativeRunHookslifecycle interface, replacing the fragile proxy-basedwrap()/wrap_runner()workarounds with first-class framework hooks.Closes #1576
Motivation
The existing OpenAI adapter relied on proxy wrapping (
wrap(),wrap_runner()) to intercept agent and tool execution. This approach:__getattr__delegation that broke with SDK updatesBasePluginintegration patternBaseIntegration's Cedar/OPA policy evaluationThis PR aligns the OpenAI adapter with the ADK adapter's architecture by using the SDK's native
RunHooksinterface.Changes
Core Adapter (
openai_agents_sdk.py)BaseIntegrationkernel.wrap(agent)+kernel.wrap_runner(Runner)Runner.run(agent, hooks=kernel.as_hooks())GovernanceRunHooks(RunHooks)with 5 callbackspre_execute()/post_execute()+ local checkstool_name/tool_argsthreaded into Cedar contextGovernanceRunHooksLifecycle Coverageon_agent_startpre_execute()on_agent_endpost_execute(), audit recordingon_tool_starton_tool_endon_handoffBackward Compatibility
All three legacy methods are preserved with
DeprecationWarning:wrap()→ useas_hooks()wrap_runner()→ useas_hooks()create_tool_guard()→ handled byon_tool_startBug Fixes
ctx.events→ctx.tool_calls: Fixedwrap_runner()to use the correctExecutionContextfieldon_agent_start: Prevented content filter and Cedar/OPA gate from both triggering on the same violationCode Quality
Args:/Returns:/Raises:, consistent section separators)Test Results
test_openai_agents_sdk_adapter.pytest_coverage_boost.py(OAI section)Usage Example