feat(policies): additive structured policy-check contract by eltoncarr-ms · Pull Request #1594 · microsoft/agent-governance-toolkit

eltoncarr-ms · 2026-04-29T22:06:59Z

Description

Adds a structured policy-check contract for integration-layer governance, addressing policy-internals leak surfaces. This is a purely additive foundation: ΓÇö no existing public API changes, no behavior change for existing callers.

Why: Adapter denial sites currently leak policy internals (raw regex, allow-list contents, limit numbers) into user-visible error text, and hosts have no programmatic way to dispatch on violation category without substring-matching free-form English. This PR introduces the structured contract that fixes both — without changing any existing public API.

Delivery plan: Foundation is split across a PR stack :

(a) discovery/inventory,
(b) this PR: additive contract
(c) parametrized parity harness with xfail allowlist
(d) first adapter conversion (LangChain, gated on maintainer review),
(e₁…eₙ) per-adapter conversions in parallel. Each PR is independently revertible; behavior change is opt-in per adapter, so risk is bounded and review surface stays small.

Changes

New module agent_os.policies.decision with ViolationCategory enum and PolicyCheckResult Pydantic model (to_legacy_tuple / to_public_dict serializers).
New module agent_os.policies.decision_factory ΓÇö single source of truth for denial result construction; sanitized public-message templates keyed by category.
PolicyViolationError.from_check_result classmethod (additive); legacy (message, error_code, details) constructor preserved verbatim. str(e) is the sanitized public_message; e.details["detail"] retains audit fidelity.
BaseIntegration gains pre_execute_check / post_execute_check (sync + async) returning PolicyCheckResult. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical.
AsyncGovernedWrapper and PolicyInterceptor migrated internally to *_check variants. External adapter API unchanged.
ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide.
agent-governance-python/agent-os/AGENTS.md updated with one-paragraph opt-in snippet.

Backward compatibility

All existing callers continue to use legacy tuple methods unchanged.
Public exception constructor signature unchanged.
Reason strings produced by legacy paths are byte-identical (verified by snapshot tests).
3308 existing tests pass; 120 new tests added; full Docker test suite green across all packages.

Follow-ups (separate PRs, tracked in plan)

Per-adapter conversion to from_check_result(...) to remove user-visible regex / allow-list / limit interpolation.
Parametrized parity harness across all adapters.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Maintenance (dependency updates, CI/CD, refactoring)
Security fix

Package(s) Affected

Checklist

My code follows the project style guidelines (ruff check)
I have added tests that prove my fix/feature works
All new and existing tests pass (pytest)
I have updated documentation as needed
I have signed the Microsoft CLA

Attribution & Prior Art

This contribution does not contain code copied or derived from other projects without attribution
Any external projects that inspired this design are credited in code comments or documentation
If this PR implements functionality similar to an existing open-source project, I have listed it below

Prior art / related projects (if any):

None.

AI & IP Disclosure

This contribution is not substantially AI-generated, OR I have disclosed AI tool usage below
This contribution does not implement patent-pending or patent-encumbered techniques
This contribution does not require an NDA or licensing agreement to understand or use

github-actions · 2026-04-29T22:07:20Z

🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

PolicyCheckResult in agent_os/policies/decision.py -- missing docstring
ViolationCategory in agent_os/policies/decision.py -- missing docstring
README.md -- section on policy checks needs update to reflect new structured contract
CHANGELOG.md -- missing entry for new PolicyCheckResult and ViolationCategory additions and related changes

Please ensure these documentation updates are made.

github-actions · 2026-04-29T22:07:23Z

🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

No breaking changes detected.

github-actions · 2026-04-29T22:07:25Z

🤖 AI Agent: security-scanner — View details

No security issues found.

github-actions · 2026-04-29T22:07:31Z

🤖 AI Agent: code-reviewer — View details

TL;DR: 0 blockers, 1 warning. Solid foundation for structured policy-check contract; minor follow-up suggested.

#	Sev	Issue	Where
1	Warn	Lack of validation for `PolicyCheckResult` fields like `public_message`	`agent_os/policies/decision.py`

Action items: None.

Warnings:

#	Issue	Where	Follow-up
1	Lack of validation for `PolicyCheckResult` fields like `public_message`	`agent_os/policies/decision.py`	Fine as follow-up PR.

github-actions · 2026-04-29T22:07:34Z

🤖 AI Agent: test-generator — `agent_os/policies/decision.py`

`agent_os/policies/decision.py`

test_violation_category_enum_values -- Verify all ViolationCategory enum values are correctly defined and match expected strings.
test_policy_check_result_serialization -- Validate PolicyCheckResult.to_public_dict for correct serialization of attributes.
test_policy_check_result_legacy_conversion -- Ensure PolicyCheckResult.to_legacy_tuple produces accurate legacy tuple outputs.

`agent_os/policies/decision_factory.py`

test_deny_blocked_pattern_tool -- Test deny_blocked_pattern_tool for correct PolicyCheckResult generation with expected attributes.
test_deny_human_approval -- Validate deny_human_approval produces the correct PolicyCheckResult.
test_deny_max_tool_calls -- Ensure deny_max_tool_calls handles boundary conditions for maximum tool calls.
test_deny_confidence_threshold -- Test deny_confidence_threshold for correct behavior with edge-case confidence values.

`agent_os/integrations/base.py`

test_pre_execute_check_policy_violation -- Verify pre_execute_check correctly handles and emits events for policy violations.
test_post_execute_check_drift_detection -- Test post_execute_check for proper drift detection and result generation.
test_async_pre_execute_check -- Ensure async_pre_execute_check correctly handles async policy checks.
test_async_post_execute_check -- Validate async_post_execute_check for structured result generation in async scenarios.

github-actions · 2026-04-29T22:07:59Z

PR Review Summary

Check	Status	Details
🔍 Code Review	❌ Failed	Issues detected
🛡️ Security Scan	✅ Completed	Analysis complete
🔄 Breaking Changes	⚠️ Warning	See details
📝 Docs Sync	✅ Completed	Analysis complete
🧪 Test Coverage	❌ Failed	Issues detected

Verdict: ❌ Changes needed

imran-siddique

TL;DR: 2 blockers, 3 warnings. Fix #1 and #2 and this ships.

#	Sev	Issue	Where
1	Block	Async path silently bypasses subclass overrides of legacy `pre_execute`/`post_execute`	`base.py` async_pre_execute_check
2	Block	`**result.audit_entry` spread in `from_check_result` can overwrite explicit detail keys	`exceptions.py:68`
3	Warn	`PolicyCheckResult` defaults `allowed=True` (fail-open by construction)	`decision.py:51`
4	Warn	Full user input passed to `deny_blocked_pattern_input`, one `redact_user_text=False` from leaking	`base.py` pre_execute_check
5	Warn	ADR index jumps 0009 to 0011, missing ADR-0010	`docs/adr/index.md`

#1: Before this PR, async_pre_execute dispatched through self.pre_execute(ctx, input_data), respecting subclass overrides. Now it calls self.pre_execute_check() directly, bypassing any adapter that overrides only legacy pre_execute. In a governance framework, a silently skipped policy check is a policy bypass. Fix: delegate through the legacy method or check for subclass override before calling the new path.

#2: **result.audit_entry is spread after explicit keys like category, detail. A crafted audit_entry with {"category": "benign"} would silently overwrite the real violation category. Fix: spread audit_entry first, or namespace under "audit" sub-key.

Warnings are fine as follow-up PRs. Test coverage (120 new tests) and the decision_factory sanitization pattern are excellent. No internal/confidential content in the ADR or security audit doc.

Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.

Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.

…ntract Satisfies scripts/ci/security-audit-required.sh gate for capability paths touched in agent_os/policies/. Documents threat model impact (information leakage reduced; no new powers; no policy-bypass surface), mitigations, and the 120-test coverage matrix. Refs microsoft#1574

Removes the smoketest filename reference that triggered cspell. The local smoke script is not part of this PR.

imran-siddique

Approving - will address blockers in follow-up PRs.

…1594) * feat(policies): additive PolicyCheckResult + decision factories Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(spell): add foundation PR terms to cspell dictionary Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava. * docs(adr): remove personal/internal refs from ADR 0011 Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry. * chore(policies): minimize __init__.py churn Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only). * chore(spell): scope spell-check fix to PR-introduced terms Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file. * chore(spell): account for terms in repo dictionary Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail. * docs(security-audit): add audit doc for additive PolicyCheckResult contract Satisfies scripts/ci/security-audit-required.sh gate for capability paths touched in agent_os/policies/. Documents threat model impact (information leakage reduced; no new powers; no policy-bypass surface), mitigations, and the 120-test coverage matrix. Refs microsoft#1574 * docs(security-audit): drop reference to local smoke script Removes the smoketest filename reference that triggered cspell. The local smoke script is not part of this PR. --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions Bot added documentation Improvements or additions to documentation tests labels Apr 29, 2026

github-actions Bot added the size/XL Extra large PR (500+ lines) label Apr 29, 2026

eltoncarr-ms marked this pull request as draft April 29, 2026 22:18

eltoncarr-ms marked this pull request as ready for review April 29, 2026 22:52

eltoncarr-ms changed the title ~~feat(policies): additive PolicyCheckResult contract for integration-layer governance~~ feat(policies): additive structured policy-check contract Apr 29, 2026

eltoncarr-ms mentioned this pull request Apr 29, 2026

test(policies): adapter parity harness for unify-policy-decisions #1598

Merged

23 tasks

eltoncarr-ms force-pushed the dev/eltonc/agt-unify-policy-decisions-foundation branch from 78a0415 to 6edec5f Compare April 29, 2026 23:23

imran-siddique mentioned this pull request Apr 29, 2026

fix(ci): make AI PR review bots concise and actionable #1601

Merged

1 task

imran-siddique requested changes Apr 29, 2026

View reviewed changes

eltoncarr-ms and others added 8 commits April 29, 2026 17:17

chore(spell): add foundation PR terms to cspell dictionary

75bd783

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

docs(adr): remove personal/internal refs from ADR 0011

194d75b

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

chore(policies): minimize __init__.py churn

f78ae57

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

docs(security-audit): drop reference to local smoke script

ee2753f

Removes the smoketest filename reference that triggered cspell. The local smoke script is not part of this PR.

eltoncarr-ms force-pushed the dev/eltonc/agt-unify-policy-decisions-foundation branch from a530bc0 to ee2753f Compare April 30, 2026 00:17

imran-siddique approved these changes Apr 30, 2026

View reviewed changes

imran-siddique enabled auto-merge (squash) April 30, 2026 01:13

imran-siddique approved these changes Apr 30, 2026

View reviewed changes

imran-siddique merged commit 17a781d into microsoft:main Apr 30, 2026
83 of 86 checks passed

eltoncarr-ms deleted the dev/eltonc/agt-unify-policy-decisions-foundation branch April 30, 2026 15:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(policies): additive structured policy-check contract#1594

feat(policies): additive structured policy-check contract#1594
imran-siddique merged 8 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-foundation

eltoncarr-ms commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Docs Sync

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

API Compatibility

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

`agent_os/policies/decision.py`

`agent_os/policies/decision_factory.py`

`agent_os/integrations/base.py`

Uh oh!

github-actions Bot commented Apr 29, 2026 •

edited

Loading

Uh oh!

imran-siddique left a comment

Uh oh!

imran-siddique left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eltoncarr-ms commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Backward compatibility

Follow-ups (separate PRs, tracked in plan)

Type of Change

Package(s) Affected

Checklist

Attribution & Prior Art

AI & IP Disclosure

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Docs Sync

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

API Compatibility

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

agent_os/policies/decision.py

agent_os/policies/decision_factory.py

agent_os/integrations/base.py

Uh oh!

github-actions Bot commented Apr 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Summary

Uh oh!

imran-siddique left a comment

Choose a reason for hiding this comment

Uh oh!

imran-siddique left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

eltoncarr-ms commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

github-actions Bot commented Apr 29, 2026 •

edited

Loading

`agent_os/policies/decision.py`

`agent_os/policies/decision_factory.py`

`agent_os/integrations/base.py`

github-actions Bot commented Apr 29, 2026 •

edited

Loading