Skip to content

feat(policies): additive structured policy-check contract#1594

Merged
imran-siddique merged 8 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-foundation
Apr 30, 2026
Merged

feat(policies): additive structured policy-check contract#1594
imran-siddique merged 8 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-foundation

Conversation

@eltoncarr-ms
Copy link
Copy Markdown
Contributor

@eltoncarr-ms eltoncarr-ms commented Apr 29, 2026

Description

Adds a structured policy-check contract for integration-layer governance, addressing policy-internals leak surfaces. This is a purely additive foundation: ΓÇö no existing public API changes, no behavior change for existing callers.

Why: Adapter denial sites currently leak policy internals (raw regex, allow-list contents, limit numbers) into user-visible error text, and hosts have no programmatic way to dispatch on violation category without substring-matching free-form English. This PR introduces the structured contract that fixes both — without changing any existing public API.

Delivery plan: Foundation is split across a PR stack :

  • (a) discovery/inventory,
  • (b) this PR: additive contract
  • (c) parametrized parity harness with xfail allowlist
  • (d) first adapter conversion (LangChain, gated on maintainer review),
  • (e₁…eₙ) per-adapter conversions in parallel. Each PR is independently revertible; behavior change is opt-in per adapter, so risk is bounded and review surface stays small.

Changes

  • New module agent_os.policies.decision with ViolationCategory enum and PolicyCheckResult Pydantic model (to_legacy_tuple / to_public_dict serializers).
  • New module agent_os.policies.decision_factory ΓÇö single source of truth for denial result construction; sanitized public-message templates keyed by category.
  • PolicyViolationError.from_check_result classmethod (additive); legacy (message, error_code, details) constructor preserved verbatim. str(e) is the sanitized public_message; e.details["detail"] retains audit fidelity.
  • BaseIntegration gains pre_execute_check / post_execute_check (sync + async) returning PolicyCheckResult. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical.
  • AsyncGovernedWrapper and PolicyInterceptor migrated internally to *_check variants. External adapter API unchanged.
  • ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide.
  • agent-governance-python/agent-os/AGENTS.md updated with one-paragraph opt-in snippet.

Backward compatibility

  • All existing callers continue to use legacy tuple methods unchanged.
  • Public exception constructor signature unchanged.
  • Reason strings produced by legacy paths are byte-identical (verified by snapshot tests).
  • 3308 existing tests pass; 120 new tests added; full Docker test suite green across all packages.

Follow-ups (separate PRs, tracked in plan)

  • Per-adapter conversion to from_check_result(...) to remove user-visible regex / allow-list / limit interpolation.
  • Parametrized parity harness across all adapters.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Maintenance (dependency updates, CI/CD, refactoring)
  • Security fix

Package(s) Affected

  • agent-os-kernel
  • agent-mesh
  • agent-runtime
  • agent-sre
  • agent-governance
  • docs / root

Checklist

  • My code follows the project style guidelines (ruff check)
  • I have added tests that prove my fix/feature works
  • All new and existing tests pass (pytest)
  • I have updated documentation as needed
  • I have signed the Microsoft CLA

Attribution & Prior Art

  • This contribution does not contain code copied or derived from other projects without attribution
  • Any external projects that inspired this design are credited in code comments or documentation
  • If this PR implements functionality similar to an existing open-source project, I have listed it below

Prior art / related projects (if any):

None.

AI & IP Disclosure

  • This contribution is not substantially AI-generated, OR I have disclosed AI tool usage below
  • This contribution does not implement patent-pending or patent-encumbered techniques
  • This contribution does not require an NDA or licensing agreement to understand or use

@github-actions github-actions Bot added documentation Improvements or additions to documentation tests labels Apr 29, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🤖 AI Agent: docs-sync-checker — Docs Sync

Docs Sync

  • PolicyCheckResult in agent_os/policies/decision.py -- missing docstring
  • ViolationCategory in agent_os/policies/decision.py -- missing docstring
  • README.md -- section on policy checks needs update to reflect new structured contract
  • CHANGELOG.md -- missing entry for new PolicyCheckResult and ViolationCategory additions and related changes

Please ensure these documentation updates are made.

@github-actions github-actions Bot added the size/XL Extra large PR (500+ lines) label Apr 29, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🤖 AI Agent: breaking-change-detector — API Compatibility

API Compatibility

No breaking changes detected.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🤖 AI Agent: security-scanner — View details

No security issues found.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🤖 AI Agent: code-reviewer — View details

TL;DR: 0 blockers, 1 warning. Solid foundation for structured policy-check contract; minor follow-up suggested.

# Sev Issue Where
1 Warn Lack of validation for PolicyCheckResult fields like public_message agent_os/policies/decision.py

Action items: None.

Warnings:

# Issue Where Follow-up
1 Lack of validation for PolicyCheckResult fields like public_message agent_os/policies/decision.py Fine as follow-up PR.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

🤖 AI Agent: test-generator — `agent_os/policies/decision.py`

agent_os/policies/decision.py

  • test_violation_category_enum_values -- Verify all ViolationCategory enum values are correctly defined and match expected strings.
  • test_policy_check_result_serialization -- Validate PolicyCheckResult.to_public_dict for correct serialization of attributes.
  • test_policy_check_result_legacy_conversion -- Ensure PolicyCheckResult.to_legacy_tuple produces accurate legacy tuple outputs.

agent_os/policies/decision_factory.py

  • test_deny_blocked_pattern_tool -- Test deny_blocked_pattern_tool for correct PolicyCheckResult generation with expected attributes.
  • test_deny_human_approval -- Validate deny_human_approval produces the correct PolicyCheckResult.
  • test_deny_max_tool_calls -- Ensure deny_max_tool_calls handles boundary conditions for maximum tool calls.
  • test_deny_confidence_threshold -- Test deny_confidence_threshold for correct behavior with edge-case confidence values.

agent_os/integrations/base.py

  • test_pre_execute_check_policy_violation -- Verify pre_execute_check correctly handles and emits events for policy violations.
  • test_post_execute_check_drift_detection -- Test post_execute_check for proper drift detection and result generation.
  • test_async_pre_execute_check -- Ensure async_pre_execute_check correctly handles async policy checks.
  • test_async_post_execute_check -- Validate async_post_execute_check for structured result generation in async scenarios.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 29, 2026

PR Review Summary

Check Status Details
🔍 Code Review ❌ Failed Issues detected
🛡️ Security Scan ✅ Completed Analysis complete
🔄 Breaking Changes ⚠️ Warning See details
📝 Docs Sync ✅ Completed Analysis complete
🧪 Test Coverage ❌ Failed Issues detected

Verdict: ❌ Changes needed

@eltoncarr-ms eltoncarr-ms marked this pull request as draft April 29, 2026 22:18
@eltoncarr-ms eltoncarr-ms marked this pull request as ready for review April 29, 2026 22:52
@eltoncarr-ms eltoncarr-ms changed the title feat(policies): additive PolicyCheckResult contract for integration-layer governance feat(policies): additive structured policy-check contract Apr 29, 2026
@eltoncarr-ms eltoncarr-ms force-pushed the dev/eltonc/agt-unify-policy-decisions-foundation branch from 78a0415 to 6edec5f Compare April 29, 2026 23:23
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR: 2 blockers, 3 warnings. Fix #1 and #2 and this ships.

# Sev Issue Where
1 Block Async path silently bypasses subclass overrides of legacy pre_execute/post_execute base.py async_pre_execute_check
2 Block **result.audit_entry spread in from_check_result can overwrite explicit detail keys exceptions.py:68
3 Warn PolicyCheckResult defaults allowed=True (fail-open by construction) decision.py:51
4 Warn Full user input passed to deny_blocked_pattern_input, one redact_user_text=False from leaking base.py pre_execute_check
5 Warn ADR index jumps 0009 to 0011, missing ADR-0010 docs/adr/index.md

#1: Before this PR, async_pre_execute dispatched through self.pre_execute(ctx, input_data), respecting subclass overrides. Now it calls self.pre_execute_check() directly, bypassing any adapter that overrides only legacy pre_execute. In a governance framework, a silently skipped policy check is a policy bypass. Fix: delegate through the legacy method or check for subclass override before calling the new path.

#2: **result.audit_entry is spread after explicit keys like category, detail. A crafted audit_entry with {"category": "benign"} would silently overwrite the real violation category. Fix: spread audit_entry first, or namespace under "audit" sub-key.

Warnings are fine as follow-up PRs. Test coverage (120 new tests) and the decision_factory sanitization pattern are excellent. No internal/confidential content in the ADR or security audit doc.

eltoncarr-ms and others added 8 commits April 29, 2026 17:17
Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation).

* New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers).

* New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category.

* `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity.

* `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5).

* `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged.

* ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected.

* AGENTS.md updated with one-paragraph opt-in snippet (T13).

Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files.

Refs microsoft#1574

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.
Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.
Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).
Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.
Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.
…ntract

Satisfies scripts/ci/security-audit-required.sh gate for capability paths
touched in agent_os/policies/. Documents threat model impact (information
leakage reduced; no new powers; no policy-bypass surface), mitigations,
and the 120-test coverage matrix.

Refs microsoft#1574
Removes the smoketest filename reference that triggered cspell. The local
smoke script is not part of this PR.
@eltoncarr-ms eltoncarr-ms force-pushed the dev/eltonc/agt-unify-policy-decisions-foundation branch from a530bc0 to ee2753f Compare April 30, 2026 00:17
@imran-siddique imran-siddique enabled auto-merge (squash) April 30, 2026 01:13
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving - will address blockers in follow-up PRs.

@imran-siddique imran-siddique merged commit 17a781d into microsoft:main Apr 30, 2026
83 of 86 checks passed
@eltoncarr-ms eltoncarr-ms deleted the dev/eltonc/agt-unify-policy-decisions-foundation branch April 30, 2026 15:16
imran-siddique pushed a commit to imran-siddique/agent-governance-toolkit that referenced this pull request May 4, 2026
…1594)

* feat(policies): additive PolicyCheckResult + decision factories

Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation).

* New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers).

* New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category.

* `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity.

* `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5).

* `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged.

* ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected.

* AGENTS.md updated with one-paragraph opt-in snippet (T13).

Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files.

Refs microsoft#1574

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(spell): add foundation PR terms to cspell dictionary

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

* docs(adr): remove personal/internal refs from ADR 0011

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

* chore(policies): minimize __init__.py churn

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

* chore(spell): scope spell-check fix to PR-introduced terms

Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.

* chore(spell): account for terms in repo dictionary

Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.

* docs(security-audit): add audit doc for additive PolicyCheckResult contract

Satisfies scripts/ci/security-audit-required.sh gate for capability paths
touched in agent_os/policies/. Documents threat model impact (information
leakage reduced; no new powers; no policy-bypass surface), mitigations,
and the 120-test coverage matrix.

Refs microsoft#1574

* docs(security-audit): drop reference to local smoke script

Removes the smoketest filename reference that triggered cspell. The local
smoke script is not part of this PR.

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants