test(policies): adapter parity harness for unify-policy-decisions by eltoncarr-ms · Pull Request #1598 · microsoft/agent-governance-toolkit

eltoncarr-ms · 2026-04-29T23:06:30Z

Description

Adds the PR (c) / T8 parity harness for the Unify Policy Decisions Across Surfaces plan.

This is intentionally test-only and depends on the foundation PR: blocked by #1594.

The two new harnesses are:

tests/test_adapter_exception_identity.py: strict-xfail allowlist for adapter PolicyViolationError identity parity.
tests/test_adapter_str_no_leak.py: strict-xfail/skip allowlist for denial string sanitization and audit-fidelity parity, based on the policy leak surface inventory.

Each subsequent adapter/surface conversion PR (d/eᵢ) should convert its target to the canonical policy-decision contract, then remove the corresponding xfail row from this allowlist.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Maintenance (dependency updates, CI/CD, refactoring)
Security fix

Package(s) Affected

Checklist

My code follows the project style guidelines (ruff check)
I have added tests that prove my fix/feature works
All new and existing tests pass (pytest)
I have updated documentation as needed
I have signed the Microsoft CLA

Attribution & Prior Art

This contribution does not contain code copied or derived from other projects without attribution
Any external projects that inspired this design are credited in code comments or documentation
If this PR implements functionality similar to an existing open-source project, I have listed it below

Prior art / related projects (if any):
None.

AI & IP Disclosure

This contribution is not substantially AI-generated, OR I have disclosed AI tool usage below
This contribution does not implement patent-pending or patent-encumbered techniques
This contribution does not require an NDA or licensing agreement to understand or use

AI tools used (if any):
GitHub Copilot CLI used for scaffolding; tests were manually reviewed and validated.

Related Issues

Blocked by PR #1594.

Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.

Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.

Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time. Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted. Inventory source: agt-policy-leak-surface-inventory.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

github-actions · 2026-04-29T23:06:48Z

PR Review Summary

Check	Status	Details
🔍 Code Review	⏳ Pending	Awaiting results
🛡️ Security Scan	⏳ Pending	Awaiting results
🔄 Breaking Changes	⏳ Pending	Awaiting results
📝 Docs Sync	⏳ Pending	Awaiting results
🧪 Test Coverage	⏳ Pending	Awaiting results

Verdict: ⏳ Still running

imran-siddique

Approving adapter parity test harness.

…crosoft#1598) * feat(policies): additive PolicyCheckResult + decision factories Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(spell): add foundation PR terms to cspell dictionary Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava. * docs(adr): remove personal/internal refs from ADR 0011 Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry. * chore(policies): minimize __init__.py churn Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only). * chore(spell): scope spell-check fix to PR-introduced terms Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file. * chore(spell): account for terms in repo dictionary Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail. * test(policies): add adapter parity harness with xfail allowlist Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time. Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted. Inventory source: agt-policy-leak-surface-inventory.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

eltoncarr-ms and others added 7 commits April 29, 2026 13:58

chore(spell): add foundation PR terms to cspell dictionary

52f674c

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

docs(adr): remove personal/internal refs from ADR 0011

7b3f818

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

chore(policies): minimize __init__.py churn

48a4ea5

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

github-actions Bot added documentation Improvements or additions to documentation tests size/XL Extra large PR (500+ lines) labels Apr 29, 2026

imran-siddique marked this pull request as ready for review April 30, 2026 03:47

imran-siddique approved these changes Apr 30, 2026

View reviewed changes

imran-siddique merged commit a18dba6 into microsoft:main Apr 30, 2026
84 of 86 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(policies): adapter parity harness for unify-policy-decisions#1598

test(policies): adapter parity harness for unify-policy-decisions#1598
imran-siddique merged 7 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-parity-harness

eltoncarr-ms commented Apr 29, 2026

Uh oh!

github-actions Bot commented Apr 29, 2026

Uh oh!

imran-siddique left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eltoncarr-ms commented Apr 29, 2026

Description

Type of Change

Package(s) Affected

Checklist

Attribution & Prior Art

AI & IP Disclosure

Related Issues

Uh oh!

github-actions Bot commented Apr 29, 2026

PR Review Summary

Uh oh!

imran-siddique left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants