test(policies): adapter parity harness for unify-policy-decisions#1598
Merged
imran-siddique merged 7 commits intoApr 30, 2026
Conversation
Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.
Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.
Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).
Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.
Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.
Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time. Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted. Inventory source: agt-policy-leak-surface-inventory.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PR Review Summary
Verdict: ⏳ Still running |
imran-siddique
approved these changes
Apr 30, 2026
Member
imran-siddique
left a comment
There was a problem hiding this comment.
Approving adapter parity test harness.
imran-siddique
pushed a commit
to imran-siddique/agent-governance-toolkit
that referenced
this pull request
May 4, 2026
…crosoft#1598) * feat(policies): additive PolicyCheckResult + decision factories Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation). * New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers). * New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category. * `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity. * `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5). * `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged. * ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected. * AGENTS.md updated with one-paragraph opt-in snippet (T13). Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files. Refs microsoft#1574 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> * chore(spell): add foundation PR terms to cspell dictionary Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava. * docs(adr): remove personal/internal refs from ADR 0011 Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry. * chore(policies): minimize __init__.py churn Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only). * chore(spell): scope spell-check fix to PR-introduced terms Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file. * chore(spell): account for terms in repo dictionary Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail. * test(policies): add adapter parity harness with xfail allowlist Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time. Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted. Inventory source: agt-policy-leak-surface-inventory.md. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds the PR (c) / T8 parity harness for the Unify Policy Decisions Across Surfaces plan.
This is intentionally test-only and depends on the foundation PR: blocked by #1594.
The two new harnesses are:
tests/test_adapter_exception_identity.py: strict-xfail allowlist for adapterPolicyViolationErroridentity parity.tests/test_adapter_str_no_leak.py: strict-xfail/skip allowlist for denial string sanitization and audit-fidelity parity, based on the policy leak surface inventory.Each subsequent adapter/surface conversion PR (d/eᵢ) should convert its target to the canonical policy-decision contract, then remove the corresponding xfail row from this allowlist.
Type of Change
Package(s) Affected
Checklist
Attribution & Prior Art
Prior art / related projects (if any):
None.
AI & IP Disclosure
AI tools used (if any):
GitHub Copilot CLI used for scaffolding; tests were manually reviewed and validated.
Related Issues
Blocked by PR #1594.