Skip to content

test(policies): adapter parity harness for unify-policy-decisions#1598

Merged
imran-siddique merged 7 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-parity-harness
Apr 30, 2026
Merged

test(policies): adapter parity harness for unify-policy-decisions#1598
imran-siddique merged 7 commits into
microsoft:mainfrom
eltoncarr-ms:dev/eltonc/agt-unify-policy-decisions-parity-harness

Conversation

@eltoncarr-ms
Copy link
Copy Markdown
Contributor

Description

Adds the PR (c) / T8 parity harness for the Unify Policy Decisions Across Surfaces plan.

This is intentionally test-only and depends on the foundation PR: blocked by #1594.

The two new harnesses are:

  • tests/test_adapter_exception_identity.py: strict-xfail allowlist for adapter PolicyViolationError identity parity.
  • tests/test_adapter_str_no_leak.py: strict-xfail/skip allowlist for denial string sanitization and audit-fidelity parity, based on the policy leak surface inventory.

Each subsequent adapter/surface conversion PR (d/eᵢ) should convert its target to the canonical policy-decision contract, then remove the corresponding xfail row from this allowlist.

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Maintenance (dependency updates, CI/CD, refactoring)
  • Security fix

Package(s) Affected

  • agent-os-kernel
  • agent-mesh
  • agent-runtime
  • agent-sre
  • agent-governance
  • docs / root

Checklist

  • My code follows the project style guidelines (ruff check)
  • I have added tests that prove my fix/feature works
  • All new and existing tests pass (pytest)
  • I have updated documentation as needed
  • I have signed the Microsoft CLA

Attribution & Prior Art

  • This contribution does not contain code copied or derived from other projects without attribution
  • Any external projects that inspired this design are credited in code comments or documentation
  • If this PR implements functionality similar to an existing open-source project, I have listed it below

Prior art / related projects (if any):
None.

AI & IP Disclosure

  • This contribution is not substantially AI-generated, OR I have disclosed AI tool usage below
  • This contribution does not implement patent-pending or patent-encumbered techniques
  • This contribution does not require an NDA or licensing agreement to understand or use

AI tools used (if any):
GitHub Copilot CLI used for scaffolding; tests were manually reviewed and validated.

Related Issues

Blocked by PR #1594.

eltoncarr-ms and others added 7 commits April 29, 2026 13:58
Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation).

* New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers).

* New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category.

* `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity.

* `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5).

* `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged.

* ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected.

* AGENTS.md updated with one-paragraph opt-in snippet (T13).

Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files.

Refs microsoft#1574

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.
Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.
Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).
Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.
Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.
Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time.

Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted.

Inventory source: agt-policy-leak-surface-inventory.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown

PR Review Summary

Check Status Details
🔍 Code Review ⏳ Pending Awaiting results
🛡️ Security Scan ⏳ Pending Awaiting results
🔄 Breaking Changes ⏳ Pending Awaiting results
📝 Docs Sync ⏳ Pending Awaiting results
🧪 Test Coverage ⏳ Pending Awaiting results

Verdict: ⏳ Still running

@github-actions github-actions Bot added documentation Improvements or additions to documentation tests size/XL Extra large PR (500+ lines) labels Apr 29, 2026
@imran-siddique imran-siddique marked this pull request as ready for review April 30, 2026 03:47
Copy link
Copy Markdown
Member

@imran-siddique imran-siddique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving adapter parity test harness.

@imran-siddique imran-siddique merged commit a18dba6 into microsoft:main Apr 30, 2026
84 of 86 checks passed
imran-siddique pushed a commit to imran-siddique/agent-governance-toolkit that referenced this pull request May 4, 2026
…crosoft#1598)

* feat(policies): additive PolicyCheckResult + decision factories

Adds an additive structured policy-check contract for integration-layer governance, addressing the policy-internals leak surfaces enumerated in .plans/agt-unify-policy-decisions.md (PR (a)+(b) foundation).

* New module `agent_os.policies.decision` with `ViolationCategory` enum and `PolicyCheckResult` Pydantic model (`to_legacy_tuple`/`to_public_dict` serializers).

* New module `agent_os.policies.decision_factory` — single source of truth for denial result construction; sanitized public-message templates keyed by category.

* `PolicyViolationError.from_check_result` classmethod (additive); legacy `(message, error_code, details)` constructor preserved verbatim. `str(e)` is the sanitized public_message; `e.details['detail']` retains audit fidelity.

* `BaseIntegration` gains `pre_execute_check` / `post_execute_check` (sync+async) returning `PolicyCheckResult`. Legacy tuple methods reimplemented as thin wrappers; reason strings byte-identical (AC-5).

* `AsyncGovernedWrapper` and `PolicyInterceptor` migrated internally to `*_check` variants. External adapter API unchanged.

* ADR docs/adr/0011-additive-policy-check-contract.md with Host Migration Guide (AC-15, AC-16). 0009 RATS unaffected.

* AGENTS.md updated with one-paragraph opt-in snippet (T13).

Tests: 120 new (test_policy_check_result_no_leak, test_policy_violation_error_safety, test_legacy_constructors, test_pre_execute_check_contract, test_public_api_surface). Baseline 257 still green. Zero new ruff or mypy findings on touched files.

Refs microsoft#1574

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* chore(spell): add foundation PR terms to cspell dictionary

Adds pytest, mypy, isort, pyupgrade, Pydantic, Docstrings, classmethod, xfail, Zava.

* docs(adr): remove personal/internal refs from ADR 0011

Removes a host name from the migration example (uses generic phrasing) and drops local OneDrive file:/// links from the references section. Also removes the now-unused dictionary entry.

* chore(policies): minimize __init__.py churn

Restore the original ordering of pre-existing imports; the additive PR only inserts the single .decision line and two __all__ entries. The earlier reorder was an isort auto-fix not required by CI lint rules (which use --select E,F,W only).

* chore(spell): scope spell-check fix to PR-introduced terms

Reverts the global .cspell-repo-terms.txt additions. Reword classmethod and xfail in ADR 0011 (PR-introduced terms). Add an in-file cspell:ignore directive to AGENTS.md for pre-existing technical terms (pytest, mypy, isort, pyupgrade, Pydantic, Docstrings) that were inherited debt surfaced by editing the file.

* chore(spell): account for terms in repo dictionary

Restore classmethod and xfail in ADR 0011 (their precise meaning matters). Drop the cspell:ignore directive from AGENTS.md and instead track all 8 surfaced terms in .cspell-repo-terms.txt: classmethod, Docstrings, isort, mypy, Pydantic, pytest, pyupgrade, xfail.

* test(policies): add adapter parity harness with xfail allowlist

Implements PR (c) / T8 for the unify-policy-decisions plan as a purely additive parity harness. The tests enumerate the policy leak surface inventory and mark current adapter/surface gaps with strict xfail entries so follow-on adapter conversion PRs can remove one allowlist entry at a time.

Each subsequent PR (d/eᵢ) should convert the corresponding adapter or surface to the canonical PolicyViolationError contract, then remove its matching xfail row from these harnesses. The skip rows document inventory items that need an injection seam before an in-process denial can be asserted.

Inventory source: agt-policy-leak-surface-inventory.md.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation size/XL Extra large PR (500+ lines) tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants