Skip to content

Engine API: conformance test suite #2792

Description

@Ricky-G

Tracker: #2729
Epic: 0 (Engine API contract + reference adapter + gap fixes)
ADR reference: docs/adr/0028-agt-studio-unified-ui.md; spec from ADR 0029 (batch #2 (filed as #2787)); capability mechanism from batch #3 (filed as #2788)
Template: Feature request

Filing metadata — Title: Engine API: conformance test suite. Milestone: AGT Studio. Labels: enhancement, agent-mesh, architecture, Priority: HIGH, tests.

Summary

Build the Engine API conformance test suite. Any engine implementation — the reference FastAPI adapter from batch #4 (filed as #2791), a custom in-house adapter, or a third-party reimplementation — must pass this suite to be considered a conformant Engine API target for AGT Studio.

The suite includes capability-metadata checks (every route declares flags, no runtime_mutating: true and read_only_surface: true collision, every MVP route from ADR 0029 is present with the right flags), schema conformance (requests and responses match the OpenAPI spec), and the read-only invariant (the set of routes tagged read_only_surface: true exactly matches the Studio-facing surface defined in ADR 0029).

Scope

In scope

  • A pytest-based suite, runnable via pytest agentmesh.server.engine_api.tests.conformance (path TBD), parameterized over an "engine under test" URL.
  • A pytest fixture that starts the reference adapter (from batch build(deps): Bump scikit-learn from 1.3.2 to 1.5.0 in /packages/agent-os/modules/caas #4 (filed as Engine API: OpenAPI spec + reference adapter + /policies gap-fix #2791)) in-process and points the suite at it. Same suite is runnable against an external URL via environment variable for third-party adapters.
  • Capability-metadata checks:
    • Every MVP route from ADR 0029 is present.
    • Every route declares all three flags (runtime_mutating, user_intent_required, read_only_surface).
    • No route has both runtime_mutating: true and read_only_surface: true.
    • The set of routes with read_only_surface: true exactly matches the ADR 0029 Studio-facing surface (no missing, no extras).
    • POST /api/v1/policy/reload (and any other documented mutating route) carries runtime_mutating: true.
  • Schema conformance checks: every documented endpoint returns a response that matches its OpenAPI schema. Use schemathesis or equivalent if it integrates cleanly; otherwise a lightweight roll-your-own JSON-Schema validator against the dereferenced OpenAPI document.
  • Versioning checks: GET /version or equivalent returns the version contract from ADR 0029; mismatch responses are well-formed.
  • A CI workflow that runs the suite against the reference adapter on every PR that touches the engine_api/ module or the OpenAPI doc.
  • Docs: docs/studio/engine-api-conformance.md explaining how to run the suite against a custom adapter and what each check covers.

Out of scope

  • The CI test inside Studio that asserts the generated client references zero runtime_mutating: true routes → Epic 1d. (Different layer: Studio's invariant is about its own generated client; conformance suite is about the engine's surface.)
  • Performance, load, or stress testing.
  • Security testing beyond the read-only invariant (no auth fuzz, no injection fuzz; those are separate issues if needed).
  • WebSocket / Epic 7a conformance — out of scope until Epic 7a's spec lands.

Background / codebase grounding

Dependencies

Deliverables

  • agent-governance-python/agent-mesh/tests/engine_api/conformance/ (path per ADR 0029): pytest suite, fixtures, helpers.
  • agent-governance-python/agent-mesh/tests/engine_api/conformance/conftest.py: in-process adapter fixture, URL-override fixture, parametrization.
  • CI workflow update (.github/workflows/<existing-mesh-workflow>.yml): conformance job runs on every PR touching engine_api/ or the OpenAPI doc. Per repo AGENTS.md, this should be added with proper permissions: contents: read and pinned actions.
  • docs/studio/engine-api-conformance.md.

Acceptance criteria

  • Suite runs green against the reference adapter from batch build(deps): Bump scikit-learn from 1.3.2 to 1.5.0 in /packages/agent-os/modules/caas #4 (filed as Engine API: OpenAPI spec + reference adapter + /policies gap-fix #2791).
  • Suite fails (intentionally) when run against a deliberately broken adapter: missing route → fails; missing capability flag → fails; runtime_mutating: true + read_only_surface: true → fails; schema mismatch → fails. Tests for these failure modes are themselves in the suite (test_meta_* or similar) so we know enforcement works.
  • Suite is parameterizable over a remote URL (env var) for non-in-process targets.
  • CI workflow added and green on a sample PR.
  • Docs page describes how to run, what each check covers, and how to onboard a custom adapter.
  • Coverage: conformance suite itself ≥ 95% (the suite is small and high-leverage; coverage gaps are likely bugs).
  • Working agreements satisfied: linked from umbrella, milestone, labels, --body-file, rendered-description verification.

Notes for picking this up

  • The suite is the place to put the read-only invariant in its strongest form. Make the assertion an exact set equality on the Studio-facing surface, not "no runtime_mutating: true in the read-only set". An exact-set check catches both directions of drift (a new mutating route accidentally tagged read-only and a new read-only route being missed by the Studio surface declaration).
  • Keep the "meta tests" (tests that prove the suite itself catches breakage) honest by using fixtures that produce a deliberately broken adapter at test time rather than mocking. Mocks here defeat the purpose.
  • If schemathesis is too heavy or too slow, a thin custom validator against the dereferenced OpenAPI schema is fine for MVP. Document the choice in the conformance docs page.
  • The CI workflow change is the most security-sensitive part of this issue per repo AGENTS.md. Pin actions by SHA, scope permissions to least privilege, declare contents: read at top level with write scoped to job where actually needed (no writes are needed for this job).
  • Working agreements for child issues apply.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions