feat(ASI): add behavioral trust evidence type specification by 0xbrainkid · Pull Request #819 · OWASP/www-project-top-10-for-large-language-model-applications

0xbrainkid · 2026-04-07T14:55:50Z

Summary

Adds a formal behavioral trust evidence type specification to the ASI agentic top 10 documentation. This addresses the gap identified in issue #802 around formalizing behavioral trust as a control class for runtime enforcement.

Per @desiorac's invitation in issue #802.

What it adds

A standardized evidence type for expressing agent behavioral trustworthiness as an input to admissibility predicates at mutation boundaries.

Evidence structure: trust_score, derivation, drift_status, verification
Trust scoring formula: success_rate × confidence(volume), scoped per task class — deterministic, auditable, no ML required
Three enforceability tiers:
- Strong (on-chain atomic, ~400ms)
- Bounded (version-anchored with configurable TTL)
- Detectable-only (self-declared or signed without anchor)
Integration points: OWASP admissibility predicates, mutation boundary gate (from issue Runtime Enforcement Mapping for OWASP Agentic Top 10 (ASI01–ASI10) #802 discussion), W3C TrustProvider interface, LangGraph multi-provider trust
Security considerations: Sybil resistance via confidence function, temporal decay, cold-start delegation chain

Connections to existing work

Maps directly to the six-layer governance model that emerged in issue Runtime Enforcement Mapping for OWASP Agentic Top 10 (ASI01–ASI10) #802
Complements the 5-layer decomposition (@QueBallSharken, @desiorac): authorization → execution evidence → mutation authority → boundary integrity → behavioral trust (this spec)
Feeds into the MITRE ATLAS follow-up artifact @QueBallSharken proposed

Adds a formal evidence type definition for behavioral trust scoring as a control class input to ASI01-ASI03 runtime enforcement. Includes: - Evidence structure (trust_score, drift_status, verification) - Trust scoring formula: success_rate × confidence(volume), per-task-class - Enforceability classification: strong / bounded / detectable-only - Integration points: OWASP admissibility predicates, mutation boundary (MITRE), TrustProvider interface (W3C/LangGraph) - Security considerations: Sybil resistance, temporal decay, cold-start Referenced from discussion OWASP#802 (Runtime Enforcement Mapping).

@desiorac

…table fingerprint Per @desiorac's review in issue OWASP#802: baseline_version (opaque string) was insufficient because it doesn't provide the monotonic reference property needed for replay-verifiable proofs. Replaced with: - baseline_snapshot_hash: SHA-256 of canonicalized baseline (JCS) - baseline_snapshot_ts: when baseline was computed - Added MUST requirement: verifiers reject enforcement-mode evidence without this field

@desiorac

Per @desiorac review: high-volume read-only calls inflate confidence for the whole agent, allowing write/payment operations to reach 'strong' enforceability on borrowed trust. Fix: gates MUST evaluate task_class-scoped evidence. cross_class_score is optional for display but MUST NOT be used for enforcement decisions. Enforceability tier is now task-class-specific.

QueBallSharken

Agreed. This is the correct constraint.

Behavioral trust only has enforcement value if it is scoped to the operation class being authorized. High-volume low-risk activity must not inflate admissibility for write/payment/mutation classes.

So I support making this normative:

task_class is required for gate evaluation
the evaluated evidence must match the current operation type
cross_class_score may be retained for display or analysis, but MUST NOT be used for enforcement decisions

That keeps the trust artifact aligned to the guarded primitive and prevents cross-class borrowing of trust.

QueBallSharken

Approved.

This closes an actual gap from #802 by turning behavioral trust into a defined evidence type instead of leaving it as an implied concept. The important part is that the enforcement path is now constrained correctly:

trust evidence is evaluated at the mutation boundary
"task_class" is required for gate evaluation
enforcement is bound to the current operation class
"cross_class_score" is not allowed to influence admissibility

That keeps trust aligned to the guarded primitive and prevents cross-class borrowing, which is the main failure mode this needed to avoid.

I also agree with the follow-up fixes:

"baseline_snapshot_hash" being explicitly SHA-256 of immutable baseline material
per-task-class gate evaluation being made normative

Those changes make the artifact more auditable, more deterministic, and more usable as a real enforcement input rather than just descriptive metadata.

Approve.

QueBallSharken · 2026-04-12T08:36:38Z

Clarifying the boundary

This is a useful addition, but I want to keep one distinction explicit in the record.

A behavioral trust evidence type is a control/evidence input to admissibility at a guarded boundary.

It is not by itself the same thing as the broader architecture question BBIS is trying to keep visible.

The unresolved BBIS-class question is whether the governing invariant remains live across all mutation-capable boundaries until the true irreversible mutation authority / primitive, rather than only being checked locally at one boundary.

So I would frame this as:

a useful enforcement input
compatible with stronger mutation-bound governance models
not a substitute for boundary-to-boundary invariant survival analysis or conformance

That distinction matters because otherwise a valid local control can get read as if it settles the larger architectural continuity question, and it does not.

QueBallSharken · 2026-04-23T13:06:31Z

One thing I’d still tighten in the record is the claim boundary around the enforceability tiers.

In BBIS terms, this PR now does a good job defining behavioral trust as a scoped enforcement input at a guarded boundary. What it should still say explicitly is what each tier is and is not claiming.

In particular:

"Strong" should be read as strong for the guarded boundary and stated enforcement context, not as a claim that the same governing invariant remained live across all mutation-capable boundaries to the true irreversible mutation authority / primitive.
"Bounded" should state the scope and freshness assumptions under which the trust evidence is being used.
"Detectable-only" should remain clearly non-preventive.

Related to that, I think the freshness rule still matters: what invalidates behavioral trust evidence, when reevaluation is required, and whether drift forces downgrade or fail-closed behavior at the boundary.

My main reason for pushing this is just to keep a useful local control from being overread as if it settles the larger BBIS-class continuity question. It does not need to do that to still be a good addition.

0xbrainkid requested review from guerilla7, hoeg and itskerenkatz as code owners April 7, 2026 14:55

0xbrainkid mentioned this pull request Apr 7, 2026

Runtime Enforcement Mapping for OWASP Agentic Top 10 (ASI01–ASI10) #802

Open

brainGROWTH added 2 commits April 7, 2026 18:54

QueBallSharken reviewed Apr 8, 2026

View reviewed changes

QueBallSharken approved these changes Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(ASI): add behavioral trust evidence type specification#819

feat(ASI): add behavioral trust evidence type specification#819
0xbrainkid wants to merge 3 commits into
OWASP:mainfrom
0xbrainkid:feat/behavioral-trust-evidence-type

0xbrainkid commented Apr 7, 2026

Uh oh!

QueBallSharken left a comment

Uh oh!

QueBallSharken left a comment

Uh oh!

QueBallSharken commented Apr 12, 2026

Uh oh!

QueBallSharken commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

0xbrainkid commented Apr 7, 2026

Summary

What it adds

Connections to existing work

Uh oh!

QueBallSharken left a comment

Choose a reason for hiding this comment

Uh oh!

QueBallSharken left a comment

Choose a reason for hiding this comment

Uh oh!

QueBallSharken commented Apr 12, 2026

Clarifying the boundary

Uh oh!

QueBallSharken commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants