Add prompt-defense-audit to Red Team Handbook tools#816
Conversation
Adds prompt-defense-audit — a deterministic system prompt defense scanner that checks for defensive posture against 12 common LLM attack vectors using pure regex pattern matching. Maps directly to OWASP LLM Top 10: - LLM01 (Prompt Injection): 6 defense checks - LLM02 (Insecure Output Handling): 1 defense check - LLM06 (Sensitive Info Disclosure): 1 defense check - LLM09 (Overreliance): 1 defense check Zero AI cost, zero dependencies, < 5ms execution. Production-tested at ultralab.tw/probe. GitHub: https://github.com/ppcvote/prompt-defense-audit npm: prompt-defense-audit
|
Status update — added empirical backing since this PR opened:
The MCP audit is particularly relevant for the Red Team Handbook context: it shows that even reference implementations of agent infrastructure ship without basic prompt-level defensive language. A static audit catches this in CI before deployment, complementary to runtime guardrails. Happy to address review feedback when maintainers have a moment. |
|
Update — prompt-defense-audit v1.4.0 released The PR currently references the 12-vector model. v1.4.0 expands to 17 vectors, adding 5 agent-specific defenses derived from a structured analysis of six documented crypto AI agent prompt-injection incidents. Mapping to OWASP LLM Top 10 categories:
Reference incidents (with primary sources in CASE_STUDIES.md):
Companion blog: https://ultralab.tw/en/blog/crypto-ai-agent-prompt-injection-static-analysis If the PR moves forward, happy to update the description to reference the 17-vector model and the case studies. The Red Team Handbook framing fits the new vectors well — they're each grounded in real production failures rather than theoretical attack categories. |
Summary
Adds prompt-defense-audit to the GenAI Red Team Handbook's tools directory and to the resources page.
What This Tool Does
Unlike offensive tools (garak, promptfoo) that probe running LLMs for vulnerabilities,
prompt-defense-audittakes a defensive posture approach: it analyzes system prompt text to determine whether adequate defenses are in place before deployment.Think of it as a configuration audit (is the firewall on?) rather than a penetration test (can I break in?).
OWASP LLM Top 10 Mapping
The tool's 12 defense checks map directly to the OWASP LLM Top 10:
Key Properties
Changes
initiatives/genai_red_team_handbook/tools/prompt-defense-audit/README.md— detailed tool documentation with usage examples and OWASP mappingresources/index.mdlinking to the toolLinks
npm install prompt-defense-audit