Skip to content

Commit 1b98f99

Browse files
committed
Vedha v1.1.0 — SARIF 2.1.0 report output + README credit/lineage
Adds an opt-in SARIF emission path so Vedha findings can be consumed directly by GitHub Code Scanning, GitLab, Defect Dojo, and any other SARIF-aware scanner UI. Also rewrites the README to credit upstream Shannon clearly and enumerate exactly what Vedha layers on top. Behaviour: ./vedha start -u ... -r ... --report-format sarif writes <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif alongside the existing markdown report. The default (`--report-format md`) is unchanged byte-for-byte. Wiring: - apps/cli: new `--report-format md|sarif` flag on `start`. Validated up front. Help text describes the two values. - apps/cli/docker.ts: forwards VEDHA_REPORT_FORMAT and VEDHA_VERSION to the worker container as env. Env is the right channel because the worker reads them inside `assembleReportActivity` to gate optional emission, and env survives Temporal serialisation without needing pipeline-input plumbing. - apps/worker/temporal/activities.ts: `assembleReportActivity` now invokes SarifReportOutputProvider after the markdown assembly when VEDHA_REPORT_FORMAT=sarif. SARIF emission is best-effort — a failure there never blocks the markdown path. - apps/worker/services/sarif-output-provider.ts: new provider. Walks the five `*_exploitation_evidence.md` deliverables that `assembleFinalReport` already consumes; emits one SARIF result per non-empty evidence file with the body as the result message (truncated at 16 KiB). - apps/worker/services/index.ts: re-exports SarifReportOutputProvider. - apps/worker/tsconfig.json: excludes __tests__/** from production compile so test code doesn't end up in dist/. Tool driver advertises five rules tagged with their CWE IDs: vedha.injection (CWE-74), vedha.xss (CWE-79), vedha.auth (CWE-287), vedha.ssrf (CWE-918), vedha.authz (CWE-285). Test infrastructure: - vitest dev dep + `test` script on @shannon/worker - vitest.config.ts in apps/worker - turbo `test` task wired up - 5 new SARIF tests covering envelope shape, one-result-per-evidence, empty/missing handling, oversized truncation, output path - Total: 29/29 pass (24 existing + 5 new) - pnpm check + build clean README: - New "Credit & lineage" section: Vedha is a fork of Shannon by Keygraph. Architecture is theirs; Vedha exists to carry security hardening, propose improvements upstream, and integrate with the Archeon stack. - New "What Vedha adds over upstream Shannon" section enumerating all 8 security fixes (S-1..S-8) and the SARIF feature. - Versioning policy table linking Vedha versions to Shannon base. - Cross-link to KeygraphHQ/shannon#322 (the upstream PR carrying the security hardening for review). Out of scope for v1.1.0 (deferred follow-ups): - --max-cost USD kill switch - --dry-run / read-only mode - Per-finding line/column SARIF locations (needs structured findings from agents)
1 parent 53dac8c commit 1b98f99

14 files changed

Lines changed: 530 additions & 16 deletions

File tree

README.md

Lines changed: 98 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,111 @@
1-
>[!NOTE]
2-
> **[📢 New: Shannon is now available via `npx @keygraph/shannon`. →](https://github.com/KeygraphHQ/shannon/discussions/249)**
3-
41
<div align="center">
52

6-
<img src="./assets/github-banner.png" alt="Shannon — AI Pentester for Web Applications and APIs" width="100%">
3+
# Vedha — Autonomous AI Pentester
4+
5+
**A friendly fork of [Shannon by Keygraph](https://github.com/KeygraphHQ/shannon),
6+
hardened and extended for production-side use.**
77

8-
# Shannon — AI Pentester by Keygraph
8+
[![Upstream: Shannon](https://img.shields.io/badge/upstream-KeygraphHQ%2Fshannon-blue?logo=github)](https://github.com/KeygraphHQ/shannon)
9+
[![License: AGPL-3.0](https://img.shields.io/badge/license-AGPL--3.0-blue.svg)](LICENSE)
910

10-
<a href="https://trendshift.io/repositories/15604" target="_blank"><img src="https://trendshift.io/api/badge/repositories/15604" alt="KeygraphHQ%2Fshannon | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a>
11+
</div>
1112

12-
Shannon is an autonomous, white-box AI pentester for web applications and APIs. <br />
13-
It analyzes your source code, identifies attack vectors, and executes real exploits to prove vulnerabilities before they reach production.
13+
## Credit & lineage
1414

15-
---
15+
Vedha is built on top of **[Shannon](https://github.com/KeygraphHQ/shannon)**,
16+
the open-source autonomous AI pentester developed by
17+
**[Keygraph](https://keygraph.io)**. The full pipeline architecture
18+
(Temporal workflows, the five-domain agent topology, browser-driven
19+
exploitation, the Wolfi container image, the report assembler) is
20+
Shannon's. The license is AGPL-3.0, inherited from upstream.
21+
22+
If you're evaluating an AI pentester for the first time, **start with
23+
upstream Shannon** — it's the canonical project. Vedha exists for
24+
three reasons:
25+
26+
1. To carry **security hardening patches** that haven't been upstreamed
27+
yet (see "What Vedha adds" below).
28+
2. To act as a **testbed for additions** — like SARIF output — that
29+
we're proposing back to upstream via PR.
30+
3. To let me run a customised build inside the Archeon agent stack
31+
without forking the upstream's release cadence.
32+
33+
Where this README still says "Shannon," that's because the upstream
34+
docs accurately describe behaviour Vedha inherits unchanged. Wherever
35+
behaviour differs, Vedha's section takes precedence.
36+
37+
## What Vedha adds over upstream Shannon
38+
39+
Three categories of changes layered on top of Shannon Lite:
40+
41+
### 1. Security hardening (8 issues)
42+
43+
| ID | What | File |
44+
|---|---|---|
45+
| **S-1** | `sanitizePromptValue()` neutralises `{{...}}` placeholder syntax and `@include(...)` directives in every user-controlled prompt interpolation site (config description, focus/avoid rule descriptions, credentials, auth-context). Prevents prompt injection by anyone who can write a Shannon config. | `apps/worker/src/services/prompt-manager.ts` |
46+
| **S-2** | Credentials (username / password / TOTP secret) are sanitised before reaching the prompt template. | same |
47+
| **S-3** | Rule descriptions in `config.avoid` and `config.focus` are sanitised before interpolation. | same |
48+
| **S-4** | `SHANNON_HOST_UID` / `SHANNON_HOST_GID` are validated as numeric and within `1..2_000_000` before they reach `groupadd`/`useradd`. Rejects 0 (root), negatives, and non-numeric input that would otherwise feed `userdel ; rm -rf /`-style payloads into a privileged command. | `entrypoint.sh` |
49+
| **S-5** | Container temp dirs (`/app`, `/tmp/.cache`, `/tmp/.config`, `/tmp/.npm`) drop from `chmod 777` to `chmod 770`. | `Dockerfile` |
50+
| **S-6** | URL is parsed once up front with a try/catch and an `http`/`https` scheme allowlist, instead of crashing mid-setup with a raw `TypeError` on a malformed input. | `apps/cli/src/index.ts` |
51+
| **S-7** | The `session.json` polling loop now distinguishes `ENOENT` (the expected steady-state) and `SyntaxError` (worker mid-write) from real I/O errors (`EACCES`, `EIO`, `ENOTDIR`), so a permissions issue surfaces with a real diagnostic instead of an indefinite spinner. | same |
52+
| **S-8** | Splash falls back to plain ASCII when the terminal doesn't advertise UTF-8, instead of emitting `?`/mojibake on raw cmd.exe / locale-less SSH / some CI log streams. | `apps/cli/src/splash.ts` |
53+
54+
These patches were originally written for Vedha and have also been
55+
proposed back to upstream Shannon as
56+
[KeygraphHQ/shannon#322](https://github.com/KeygraphHQ/shannon/pull/322).
57+
58+
### 2. SARIF 2.1.0 report output
1659

17-
<a href="https://discord.gg/9ZqQPuhJB7"><img src="./assets/discord.png" height="40" alt="Join Discord"></a>
18-
<a href="https://keygraph.io/"><img src="./assets/Keygraph_Button.png" height="40" alt="Visit Keygraph.io"></a>
60+
A new `--report-format sarif` flag emits a SARIF 2.1.0 file alongside
61+
the markdown report, so findings can be ingested by:
62+
63+
- **GitHub Code Scanning** (auto-uploaded by `github/codeql-action/upload-sarif`)
64+
- **GitLab CI** security dashboards
65+
- **Defect Dojo**, **SonarQube**, and any other SARIF-aware scanner UI
66+
67+
```bash
68+
./vedha start --url https://example.com --repo my-repo --report-format sarif
69+
# writes:
70+
# <repo>/.shannon/deliverables/comprehensive_security_assessment_report.md
71+
# <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif
72+
```
73+
74+
The tool driver advertises five rules tagged with their CWE IDs:
75+
`vedha.injection` (CWE-74), `vedha.xss` (CWE-79),
76+
`vedha.auth` (CWE-287), `vedha.ssrf` (CWE-918),
77+
`vedha.authz` (CWE-285). Default behaviour (`md`) is unchanged —
78+
SARIF is opt-in.
79+
80+
### 3. Branding & integration
81+
82+
- CLI rebranded to `./vedha` / `npx @archeon/vedha` (Shannon's `./shannon` invocation works too via the legacy entrypoint).
83+
- State directory at `~/.vedha/` instead of `~/.shannon/`.
84+
- Logo and ASCII splash refreshed.
85+
86+
## Versioning & sync policy
87+
88+
Vedha tracks Shannon mainline at coarse cadence — typically a couple of
89+
releases behind. When upstream ships a meaningful change (a CVE fix, a
90+
new vulnerability domain, a workflow refactor), Vedha syncs and tests
91+
before tagging.
92+
93+
| Vedha version | Based on Shannon | What's new in Vedha |
94+
|---|---|---|
95+
| **v1.0.0** | pre-v1.1.0 main | Initial fork; 8 security fixes (S-1..S-8) |
96+
| **v1.1.0** *(this release)* | pre-v1.1.0 main | + SARIF 2.1.0 output (`--report-format sarif`) |
97+
98+
For all upstream features not listed under "What Vedha adds," refer to
99+
[Shannon's documentation](https://github.com/KeygraphHQ/shannon)
100+
Vedha inherits them unchanged.
19101

20102
---
21-
</div>
22103

23-
## What is Shannon?
104+
> The remainder of this README is Shannon's documentation, lightly
105+
> edited for Vedha. Behaviour described below applies to Vedha
106+
> identically unless explicitly noted.
107+
108+
## What is Shannon? (inherited)
24109

25110
Shannon is an AI pentester developed by [Keygraph](https://keygraph.io). It performs white-box security testing of web applications and their underlying APIs by combining source code analysis with live exploitation.
26111

apps/cli/src/commands/start.ts

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ export interface StartArgs {
2424
output?: string;
2525
pipelineTesting: boolean;
2626
router: boolean;
27+
reportFormat: 'md' | 'sarif';
2728
version: string;
2829
}
2930

@@ -125,6 +126,7 @@ export async function start(args: StartArgs): Promise<void> {
125126
taskQueue,
126127
containerName,
127128
envFlags: buildEnvFlags(),
129+
reportFormat: args.reportFormat,
128130
...(config && { config }),
129131
...(hasCredentials && { credentials: credentialsPath }),
130132
...(promptsDir && { promptsDir }),

apps/cli/src/docker.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -196,6 +196,7 @@ export interface WorkerOptions {
196196
outputDir?: string;
197197
workspace: string;
198198
pipelineTesting?: boolean;
199+
reportFormat?: 'md' | 'sarif';
199200
}
200201

201202
/**
@@ -244,6 +245,15 @@ export function spawnWorker(opts: WorkerOptions): ChildProcess {
244245
// Environment
245246
args.push(...opts.envFlags);
246247

248+
// Forward Vedha-specific runtime flags as env. Done as env (rather than
249+
// CLI args) because the worker reads them inside the activity to gate
250+
// optional output emission, and env survives Temporal serialisation
251+
// without needing pipeline-input plumbing.
252+
if (opts.reportFormat && opts.reportFormat !== 'md') {
253+
args.push('-e', `VEDHA_REPORT_FORMAT=${opts.reportFormat}`);
254+
}
255+
args.push('-e', `VEDHA_VERSION=${opts.version}`);
256+
247257
// Container settings
248258
args.push('--shm-size', '2gb', '--security-opt', 'seccomp=unconfined');
249259

apps/cli/src/index.ts

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,10 @@ Options for 'start':
7070
-w, --workspace <name> Named workspace (auto-resumes if exists)
7171
--pipeline-testing Use minimal prompts for fast testing
7272
--router Route requests through claude-code-router
73+
--report-format <fmt> Report output format: 'md' (default) or 'sarif'
74+
'sarif' emits a SARIF 2.1.0 file alongside the
75+
markdown report for ingestion by GitHub Code
76+
Scanning, GitLab, Defect Dojo, etc.
7377
7478
Examples:
7579
${prefix} start -u https://example.com -r ${mode === 'local' ? 'my-repo' : './my-repo'}
@@ -87,6 +91,8 @@ Monitor workflows at http://localhost:8233
8791
`);
8892
}
8993

94+
type ReportFormat = 'md' | 'sarif';
95+
9096
interface ParsedStartArgs {
9197
url: string;
9298
repo: string;
@@ -95,6 +101,7 @@ interface ParsedStartArgs {
95101
output?: string;
96102
pipelineTesting: boolean;
97103
router: boolean;
104+
reportFormat: ReportFormat;
98105
}
99106

100107
function parseStartArgs(argv: string[]): ParsedStartArgs {
@@ -105,6 +112,7 @@ function parseStartArgs(argv: string[]): ParsedStartArgs {
105112
let output: string | undefined;
106113
let pipelineTesting = false;
107114
let router = false;
115+
let reportFormat: ReportFormat = 'md';
108116

109117
for (let i = 0; i < argv.length; i++) {
110118
const arg = argv[i];
@@ -152,6 +160,16 @@ function parseStartArgs(argv: string[]): ParsedStartArgs {
152160
case '--router':
153161
router = true;
154162
break;
163+
case '--report-format':
164+
if (next && !next.startsWith('-')) {
165+
if (next !== 'md' && next !== 'sarif') {
166+
console.error(`ERROR: --report-format must be 'md' or 'sarif', got '${next}'`);
167+
process.exit(1);
168+
}
169+
reportFormat = next;
170+
i++;
171+
}
172+
break;
155173
default:
156174
console.error(`Unknown option: ${arg}`);
157175
console.error(`Run "${getMode() === 'local' ? './vedha' : 'npx @archeon/vedha'} help" for usage`);
@@ -182,6 +200,7 @@ function parseStartArgs(argv: string[]): ParsedStartArgs {
182200
repo,
183201
pipelineTesting,
184202
router,
203+
reportFormat,
185204
...(config && { config }),
186205
...(workspace && { workspace }),
187206
...(output && { output }),

apps/worker/package.json

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,8 @@
1616
"scripts": {
1717
"build": "tsc",
1818
"check": "tsc --noEmit",
19-
"clean": "rm -rf dist"
19+
"clean": "rm -rf dist",
20+
"test": "vitest run"
2021
},
2122
"dependencies": {
2223
"@anthropic-ai/claude-agent-sdk": "catalog:",
@@ -32,6 +33,7 @@
3233
"zx": "^8.0.0"
3334
},
3435
"devDependencies": {
35-
"@types/js-yaml": "^4.0.9"
36+
"@types/js-yaml": "^4.0.9",
37+
"vitest": "^4.1.2"
3638
}
3739
}
Lines changed: 145 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,145 @@
1+
/**
2+
* Behavioural tests for SarifReportOutputProvider.
3+
*
4+
* Covers the contract that consumers (GitHub Code Scanning, GitLab,
5+
* Defect Dojo) actually depend on:
6+
* - SARIF 2.1.0 envelope with the expected top-level fields
7+
* - Tool driver advertises the five built-in vulnerability rules
8+
* - One result per non-empty evidence file, ruleId matching the rule
9+
* - Empty / missing evidence files do not produce results
10+
* - Result messages are truncated rather than dropping out at limit
11+
*/
12+
13+
import fs from 'node:fs/promises';
14+
import os from 'node:os';
15+
import path from 'node:path';
16+
import { afterEach, describe, expect, it } from 'vitest';
17+
import { SarifReportOutputProvider } from '../services/sarif-output-provider.js';
18+
import type { ActivityLogger } from '../types/activity-logger.js';
19+
20+
const noopLogger: ActivityLogger = {
21+
info: () => undefined,
22+
warn: () => undefined,
23+
error: () => undefined,
24+
};
25+
26+
async function setupRepoWithDeliverables(
27+
evidence: Record<string, string>,
28+
): Promise<{ repoPath: string; cleanup: () => Promise<void> }> {
29+
const repoPath = await fs.mkdtemp(path.join(os.tmpdir(), 'vedha-sarif-test-'));
30+
const deliverablesPath = path.join(repoPath, '.shannon', 'deliverables');
31+
await fs.mkdir(deliverablesPath, { recursive: true });
32+
for (const [name, body] of Object.entries(evidence)) {
33+
await fs.writeFile(path.join(deliverablesPath, name), body, 'utf8');
34+
}
35+
return {
36+
repoPath,
37+
cleanup: () => fs.rm(repoPath, { recursive: true, force: true }),
38+
};
39+
}
40+
41+
function makeInput(repoPath: string): { repoPath: string } {
42+
return { repoPath };
43+
}
44+
45+
describe('SarifReportOutputProvider', () => {
46+
let cleanup: (() => Promise<void>) | null = null;
47+
48+
afterEach(async () => {
49+
if (cleanup) {
50+
await cleanup();
51+
cleanup = null;
52+
}
53+
});
54+
55+
it('emits a valid SARIF 2.1.0 envelope when at least one finding exists', async () => {
56+
const setup = await setupRepoWithDeliverables({
57+
'injection_exploitation_evidence.md': '## SQL injection in /api/users\n\nProof: `' + "1' OR '1'='1" + '`',
58+
});
59+
cleanup = setup.cleanup;
60+
61+
const provider = new SarifReportOutputProvider('1.1.0');
62+
const result = await provider.generate(makeInput(setup.repoPath), noopLogger);
63+
64+
expect(result.outputPath).toBeDefined();
65+
const sarif = JSON.parse(await fs.readFile(result.outputPath as string, 'utf8'));
66+
expect(sarif.version).toBe('2.1.0');
67+
expect(sarif.$schema).toMatch(/sarif-schema-2\.1\.0/);
68+
expect(sarif.runs).toHaveLength(1);
69+
expect(sarif.runs[0].tool.driver.name).toBe('Vedha');
70+
expect(sarif.runs[0].tool.driver.version).toBe('1.1.0');
71+
expect(sarif.runs[0].tool.driver.rules).toHaveLength(5);
72+
expect(sarif.runs[0].results).toHaveLength(1);
73+
expect(sarif.runs[0].results[0].ruleId).toBe('vedha.injection');
74+
expect(sarif.runs[0].results[0].message.text).toMatch(/SQL injection/);
75+
});
76+
77+
it('emits one result per non-empty evidence file', async () => {
78+
const setup = await setupRepoWithDeliverables({
79+
'injection_exploitation_evidence.md': 'finding',
80+
'xss_exploitation_evidence.md': 'finding',
81+
'authz_exploitation_evidence.md': 'finding',
82+
});
83+
cleanup = setup.cleanup;
84+
85+
const provider = new SarifReportOutputProvider();
86+
const result = await provider.generate(makeInput(setup.repoPath), noopLogger);
87+
const sarif = JSON.parse(await fs.readFile(result.outputPath as string, 'utf8'));
88+
89+
expect(sarif.runs[0].results.map((r: { ruleId: string }) => r.ruleId).sort()).toEqual([
90+
'vedha.authz',
91+
'vedha.injection',
92+
'vedha.xss',
93+
]);
94+
});
95+
96+
it('skips empty and missing evidence files', async () => {
97+
const setup = await setupRepoWithDeliverables({
98+
'injection_exploitation_evidence.md': '',
99+
'xss_exploitation_evidence.md': ' \n \t\n',
100+
// auth/ssrf/authz: not written at all
101+
});
102+
cleanup = setup.cleanup;
103+
104+
const provider = new SarifReportOutputProvider();
105+
const result = await provider.generate(makeInput(setup.repoPath), noopLogger);
106+
const sarif = JSON.parse(await fs.readFile(result.outputPath as string, 'utf8'));
107+
108+
expect(sarif.runs[0].results).toHaveLength(0);
109+
// Even with zero results, the envelope must be valid.
110+
expect(sarif.version).toBe('2.1.0');
111+
expect(sarif.runs[0].tool.driver.rules).toHaveLength(5);
112+
});
113+
114+
it('truncates oversized evidence rather than dropping it', async () => {
115+
const huge = 'A'.repeat(64 * 1024); // 64 KiB, well above the 16 KiB limit
116+
const setup = await setupRepoWithDeliverables({
117+
'auth_exploitation_evidence.md': huge,
118+
});
119+
cleanup = setup.cleanup;
120+
121+
const provider = new SarifReportOutputProvider();
122+
const result = await provider.generate(makeInput(setup.repoPath), noopLogger);
123+
const sarif = JSON.parse(await fs.readFile(result.outputPath as string, 'utf8'));
124+
const messageText = sarif.runs[0].results[0].message.text as string;
125+
126+
expect(sarif.runs[0].results).toHaveLength(1);
127+
expect(messageText.length).toBeLessThan(huge.length);
128+
expect(messageText).toMatch(/\[truncated\]$/);
129+
});
130+
131+
it('writes the SARIF file alongside the markdown report', async () => {
132+
const setup = await setupRepoWithDeliverables({
133+
'ssrf_exploitation_evidence.md': 'finding',
134+
});
135+
cleanup = setup.cleanup;
136+
137+
const provider = new SarifReportOutputProvider();
138+
const result = await provider.generate(makeInput(setup.repoPath), noopLogger);
139+
140+
expect(result.outputPath).toBe(
141+
path.join(setup.repoPath, '.shannon', 'deliverables', 'comprehensive_security_assessment_report.sarif'),
142+
);
143+
await expect(fs.access(result.outputPath as string)).resolves.toBeUndefined();
144+
});
145+
});

apps/worker/src/services/index.ts

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,4 @@ export { Container, getContainer, getOrCreateContainer, removeContainer } from '
2020
export { ExploitationCheckerService } from './exploitation-checker.js';
2121
export { loadPrompt } from './prompt-manager.js';
2222
export { assembleFinalReport, injectModelIntoReport } from './reporting.js';
23+
export { SarifReportOutputProvider } from './sarif-output-provider.js';

0 commit comments

Comments
 (0)