Security hardening, CLI bug fixes, and SARIF report output by DevamShah · Pull Request #322 · KeygraphHQ/shannon

DevamShah · 2026-04-27T03:25:11Z

Drive-by contribution from a long-time Shannon user. I'm aware of the
"external PRs not accepted at this time" notice in the README — happy
to close this and refile via Issues / private channel if preferred,
or split into smaller pieces. Three orthogonal commits, each
independently revertible.

Summary

Three logically distinct commits, one PR for review convenience:

security: hardening — prompt-injection defences in
prompt-manager.ts, UID/GID validation in entrypoint.sh, drop
chmod 777 to chmod 770 on container temp dirs.
fix(cli): — URL try/catch + scheme allowlist on start,
distinguish ENOENT from real I/O errors in the session-poll loop,
locale-aware splash with ASCII fallback for terminals that don't
render Unicode block art.
feat: SARIF 2.1.0 report output — opt-in --report-format sarif
flag wires a new SarifReportOutputProvider through the existing
ReportOutputProvider DI seam. Default behaviour (md only) is
byte-for-byte unchanged.

Why bundled

These were found while reading the codebase to evaluate Shannon for
internal use. Each commit stands alone — squash, cherry-pick, or close
any one without affecting the others.

Commit 1 — security

ID	What	Where
S-1	Prompt injection via `config.description`	`apps/worker/src/services/prompt-manager.ts`
S-2	Credential injection (username / password / TOTP secret) into prompts	same
S-3	`config.avoid` / `config.focus` rule descriptions injected raw	same
S-4	`SHANNON_HOST_UID` / `SHANNON_HOST_GID` consumed by `groupadd` / `useradd` without numeric or range validation	`entrypoint.sh`
S-5	`chmod 777` on `/app`, `/tmp/.cache`, `/tmp/.config`, `/tmp/.npm`	`Dockerfile`

Threat model. Anyone who can write a Shannon config (a CI secret
leak, a compromised target repo) can today embed {{AUTH_CONTEXT}}
or @include(/etc/passwd) in a description and have the agent treat
it as orchestrator-level instruction. After this change those payloads
are inert text — the new `sanitizePromptValue()` breaks {{...}}
placeholder syntax and @include(...) directives. Newlines are
preserved. Applied uniformly to every user-controlled interpolation
site.

`entrypoint.sh` now rejects non-numeric / out-of-range / zero values
for `SHANNON_HOST_UID` and `SHANNON_HOST_GID` with a clear error,
preventing a malicious env from mapping the pentest user to root or
feeding crafted input into a system command.

`chmod 770` is sufficient: only the pentest user (or a UID remapped
into the pentest group) ever runs in the container — world-write adds
blast radius without functional benefit.

Commit 2 — CLI bug fixes

`new URL(args.url)` was called twice deep in setup with no error
handling; a malformed input crashed mid-setup with a raw `TypeError`.
Now wrapped in try/catch with an explicit `http` / `https` scheme
allowlist (the worker assumes web semantics).
The `session.json` polling loop's bare `catch` swallowed
`EACCES` / `EIO` / `ENOTDIR` alike, so a permissions issue
manifested as an indefinite "Waiting for workflow to start...".
Now distinguishes `ENOENT` (steady-state) and `SyntaxError`
(worker mid-write) from real I/O errors.
`splash.ts` falls back to a plain-ASCII variant when the terminal
doesn't advertise UTF-8. Detection uses `LANG` / `LC_ALL` /
`LC_CTYPE` plus the well-known `WT_SESSION` and
`TERM_PROGRAM=vscode` signals. The Unicode visual is preserved on
every modern terminal and only degrades on raw cmd.exe / locale-less
SSH / some CI log streams.

Commit 3 — SARIF report output

`shannon start ... --report-format sarif` writes
`/.shannon/deliverables/comprehensive_security_assessment_report.sarif`
alongside the markdown report. The default (`md`) is unchanged.

Wiring goes:
`CLI flag → SHANNON_REPORT_FORMAT env → worker.ts:configureReportOutputProvider() → setContainerFactory(SarifReportOutputProvider)`

The provider plugs into the existing `ReportOutputProvider` interface
that's already invoked from `generateReportOutputActivity` —
zero changes to the activity / workflow layer.

Tool driver advertises five rules tagged with their CWE IDs and
the canonical OWASP help URI:
`shannon.injection` (CWE-74), `shannon.xss` (CWE-79),
`shannon.auth` (CWE-287), `shannon.ssrf` (CWE-918),
`shannon.authz` (CWE-285). Each non-empty
`*_exploitation_evidence.md` produces one SARIF `result` with the
evidence body as the message (truncated at 16 KiB to stay under
GitHub's per-result cap).

Out of scope (explicitly v0.1). Per-finding line/column locations
inside source files. The agents don't currently emit structured
per-finding metadata, so the artefact location is the deliverable file
itself. The envelope and consumer wiring shipped here unblock that
follow-up — the result mapping is then a one-function change.

Test plan

`pnpm test` — 21/21 vitest cases green across three suites:
- `prompt-manager.test.ts` (11 cases) pins the
  `sanitizePromptValue` and URL validation contracts
- `uid-gid-validation.test.ts` (5 cases) pins the regex+range
  contract that `entrypoint.sh` enforces, so the bash and TS
  sides can never silently drift
- `sarif-output-provider.test.ts` (5 cases) covers SARIF envelope
  shape, one-result-per-evidence, empty/missing handling,
  oversized truncation, and the output path
`pnpm check` — `tsc --noEmit` clean for both `@keygraph/shannon` and `@shannon/worker`
`pnpm build` — turbo build succeeds; `apps/cli/dist/index.mjs`
bundle 53.21 kB / 14.48 kB gzipped (no growth from main)
Manual: `node apps/cli/dist/index.mjs help` shows the new flag;
`LANG=C node apps/cli/dist/index.mjs info` renders the ASCII
fallback splash; `pnpm check` / `pnpm build` / `pnpm test` all
cached on second run
Pre-existing biome issues on `main` (import-organize warnings in
`workflows.ts` etc.) are not touched. All files added or modified
in this PR pass `biome check` cleanly.

Things I deliberately did not do

No `exec $*` quoting fix in `entrypoint.sh`. The args come
from Docker `CMD`, not user input — the practical injection risk is
near zero, and the fix is fiddly enough that I'd rather not bundle
it.
No `--max-cost` cap or `--dry-run` mode. Both were on my
shortlist but each needs cost-tracker / fs-write-audit infrastructure
I don't yet have a clean read on. Happy to follow up with separate
PRs if there's interest.
No SARIF result locations inside source files. Out of scope until
agents emit structured findings (see Commit 3 description).

Built against `main @ 79caada` (post-v1.1.0). Author: Devam Shah —
no commercial affiliation, contributing personally.

…perms Prompts are how the orchestrator instructs LLM agents. Today, several user-controlled config fields are interpolated raw into those prompts, which lets anyone who can write the config also override agent instructions. The UID/GID handling in entrypoint.sh feeds unvalidated env vars into groupadd/useradd, and the runtime image grants 0o777 on several directories that only ever need owner+group access. Changes: - prompt-manager.ts: add `sanitizePromptValue()` and apply it to every user-controlled interpolation site — config.description, focus/avoid rule descriptions, credentials (username/password/totp_secret), and the auth-context block. The sanitiser breaks `{{...}}` placeholder syntax and `@include(path)` directives so injected values cannot pose as real placeholders or pull in arbitrary files. Newlines remain intact since multi-line descriptions are valid input. - entrypoint.sh: validate SHANNON_HOST_UID and SHANNON_HOST_GID against `^[0-9]+$` and a 1..2_000_000 range before they reach groupadd/useradd. Refuses 0 (root), negatives, non-numeric input, and out-of-range values with a clear error. - Dockerfile: drop `chmod 777` to `chmod 770` on /app, /tmp/.cache, /tmp/.config, /tmp/.npm. The container only runs the pentest user (or a remapped UID added to the pentest group via entrypoint), so world-writable adds blast radius without functional benefit. Also chowns the /tmp/.* dirs to pentest. - vitest infrastructure: adds vitest as a dev dep on @shannon/worker, a `test` script + turbo task, and 16 tests covering the sanitiser spec and the UID/GID validation contract that entrypoint.sh enforces. All 16 pass; `tsc --noEmit` clean. Threat model: an attacker with write access to a Shannon config (a CI secret leak, a compromised repo) could previously embed `{{AUTH_CONTEXT}}` or `@include(/etc/passwd)` in a description and have the agent treat it as orchestrator-level instruction or file inclusion. After this change those payloads are inert text.

Three small reliability fixes that surfaced while preparing the security hardening commit: - start.ts: validate the target URL up front. `new URL(args.url)` was called twice deep in the setup flow with no error handling, so a malformed input crashed mid-setup with a raw TypeError. Add a single try/catch at the top, plus an `http`/`https` scheme allowlist — the worker assumes web semantics, so `file://`, `ftp://`, and `javascript:` should be rejected explicitly rather than silently flowing through. - start.ts: in the session.json polling loop, distinguish ENOENT (the expected steady-state until the worker writes the file) and SyntaxError (the worker is mid-write) from real I/O errors. Today the bare catch silently swallows EACCES, EIO, ENOTDIR alike, so a permissions or filesystem issue manifests as an indefinite "Waiting for workflow to start..." with no diagnostic. Genuine errors now exit with the underlying message. - splash.ts: detect terminal UTF-8 capability and fall back to a plain ASCII splash when the terminal can't render the Unicode block art. The existing splash relies on `█`, `╔` and similar; these render as `?` or mojibake in raw cmd.exe, locale-less SSH sessions, and several CI log streams. UTF-8 detection uses LANG / LC_ALL / LC_CTYPE plus the well-known `WT_SESSION` and `TERM_PROGRAM=vscode` signals, so the visual is preserved on every modern terminal and only degrades when needed.

Adds an opt-in SARIF emission path so Shannon findings can be ingested directly by GitHub Code Scanning, GitLab CI security dashboards, Defect Dojo, and any other consumer of the OASIS SARIF 2.1.0 standard. Behaviour: shannon start -u ... -r ... --report-format sarif writes <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif alongside the existing markdown report. The default (`--report-format md`) is unchanged and preserves the current behaviour byte-for-byte. Wiring: - apps/cli: new `--report-format md|sarif` flag on `start`. Validated up front. Help text describes the two values. - apps/cli/docker.ts: forwards SHANNON_REPORT_FORMAT and SHANNON_VERSION to the worker container as env. Env (rather than CLI arg) is the right channel because the worker reads them once at startup to install the provider before any activity input plumbing exists. - apps/worker/temporal/worker.ts: new `configureReportOutputProvider()` inspects SHANNON_REPORT_FORMAT, calls `setContainerFactory()` to inject `SarifReportOutputProvider` for the rest of the worker's lifetime. - apps/worker/services/sarif-output-provider.ts: new provider implementing the existing `ReportOutputProvider` interface. Walks the five `*_exploitation_evidence.md` deliverables that `assembleFinalReport` already consumes; emits one SARIF result per non-empty file with the evidence body as the result message. - apps/worker/services/index.ts: re-exports `SarifReportOutputProvider` for downstream consumers building on the worker package. - apps/worker/tsconfig.json: excludes `__tests__/**` and `vitest.config.ts` from the production compile so test code does not end up in dist/. Tool driver advertises five rules tagged with their CWE IDs and the canonical OWASP help URI: shannon.injection (CWE-74), shannon.xss (CWE-79), shannon.auth (CWE-287), shannon.ssrf (CWE-918), shannon.authz (CWE-285). Scope (v0.1) — and what is intentionally out of scope: - Each non-empty evidence file produces exactly one SARIF result. The result `message` is the evidence body (truncated at 16 KiB to stay within consumer limits, particularly GitHub's per-result cap). - Per-finding line/column locations inside source files are not yet emitted because the agents do not currently produce structured per-finding metadata. That is a natural follow-up — the SARIF envelope and consumer wiring shipped here unblocks it. 5 vitest cases cover the behavioural contract: SARIF envelope shape, one-result-per-non-empty-evidence, empty/missing handling, oversized truncation, and the output path. Total test count across the package is now 21/21 green; `tsc --noEmit` clean for both apps.

Adds an opt-in SARIF emission path so Vedha findings can be consumed directly by GitHub Code Scanning, GitLab, Defect Dojo, and any other SARIF-aware scanner UI. Also rewrites the README to credit upstream Shannon clearly and enumerate exactly what Vedha layers on top. Behaviour: ./vedha start -u ... -r ... --report-format sarif writes <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif alongside the existing markdown report. The default (`--report-format md`) is unchanged byte-for-byte. Wiring: - apps/cli: new `--report-format md|sarif` flag on `start`. Validated up front. Help text describes the two values. - apps/cli/docker.ts: forwards VEDHA_REPORT_FORMAT and VEDHA_VERSION to the worker container as env. Env is the right channel because the worker reads them inside `assembleReportActivity` to gate optional emission, and env survives Temporal serialisation without needing pipeline-input plumbing. - apps/worker/temporal/activities.ts: `assembleReportActivity` now invokes SarifReportOutputProvider after the markdown assembly when VEDHA_REPORT_FORMAT=sarif. SARIF emission is best-effort — a failure there never blocks the markdown path. - apps/worker/services/sarif-output-provider.ts: new provider. Walks the five `*_exploitation_evidence.md` deliverables that `assembleFinalReport` already consumes; emits one SARIF result per non-empty evidence file with the body as the result message (truncated at 16 KiB). - apps/worker/services/index.ts: re-exports SarifReportOutputProvider. - apps/worker/tsconfig.json: excludes __tests__/** from production compile so test code doesn't end up in dist/. Tool driver advertises five rules tagged with their CWE IDs: vedha.injection (CWE-74), vedha.xss (CWE-79), vedha.auth (CWE-287), vedha.ssrf (CWE-918), vedha.authz (CWE-285). Test infrastructure: - vitest dev dep + `test` script on @shannon/worker - vitest.config.ts in apps/worker - turbo `test` task wired up - 5 new SARIF tests covering envelope shape, one-result-per-evidence, empty/missing handling, oversized truncation, output path - Total: 29/29 pass (24 existing + 5 new) - pnpm check + build clean README: - New "Credit & lineage" section: Vedha is a fork of Shannon by Keygraph. Architecture is theirs; Vedha exists to carry security hardening, propose improvements upstream, and integrate with the Archeon stack. - New "What Vedha adds over upstream Shannon" section enumerating all 8 security fixes (S-1..S-8) and the SARIF feature. - Versioning policy table linking Vedha versions to Shannon base. - Cross-link to KeygraphHQ/shannon#322 (the upstream PR carrying the security hardening for review). Out of scope for v1.1.0 (deferred follow-ups): - --max-cost USD kill switch - --dry-run / read-only mode - Per-finding line/column SARIF locations (needs structured findings from agents)

DevamShah added 3 commits April 27, 2026 08:43

Alexi5000 mentioned this pull request May 21, 2026

feat(reporting): add SARIF 2.1.0 output provider #343

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security hardening, CLI bug fixes, and SARIF report output#322

Security hardening, CLI bug fixes, and SARIF report output#322
DevamShah wants to merge 3 commits into
KeygraphHQ:mainfrom
DevamShah:security-bugs-features

DevamShah commented Apr 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

DevamShah commented Apr 27, 2026

Summary

Why bundled

Commit 1 — security

Commit 2 — CLI bug fixes

Commit 3 — SARIF report output

Test plan

Things I deliberately did not do

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant