Security hardening, CLI bug fixes, and SARIF report output#322
Open
DevamShah wants to merge 3 commits into
Open
Security hardening, CLI bug fixes, and SARIF report output#322DevamShah wants to merge 3 commits into
DevamShah wants to merge 3 commits into
Conversation
…perms
Prompts are how the orchestrator instructs LLM agents. Today, several
user-controlled config fields are interpolated raw into those prompts,
which lets anyone who can write the config also override agent
instructions. The UID/GID handling in entrypoint.sh feeds unvalidated
env vars into groupadd/useradd, and the runtime image grants 0o777 on
several directories that only ever need owner+group access.
Changes:
- prompt-manager.ts: add `sanitizePromptValue()` and apply it to every
user-controlled interpolation site — config.description, focus/avoid
rule descriptions, credentials (username/password/totp_secret), and
the auth-context block. The sanitiser breaks `{{...}}` placeholder
syntax and `@include(path)` directives so injected values cannot pose
as real placeholders or pull in arbitrary files. Newlines remain
intact since multi-line descriptions are valid input.
- entrypoint.sh: validate SHANNON_HOST_UID and SHANNON_HOST_GID against
`^[0-9]+$` and a 1..2_000_000 range before they reach groupadd/useradd.
Refuses 0 (root), negatives, non-numeric input, and out-of-range
values with a clear error.
- Dockerfile: drop `chmod 777` to `chmod 770` on /app, /tmp/.cache,
/tmp/.config, /tmp/.npm. The container only runs the pentest user
(or a remapped UID added to the pentest group via entrypoint), so
world-writable adds blast radius without functional benefit. Also
chowns the /tmp/.* dirs to pentest.
- vitest infrastructure: adds vitest as a dev dep on @shannon/worker,
a `test` script + turbo task, and 16 tests covering the sanitiser
spec and the UID/GID validation contract that entrypoint.sh enforces.
All 16 pass; `tsc --noEmit` clean.
Threat model: an attacker with write access to a Shannon config (a CI
secret leak, a compromised repo) could previously embed
`{{AUTH_CONTEXT}}` or `@include(/etc/passwd)` in a description and have
the agent treat it as orchestrator-level instruction or file inclusion.
After this change those payloads are inert text.
Three small reliability fixes that surfaced while preparing the security hardening commit: - start.ts: validate the target URL up front. `new URL(args.url)` was called twice deep in the setup flow with no error handling, so a malformed input crashed mid-setup with a raw TypeError. Add a single try/catch at the top, plus an `http`/`https` scheme allowlist — the worker assumes web semantics, so `file://`, `ftp://`, and `javascript:` should be rejected explicitly rather than silently flowing through. - start.ts: in the session.json polling loop, distinguish ENOENT (the expected steady-state until the worker writes the file) and SyntaxError (the worker is mid-write) from real I/O errors. Today the bare catch silently swallows EACCES, EIO, ENOTDIR alike, so a permissions or filesystem issue manifests as an indefinite "Waiting for workflow to start..." with no diagnostic. Genuine errors now exit with the underlying message. - splash.ts: detect terminal UTF-8 capability and fall back to a plain ASCII splash when the terminal can't render the Unicode block art. The existing splash relies on `█`, `╔` and similar; these render as `?` or mojibake in raw cmd.exe, locale-less SSH sessions, and several CI log streams. UTF-8 detection uses LANG / LC_ALL / LC_CTYPE plus the well-known `WT_SESSION` and `TERM_PROGRAM=vscode` signals, so the visual is preserved on every modern terminal and only degrades when needed.
Adds an opt-in SARIF emission path so Shannon findings can be ingested directly by GitHub Code Scanning, GitLab CI security dashboards, Defect Dojo, and any other consumer of the OASIS SARIF 2.1.0 standard. Behaviour: shannon start -u ... -r ... --report-format sarif writes <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif alongside the existing markdown report. The default (`--report-format md`) is unchanged and preserves the current behaviour byte-for-byte. Wiring: - apps/cli: new `--report-format md|sarif` flag on `start`. Validated up front. Help text describes the two values. - apps/cli/docker.ts: forwards SHANNON_REPORT_FORMAT and SHANNON_VERSION to the worker container as env. Env (rather than CLI arg) is the right channel because the worker reads them once at startup to install the provider before any activity input plumbing exists. - apps/worker/temporal/worker.ts: new `configureReportOutputProvider()` inspects SHANNON_REPORT_FORMAT, calls `setContainerFactory()` to inject `SarifReportOutputProvider` for the rest of the worker's lifetime. - apps/worker/services/sarif-output-provider.ts: new provider implementing the existing `ReportOutputProvider` interface. Walks the five `*_exploitation_evidence.md` deliverables that `assembleFinalReport` already consumes; emits one SARIF result per non-empty file with the evidence body as the result message. - apps/worker/services/index.ts: re-exports `SarifReportOutputProvider` for downstream consumers building on the worker package. - apps/worker/tsconfig.json: excludes `__tests__/**` and `vitest.config.ts` from the production compile so test code does not end up in dist/. Tool driver advertises five rules tagged with their CWE IDs and the canonical OWASP help URI: shannon.injection (CWE-74), shannon.xss (CWE-79), shannon.auth (CWE-287), shannon.ssrf (CWE-918), shannon.authz (CWE-285). Scope (v0.1) — and what is intentionally out of scope: - Each non-empty evidence file produces exactly one SARIF result. The result `message` is the evidence body (truncated at 16 KiB to stay within consumer limits, particularly GitHub's per-result cap). - Per-finding line/column locations inside source files are not yet emitted because the agents do not currently produce structured per-finding metadata. That is a natural follow-up — the SARIF envelope and consumer wiring shipped here unblocks it. 5 vitest cases cover the behavioural contract: SARIF envelope shape, one-result-per-non-empty-evidence, empty/missing handling, oversized truncation, and the output path. Total test count across the package is now 21/21 green; `tsc --noEmit` clean for both apps.
DevamShah
added a commit
to DevamShah/vedha
that referenced
this pull request
Apr 27, 2026
Adds an opt-in SARIF emission path so Vedha findings can be consumed directly by GitHub Code Scanning, GitLab, Defect Dojo, and any other SARIF-aware scanner UI. Also rewrites the README to credit upstream Shannon clearly and enumerate exactly what Vedha layers on top. Behaviour: ./vedha start -u ... -r ... --report-format sarif writes <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif alongside the existing markdown report. The default (`--report-format md`) is unchanged byte-for-byte. Wiring: - apps/cli: new `--report-format md|sarif` flag on `start`. Validated up front. Help text describes the two values. - apps/cli/docker.ts: forwards VEDHA_REPORT_FORMAT and VEDHA_VERSION to the worker container as env. Env is the right channel because the worker reads them inside `assembleReportActivity` to gate optional emission, and env survives Temporal serialisation without needing pipeline-input plumbing. - apps/worker/temporal/activities.ts: `assembleReportActivity` now invokes SarifReportOutputProvider after the markdown assembly when VEDHA_REPORT_FORMAT=sarif. SARIF emission is best-effort — a failure there never blocks the markdown path. - apps/worker/services/sarif-output-provider.ts: new provider. Walks the five `*_exploitation_evidence.md` deliverables that `assembleFinalReport` already consumes; emits one SARIF result per non-empty evidence file with the body as the result message (truncated at 16 KiB). - apps/worker/services/index.ts: re-exports SarifReportOutputProvider. - apps/worker/tsconfig.json: excludes __tests__/** from production compile so test code doesn't end up in dist/. Tool driver advertises five rules tagged with their CWE IDs: vedha.injection (CWE-74), vedha.xss (CWE-79), vedha.auth (CWE-287), vedha.ssrf (CWE-918), vedha.authz (CWE-285). Test infrastructure: - vitest dev dep + `test` script on @shannon/worker - vitest.config.ts in apps/worker - turbo `test` task wired up - 5 new SARIF tests covering envelope shape, one-result-per-evidence, empty/missing handling, oversized truncation, output path - Total: 29/29 pass (24 existing + 5 new) - pnpm check + build clean README: - New "Credit & lineage" section: Vedha is a fork of Shannon by Keygraph. Architecture is theirs; Vedha exists to carry security hardening, propose improvements upstream, and integrate with the Archeon stack. - New "What Vedha adds over upstream Shannon" section enumerating all 8 security fixes (S-1..S-8) and the SARIF feature. - Versioning policy table linking Vedha versions to Shannon base. - Cross-link to KeygraphHQ/shannon#322 (the upstream PR carrying the security hardening for review). Out of scope for v1.1.0 (deferred follow-ups): - --max-cost USD kill switch - --dry-run / read-only mode - Per-finding line/column SARIF locations (needs structured findings from agents)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three logically distinct commits, one PR for review convenience:
security:hardening — prompt-injection defences inprompt-manager.ts, UID/GID validation inentrypoint.sh, dropchmod 777tochmod 770on container temp dirs.fix(cli):— URL try/catch + scheme allowlist onstart,distinguish
ENOENTfrom real I/O errors in the session-poll loop,locale-aware splash with ASCII fallback for terminals that don't
render Unicode block art.
feat:SARIF 2.1.0 report output — opt-in--report-format sarifflag wires a new
SarifReportOutputProviderthrough the existingReportOutputProviderDI seam. Default behaviour (md only) isbyte-for-byte unchanged.
Why bundled
These were found while reading the codebase to evaluate Shannon for
internal use. Each commit stands alone — squash, cherry-pick, or close
any one without affecting the others.
Commit 1 — security
config.descriptionapps/worker/src/services/prompt-manager.tsconfig.avoid/config.focusrule descriptions injected rawSHANNON_HOST_UID/SHANNON_HOST_GIDconsumed bygroupadd/useraddwithout numeric or range validationentrypoint.shchmod 777on/app,/tmp/.cache,/tmp/.config,/tmp/.npmDockerfileThreat model. Anyone who can write a Shannon config (a CI secret
leak, a compromised target repo) can today embed
{{AUTH_CONTEXT}}or
@include(/etc/passwd)in a description and have the agent treatit as orchestrator-level instruction. After this change those payloads
are inert text — the new `sanitizePromptValue()` breaks
{{...}}placeholder syntax and
@include(...)directives. Newlines arepreserved. Applied uniformly to every user-controlled interpolation
site.
`entrypoint.sh` now rejects non-numeric / out-of-range / zero values
for `SHANNON_HOST_UID` and `SHANNON_HOST_GID` with a clear error,
preventing a malicious env from mapping the pentest user to root or
feeding crafted input into a system command.
`chmod 770` is sufficient: only the pentest user (or a UID remapped
into the pentest group) ever runs in the container — world-write adds
blast radius without functional benefit.
Commit 2 — CLI bug fixes
handling; a malformed input crashed mid-setup with a raw `TypeError`.
Now wrapped in try/catch with an explicit `http` / `https` scheme
allowlist (the worker assumes web semantics).
`EACCES` / `EIO` / `ENOTDIR` alike, so a permissions issue
manifested as an indefinite "Waiting for workflow to start...".
Now distinguishes `ENOENT` (steady-state) and `SyntaxError`
(worker mid-write) from real I/O errors.
doesn't advertise UTF-8. Detection uses `LANG` / `LC_ALL` /
`LC_CTYPE` plus the well-known `WT_SESSION` and
`TERM_PROGRAM=vscode` signals. The Unicode visual is preserved on
every modern terminal and only degrades on raw cmd.exe / locale-less
SSH / some CI log streams.
Commit 3 — SARIF report output
`shannon start ... --report-format sarif` writes
`/.shannon/deliverables/comprehensive_security_assessment_report.sarif`
alongside the markdown report. The default (`md`) is unchanged.
Wiring goes:
`CLI flag → SHANNON_REPORT_FORMAT env → worker.ts:configureReportOutputProvider() → setContainerFactory(SarifReportOutputProvider)`
The provider plugs into the existing `ReportOutputProvider` interface
that's already invoked from `generateReportOutputActivity` —
zero changes to the activity / workflow layer.
Tool driver advertises five rules tagged with their CWE IDs and
the canonical OWASP help URI:
`shannon.injection` (CWE-74), `shannon.xss` (CWE-79),
`shannon.auth` (CWE-287), `shannon.ssrf` (CWE-918),
`shannon.authz` (CWE-285). Each non-empty
`*_exploitation_evidence.md` produces one SARIF `result` with the
evidence body as the message (truncated at 16 KiB to stay under
GitHub's per-result cap).
Out of scope (explicitly v0.1). Per-finding line/column locations
inside source files. The agents don't currently emit structured
per-finding metadata, so the artefact location is the deliverable file
itself. The envelope and consumer wiring shipped here unblock that
follow-up — the result mapping is then a one-function change.
Test plan
`sanitizePromptValue` and URL validation contracts
contract that `entrypoint.sh` enforces, so the bash and TS
sides can never silently drift
shape, one-result-per-evidence, empty/missing handling,
oversized truncation, and the output path
bundle 53.21 kB / 14.48 kB gzipped (no growth from main)
`LANG=C node apps/cli/dist/index.mjs info` renders the ASCII
fallback splash; `pnpm check` / `pnpm build` / `pnpm test` all
cached on second run
`workflows.ts` etc.) are not touched. All files added or modified
in this PR pass `biome check` cleanly.
Things I deliberately did not do
from Docker `CMD`, not user input — the practical injection risk is
near zero, and the fix is fiddly enough that I'd rather not bundle
it.
shortlist but each needs cost-tracker / fs-write-audit infrastructure
I don't yet have a clean read on. Happy to follow up with separate
PRs if there's interest.
agents emit structured findings (see Commit 3 description).
Built against `main @ 79caada` (post-v1.1.0). Author: Devam Shah —
no commercial affiliation, contributing personally.