Skip to content

Security hardening, CLI bug fixes, and SARIF report output#322

Open
DevamShah wants to merge 3 commits into
KeygraphHQ:mainfrom
DevamShah:security-bugs-features
Open

Security hardening, CLI bug fixes, and SARIF report output#322
DevamShah wants to merge 3 commits into
KeygraphHQ:mainfrom
DevamShah:security-bugs-features

Conversation

@DevamShah
Copy link
Copy Markdown

Drive-by contribution from a long-time Shannon user. I'm aware of the
"external PRs not accepted at this time" notice in the README — happy
to close this and refile via Issues / private channel if preferred,
or split into smaller pieces. Three orthogonal commits, each
independently revertible.

Summary

Three logically distinct commits, one PR for review convenience:

  1. security: hardening — prompt-injection defences in
    prompt-manager.ts, UID/GID validation in entrypoint.sh, drop
    chmod 777 to chmod 770 on container temp dirs.
  2. fix(cli): — URL try/catch + scheme allowlist on start,
    distinguish ENOENT from real I/O errors in the session-poll loop,
    locale-aware splash with ASCII fallback for terminals that don't
    render Unicode block art.
  3. feat: SARIF 2.1.0 report output — opt-in --report-format sarif
    flag wires a new SarifReportOutputProvider through the existing
    ReportOutputProvider DI seam. Default behaviour (md only) is
    byte-for-byte unchanged.

Why bundled

These were found while reading the codebase to evaluate Shannon for
internal use. Each commit stands alone — squash, cherry-pick, or close
any one without affecting the others.

Commit 1 — security

ID What Where
S-1 Prompt injection via config.description apps/worker/src/services/prompt-manager.ts
S-2 Credential injection (username / password / TOTP secret) into prompts same
S-3 config.avoid / config.focus rule descriptions injected raw same
S-4 SHANNON_HOST_UID / SHANNON_HOST_GID consumed by groupadd / useradd without numeric or range validation entrypoint.sh
S-5 chmod 777 on /app, /tmp/.cache, /tmp/.config, /tmp/.npm Dockerfile

Threat model. Anyone who can write a Shannon config (a CI secret
leak, a compromised target repo) can today embed {{AUTH_CONTEXT}}
or @include(/etc/passwd) in a description and have the agent treat
it as orchestrator-level instruction. After this change those payloads
are inert text — the new `sanitizePromptValue()` breaks {{...}}
placeholder syntax and @include(...) directives. Newlines are
preserved. Applied uniformly to every user-controlled interpolation
site.

`entrypoint.sh` now rejects non-numeric / out-of-range / zero values
for `SHANNON_HOST_UID` and `SHANNON_HOST_GID` with a clear error,
preventing a malicious env from mapping the pentest user to root or
feeding crafted input into a system command.

`chmod 770` is sufficient: only the pentest user (or a UID remapped
into the pentest group) ever runs in the container — world-write adds
blast radius without functional benefit.

Commit 2 — CLI bug fixes

  • `new URL(args.url)` was called twice deep in setup with no error
    handling; a malformed input crashed mid-setup with a raw `TypeError`.
    Now wrapped in try/catch with an explicit `http` / `https` scheme
    allowlist (the worker assumes web semantics).
  • The `session.json` polling loop's bare `catch` swallowed
    `EACCES` / `EIO` / `ENOTDIR` alike, so a permissions issue
    manifested as an indefinite "Waiting for workflow to start...".
    Now distinguishes `ENOENT` (steady-state) and `SyntaxError`
    (worker mid-write) from real I/O errors.
  • `splash.ts` falls back to a plain-ASCII variant when the terminal
    doesn't advertise UTF-8. Detection uses `LANG` / `LC_ALL` /
    `LC_CTYPE` plus the well-known `WT_SESSION` and
    `TERM_PROGRAM=vscode` signals. The Unicode visual is preserved on
    every modern terminal and only degrades on raw cmd.exe / locale-less
    SSH / some CI log streams.

Commit 3 — SARIF report output

`shannon start ... --report-format sarif` writes
`/.shannon/deliverables/comprehensive_security_assessment_report.sarif`
alongside the markdown report. The default (`md`) is unchanged.

Wiring goes:
`CLI flag → SHANNON_REPORT_FORMAT env → worker.ts:configureReportOutputProvider() → setContainerFactory(SarifReportOutputProvider)`

The provider plugs into the existing `ReportOutputProvider` interface
that's already invoked from `generateReportOutputActivity` —
zero changes to the activity / workflow layer.

Tool driver advertises five rules tagged with their CWE IDs and
the canonical OWASP help URI:
`shannon.injection` (CWE-74), `shannon.xss` (CWE-79),
`shannon.auth` (CWE-287), `shannon.ssrf` (CWE-918),
`shannon.authz` (CWE-285). Each non-empty
`*_exploitation_evidence.md` produces one SARIF `result` with the
evidence body as the message (truncated at 16 KiB to stay under
GitHub's per-result cap).

Out of scope (explicitly v0.1). Per-finding line/column locations
inside source files. The agents don't currently emit structured
per-finding metadata, so the artefact location is the deliverable file
itself. The envelope and consumer wiring shipped here unblock that
follow-up — the result mapping is then a one-function change.

Test plan

  • `pnpm test` — 21/21 vitest cases green across three suites:
    • `prompt-manager.test.ts` (11 cases) pins the
      `sanitizePromptValue` and URL validation contracts
    • `uid-gid-validation.test.ts` (5 cases) pins the regex+range
      contract that `entrypoint.sh` enforces, so the bash and TS
      sides can never silently drift
    • `sarif-output-provider.test.ts` (5 cases) covers SARIF envelope
      shape, one-result-per-evidence, empty/missing handling,
      oversized truncation, and the output path
  • `pnpm check` — `tsc --noEmit` clean for both `@keygraph/shannon` and `@shannon/worker`
  • `pnpm build` — turbo build succeeds; `apps/cli/dist/index.mjs`
    bundle 53.21 kB / 14.48 kB gzipped (no growth from main)
  • Manual: `node apps/cli/dist/index.mjs help` shows the new flag;
    `LANG=C node apps/cli/dist/index.mjs info` renders the ASCII
    fallback splash; `pnpm check` / `pnpm build` / `pnpm test` all
    cached on second run
  • Pre-existing biome issues on `main` (import-organize warnings in
    `workflows.ts` etc.) are not touched. All files added or modified
    in this PR pass `biome check` cleanly.

Things I deliberately did not do

  • No `exec $*` quoting fix in `entrypoint.sh`. The args come
    from Docker `CMD`, not user input — the practical injection risk is
    near zero, and the fix is fiddly enough that I'd rather not bundle
    it.
  • No `--max-cost` cap or `--dry-run` mode. Both were on my
    shortlist but each needs cost-tracker / fs-write-audit infrastructure
    I don't yet have a clean read on. Happy to follow up with separate
    PRs if there's interest.
  • No SARIF result locations inside source files. Out of scope until
    agents emit structured findings (see Commit 3 description).

Built against `main @ 79caada` (post-v1.1.0). Author: Devam Shah —
no commercial affiliation, contributing personally.

…perms

Prompts are how the orchestrator instructs LLM agents. Today, several
user-controlled config fields are interpolated raw into those prompts,
which lets anyone who can write the config also override agent
instructions. The UID/GID handling in entrypoint.sh feeds unvalidated
env vars into groupadd/useradd, and the runtime image grants 0o777 on
several directories that only ever need owner+group access.

Changes:

- prompt-manager.ts: add `sanitizePromptValue()` and apply it to every
  user-controlled interpolation site — config.description, focus/avoid
  rule descriptions, credentials (username/password/totp_secret), and
  the auth-context block. The sanitiser breaks `{{...}}` placeholder
  syntax and `@include(path)` directives so injected values cannot pose
  as real placeholders or pull in arbitrary files. Newlines remain
  intact since multi-line descriptions are valid input.

- entrypoint.sh: validate SHANNON_HOST_UID and SHANNON_HOST_GID against
  `^[0-9]+$` and a 1..2_000_000 range before they reach groupadd/useradd.
  Refuses 0 (root), negatives, non-numeric input, and out-of-range
  values with a clear error.

- Dockerfile: drop `chmod 777` to `chmod 770` on /app, /tmp/.cache,
  /tmp/.config, /tmp/.npm. The container only runs the pentest user
  (or a remapped UID added to the pentest group via entrypoint), so
  world-writable adds blast radius without functional benefit. Also
  chowns the /tmp/.* dirs to pentest.

- vitest infrastructure: adds vitest as a dev dep on @shannon/worker,
  a `test` script + turbo task, and 16 tests covering the sanitiser
  spec and the UID/GID validation contract that entrypoint.sh enforces.
  All 16 pass; `tsc --noEmit` clean.

Threat model: an attacker with write access to a Shannon config (a CI
secret leak, a compromised repo) could previously embed
`{{AUTH_CONTEXT}}` or `@include(/etc/passwd)` in a description and have
the agent treat it as orchestrator-level instruction or file inclusion.
After this change those payloads are inert text.
Three small reliability fixes that surfaced while preparing the security
hardening commit:

- start.ts: validate the target URL up front. `new URL(args.url)` was
  called twice deep in the setup flow with no error handling, so a
  malformed input crashed mid-setup with a raw TypeError. Add a single
  try/catch at the top, plus an `http`/`https` scheme allowlist — the
  worker assumes web semantics, so `file://`, `ftp://`, and
  `javascript:` should be rejected explicitly rather than silently
  flowing through.

- start.ts: in the session.json polling loop, distinguish ENOENT (the
  expected steady-state until the worker writes the file) and
  SyntaxError (the worker is mid-write) from real I/O errors. Today the
  bare catch silently swallows EACCES, EIO, ENOTDIR alike, so a
  permissions or filesystem issue manifests as an indefinite "Waiting
  for workflow to start..." with no diagnostic. Genuine errors now exit
  with the underlying message.

- splash.ts: detect terminal UTF-8 capability and fall back to a plain
  ASCII splash when the terminal can't render the Unicode block art.
  The existing splash relies on `█`, `╔` and similar; these
  render as `?` or mojibake in raw cmd.exe, locale-less SSH sessions,
  and several CI log streams. UTF-8 detection uses LANG / LC_ALL /
  LC_CTYPE plus the well-known `WT_SESSION` and `TERM_PROGRAM=vscode`
  signals, so the visual is preserved on every modern terminal and
  only degrades when needed.
Adds an opt-in SARIF emission path so Shannon findings can be ingested
directly by GitHub Code Scanning, GitLab CI security dashboards,
Defect Dojo, and any other consumer of the OASIS SARIF 2.1.0 standard.

Behaviour:

  shannon start -u ... -r ... --report-format sarif

writes
  <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif

alongside the existing markdown report. The default (`--report-format md`)
is unchanged and preserves the current behaviour byte-for-byte.

Wiring:

- apps/cli: new `--report-format md|sarif` flag on `start`. Validated up
  front. Help text describes the two values.
- apps/cli/docker.ts: forwards SHANNON_REPORT_FORMAT and SHANNON_VERSION
  to the worker container as env. Env (rather than CLI arg) is the right
  channel because the worker reads them once at startup to install the
  provider before any activity input plumbing exists.
- apps/worker/temporal/worker.ts: new `configureReportOutputProvider()`
  inspects SHANNON_REPORT_FORMAT, calls `setContainerFactory()` to inject
  `SarifReportOutputProvider` for the rest of the worker's lifetime.
- apps/worker/services/sarif-output-provider.ts: new provider implementing
  the existing `ReportOutputProvider` interface. Walks the five
  `*_exploitation_evidence.md` deliverables that `assembleFinalReport`
  already consumes; emits one SARIF result per non-empty file with the
  evidence body as the result message.
- apps/worker/services/index.ts: re-exports `SarifReportOutputProvider`
  for downstream consumers building on the worker package.
- apps/worker/tsconfig.json: excludes `__tests__/**` and `vitest.config.ts`
  from the production compile so test code does not end up in dist/.

Tool driver advertises five rules tagged with their CWE IDs and the
canonical OWASP help URI: shannon.injection (CWE-74), shannon.xss
(CWE-79), shannon.auth (CWE-287), shannon.ssrf (CWE-918), shannon.authz
(CWE-285).

Scope (v0.1) — and what is intentionally out of scope:

- Each non-empty evidence file produces exactly one SARIF result. The
  result `message` is the evidence body (truncated at 16 KiB to stay
  within consumer limits, particularly GitHub's per-result cap).
- Per-finding line/column locations inside source files are not yet
  emitted because the agents do not currently produce structured
  per-finding metadata. That is a natural follow-up — the SARIF
  envelope and consumer wiring shipped here unblocks it.

5 vitest cases cover the behavioural contract: SARIF envelope shape,
one-result-per-non-empty-evidence, empty/missing handling, oversized
truncation, and the output path. Total test count across the package
is now 21/21 green; `tsc --noEmit` clean for both apps.
DevamShah added a commit to DevamShah/vedha that referenced this pull request Apr 27, 2026
Adds an opt-in SARIF emission path so Vedha findings can be consumed
directly by GitHub Code Scanning, GitLab, Defect Dojo, and any other
SARIF-aware scanner UI. Also rewrites the README to credit upstream
Shannon clearly and enumerate exactly what Vedha layers on top.

Behaviour:

  ./vedha start -u ... -r ... --report-format sarif

writes
  <repo>/.shannon/deliverables/comprehensive_security_assessment_report.sarif

alongside the existing markdown report. The default (`--report-format md`)
is unchanged byte-for-byte.

Wiring:

- apps/cli: new `--report-format md|sarif` flag on `start`. Validated
  up front. Help text describes the two values.
- apps/cli/docker.ts: forwards VEDHA_REPORT_FORMAT and VEDHA_VERSION
  to the worker container as env. Env is the right channel because the
  worker reads them inside `assembleReportActivity` to gate optional
  emission, and env survives Temporal serialisation without needing
  pipeline-input plumbing.
- apps/worker/temporal/activities.ts: `assembleReportActivity` now
  invokes SarifReportOutputProvider after the markdown assembly when
  VEDHA_REPORT_FORMAT=sarif. SARIF emission is best-effort — a failure
  there never blocks the markdown path.
- apps/worker/services/sarif-output-provider.ts: new provider. Walks
  the five `*_exploitation_evidence.md` deliverables that
  `assembleFinalReport` already consumes; emits one SARIF result per
  non-empty evidence file with the body as the result message
  (truncated at 16 KiB).
- apps/worker/services/index.ts: re-exports SarifReportOutputProvider.
- apps/worker/tsconfig.json: excludes __tests__/** from production
  compile so test code doesn't end up in dist/.

Tool driver advertises five rules tagged with their CWE IDs:
  vedha.injection (CWE-74), vedha.xss (CWE-79), vedha.auth (CWE-287),
  vedha.ssrf (CWE-918), vedha.authz (CWE-285).

Test infrastructure:
- vitest dev dep + `test` script on @shannon/worker
- vitest.config.ts in apps/worker
- turbo `test` task wired up
- 5 new SARIF tests covering envelope shape, one-result-per-evidence,
  empty/missing handling, oversized truncation, output path
- Total: 29/29 pass (24 existing + 5 new)
- pnpm check + build clean

README:
- New "Credit & lineage" section: Vedha is a fork of Shannon by
  Keygraph. Architecture is theirs; Vedha exists to carry security
  hardening, propose improvements upstream, and integrate with the
  Archeon stack.
- New "What Vedha adds over upstream Shannon" section enumerating
  all 8 security fixes (S-1..S-8) and the SARIF feature.
- Versioning policy table linking Vedha versions to Shannon base.
- Cross-link to KeygraphHQ/shannon#322 (the upstream PR carrying
  the security hardening for review).

Out of scope for v1.1.0 (deferred follow-ups):
- --max-cost USD kill switch
- --dry-run / read-only mode
- Per-finding line/column SARIF locations (needs structured findings
  from agents)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant