You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue #1538 reports that spec compliance reviewers in subagent-driven-development can take 20-33 minutes because the spec reviewer prompt tells the reviewer to read the implementation code but does not provide a git range or file scope. In current dev, code-quality-reviewer-prompt.md receives BASE_SHA and HEAD_SHA, while spec-reviewer-prompt.md does not; that leaves the spec reviewer to discover changed files by crawling the broader repository.
This PR addresses the diff-scope part of #1538 only. It does not change model selection guidance; #1496 is already exploring the broader subagent model-selection problem.
What does this PR change?
Adds a Git Range to Review section to skills/subagent-driven-development/spec-reviewer-prompt.md, including BASE_SHA, HEAD_SHA, and explicit git diff --stat / git diff commands. The prompt now tells the spec reviewer to read only files in that diff and explicitly lists broad codebase reads outside the diff range as a thing not to do.
It also adds a static regression test for this prompt shape and wires that test into the Claude Code skill test runner.
Is this change appropriate for the core library?
Yes. This is a general-purpose correction to the core SDD review prompt. It applies to any project using subagent-driven-development, does not add a third-party dependency, does not add a new harness, and does not introduce project-specific behavior.
What alternatives did you consider?
Change model selection guidance so spec reviewers use cheaper models. I did not include that here because feat: add subagent-model-reconciliation skill #1496 already covers model selection more broadly, and bundling it would make this PR less reviewable.
No PR dedicated to adding a git diff range to spec-reviewer-prompt.md was found. #578 is a much broader persistent validator/team-mode proposal. #613 hardens review-skipping gates. #1496 addresses model reconciliation. This PR is narrower than all three.
Environment tested
Harness (e.g. Claude Code, Cursor)
Harness version
Model
Model version/ID
Local shell / Claude Code test files
Claude Code CLI 2.1.141 present; tests run via bash on macOS
N/A
Static regression tests only
New harness support (required if this PR adds a new harness)
Not applicable. This PR does not add or modify harness support.
Evaluation
Initial prompt/context: user asked me to patrol existing PRs, identify high-priority issues, and continue with the highest-value actionable issue. I selected #1538 because it describes a concrete observed SDD slowdown and current dev still shows the asymmetry: code quality reviewer has a git range, spec reviewer does not.
Result before editing spec-reviewer-prompt.md: 7 failures. The prompt lacked ## Git Range to Review, BASE_SHA, HEAD_SHA, the two git diff commands, the changed-file scope rule, and the outside-diff prohibition.
All commands exited 0. I also tried bash tests/claude-code/run-skill-tests.sh --test test-subagent-driven-development-static.sh; it failed before reaching the test because macOS lacks GNU timeout by default. I did not change the runner for that pre-existing local environment issue because it is unrelated to #1538.
Live adversarial agent sessions after the change: 0. This is a narrow prompt-template guard with a RED/GREEN static regression test; I am not claiming multi-session behavioral pressure-test coverage.
Rigor
If this is a skills change: I used superpowers:writing-skills and completed adversarial pressure testing (paste results below)
This change was tested adversarially, not just on the happy path
I did not modify carefully-tuned content (Red Flags table, rationalizations, "human partner" language) without extensive evals showing the change is an improvement
I used the superpowers:writing-skills process for the RED/GREEN shape of this edit: identify the observed baseline failure from #1538, write a failing static regression test before changing the prompt, then make the minimum prompt change needed for the regression to pass. I did not complete live adversarial pressure testing; the unchecked boxes above are intentional rather than omitted.
Human review
A human has reviewed the COMPLETE proposed diff before submission
The complete staged diff was shown before submission. The human partner previously instructed me to treat shown diffs as reviewed for these PRs unless they say otherwise.
This is a narrow follow-up from the issue triage for #1538. The PR only addresses the spec-reviewer diff-scope gap: code-quality-reviewer-prompt.md already receives BASE_SHA / HEAD_SHA, but spec-reviewer-prompt.md did not, which forces the reviewer to discover changed files by crawling the repo.
What changed: add a Git Range to Review section to the spec reviewer prompt, require git diff --stat / git diff, and explicitly scope reads to files in that diff. A static regression test locks this prompt shape in.
What this intentionally does not change: model selection guidance, persistent validator/team mode, citations, or review gate semantics. Those are covered by broader related PRs (#1496, #578, #613) and would make this PR much harder to review.
Verification: the new static regression failed on current dev before the prompt edit and passes after it. test-worktree-path-policy.sh, shell syntax checks, and diff whitespace checks also pass. The full run-skill-tests.sh --test ... wrapper was not used on my macOS machine because it depends on GNU timeout, which is not installed by default; the underlying static test itself passes directly.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What problem are you trying to solve?
Issue #1538 reports that spec compliance reviewers in
subagent-driven-developmentcan take 20-33 minutes because the spec reviewer prompt tells the reviewer to read the implementation code but does not provide a git range or file scope. In currentdev,code-quality-reviewer-prompt.mdreceivesBASE_SHAandHEAD_SHA, whilespec-reviewer-prompt.mddoes not; that leaves the spec reviewer to discover changed files by crawling the broader repository.This PR addresses the diff-scope part of #1538 only. It does not change model selection guidance; #1496 is already exploring the broader subagent model-selection problem.
What does this PR change?
Adds a
Git Range to Reviewsection toskills/subagent-driven-development/spec-reviewer-prompt.md, includingBASE_SHA,HEAD_SHA, and explicitgit diff --stat/git diffcommands. The prompt now tells the spec reviewer to read only files in that diff and explicitly lists broad codebase reads outside the diff range as a thing not to do.It also adds a static regression test for this prompt shape and wires that test into the Claude Code skill test runner.
Is this change appropriate for the core library?
Yes. This is a general-purpose correction to the core SDD review prompt. It applies to any project using
subagent-driven-development, does not add a third-party dependency, does not add a new harness, and does not introduce project-specific behavior.What alternatives did you consider?
Does this PR contain multiple unrelated changes?
No. The prompt edit and the static test both cover one issue: the spec compliance reviewer needs an explicit task diff boundary.
Existing PRs
I searched all open and closed PRs for:
1538 slow review agents spec reviewer git rangespec reviewer diff scope BASE_SHA HEAD_SHAsubagent-driven-development spec compliance reviewer git diffsubagent review model selection spec reviewerNo PR dedicated to adding a git diff range to
spec-reviewer-prompt.mdwas found. #578 is a much broader persistent validator/team-mode proposal. #613 hardens review-skipping gates. #1496 addresses model reconciliation. This PR is narrower than all three.Environment tested
New harness support (required if this PR adds a new harness)
Not applicable. This PR does not add or modify harness support.
Evaluation
Initial prompt/context: user asked me to patrol existing PRs, identify high-priority issues, and continue with the highest-value actionable issue. I selected #1538 because it describes a concrete observed SDD slowdown and current
devstill shows the asymmetry: code quality reviewer has a git range, spec reviewer does not.Baseline before the change:
Result before editing
spec-reviewer-prompt.md: 7 failures. The prompt lacked## Git Range to Review,BASE_SHA,HEAD_SHA, the twogit diffcommands, the changed-file scope rule, and the outside-diff prohibition.After the change:
Result: 7/7 assertions passed.
Additional verification:
bash tests/claude-code/test-worktree-path-policy.sh bash -n tests/claude-code/test-subagent-driven-development-static.sh tests/claude-code/run-skill-tests.sh tests/claude-code/test-worktree-path-policy.sh git diff --cached --check && git diff --checkAll commands exited 0. I also tried
bash tests/claude-code/run-skill-tests.sh --test test-subagent-driven-development-static.sh; it failed before reaching the test because macOS lacks GNUtimeoutby default. I did not change the runner for that pre-existing local environment issue because it is unrelated to #1538.Live adversarial agent sessions after the change: 0. This is a narrow prompt-template guard with a RED/GREEN static regression test; I am not claiming multi-session behavioral pressure-test coverage.
Rigor
superpowers:writing-skillsand completed adversarial pressure testing (paste results below)I used the
superpowers:writing-skillsprocess for the RED/GREEN shape of this edit: identify the observed baseline failure from #1538, write a failing static regression test before changing the prompt, then make the minimum prompt change needed for the regression to pass. I did not complete live adversarial pressure testing; the unchecked boxes above are intentional rather than omitted.Human review
The complete staged diff was shown before submission. The human partner previously instructed me to treat shown diffs as reviewed for these PRs unless they say otherwise.