Using Claude Agent Teams to Extract Entities#126
Draft
puja-trivedi wants to merge 4 commits into
Draft
Conversation
Contributor
There was a problem hiding this comment.
Code Review
This pull request introduces a multi-agent neuroscience NLP extraction pipeline and includes extraction results for three research papers. The review feedback highlights several inconsistencies with the pipeline's output requirements, specifically regarding the formatting of unmapped entities, the inclusion of meta-commentary in metadata fields, and a missing metadata block in one of the result files.
| "pipeline_metadata": { | ||
| "extractor_model": "claude-opus-4-7", | ||
| "ontology_mapper_model": "claude-opus-4-7", | ||
| "reviewer_model": "claude-opus-4-7 (sonnet-4-6 would suffice; single-instance orchestration)", |
Contributor
There was a problem hiding this comment.
The reviewer_model field contains meta-commentary about model selection. This field should only contain the specific model identifier used for the task to maintain data consistency.
Suggested change
| "reviewer_model": "claude-opus-4-7 (sonnet-4-6 would suffice; single-instance orchestration)", | |
| "reviewer_model": "claude-opus-4-7", |
| ], | ||
| "indirect_references": [], | ||
| "extraction_round": 1, | ||
| "ontology": "CL", |
Contributor
There was a problem hiding this comment.
| "year": 2025, | ||
| "doi": "10.1038/s41593-025-01869-7" | ||
| }, | ||
| "entities": { |
Contributor
| "ontology": null, | ||
| "ontology_id": null, | ||
| "ontology_label": null, | ||
| "confidence": "low", |
Contributor
| "source_sentences": ["six major dopaminergic MSN subtypes: STRd D1 Striosome, STRd D1 Matrix, STRv D1, STRd D2 Striosome, STRd D2 Matrix and STRv D2."], | ||
| "indirect_references": [], | ||
| "extraction_round": 1, | ||
| "ontology": "CL", |
Contributor
There was a problem hiding this comment.
…igger with a zero-delta termination rule. 2. Make source_sentences mean every sentence, and add an occurrence count.3. Give the Reviewer a hard exhaustiveness check, not just a schema check.
…ivalent sequence of '*'
v1 allowed agents to pass review while silently skipping the work the prompt described. v2 requires on-disk artifacts and Reviewer-verifiable evidence for every claim, so the failure modes observed on the basal-ganglia paper run can't recur. Specific changes: - Masking-and-rescan loop: each round must write to disk; round N+1 must read that file fresh and emit with per-candidate decisions before updating the entity list. Regex/grep as a substitute for re-reading is explicitly prohibited. Empty residuals on round 2 is now a red flag, not a success signal. - field constrained to literal substrings verifiable by . Regex metacharacters, escapes, and anchors trigger a hard FAIL. ✅/❌ examples added inline. - Coreference resolution promoted to a numbered, mandatory step with a flag. Entirely-empty across the output is FAIL; Reviewer spot-checks 5 random pronoun-bearing sentences. - Reviewer must run in a fresh context and write before loading the Extractor's output. The independent-candidate diff is a checklist item. If true context isolation isn't available, the Reviewer must declare . - Every Reviewer checklist item now requires an field (file path, shell command output, or structured sub-check). Self-attestation is no longer acceptable. - Ontology Mapper must declare . Unvalidated IDs trigger a Reviewer spot-check of ≥10 IDs against OLS/HGNC/NCBITaxon. - Class-coverage sweep extended with / patterns (instrument names, assay names, sample-prep verbs) to catch terms like cryostat and ependymal that v1 missed. - Added an anti-shortcut table mapping each v1 failure mode to the v2 control that prevents it, so a future executing model sees the trap before falling into it. v1 prompt retained at pipeline_prompt.md for reference and diffing.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.