Using Claude Agent Teams to Extract Entities by puja-trivedi · Pull Request #126 · sensein/structsense

puja-trivedi · 2026-05-13T21:38:47Z

trying different prompts using agent teams

gemini-code-assist

Code Review

This pull request introduces a multi-agent neuroscience NLP extraction pipeline and includes extraction results for three research papers. The review feedback highlights several inconsistencies with the pipeline's output requirements, specifically regarding the formatting of unmapped entities, the inclusion of meta-commentary in metadata fields, and a missing metadata block in one of the result files.

gemini-code-assist · 2026-05-13T21:41:43Z

+  "pipeline_metadata": {
+    "extractor_model": "claude-opus-4-7",
+    "ontology_mapper_model": "claude-opus-4-7",
+    "reviewer_model": "claude-opus-4-7 (sonnet-4-6 would suffice; single-instance orchestration)",


The reviewer_model field contains meta-commentary about model selection. This field should only contain the specific model identifier used for the task to maintain data consistency.

Suggested change

"reviewer_model": "claude-opus-4-7 (sonnet-4-6 would suffice; single-instance orchestration)",

"reviewer_model": "claude-opus-4-7",

gemini-code-assist · 2026-05-13T21:41:43Z

+      ],
+      "indirect_references": [],
+      "extraction_round": 1,
+      "ontology": "CL",


For unmapped entities, the ontology field must be set to null as per the requirements in pipeline_prompt.md (line 165). This issue is also present for entities E012, E014, E026, E040, E041, and E043 in this file.

Suggested change

"ontology": "CL",

"ontology": null,

gemini-code-assist · 2026-05-13T21:41:43Z

+    "year": 2025,
+    "doi": "10.1038/s41593-025-01869-7"
+  },
+  "entities": {


This file is missing the pipeline_metadata block that is present in the other extraction results. This block is necessary for tracking model versions, reviewer status, and iteration counts as part of the standard pipeline output.

gemini-code-assist · 2026-05-13T21:41:43Z

+      "ontology": null,
+      "ontology_id": null,
+      "ontology_label": null,
+      "confidence": "low",


For unmapped entities where ontology identifiers are null, the confidence field must be set to "none" as specified in the pipeline_prompt.md (line 165).

Suggested change

"confidence": "low",

"confidence": "none",

gemini-code-assist · 2026-05-13T21:41:43Z

+      "source_sentences": ["six major dopaminergic MSN subtypes: STRd D1 Striosome, STRd D1 Matrix, STRv D1, STRd D2 Striosome, STRd D2 Matrix and STRv D2."],
+      "indirect_references": [],
+      "extraction_round": 1,
+      "ontology": "CL",


The ontology field should be set to null for unmapped entities (where ontology_id is null) to adhere to the formatting rules defined in pipeline_prompt.md. This violation is pervasive throughout the file for all unmapped entities.

Suggested change

"ontology": "CL",

"ontology": null,

…igger with a zero-delta termination rule. 2. Make source_sentences mean every sentence, and add an occurrence count.3. Give the Reviewer a hard exhaustiveness check, not just a schema check.

…ivalent sequence of '*'

v1 allowed agents to pass review while silently skipping the work the prompt described. v2 requires on-disk artifacts and Reviewer-verifiable evidence for every claim, so the failure modes observed on the basal-ganglia paper run can't recur. Specific changes: - Masking-and-rescan loop: each round must write to disk; round N+1 must read that file fresh and emit with per-candidate decisions before updating the entity list. Regex/grep as a substitute for re-reading is explicitly prohibited. Empty residuals on round 2 is now a red flag, not a success signal. - field constrained to literal substrings verifiable by . Regex metacharacters, escapes, and anchors trigger a hard FAIL. ✅/❌ examples added inline. - Coreference resolution promoted to a numbered, mandatory step with a flag. Entirely-empty across the output is FAIL; Reviewer spot-checks 5 random pronoun-bearing sentences. - Reviewer must run in a fresh context and write before loading the Extractor's output. The independent-candidate diff is a checklist item. If true context isolation isn't available, the Reviewer must declare . - Every Reviewer checklist item now requires an field (file path, shell command output, or structured sub-check). Self-attestation is no longer acceptable. - Ontology Mapper must declare . Unvalidated IDs trigger a Reviewer spot-check of ≥10 IDs against OLS/HGNC/NCBITaxon. - Class-coverage sweep extended with / patterns (instrument names, assay names, sample-prep verbs) to catch terms like cryostat and ependymal that v1 missed. - Added an anti-shortcut table mapping each v1 failure mode to the v2 control that prevents it, so a future executing model sees the trap before falling into it. v1 prompt retained at pipeline_prompt.md for reference and diffing.

first draft of agent_team prompt and output

b674391

puja-trivedi marked this pull request as draft May 13, 2026 21:39

gemini-code-assist Bot reviewed May 13, 2026

View reviewed changes

added second prompt, changes: 1. Replace the discretionary masking tr…

15a48b7

…igger with a zero-delta termination rule. 2. Make source_sentences mean every sentence, and add an occurrence count.3. Give the Reviewer a hard exhaustiveness check, not just a schema check.

puja-trivedi requested review from satra and tekrajchhetri May 14, 2026 19:27

puja-trivedi added 2 commits May 15, 2026 09:44

added masked version of text bu replacing the entity name with an equ…

af589c6

…ivalent sequence of '*'

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Claude Agent Teams to Extract Entities#126

Using Claude Agent Teams to Extract Entities#126
puja-trivedi wants to merge 4 commits into
mainfrom
claude_agent_team

puja-trivedi commented May 13, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

gemini-code-assist Bot May 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	"reviewer_model": "claude-opus-4-7 (sonnet-4-6 would suffice; single-instance orchestration)",
	"reviewer_model": "claude-opus-4-7",

Conversation

puja-trivedi commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 13, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

puja-trivedi commented May 13, 2026 •

edited

Loading