Eidolons

A personal, portable team of AI agents. Named specialists that work alone when the task is sharp, in harmony when it's big — and travel with you from project to project, host to host.

Most AI coding tools ship a single generalist that plans, scouts, builds, and documents all at once — and hits a ceiling fast. Eidolons is a different shape: eight independently-versioned specialists across seven roles, one CLI, dropped into any project. You get sharp boundaries instead of one confused generalist — the right specialist for each phase, over a shared memory that carries context between them.

And the routing is mechanical, not hopeful. Most multi-agent setups are a paragraph in CLAUDE.md the model is free to ignore — so it does, until you name an agent yourself. Eidolons installs real per-host hooks: at session start a deterministic, non-LLM kernel computes the routing decision and injects it — plus recalled memory — into context, on its own, every time. The team travels across Claude Code, Codex, GitHub Copilot, Cursor, and OpenCode, and degrades gracefully to documentary routing wherever a host's hooks aren't sound.

Try it in 60 seconds

Evaluation, not commitment — this drops a read-only ATLAS into a throwaway folder:

curl -sSL https://raw.githubusercontent.com/Rynaro/eidolons/main/cli/install.sh | bash
cd /tmp && mkdir eidolons-demo && cd eidolons-demo
eidolons init --preset minimal --non-interactive

Explore, then rm -rf /tmp/eidolons-demo and walk away. Full flow in Install.

Does it actually work?

Two questions, both measured against a bare host running the same model — no hand-waving.

1. Does wiring in the team change what the host actually does? eidolons eval compliance runs one prompt suite through a headless host twice — once with the harness wired, once with only the prose cortex (≈ a bare host) — and scores how it routes.

Routing _{(Claude Code · k=2 · 56 sessions)}	Prose only	With Eidolons
Stability — picks the right specialist on both runs	16.7%	58.3% _(3.5×)
Routes "which approach?" → the reasoner	0%	100%
Correct target overall	58.3%	66.7%
False delegation on control prompts	0%	0%

The signature win is consistency — the injection makes routing far more deterministic. (This is a SessionStart-only lower bound: headless hosts don't fire per-prompt hooks, so the interactive number is expected higher. Honest writeup: .spectra/research/compliance-eval-2026-06-12.md.)

2. Does the specialist shape beat one generalist pass? On an adversarial-hard coding suite (budget-matched, k=2) — measured in Vivi's own repo — Vivi's parallel-candidate shape lands every fix where a single pass lands two-thirds:

Hard-task fix quality _{(pass², resolved on both runs)}	Single pass	Vivi (fanout)
Adversarial-hard suite	0.67	1.00

Zero reward-hacks in 63 holdout-gated runs. And Kupo, the executor the team delegates micro-tasks to, earned its roster seat on a behavioral additive-proof — 36/36 tasks, pass³ 1.00.

Don't trust our numbers — reproduce the floor yourself. The behavioral evals above are billed and model-dependent, but the routing decision underneath them is a deterministic, non-LLM kernel — so we ship it as a benchmark anyone can run cold, with no API key, no billing, and ~0 tokens:

eidolons eval routing --suite public      # 15 labelled tasks across 12 routing categories

It grades the kernel's output against Eidolons-authored ground truth (evals/routing-suite.yaml); because the kernel is deterministic, pass^k == pass^1 and you'll get the exact same result we do. (--validate-suite self-tests the suite; --json for machine output.) This is the reproducible floor the billed evals build on — verify it, then weigh the rest.

These are early, small-N signals, framed honestly in the research digests and CHANGELOG.md — not marketing.

Meet the team

Eidolon	What it does	Reach for it when…
ATLAS _scout	Maps an unfamiliar codebase without writing a line. Evidence-anchored, read-only by construction.	Auditing a new repo, onboarding, before any change.
SPECTRA _planner	Turns a rough idea or scout report into a decision-ready spec — rubrics, gates, GIVEN/WHEN/THEN.	Planning a feature before you build it.
Vivi _{coder · default}	The default coder. Brownfield, pattern-first, test-anchored — drives a closed edit-run-test loop and gates on `pass^k` instead of one green run.	Shipping the change SPECTRA planned, on a loop-capable host.
APIVR-Δ _{coder · fallback}	Vivi's conservative predecessor, for hosts without the closed loop. Same discipline, non-loop posture — add with `eidolons add apivr`.	A loop-incompetent host, or a cautious builder.
IDG _scriber	Synthesizes docs from sessions, specs, and deltas — provenance-first, with `[GAP]`/`[DISPUTED]` markers.	Chronicling what you just built.
FORGE _reasoner	Deliberates on ambiguous trade-offs. Names alternatives, surfaces assumptions, returns a verdict + confidence.	Two patterns apply and the choice isn't obvious.
VIGIL _debugger	Forensic debugger for failures that resist normal repair. Reproduction-gated, counterfactual-verified.	A flaky test, heisenbug, or unexplained regression.
Kupo _executor	Low-effort delegate target. Patches an ephemeral sandbox, proves it with a real verifier, and proposes the patch back — never writes the real tree.	Offloading trivial localized edits to keep a session lean.
CRYSTALIUM _memory	The shared four-layer memory substrate every member writes to and recalls from — tier-gated writes, hybrid recall, Dream consolidation, principled forgetting.	Carrying context and learned patterns across sessions and members.

Eight shipped specialists across seven capability classes — scout, planner, coder, scriber, reasoner, debugger, executor — plus CRYSTALIUM, the memory substrate underneath them all. Versions and handoff contracts live in roster/index.yaml, the machine-readable source of truth.

How they compose

The team has a default shape: ATLAS scouts, SPECTRA plans, Vivi builds, IDG chronicles. FORGE and VIGIL are lateral specialists — consultable at any stage. CRYSTALIUM sits underneath all of them, the shared memory every member writes handoffs into and recalls from. Partial teams are first-class: bring just ATLAS to an audit, or the full pipeline to a greenfield.

Canonical pipeline

ATLAS ───▶ SPECTRA ───▶  Vivi  ───▶ IDG
  scout      plan         build       chronicle
             ▲             │ ▲
           FORGE ◀── (ambiguity, trade-offs, novel problems)
                           │ │
                         VIGIL ◀── (failure resisted repair; forensic attribution)
                           │
                         Kupo ◀── (localized verifier-backed micro-tasks, PROPOSE-only)

  ╞════════════════ CRYSTALIUM ════════════════╡
   shared memory — every member commits handoff
   artifacts and recalls them (bidirectional)

Handoffs are structured artifacts written to disk, not free-form messages. See methodology/composition.md for the contract table and partial-team matrix.

Mechanical routing — the harness

A descriptor table in a prose file can only suggest delegation; the host decides whether to listen, and usually it doesn't until you name an Eidolon yourself. The decision to build a mechanical harness wasn't a hunch — it's backed by a research synthesis of 112 adversarially-verified capability rows across 18 agents (DOSSIER-HARNESS-2026-06.md). The harness closes that gap with three pieces:

A deterministic routing kernel. eidolons run "<prompt>" --json classifies a prompt against roster/routing.yaml — no LLM, fully reproducible — and emits which Eidolon(s) handle it, at what tier, in what chain.
Per-host hook adapters. eidolons harness install wires that kernel into each host's own lifecycle hooks. At session start the routing artifact and a memory digest are injected as context — the host doesn't have to remember to delegate; the routing arrives on its own.
Graceful degradation. Routing is injected by default (advisory-mechanical). Where a host's hooks are absent or buggy, it silently falls back to documentary routing — never worse than prose.

eidolons harness install            # wire routing + memory injection into detected hosts
eidolons harness install --strict   # opt-in: add tool-boundary delegate-or-deny where sound
eidolons harness status             # per-host effective enforcement tier

For a hard backstop, --strict adds a PreToolUse delegate-or-deny tier that mechanically blocks direct main-loop edits (only delegated subagents may write), with soundness graded per host. When CRYSTALIUM is installed, the session-start hook also runs eidolons memory preflight — a one-shot recall that injects prior project memory, fail-open and bounded. The full per-host capability matrix is in DOSSIER-HARNESS-2026-06.md and docs/architecture.md § "Harness Layer".

Spec-Driven lifecycle — ESL

Routing decides who works. ESL — the Eidolons Spec Lifecycle — decides how a change moves, so non-trivial work runs through a right-sized, auditable lifecycle instead of a one-shot prompt. It isn't a second framework bolted on: the specialists you already have are the lifecycle — SPECTRA specifies, FORGE deliberates, Vivi implements, Kupo/VIGIL verify, IDG archives — and ESL is the thin grammar that sequences them, change by change, on disk under .spectra/changes/. Each Eidolon ships its own lifecycle hop; the cortex orchestrates the rest.

It's built deliberately against the documented failure modes of spec-driven development — over-specification, instruction bloat, spec-as-waterfall, "spec" as a throwaway prompt:

A mechanical right-sizing gate classifies every change by observable signals — trivial → Kupo direct (no ceremony), lite → one-page spec, full → the whole lifecycle. You can't over-specify a one-line fix.
maker ≠ checker, enforced — the implementer and the verifier are mechanically distinct identities, checked on the hand-off envelope. A change cannot self-verify.
Drift-check before archive re-derives the change against its living spec, catching implementation that outran the intent.
Opt-in, then mechanically forced — advisory by default (SHOULD open a change first); escalates to blocking (MUST) once the project crosses mechanical size thresholds (change-count / repo-LOC / full-spec ratio), recorded auditably in the lock. With tonberry installed, the harness injects an ESL reminder at every session start and on every non-trivial routed prompt — not left to memory. Trivial work is always exempt; install auto-assesses (skip with EIDOLONS_SKIP_AUTO_ASSESS=1).

The official implementation is tonberry — a thin (~13 MB, distroless) Go MCP whose verify is byte-identical to a zero-dependency bash 3.2 conformance checker, so the rich runtime and the minimal reference can never drift. Install it with eidolons mcp install tonberry; the contract it implements is Rynaro/eidolons-esl. ESL is opt-in — absent the MCP, the Eidolons route and build exactly as before. Once installed, the surfacing is mechanical — injected at session start every time, not (yet) a hard edit-time block.

Install

One-time, global:

curl -sSL https://raw.githubusercontent.com/Rynaro/eidolons/main/cli/install.sh | bash

This installs the eidolons CLI to ~/.local/bin/eidolons and caches the nexus at ~/.eidolons/nexus.

Per project — empty folders or running projects:

cd <any-project>
eidolons init                # interactive — choose members and preset (offers CRYSTALIUM memory)
eidolons add forge           # add a single member later
eidolons sync                # reconcile installed members to eidolons.yaml
eidolons harness install     # wire mechanical routing + memory injection into your hosts
eidolons verify              # re-check installed Eidolons against the roster's signed metadata

Keep the nexus current with eidolons upgrade self (atomic, integrity-verified, with --check and --rollback); upgrade installed Eidolons with eidolons upgrade. MCP servers — CRYSTALIUM memory and the tonberry ESL runtime — are a separate catalogue managed through eidolons mcp {list,install,upgrade,…}. Commit eidolons.lock alongside eidolons.yaml for reproducible, tamper-evident installs. Full walkthrough: docs/getting-started.md.

Verified releases

Every shipped Eidolon publishes attestation-backed releases through one canonical workflow (eidolon-release-template.yml) hosted here. Each release records its commit, tree, and archive SHA-256 into roster/index.yaml via Roster Intake. Under the default integrity.enforcement: strict posture, eidolons sync and eidolons verify abort with exit 1 if any installed Eidolon's checksum drifts from the signed metadata — the same gate Roster Health runs nightly. Nexus releases use the same model. Read the trust model at docs/release-integrity.md.

What's in this repo

Area	What it contains
`roster/`	Machine-readable registry of every Eidolon — versions, repos, handoffs; the routing table (`routing.yaml`) and MCP catalogue (`mcps.yaml`)
`methodology/`	Design principles, composition contracts, the routing cortex, vocabulary
`research/`	Papers, citations, production patterns, scientific backing
`cli/`	The `eidolons` command-line tool — installs, wires, and orchestrates the team
`schemas/`	JSON Schemas for `eidolons.yaml`, `eidolons.lock`, roster entries, eval suites
`docs/`	Getting started, architecture, CLI reference, MCP store, model management, release integrity
`examples/`	Worked examples: greenfield, brownfield, solo-member, partial-team

Why a nexus, when each Eidolon is independently installable?

Each Eidolon is its own first-class repo, independently versioned — that's a hard invariant. The nexus is a coordinator, not an owner. It exists for what no single Eidolon can hold:

Discovery — without a roster, nobody knows which Eidolons exist or how they relate.
Composition — handoff contracts, pipeline conventions, the routing kernel, and partial-team patterns are shared assets.
Research — the scientific backing for the whole program lives in one place instead of drifting across repos.
Wiring — one eidolons add atlas,spectra,vivi beats fifty lines of clone-and-install docs, and eidolons harness install reaches hosts no individual Eidolon could.
Supply-chain integrity — one canonical signing workflow, one ingestion path, one consumer-side gate. Independent signing schemes would defeat the trust model.

The four-layer architecture (install standard → Eidolon repos → this nexus → consumer project) is documented in docs/architecture.md. The install contract every Eidolon satisfies is the Eidolons Individual Install Standard (Rynaro/eidolons-eiis) — versioned independently; the CLI refuses to install non-conformant members.

Contributing

Per-Eidolon bugs and features belong in that Eidolon's repo (an ATLAS finding → Rynaro/ATLAS). CLI bugs, roster issues, and composition-contract changes belong here. Install-standard questions belong in Rynaro/eidolons-eiis. Unsure which layer owns a concern? docs/architecture.md maps the four layers and their responsibilities.

License

Apache-2.0. See LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Eidolons

Try it in 60 seconds

Does it actually work?

Meet the team

How they compose

Mechanical routing — the harness

Spec-Driven lifecycle — ESL

Install

Verified releases

What's in this repo

Contributing

License

About

Uh oh!

Releases 65

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 576 Commits
.claude/skills		.claude/skills
.github/workflows		.github/workflows
.spectra		.spectra
art		art
assets		assets
bin		bin
cli		cli
docker		docker
docs		docs
evals		evals
examples		examples
methodology		methodology
research		research
roster		roster
schemas		schemas
scripts		scripts
specs		specs
.gitignore		.gitignore
.mcp.json		.mcp.json
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
DOSSIER-HARNESS-2026-06.md		DOSSIER-HARNESS-2026-06.md
EIDOLONS.md		EIDOLONS.md
LICENSE		LICENSE
MANIFESTO.md		MANIFESTO.md
Makefile		Makefile
README.md		README.md
VERSION		VERSION

Folders and files

Latest commit

History

Repository files navigation

Eidolons

Try it in 60 seconds

Does it actually work?

Meet the team

How they compose

Mechanical routing — the harness

Spec-Driven lifecycle — ESL

Install

Verified releases

What's in this repo

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 65

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages