A personal, portable team of AI agents. Named specialists that work alone when the task is sharp, in harmony when it's big — and travel with you from project to project, host to host.
Most AI coding tools ship a single generalist that plans, scouts, builds, and documents all at once — and hits a ceiling fast. Eidolons is a different shape: eight independently-versioned specialists across seven roles, one CLI, dropped into any project. You get sharp boundaries instead of one confused generalist — the right specialist for each phase, over a shared memory that carries context between them.
And the routing is mechanical, not hopeful. Most multi-agent setups are a paragraph in CLAUDE.md the model is free to ignore — so it does, until you name an agent yourself. Eidolons installs real per-host hooks: at session start a deterministic, non-LLM kernel computes the routing decision and injects it — plus recalled memory — into context, on its own, every time. The team travels across Claude Code, Codex, GitHub Copilot, Cursor, and OpenCode, and degrades gracefully to documentary routing wherever a host's hooks aren't sound.
Evaluation, not commitment — this drops a read-only ATLAS into a throwaway folder:
curl -sSL https://raw.githubusercontent.com/Rynaro/eidolons/main/cli/install.sh | bash
cd /tmp && mkdir eidolons-demo && cd eidolons-demo
eidolons init --preset minimal --non-interactiveExplore, then rm -rf /tmp/eidolons-demo and walk away. Full flow in Install.
Two questions, both measured against a bare host running the same model — no hand-waving.
1. Does wiring in the team change what the host actually does? eidolons eval compliance runs one prompt suite through a headless host twice — once with the harness wired, once with only the prose cortex (≈ a bare host) — and scores how it routes.
| Routing (Claude Code · k=2 · 56 sessions) | Prose only | With Eidolons |
|---|---|---|
| Stability — picks the right specialist on both runs | 16.7% | 58.3% (3.5×) |
| Routes "which approach?" → the reasoner | 0% | 100% |
| Correct target overall | 58.3% | 66.7% |
| False delegation on control prompts | 0% | 0% |
The signature win is consistency — the injection makes routing far more deterministic. (This is a SessionStart-only lower bound: headless hosts don't fire per-prompt hooks, so the interactive number is expected higher. Honest writeup: .spectra/research/compliance-eval-2026-06-12.md.)
2. Does the specialist shape beat one generalist pass? On an adversarial-hard coding suite (budget-matched, k=2) — measured in Vivi's own repo — Vivi's parallel-candidate shape lands every fix where a single pass lands two-thirds:
| Hard-task fix quality (pass², resolved on both runs) | Single pass | Vivi (fanout) |
|---|---|---|
| Adversarial-hard suite | 0.67 | 1.00 |
Zero reward-hacks in 63 holdout-gated runs. And Kupo, the executor the team delegates micro-tasks to, earned its roster seat on a behavioral additive-proof — 36/36 tasks, pass³ 1.00.
Don't trust our numbers — reproduce the floor yourself. The behavioral evals above are billed and model-dependent, but the routing decision underneath them is a deterministic, non-LLM kernel — so we ship it as a benchmark anyone can run cold, with no API key, no billing, and ~0 tokens:
eidolons eval routing --suite public # 15 labelled tasks across 12 routing categoriesIt grades the kernel's output against Eidolons-authored ground truth (evals/routing-suite.yaml); because the kernel is deterministic, pass^k == pass^1 and you'll get the exact same result we do. (--validate-suite self-tests the suite; --json for machine output.) This is the reproducible floor the billed evals build on — verify it, then weigh the rest.
These are early, small-N signals, framed honestly in the research digests and CHANGELOG.md — not marketing.
| Eidolon | What it does | Reach for it when… | Latest |
|---|---|---|---|
| ATLAS scout | Maps an unfamiliar codebase without writing a line. Evidence-anchored, read-only by construction. | Auditing a new repo, onboarding, before any change. | |
| SPECTRA planner | Turns a rough idea or scout report into a decision-ready spec — rubrics, gates, GIVEN/WHEN/THEN. | Planning a feature before you build it. | |
| Vivi coder · default | The default coder. Brownfield, pattern-first, test-anchored — drives a closed edit-run-test loop and gates on pass^k instead of one green run. |
Shipping the change SPECTRA planned, on a loop-capable host. | |
| APIVR-Δ coder · fallback | Vivi's conservative predecessor, for hosts without the closed loop. Same discipline, non-loop posture — add with eidolons add apivr. |
A loop-incompetent host, or a cautious builder. | |
| IDG scriber | Synthesizes docs from sessions, specs, and deltas — provenance-first, with [GAP]/[DISPUTED] markers. |
Chronicling what you just built. | |
| FORGE reasoner | Deliberates on ambiguous trade-offs. Names alternatives, surfaces assumptions, returns a verdict + confidence. | Two patterns apply and the choice isn't obvious. | |
| VIGIL debugger | Forensic debugger for failures that resist normal repair. Reproduction-gated, counterfactual-verified. | A flaky test, heisenbug, or unexplained regression. | |
| Kupo executor | Low-effort delegate target. Patches an ephemeral sandbox, proves it with a real verifier, and proposes the patch back — never writes the real tree. | Offloading trivial localized edits to keep a session lean. | |
| CRYSTALIUM memory | The shared four-layer memory substrate every member writes to and recalls from — tier-gated writes, hybrid recall, Dream consolidation, principled forgetting. | Carrying context and learned patterns across sessions and members. |
Eight shipped specialists across seven capability classes — scout, planner, coder, scriber, reasoner, debugger, executor — plus CRYSTALIUM, the
memorysubstrate underneath them all. Versions and handoff contracts live inroster/index.yaml, the machine-readable source of truth.
The team has a default shape: ATLAS scouts, SPECTRA plans, Vivi builds, IDG chronicles. FORGE and VIGIL are lateral specialists — consultable at any stage. CRYSTALIUM sits underneath all of them, the shared memory every member writes handoffs into and recalls from. Partial teams are first-class: bring just ATLAS to an audit, or the full pipeline to a greenfield.
Canonical pipeline
ATLAS ───▶ SPECTRA ───▶ Vivi ───▶ IDG
scout plan build chronicle
▲ │ ▲
FORGE ◀── (ambiguity, trade-offs, novel problems)
│ │
VIGIL ◀── (failure resisted repair; forensic attribution)
│
Kupo ◀── (localized verifier-backed micro-tasks, PROPOSE-only)
╞════════════════ CRYSTALIUM ════════════════╡
shared memory — every member commits handoff
artifacts and recalls them (bidirectional)
Handoffs are structured artifacts written to disk, not free-form messages. See methodology/composition.md for the contract table and partial-team matrix.
A descriptor table in a prose file can only suggest delegation; the host decides whether to listen, and usually it doesn't until you name an Eidolon yourself. The decision to build a mechanical harness wasn't a hunch — it's backed by a research synthesis of 112 adversarially-verified capability rows across 18 agents (DOSSIER-HARNESS-2026-06.md). The harness closes that gap with three pieces:
- A deterministic routing kernel.
eidolons run "<prompt>" --jsonclassifies a prompt againstroster/routing.yaml— no LLM, fully reproducible — and emits which Eidolon(s) handle it, at what tier, in what chain. - Per-host hook adapters.
eidolons harness installwires that kernel into each host's own lifecycle hooks. At session start the routing artifact and a memory digest are injected as context — the host doesn't have to remember to delegate; the routing arrives on its own. - Graceful degradation. Routing is injected by default (advisory-mechanical). Where a host's hooks are absent or buggy, it silently falls back to documentary routing — never worse than prose.
eidolons harness install # wire routing + memory injection into detected hosts
eidolons harness install --strict # opt-in: add tool-boundary delegate-or-deny where sound
eidolons harness status # per-host effective enforcement tierFor a hard backstop, --strict adds a PreToolUse delegate-or-deny tier that mechanically blocks direct main-loop edits (only delegated subagents may write), with soundness graded per host. When CRYSTALIUM is installed, the session-start hook also runs eidolons memory preflight — a one-shot recall that injects prior project memory, fail-open and bounded. The full per-host capability matrix is in DOSSIER-HARNESS-2026-06.md and docs/architecture.md § "Harness Layer".
Routing decides who works. ESL — the Eidolons Spec Lifecycle — decides how a change moves, so non-trivial work runs through a right-sized, auditable lifecycle instead of a one-shot prompt. It isn't a second framework bolted on: the specialists you already have are the lifecycle — SPECTRA specifies, FORGE deliberates, Vivi implements, Kupo/VIGIL verify, IDG archives — and ESL is the thin grammar that sequences them, change by change, on disk under .spectra/changes/. Each Eidolon ships its own lifecycle hop; the cortex orchestrates the rest.
It's built deliberately against the documented failure modes of spec-driven development — over-specification, instruction bloat, spec-as-waterfall, "spec" as a throwaway prompt:
- A mechanical right-sizing gate classifies every change by observable signals —
trivial→ Kupo direct (no ceremony),lite→ one-page spec,full→ the whole lifecycle. You can't over-specify a one-line fix. - maker ≠ checker, enforced — the implementer and the verifier are mechanically distinct identities, checked on the hand-off envelope. A change cannot self-verify.
- Drift-check before archive re-derives the change against its living spec, catching implementation that outran the intent.
- Opt-in, then mechanically forced — advisory by default (
SHOULDopen a change first); escalates to blocking (MUST) once the project crosses mechanical size thresholds (change-count / repo-LOC / full-spec ratio), recorded auditably in the lock. With tonberry installed, the harness injects an ESL reminder at every session start and on every non-trivial routed prompt — not left to memory. Trivial work is always exempt; install auto-assesses (skip withEIDOLONS_SKIP_AUTO_ASSESS=1).
The official implementation is tonberry — a thin (~13 MB, distroless) Go MCP whose verify is byte-identical to a zero-dependency bash 3.2 conformance checker, so the rich runtime and the minimal reference can never drift. Install it with eidolons mcp install tonberry; the contract it implements is Rynaro/eidolons-esl. ESL is opt-in — absent the MCP, the Eidolons route and build exactly as before. Once installed, the surfacing is mechanical — injected at session start every time, not (yet) a hard edit-time block.
One-time, global:
curl -sSL https://raw.githubusercontent.com/Rynaro/eidolons/main/cli/install.sh | bashThis installs the eidolons CLI to ~/.local/bin/eidolons and caches the nexus at ~/.eidolons/nexus.
Per project — empty folders or running projects:
cd <any-project>
eidolons init # interactive — choose members and preset (offers CRYSTALIUM memory)
eidolons add forge # add a single member later
eidolons sync # reconcile installed members to eidolons.yaml
eidolons harness install # wire mechanical routing + memory injection into your hosts
eidolons verify # re-check installed Eidolons against the roster's signed metadataKeep the nexus current with eidolons upgrade self (atomic, integrity-verified, with --check and --rollback); upgrade installed Eidolons with eidolons upgrade. MCP servers — CRYSTALIUM memory and the tonberry ESL runtime — are a separate catalogue managed through eidolons mcp {list,install,upgrade,…}. Commit eidolons.lock alongside eidolons.yaml for reproducible, tamper-evident installs. Full walkthrough: docs/getting-started.md.
Every shipped Eidolon publishes attestation-backed releases through one canonical workflow (eidolon-release-template.yml) hosted here. Each release records its commit, tree, and archive SHA-256 into roster/index.yaml via Roster Intake. Under the default integrity.enforcement: strict posture, eidolons sync and eidolons verify abort with exit 1 if any installed Eidolon's checksum drifts from the signed metadata — the same gate Roster Health runs nightly. Nexus releases use the same model. Read the trust model at docs/release-integrity.md.
| Area | What it contains |
|---|---|
roster/ |
Machine-readable registry of every Eidolon — versions, repos, handoffs; the routing table (routing.yaml) and MCP catalogue (mcps.yaml) |
methodology/ |
Design principles, composition contracts, the routing cortex, vocabulary |
research/ |
Papers, citations, production patterns, scientific backing |
cli/ |
The eidolons command-line tool — installs, wires, and orchestrates the team |
schemas/ |
JSON Schemas for eidolons.yaml, eidolons.lock, roster entries, eval suites |
docs/ |
Getting started, architecture, CLI reference, MCP store, model management, release integrity |
examples/ |
Worked examples: greenfield, brownfield, solo-member, partial-team |
Why a nexus, when each Eidolon is independently installable?
Each Eidolon is its own first-class repo, independently versioned — that's a hard invariant. The nexus is a coordinator, not an owner. It exists for what no single Eidolon can hold:
- Discovery — without a roster, nobody knows which Eidolons exist or how they relate.
- Composition — handoff contracts, pipeline conventions, the routing kernel, and partial-team patterns are shared assets.
- Research — the scientific backing for the whole program lives in one place instead of drifting across repos.
- Wiring — one
eidolons add atlas,spectra,vivibeats fifty lines of clone-and-install docs, andeidolons harness installreaches hosts no individual Eidolon could. - Supply-chain integrity — one canonical signing workflow, one ingestion path, one consumer-side gate. Independent signing schemes would defeat the trust model.
The four-layer architecture (install standard → Eidolon repos → this nexus → consumer project) is documented in docs/architecture.md. The install contract every Eidolon satisfies is the Eidolons Individual Install Standard (Rynaro/eidolons-eiis) — versioned independently; the CLI refuses to install non-conformant members.
Per-Eidolon bugs and features belong in that Eidolon's repo (an ATLAS finding → Rynaro/ATLAS). CLI bugs, roster issues, and composition-contract changes belong here. Install-standard questions belong in Rynaro/eidolons-eiis. Unsure which layer owns a concern? docs/architecture.md maps the four layers and their responsibilities.
Apache-2.0. See LICENSE.
