All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
- run/mcp: Make fat-harness acceptance opt-in via
seed.orchestrator.execution_mode: fat_harnessfor fresh CLI/MCP seed execution. Missing/blank execution mode now uses the default runner again until seed authoring and QA guidance consistently emit profile-compatible typed evidence for every AC. This mitigates layered scaffold AC failures reported in #1202.
- providers: GitHub Copilot CLI adapter (
CopilotCliLLMAdapter) — first-class peer of Codex/Gemini/OpenCode adapters. Switch withOUROBOROS_LLM_BACKEND=copilot. Uses localcopilot -pnon-interactive mode withGH_TOKEN/GITHUB_TOKENauth, hard tool envelope via--available-tools+--allow-tool+--add-dir, sandbox-class permission mapping, JSONL stream parsing, recursion guard via shared_OUROBOROS_DEPTHcounter (max depth 5), and auth-error short-circuit on401/missing-token detections. Optional install:pip install ouroboros-ai[copilot](the Copilot CLI itself is installed externally). - opencode: Subagent bridge plugin (
src/ouroboros/opencode/plugin/ouroboros-bridge.ts) — routes MCPouroboros_*tool calls with a_subagentparameter into OpenCode's native Task subagent panes viasession.promptAsync. Fire-and-forget dispatch returns from the hook in ~10ms, eliminating the blocking 200s+ latency of the previoussession.promptapproach. Installed automatically byouroboros setup. See OpenCode Subagent Bridge. - lateral_think: Parallel multi-persona dispatch —
ouroboros_lateral_thinknow acceptspersona="all"orpersonas=["hacker","architect",...]to fan out to multiple lateral-thinking personas in a single call. Each persona runs in its own Task pane with an independent LLM context, eliminating anchoring bias across alternatives. Uses new_subagents(plural) JSON contract, implemented server-side viabuild_lateral_multi_subagent()and plugin-side via MAX_FANOUT=10 parallelpromptAsyncwith per-payload dedupe and error isolation. - opencode/bridge: Plugin v23 recognizes
_subagentsarray for parallel fan-out. Per-payload validation, truncation, and dedupe. One failed dispatch does not abort the rest. Newouroboros_subagentsandouroboros_dispatch_errorsmetadata fields. Backwards compatible with v22 single-payload_subagentcontract.
- skills: Renamed the packaged
resumeskill toresume-sessionso Claude Code's built-in/resumesession picker is no longer shadowed. Useooo resume-sessionor/ouroboros:resume-sessionfor the Ouroboros in-flight session listing. - mcp/security:
FREETEXT_FIELDSallowlist for user-input fields (goals, prompts, descriptions) — shell metacharacters (;,|,&, backticks,$()) are no longer rejected in fields where they are legitimate prose. Structural fields remain strictly validated. - opencode/bridge: Robustness hardening (v22) — no uncaught errors under any input. Adds reject-path logging, frozen-content guards, empty-sessionID guard, client init-order guard, 5-second FNV-1a prompt dedupe, 100 KB prompt byte cap with truncation marker, user-visible
surfaceErr()for dispatch failures (no more silent "dispatched but never ran"), and an absolute outer try/catch so the plugin cannot throw into the opencode runLoop.
- interview: Fix empty response bypass in ClaudeCodeAdapter — empty content now always triggers error regardless of session_id
- interview: Fix sub-agent turn exhaustion — increase max_turns from 1 to 3 so the agent can use tools and still generate the question
- style: Apply ruff format to 4 files
- ci: Resolve ruff and mypy CI failures
- mcp: Initialize EventStore in ExecuteSeedHandler before passing to OrchestratorRunner
- mcp: Remove double-registration in CLI that overwrote dependency-injected handlers with empty ones
- mcp: Return proper MCP error responses (isError:true) instead of error text in success
- mcp: Catch
pydantic.ValidationErrorin ExecuteSeed, MeasureDrift, Evaluate handlers - mcp: Initialize EventStore before EvolutionaryLoop.evolve_step accesses it
- mcp: Forward host/port CLI args to server for SSE transport
- mcp: Remove dead code (discarded EvaluationPipeline/LateralThinker instances)
- mcp: Remove invalid
llm_adapterkwarg from ClaudeAgentAdapter init - orchestrator: Handle DependencyAnalyzer error with all-parallel fallback instead of crash
- seed: Add Pydantic aliases (
typefor field_type,criteriafor evaluation_criteria) - eval: Change EvaluationPipeline/SeedGenerator type annotations from LiteLLMAdapter to LLMAdapter Protocol
- security: Validate nested string values in InputValidator, not just top-level
- security: Use MappingProxyType for frozen dataclass AuthContext.metadata
- protocol: Add credentials param to MCPServer protocol to match implementation
- build: Use dynamic version from
__init__.pyvia hatchling (single source of truth)
- adapter: Handle unknown message types (
rate_limit_event) from Claude Agent SDK with retry logic - interview: Ensure first response is a direct question, not introduction
- mcp: Correct uvx command syntax to use
--python 3.14 --from ouroboros-aifor proper version resolution
Agent System (ouroboros.plugin.agents)
AgentRegistry- Dynamic agent discovery with custom.mdfile support from.claude-plugin/agents/AgentPool- Reusable agent pool with load balancing, auto-scaling, and health monitoringAgentRoleenum - Type-safe role categorization (ANALYSIS, PLANNING, EXECUTION, REVIEW, DOMAIN, PRODUCT, COORDINATION)AgentSpec- Frozen dataclass for agent specifications with tools, capabilities, and model preferences- 4 builtin agents:
executor,planner,verifier,analyst
Skill System (ouroboros.plugin.skills)
SkillRegistry- Hot-reloadable skill discovery from.claude-plugin/skills/MagicKeywordDetector- "ooo:" prefix and trigger keyword routingSkillExecutor- Context-aware skill execution with history trackingSkillDocumentation- Auto-generated documentation from SKILL.md files- 9 new execution mode skills:
autopilot- Autonomous execution from idea to working codeultrawork- Maximum parallelism with parallel agent orchestrationralph- Self-referential loop with verifier verification (includes ultrawork)ultrapilot- Parallel autopilot with file ownership partitioningecomode- Token-efficient execution using haiku and sonnetswarm- N coordinated agents using native runtime teamspipeline- Sequential agent chaining with data passingtutorial- Interactive guided tour for new usersswarm- Team coordination mode
Orchestration (ouroboros.plugin.orchestration)
ModelRouter- PAL (Progressive Auto-escalation) routing with tier selectionScheduler- Parallel task execution with dependency resolution viaTaskGraphRoutingContext- Complexity-aware routing with learning from historyScheduledTask- Task wrapper with priority, dependencies, and timeout support
State Management
- Removed:
StateStore,StateManager,RecoveryManager,StateCompression(dead code — all runtime state managed by EventStore/SQLite)
TUI HUD Components (ouroboros.tui.components)
AgentsPanel- Real-time agent pool status visualizationTokenTracker- Per-agent token usage with cost estimationProgressBar- Multi-phase progress with animated spinnersEventLog- Scrolling event history with color-coded severityHUDDashboard- Unified HUD screen integrating all components
Documentation
docs/compare-alternatives.md- Comparison with other AI agents and frameworksdocs/onboarding-metrics.md- User onboarding metrics and optimization strategiesdocs/marketing/- Marketing assets (social media templates, star campaign, why ouroboros)docs/screenshots/- Screenshot capture guides and production scriptsdocs/videos/- Video production guides and demo scripts- Updated
CONTRIBUTING.md- Full development setup and contribution guide - Updated
docs/architecture.md- Plugin system architecture documentation - Updated
docs/getting-started.md- Enhanced onboarding experience
Developer Experience
- GitHub workflows:
.github/workflows/lint.yml,test.yml,release.yml playground/directory with example models and configurations- 161 new passing tests (149 unit + 12 integration)
Skill Files Updated
- Updated
help,setup,welcomeskills with progressive disclosure - Added 8 new skill SKILL.md files (autopilot, ultrawork, ralph, ultrapilot, ecomode, swarm, pipeline, tutorial)
- Updated CLI onboarding flow to reference new plugin system
- Enhanced skill discovery with automatic trigger keyword indexing
- Improved state persistence across /clear and session restarts
- 161 new tests for plugin system (149 unit + 12 integration)
- All existing TUI and tree tests continue to pass (190 tests)
- Total test count: 1731 passing tests
- CLI Reference (
docs/cli-reference.md) - Complete command reference with examples - Prerequisites section in README with Python 3.14+ requirement
- Contributing section with links to Issues and Discussions
- OSS badges - PyPI version, Python version, License
- Tiered confirmation system for interview rounds:
- Rounds 1-3: Auto-continue (minimum context gathering)
- Rounds 4-15: Ask "Continue?" after each round
- Rounds 16+: Ask "Continue?" with diminishing returns warning
- No hard round limit - User controls when to stop
- New constants:
MIN_ROUNDS_BEFORE_EARLY_EXIT,SOFT_LIMIT_WARNING_THRESHOLD
- Removed
MAX_INTERVIEW_ROUNDShard limit (was 10) is_completenow only checks status (user-controlled completion)record_response()no longer auto-completes at max rounds- System prompt simplified to show "Round N" instead of "Round N of 10"
- Extracted
_run_interview_loop()helper to eliminate code duplication (~60 lines) - State saved immediately after status mutation for consistency
- Updated welcome message to reflect no round limit
- Korean-language requirement documents (
requirement/folder) - Hard round limit enforcement in interview engine
- Code duplication in init.py interview continuation flow
- New security utilities module with comprehensive protection features
- API Key Management
mask_api_key()- Safely mask API keys for logging (shows only last 4 chars)validate_api_key_format()- Basic format validation for API keys
- Sensitive Data Detection
is_sensitive_field()- Detect sensitive field names (api_key, password, token, etc.)is_sensitive_value()- Detect values that look like secretsmask_sensitive_value()- Mask potentially sensitive valuessanitize_for_logging()- Create sanitized copies of dicts for safe logging
- Input Validation
InputValidatorclass with size limits for DoS prevention:MAX_INITIAL_CONTEXT_LENGTH= 50KBMAX_USER_RESPONSE_LENGTH= 10KBMAX_SEED_FILE_SIZE= 1MBMAX_LLM_RESPONSE_LENGTH= 100KB
- Automatic sensitive data masking in structlog processor chain
- API keys, passwords, tokens are now automatically redacted in all log outputs
- Nested dictionaries are recursively sanitized
- Pattern-based detection for values starting with
sk-,pk-,Bearer, etc.
- Input validation now uses
InputValidatorfor consistent size limits start_interview()validates initial context lengthrecord_response()validates user response length
- LLM responses are now validated and truncated if exceeding size limits
- Warning logged when response truncation occurs
- Seed file size is now validated before loading
- Protection against oversized seed files
- API Key Management: Keys are masked in logs, showing only provider prefix and last 4 characters
- Input Validation: All external inputs have size limits to prevent DoS attacks
- Log Sanitization: Sensitive data is automatically masked in all log outputs
- Credentials Protection:
credentials.yamlcontinues to use chmod 600 permissions
- Added comprehensive test suite for security module (39 tests)
- Added sensitive data masking tests for logging module (5 tests)
- All 1341 tests passing
- Initial release with core Ouroboros workflow system
- Big Bang (Phase 0) - Interview and Seed generation
- PAL Router (Phase 1) - Progressive Adaptive LLM selection
- Double Diamond (Phase 2) - Execution engine
- Resilience (Phase 3) - Stagnation detection and lateral thinking
- Evaluation (Phase 4) - Mechanical, semantic, and consensus evaluation
- Secondary Loop (Phase 5) - TODO registry and batch scheduler
- Orchestrator (Epic 8) - Runtime abstraction and orchestration
- CLI interface with Typer
- Event sourcing with SQLite persistence
- Structured logging with structlog
- Various bug fixes and stability improvements
- Initial project structure
- Core types and error hierarchy
- Basic configuration system