Roadmap¶
Current status¶
SynthOrg is pre-alpha. The platform, infrastructure, and runtime are built and tested (39,000+ tests in the latest run) and integrated through a REST + WebSocket API, a React 19 dashboard, and a Go CLI. The agent runtime, multi-agent coordinator, work pipeline spine, intake engine, sandbox lifecycle dispatch, and distributed-path consumers are all wired and exercised by deterministic e2e harnesses with a scripted provider (no real LLM spend). What remains in flight is the operator-facing maturity that turns the wired runtime into a polished autonomous studio, plus real-provider acceptance against a live LLM. Progress is tracked openly on the issue tracker.
Available now¶
Shipped and exercised today (by deterministic e2e harnesses with a scripted provider, zero LLM spend, unless noted):
- API, dashboard, CLI: REST + WebSocket API, the React 19 dashboard, and the Go CLI for Docker orchestration and supply-chain verification.
- Persistence: SQLite (single-node default) and PostgreSQL
(multi-instance), dual-backend conformance-tested, with in-process
yoyo-managed migrations and ISO 4217 currency stamping on every
cost-bearing row. Opt-in TimescaleDB hypertable conversion for the
append-only
cost_recordsandaudit_entriestables (enable_timescaledb: falseby default; ignored gracefully when the extension is absent). - Provider layer: any LLM via LiteLLM with retry and rate-limit handling; local model management for Ollama and LM Studio.
- Configuration and templates: define a company in YAML; importable agent, department, and company templates with personality presets and locale-aware name generation.
- Agent runtime: a configured provider boots a real agent runtime that executes tasks (LLM + sandboxed tools) under a minimal safety spine (autonomy/trust verdict on tool actions, approval-queue producer for sensitive actions). An empty company (no provider) cleanly rejects task submission.
- Multi-agent coordinator and work pipeline spine:
/coordinateruns decompose, route, parallel execution, then roll up end-to-end behind the provider-present switch. The shared work pipeline (intake to projects to decompose to solo/team to execute to coordination metrics) is the single integration point every entry adapter feeds, with solo-vs-team decided internally by decomposition. - Entry adapters: real work-entry paths for the intake engine
(
POST /requests/{id}/approve), the task board (POST /tasks), and stated objectives (POST /objectives), all driving the pipeline spine. - Sandbox lifecycle dispatch:
DockerSandbox.execute()honoursowner_idand dispatches to the configured per-call / per-agent / per-task lifecycle strategy, with grace-period teardown. - Distributed dispatch: NATS JetStream queue, worker pool, dead-letter consumer, dedup pruner, and heartbeat subscriber, validated under multi-worker synthetic load (no loss, no duplication).
- Conversational org interface: talk to the company in natural language: clarify-and-propose against the Chief of Staff, per-turn concern routing to the best-fit role agent, multi-agent group chat, human-consented agent-initiated invites, and direct MCP acting under trust (sensitive actions approval-gated; fail-closed when security governance is inactive). The four conversational modes (explain-chat, propose, concern routing, and group chat) are on by default; explain-chat, propose, and group chat toggle per request, while concern routing is baked into the proposer at startup and needs a restart to change. Agent-initiated invites and direct MCP acting are off by default.
- Operations: structured logging with correlation tracking and redaction, log shipping, Prometheus metrics, OTLP, HttpOnly-cookie multi-user sessions with CSRF protection, Wolfi apko-composed distroless images, Trivy + Grype scanning, cosign signatures, and SLSA L3 provenance.
- Product studio substrate: persistent project workspace with pluggable git, brownfield codebase intake, living documentation, and a deep requirements interview.
- Operate tier: golden-company benchmark, mission control with run replay, a cost forecast/kill-switch dial, a measurable learning curve, deterministic replay, run narratives, and an adversarial red-team.
- Agent capability layer: a knowledge and provenance retrieval substrate, research mode, continual improvement, governed external API access, and headless-browser and virtual-desktop testing.
In active development¶
These turn the wired runtime into a polished autonomous studio. The runtime, coordinator, intake, work pipeline, sandbox dispatch, and distributed-path consumers already run under deterministic harnesses; what remains is operator-facing maturity and real-provider acceptance:
- Self-improvement loop: company-wide signals from existing subsystems producing deployment and product-level improvement proposals through a rule-first hybrid pipeline with mandatory human approval. Components built and unit-tested; live end-to-end run pending.
- Real-provider acceptance: the e2e harness drives the runtime against a deterministic scripted provider, not a real LLM. A real-provider golden-company benchmark and run narrative arrive with the operate tier.
Backlog¶
Research candidates and longer-term ideas without a scheduled timeframe. See Future Vision for detail.
- Advanced memory architecture (GraphRAG, RL consolidation)
- Distributed multi-node organisational memory consistency (Phase 2 compare-and-set on PostgreSQL advisory locks)
- A2A skill negotiation and inter-org federation (delegation across organisations)
- Community template marketplace
- Kubernetes sandbox backend
- Shift system for agents
- Training mode (learn from senior agents)
See Open Questions for unresolved design decisions.