CLAUDE.md Reference: Infrequently Needed Sections¶
Moved from root CLAUDE.md to reduce per-message token cost. Read on demand.
Documentation¶
- Docs:
docs/(Markdown, built with Zensical, config:mkdocs.yml). Also embedded in the web Docker image at/docs/via a docs-builder stage - Design spec:
docs/design/(12 pages), Architecture:docs/architecture/, Roadmap:docs/roadmap/ - Security:
docs/security.md, Licensing:docs/licensing.md, Reference:docs/reference/ - REST API reference:
docs/openapi/index.md(landing page) +docs/openapi/reference.html(Scalar viewer) +docs/openapi/openapi.json(schema). The viewer and schema are generated byscripts/export_openapi.pyand written as static siblings of the landing page so zensical copies them through on build. Sitemap policy:scripts/patch_sitemap.pyincludesreference.html(a real landing page that should be discoverable) but excludesopenapi.json(Google never renders raw JSON in search results, and including it created permanent "Discovered, currently not indexed" noise in Search Console). - Comparison data:
data/competitors.yaml, the shared YAML source fordocs/reference/comparison.md(generated byscripts/generate_comparison.py) andsite/src/pages/compare.astro - Library reference:
docs/api/(auto-generated via mkdocstrings + Griffe, AST-based) - Scripts:
scripts/: CI/build utilities and development-time validation hooks (relaxed ruff rules:printand deferred imports allowed). Validation hooks include:check_push_rebased.sh(blocks push if behind main),check_bash_no_write.sh(blocks file writes via Bash),check_git_c_cwd.sh(blocks unnecessarygit -C),check_web_design_system.py(validates design tokens on web file edits). CI scripts include:evaluate-scan.sh(DRY Trivy JSON result evaluation),cis-scan.sh(CIS Docker Benchmark wrapper),report-image-size.sh(image size reporting to step summary) - Landing page:
site/(Astro + React islands via@astrojs/react). Includes/get/CLI install page,/compare/framework comparison, contact form, interactive dashboard preview, SEO - Deps:
docsgroup inpyproject.toml(zensical,mkdocstrings[python],griffe-pydantic)
Docker¶
# Build and run (from repo root)
cp docker/.env.example docker/.env # configure env vars
docker compose -f docker/compose.yml build
docker compose -f docker/compose.yml up -d
docker compose -f docker/compose.yml down
# Verify
curl http://localhost:3001/api/v1/readyz # backend (direct)
curl http://localhost:3000/api/v1/readyz # backend (via web proxy)
- Images: backend (Wolfi apko-composed distroless, non-root), web (Caddy pure-apko, SPA + API proxy + embedded docs), sandbox (Python + Node.js Wolfi, non-root)
- Config: all Docker files in
docker/: Dockerfiles, compose,.env.example. Single root.dockerignore(all images build withcontext: .) - Verification: CLI verifies cosign signatures + SLSA provenance at pull time; bypass with
--skip-verify - Tags: version from
pyproject.toml, semver, SHA, plus dev tags (v0.8.4-dev.3,devrolling) for dev channel builds
Package Structure¶
src/synthorg/
api/ # Litestar REST + WebSocket API, RFC 9457 errors, setup wizard, personality presets, auth/ (role-based access control, HttpOnly cookie sessions, CSRF double-submit, lockout / refresh-token / session repositories under persistence/{sqlite,postgres}/, concurrent session enforcement, user presence, OrgRole enum for org config permissions), guards (HumanRole-based + OrgRole-based with department scoping via require_org_mutation), user management (CRUD + org-role grant/revoke), dto_org (request DTOs for company/department/agent mutations), dto_workflow (request/response DTOs for workflow definition and execution operations), services/org_mutations (read-modify-write config mutation service), auto-wiring, lifecycle (auto-promote first owner), bootstrap (agent registry init from config), template packs (list + live-apply), memory admin (fine-tuning pipeline with orchestrator, checkpoint management, preflight checks, run history, embedder queries), optimistic concurrency (ETag/If-Match), TLS config, tiered rate limiting (unauth by IP, auth by user ID), rate_limits/policies (RATE_LIMIT_POLICIES canonical registry of (max_requests, window_seconds) defaults per operation id + per_op_rate_limit_from_policy helper; operator overrides flow through PerOpRateLimitConfig.overrides separately), workflows (visual workflow definition CRUD, validation, YAML export, blueprint listing, blueprint instantiation, version history, diff, rollback), workflow executions (activate, list, get, cancel), ceremony policy (project + per-department query/override, resolved policy with field origins), quality overrides (per-agent quality score override CRUD), reports (on-demand report generation, period listing), notification_dispatcher (fan-out notification sink), training (training plan CRUD, execution, preview, overrides)
backup/ # Backup/restore orchestrator, scheduler, retention, handlers/
budget/ # Cost tracking, budget enforcement, quota degradation (including synchronous peek for routing-time selector hints), CFO optimization, trend analysis, budget forecasting, configurable currency formatting, risk budget (cumulative risk-unit tracking, risk scoring integration, risk check, risk records), automated reporting (periodic comprehensive reports, spending/performance/task-completion/risk-trends templates, report scheduling config), coordination metrics (9 empirical metrics: efficiency, overhead, error amplification, message density, redundancy, Amdahl ceiling, straggler gap, token/speedup ratio, message overhead), project cost aggregates (durable per-project lifetime cost totals surviving retention pruning)
cli/ # Python CLI module (superseded by top-level cli/ Go binary)
client/ # Client simulation: ai_client, human_client, hybrid_client, pool, adapters, runner, continuous, store, simulation_state, config, models, protocols, feedback/ (binary, scored, criteria_check, adversarial), generators/ (template, llm, dataset, procedural, hybrid), report/ (detailed, summary, metrics_only, json_export)
a2a/ # Optional A2A external gateway (JSON-RPC 2.0 federation), agent_card (safe-subset projection), client (outbound federation), gateway (inbound JSON-RPC dispatcher), models (A2A protocol types), task_mapper + message_mapper (bidirectional mapping), config, security (peer validation, payload limits), well_known (Agent Card discovery), peer_registry, push_verifier (HMAC-SHA256), connection_types/ (a2a_peer registration)
communication/ # Message bus, dispatcher, channels, delegation, conflict resolution, meeting/, event_stream/ (AG-UI SSE hub, event projector, interrupt/resume protocol, evidence package re-exports)
config/ # YAML company config loading and validation
core/ # Shared domain models, base classes, resilience config, immutable (deep_copy_mapping, freeze_recursive for frozen Pydantic field protection), tool_disclosure (ToolL1Metadata, ToolL2Body, ToolL3Resource), tool_constraints (ToolSubConstraints, the five dimension enums, get_sub_constraints; placed in core so core.agent need not import the tools hub)
execution/ # Light leaf for execution-trace types, engine-free so non-engine consumers (e.g. budget.coordination_collector) can import them cold: turn (TurnRecord, NodeType, BehaviorTag), efficiency (EfficiencyRatios, IdealTrajectoryBaseline, compute_efficiency_ratios), view (ExecutionResultView runtime-checkable protocol), parked_context (ParkedContext, the serialised parked-agent snapshot the persistence/worker/API layers name without pulling engine). See ADR-0012.
engine/ # Orchestration, execution loops, task engine (observer registration, background observer dispatch), coordination, checkpoint recovery, structured failure diagnosis (FailureCategory, infer_failure_category, RecoveryResult failure_context/criteria_failed/stagnation_evidence), approval/review gates (no-self-review enforcement via SelfReviewError, immutable DecisionRecord drop-box), stagnation detection, context budget, compaction, hybrid loop, prompt profiles (tier-based prompt adaptation, personality trimming via max_personality_tokens), procedural memory integration (failure-driven), post_execution/ (extracted memory hooks -- distillation capture, procedural memory pipeline, evolution trigger), evolution/ (pluggable trigger/proposer/guard/adapter pipeline, EvolutionService orchestrator, EvolutionConfig with safe defaults, triggers/ (batched/inflection/per-task/composite), proposers/ (separate-analyzer/self-report/composite), adapters/ (identity/strategy-selection/prompt-template), guards/ (rate-limit/review-gate/rollback/shadow-evaluation (with shadow_protocol.py protocols + shadow_providers.py Configured/RecentHistory strategies)/approve-all (no-op fallback when every guard is disabled)/composite)), identity/ (diff utilities, store/ (IdentityVersionStore protocol, append-only + copy-on-write implementations with rollback)), workspace/ (git worktree isolation, merge orchestration, semantic conflict detection), quality/ (step-level quality signal classifier, accuracy-effort ratio, StepQualityClassifier protocol), health/ (two-layer health monitoring pipeline, HealthJudge + TriageFilter, EscalationTicket, NotificationSink wiring), trajectory/ (best-of-K trajectory scoring, TrajectoryScorer, budget guard, TrajectoryConfig), intake/ (IntakeEngine lifecycle walker, strategies/ (DirectIntake pass-through, AgentIntake LLM-driven triage)), review/ (ReviewPipeline chain walker, stages/ (InternalReviewStage, ClientReviewStage)), workflow/ (Kanban board, Agile sprints, WIP limits, sprint lifecycle, velocity tracking, ceremony scheduling, strategy migration, strategies/ (pluggable scheduling strategies), velocity_calculators/ (pluggable velocity calculators), definition (visual workflow graph model, node/edge types, validation, YAML export), blueprint_loader (starter blueprint loading), blueprint_models (blueprint data models), blueprints/ (5 YAML starter templates), diff (version diff computation), version (version snapshot model), execution (workflow activation service, execution models, condition evaluator (compound AND/OR/NOT), graph utilities, execution_observer (TaskEngine bridge for lifecycle transitions), execution_activation_helpers (graph walking, conditional processing, task config parsing), execution_lifecycle (execution transitions, status management, task-event handling), subworkflow_registry (subworkflow publishing, version resolution, parent references))), strategy/ (trendslop mitigation: strategic lenses, constitutional principles, confidence calibration, cost tier resolution, lens_assignment (LensAssigner protocol, DiversityMaximizingAssigner round-robin), consensus (ConsensusVelocityDetector, ConsensusAction), premortem (PremortemExecutor protocol, DefaultPremortemExecutor, FailureMode, PremortemOutput))
hr/ # Hiring, firing, onboarding, agent registry (evolve_identity for evolution-approved changes), performance tracking (InflectionSink protocol, PerformanceInflection events for trend direction changes), activity timeline, activity event types, cost event redaction, career history, promotion/demotion, evaluation/ (five-pillar evaluation framework, pluggable pillar scoring strategies, EvaluationConfig), quality scoring (layered composite: CI signal + LLM judge + human override, QualityOverrideStore), scaling/ (dynamic company scaling: ScalingService orchestrator with runtime strategy enable/disable and priority reordering, domain models (ScalingSignal/ScalingContext/ScalingDecision/ScalingActionRecord), enums (ScalingActionType/ScalingOutcome/ScalingStrategyName), error types, ScalingContextBuilder (signal aggregation with graceful degradation), pluggable ScalingStrategy/ScalingSignalSource/ScalingTrigger/ScalingGuard protocols, strategies/ (WorkloadAutoScale, BudgetCap, SkillGap, PerformancePruning with evolution deferral), signals/ (workload, budget, skill, performance read-only adapters), triggers/ (BatchedScalingTrigger with overlap protection, SignalThresholdTrigger with crossing detection, CompositeScalingTrigger), guards/ (ConflictResolver with MappingProxyType-wrapped priority, CooldownGuard, RateLimitGuard with batch-aware enforcement, ApprovalGateGuard, CompositeScalingGuard with public get_guards()), config (per-strategy + trigger + guard + default_hire_level), factory), training/ (pluggable training pipeline: TrainingService orchestrator, TrainingPlan/TrainingResult models, factory, config, selectors/ (role_top_performers, department_diversity, user_curated, composite), extractors/ (procedural, semantic, tool_patterns), curateurs/ (relevance, llm_curated), guards/ (sanitization, volume_cap, review_gate), onboarding_integration)
notifications/ # NotificationSink protocol, NotificationDispatcher fan-out, Notification model (category taxonomy: approval/budget/security/system/agent/health + severity taxonomy), adapters/ (console, ntfy, slack, email), config
ontology/ # Semantic ontology subsystem: @ontology_entity decorator, OntologyBackend protocol, SQLiteOntologyBackend, OntologyService (bootstrap + CRUD), OntologyConfig (6 sub-configs), EntityDefinition/EntityField/EntityRelation models, versioning integration, drift detection types, error hierarchy, observability events
memory/ # Pluggable MemoryBackend, retrieval pipeline (hybrid dense+BM25 sparse with RRF fusion, MMR diversity re-ranking via apply_diversity_penalty with pre-computed bigram cache), tool-based injection strategy with iterative Search-and-Ask reformulation loop (fail-safe reformulator/sufficiency_checker), ToolRegistry memory tool wrappers (SearchMemoryTool, RecallMemoryTool), fail-closed memory filter, agentic query reformulation, org memory, backends/ (composite namespace-based routing, inmemory session-scoped, mem0 Qdrant+SQLite, EmbeddingCostConfig embedding cost tracking), consolidation/ (SimpleConsolidationStrategy, DualModeConsolidationStrategy density-aware, LLMConsolidationStrategy with parallel TaskGroup per-category processing + trajectory-context injection from distillation entries, LLMConsolidationConfig, DistillationRequest capture helper tagged "distillation" EPISODIC, retention, archival), embedding/ (LMEB-ranked model selection, embedder config resolution, fine-tuning pipeline with orchestrator, cancellation, checkpoint management), procedural/ (failure-driven auto-generation, proposer LLM pipeline, SKILL.md materialization, ProceduralMemoryConfig, capture/ (failure/success/hybrid capture strategies), pruning/ (TTL/Pareto/hybrid pruning strategies), propagation/ (none/role-scoped/department-scoped cross-agent propagation))
persistence/ # Pluggable PersistenceBackend, SQLite + Postgres backends, settings + user + artifact + project + preset + workflow definition + workflow execution + workflow version + agent identity versions + fine-tune + decision record (append-only audit drop-box) + risk override + SSRF violation + project cost aggregate + training plan + training result repositories, artifact content storage (pluggable ArtifactStorageBackend, filesystem impl), migrations.py + migration_helpers.py (yoyo-migrations runner coroutines and URL/discovery/result-dataclass helpers, in-process), sqlite/revisions/ + postgres/revisions/ (revision .sql files), optional TimescaleDB hypertable support for append-only time-series tables
versioning/ # Generic versioning infrastructure: VersionSnapshot[T] model, VersioningService[T] (content-addressable deduplication via SHA-256 hash, INSERT OR IGNORE concurrent-write safety), compute_content_hash
telemetry/ # Opt-in product telemetry (disabled by default): TelemetryReporter protocol, TelemetryEvent model, PrivacyScrubber (allowlist + forbidden pattern validation), TelemetryCollector (heartbeat scheduling, deployment ID persistence, environment resolution chain), host_info (Docker daemon `/info` enrichment for startup events via aiodocker), reporters/ (LogfireReporter, NoopReporter), TelemetryConfig
observability/ # Structured logging, correlation tracking, redaction, third-party logger taming, log shipping (syslog, HTTP), compressed archival, events/
providers/ # LLM provider abstraction, presets, model auto-discovery, capabilities, runtime CRUD (management/), local model management (pull/delete/config via LocalModelManager protocol), provider families, discovery SSRF allowlist, health tracking, active health probing, defaults_config (ProviderModelDefaults: last-resort metadata fallbacks when LiteLLM exposes no per-model data, e.g. fallback_max_output_tokens), routing/ (strategy-based model routing, multi-provider resolution with ModelCandidateSelector protocol, QuotaAwareSelector, CheapestSelector)
settings/ # Runtime-editable settings (DB > env > code), Fernet encryption, ConfigResolver, bootstrap_resolver (pre-init env > default), definitions/, subscribers/ (SecuritySubscriber for discovery allowlist hot-reload)
security/ # Rule engine, audit log, output scanner, progressive trust, autonomy levels, timeout policies, LLM fallback evaluator, custom policy rules, risk scoring (pluggable RiskScorer protocol, multi-dimensional RiskScore, DefaultRiskScorer), enforcement modes (active/shadow/disabled via SecurityEnforcementMode), risk override (SecOps risk tier reclassification via RiskTierOverride + SecOpsRiskClassifier), SSRF violation tracking (SsrfViolation model, pending/allowed/denied status for self-healing discovery allowlist)
templates/ # Pre-built company templates (inheritance tree), template merge engine, personality presets, preset discovery/CRUD service, model requirements, tier-to-model matching, locale-aware name generation, workflow config rendering, pack_loader (additive team packs), packs/ (built-in pack YAMLs), uses_packs composition
meta/ # Self-improvement meta-loop: signal aggregation (7 domains), rule engine (9 built-in rules + custom declarative rules via dashboard), improvement strategies (config/architecture/prompt tuning), proposal guards (scope/rollback/rate-limit/approval), rollout (before-after/canary, tiered regression detection), appliers (config/prompt/architecture/code each expose dry_run() validation via shared appliers/_validation.py helpers: parse_dotted_path, apply_diff_to_dict, validate_payload_keys, format_validation_errors), Chief of Staff role. Custom rule authoring: DeclarativeRule, CustomRuleDefinition model, METRIC_REGISTRY (25 metrics), CustomRuleRepository protocol + SQLite impl, CustomRuleController (CRUD + preview). Unified MCP API server: 240+ tools across 21 domains with capability-based scoping (registry, scoper, invoker, tool builders, domain defs, handlers). Service orchestrator, factory, config
tools/ # Tool registry, built-in tools, git SSRF prevention, MCP bridge, sandbox factory (gVisor default overrides via merge_gvisor_defaults), invocation tracking, network_validator (shared SSRF), sub_constraint_enforcer (granular enforcement of core.tool_constraints), disclosure_config (ToolDisclosureConfig), disclosure_metrics (ToolDisclosureMetrics), discovery (ListToolsTool, LoadToolTool, LoadToolResourceTool, ToolDisclosureManager, DeferredDisclosureManager), web/ (HTTP requests, HTML parsing, web search), database/ (SQL query, schema inspection), terminal/ (sandboxed shell commands), design/ (image generation via ImageProvider protocol, diagram DSL generation, asset management), communication/ (SMTP email sending, notification dispatch via NotificationDispatcherProtocol, Jinja2 template formatting), analytics/ (data aggregation via AnalyticsProvider protocol, report generation, metric collection via MetricSink protocol), sandbox/ (4-domain SandboxPolicy model (filesystem/network/process/inference), SandboxRuntimeResolver (gVisor probe + per-category runtime resolution with fallback), SandboxCredentialManager (env var credential stripping))
web/src/ # React 19 dashboard (see web/CLAUDE.md for full structure)
cli/ # Go CLI binary (see cli/CLAUDE.md for full structure)
site/ # Astro landing page (synthorg.io), React islands for interactive sections
data/ # Shared data files (competitors.yaml for comparison page)
Releasing¶
- Automated by Release Please: every push to
maincreates/updates a release PR with changelog - Version bumping:
always-bump-patchstrategy; every release bumps patch (e.g. 0.5.3 -> 0.5.4), regardless of commit type.auto-rollover.ymldetects when the last stable patch meets the__synthorg_rollover_at_patchthreshold in.github/release-please-config.json(default 9) and creates an emptyRelease-As: 0.(X+1).0commit to preserve the 0.X.9 -> 0.(X+1).0 pattern automatically. Release-Astrailer: for exception bumps (1.0 graduation, explicit version jumps), land aRelease-As: X.Y.Ztrailer in a commit onmain. Two valid routes: (a) final paragraph of a feature-PR body that will be squash-merged (squash copies the trailer into the main-branch commit message, whereauto-rollover.ymland Release Please both pick it up); (b) trigger Actions -> Graduate -> Run workflow withtarget_version+reason. The Graduate workflow mints asynthorg-repo-botApp installation token and creates a signed empty commit onmainvia the Git Data API, landing aRelease-As: X.Y.Ztrailer that both RP and auto-rollover pick up. Downgrades and same-version graduations are hard-blocked by the workflow's validation step; fix forward with a higher target instead. The prior "addRelease-As:to the RP release PR body" route is deliberately unsupported: that edit never becomes a commit onmainuntil the RP PR merges, soauto-rollover.ymlcan race ahead and push a conflicting trailer before RP reacts.- Signed commits: every CI-generated commit on
mainis produced via the GitHub API under thesynthorg-repo-botApp installation token, verifying as{verified: true, reason: "valid"}.mainenforcesrequired_signaturesvia theprotect-mainruleset, so an unsigned commit would be rejected outright. One deliberate exception: the BSL Change Date update on the Release Please PR branch (release.yml"Update BSL Change Date" step) commits viaGITHUB_TOKENrather than the App token. The commit lands on the RP PR branch (notmain), so the recursion-suppression penalty ofGITHUB_TOKENdoes not apply, and GitHub's ambient token still produces a signed commit attributed togithub-actions[bot]which satisfies branch protection via the eventual squash-merge. - Release flow: merge release PR -> draft Release + tag -> Docker + CLI workflows build, smoke-test the artifacts at build time (
smoke-test-backend-imageagainst the just-built image;smoke-test-cli-binaryagainst the just-built binary), and attach assets to the draft ->finalize-release.ymlposts afinalize-releasecommit status, assembles the Verification section, and publishes the draft. On stable releases, superseded dev pre-releases + tags (those whose base version is at or below the published stable) are then deleted; dev builds targeting a higher, not-yet-released version are preserved. Smoke tests run at build time (not at finalise) so a broken artifact fails the originating PR with a red ❌ on the commit row, not the finalise step after a tag has already been cut. - Dev channel: every push to
main(except Release Please bumps) creates a dev pre-release (e.g.v0.8.4-dev.3) viadev-release.yml. Users opt in withsynthorg config set channel dev. Dev releases flow through the same Docker + CLI pipelines as stable releases. When a stable release is published, dev releases and tags whose base version is at or below it are deleted; dev builds targeting a higher, not-yet-released version are preserved (amainpush can mint the next version'sdev.1while the previous stable is still finalising). If a dev release is swept while itsdocker.ymlrun is still in flight, that run's update-release step skips gracefully (warns, exits 0) rather than failing. - Nightly verification: deliberately none. The build-time pipeline (
docker.yml+cli.yml+finalize-release.yml) is the source of truth for release-body structure, asset signing, and SBOM attachment. App-token signing is a property of the GitHub API auth path (POST /git/commitsunder an installation token returns a GitHub-signed commit unconditionally), not of any code we own; a misconfigured secret or revoked installation would also fail the next real release, so a nightly canary mostly catches its own implementation drift. Earlierrelease-pipeline-health.ymlandtest-signing.ymlworkflows were removed for that reason. - Pre-1.0 -> post-1.0 transition: when v1.0.0 ships,
always-bump-patchstays in place (the SynthOrg release cadence favours conservative patch bumps). What flips isbump-minor-pre-major: truein the RP config; after 1.0 this flag is dropped soBREAKING CHANGE:footers start producing major bumps again (1.x.y -> 2.0.0).Release-As:trailers keep working unchanged.auto-rollover.ymlalso keeps working unchanged; patch-rollover is version-independent, and rollover at 1.x.9 -> 1.(x+1).0 continues to use the same mechanism. A follow-up PR will flip the config flag when v1.0.0 lands. - Config:
.github/release-please-config.json,.github/.release-please-manifest.json(do not edit manually) - Changelog:
.github/CHANGELOG.md(auto-generated, do not edit) - Version locations:
pyproject.toml([tool.commitizen].version),src/synthorg/__init__.py(__version__)
CI¶
- Path filtering:
dorny/paths-filter; jobs only run when their domain is affected. CLI has its own workflow (cli.yml). - Jobs: lint (ruff) + type-check (mypy) + test-unit (matrix sharded via pytest-split, balanced from
.test_durations.unit; shard count in.github/workflows/ci.ymlmatrix.shard) + test-integration (matrix sharded via pytest-split, balanced from.test_durations.integration, backed byservices: postgresinstead of testcontainers; conftest detectsSYNTHORG_TEST_POSTGRES_HOST/PORT/USER/PASSWORD/DBand yields a connection-info proxy directly) + test-e2e (single shard, same service container) + test-conformance-sqlite (SQLite-only-k "not postgres"slice of the conformance suite). All four arms setCOVERAGE_CORE=sysmonfor the lower-overhead coverage.py tracing backend (line + branch parity since coverage 7.7). Each shard collects coverage; test-coverage-aggregate combines them, asserts every shard contributed, and enforces the coverage gate viacoverage report --fail-under=$(...)driven by[tool.coverage.report] fail_underinpyproject.tomlbefore a single best-effort Codecov upload. Plus python-audit (pip-audit), dockerfile-lint (hadolint), dashboard (lint/type-check/test under the active-handle gate/build/storybook-build/audit), export-openapi (runsscripts/export_openapi.pyonce and shares the artifact with the dashboard arm), and.github/actions/install-postgres-18-client(shared composite for PGDG postgresql-client-18 install with SHA-256-pinned signing key). All run in parallel -> ci-pass gate. - Pages:
pages.yml: version extraction frompyproject.toml, OpenAPI export, comparison page generation, Astro + Zensical docs build, GitHub Pages deploy on push to main - PR Preview:
pages-preview.yml: Cloudflare Pages deploy per PR (pr-<number>.synthorg-pr-preview.pages.dev), cleanup on PR close - Docker:
docker.yml: build + Trivy scan + CIS benchmark run on every PR; push to GHCR + cosign sign + SLSA L3 provenance gated by theimage-pushdeployment environment (branch policymain,v*). Build and publish are split into separate jobs per image (build-X+build-X-publish); only the publish half carriespackages: write/id-token: write/attestations: write. Shared logic lives in composite actions (build-scan-image,publish-image). CVE triage:.github/.trivyignore.yaml - CLI:
cli.yml: Go lint/test/build (cross-compile) + govulncheck + fuzz. GoReleaser release onv*tags with cosign signing + SLSA provenance, gated by therelease-tagsdeployment environment (v*-only, no privileged secrets; keepsRELEASE_PLEASE_TOKENout of the tag path). The release job'sgh releaseupload/download/edit + body-read calls go through.github/scripts/gh_with_retry.shand thechecksums.txtkeyless signing through.github/scripts/cosign_sign_with_retry.sh sign-blob; the fourattest-build-provenancesteps and the SBOM install steps ride boundedcontinue-on-errorretry ladders so a transient Rekor/Sigstore timeout does not fail a release - Renovate: daily dependency updates via Mend GitHub App. 3 domain groups (Python, Web, Infrastructure), no auto-merge. The Infrastructure group spans Go modules, Dockerfile + docker-compose images, GitHub Actions SHAs, and every custom-regex pin (binary-tool versions like Trivy / Gitleaks / D2 / apko, container-image regexes for state.go / compose.yml / busybox / testcontainers, action
version:inputs like golangci-lint / GoReleaser, go install URLs like govulncheck). Config:renovate.json. Use/review-dep-prbefore merging - Security scanning: gitleaks (push/PR + weekly), zizmor (workflow analysis), OSSF Scorecard (weekly), Socket.dev (PR supply chain), ZAP DAST (weekly + manual, rules:
.github/zap-rules.tsv) - Coverage: Codecov (best-effort, CI not gated on availability)
- Dependency review:
dependency-review.yml: license allow-list (permissive + weak-copyleft), per-package GPL exemptions for dev-only tool deps (golangci-lint), PR comment summaries - CLA:
cla.yml: two jobs splitting read and write.cla-check(pull_request_target) runs self-contained bash +gh apiagainst.github/cla-signatures.jsonon thecla-signaturesbranch, with agh_api_retryhelper that does bounded exponential-with-cap retry on transient EPIPE / 5xx (8 attempts, ~10-min budget under a 12-min job timeout) and fails fast on definitive 4xx. It uses the<!-- synthorg-cla-check -->marker for idempotent PR comment updates (PATCH if the marker comment exists, POST on first transition).cla-sign(issue_commentmatching the sign-text body) records the signature via the Git Data API under thesynthorg-repo-botApp token. Bot allowlist (dependabot[bot],renovate[bot],synthorg-repo-bot[bot],github-actions[bot]) skips the CLA on both jobs. - Release:
release.yml: Release Please creates draft release PR. Mints asynthorg-repo-botApp installation token via therelease-runner-setupcomposite action (secrets documented in docs/reference/github-environments.md). Gated by thereleasedeployment environment. Includes a Highlights step that calls GitHub Models (openai/gpt-4.1-miniviaactions/ai-inference, Copilot Pro quota, no new secret) to prepend a three-section summary to the release PR body, wrapped in<!-- HIGHLIGHTS_START -->...<!-- HIGHLIGHTS_END -->markers. Total bullet count is dynamic (1-15) scaled to the changelog volume and distributed across three fixed headers: What you'll notice (user-facing fixes + UX / behaviour changes), What's new (newly-introduced capabilities and extensions), Under the hood (maintenance, deps, refactors, included only when notable). Empty sections are omitted. Opt out per-release by adding aNo-Highlights:trailer (case-insensitive, anywhere on its own line) to the Release Please PR body before the workflow runs.finalize-release.ymlthen promotes the same marker block from the merged release-please PR body into the published release body (release-please builds release notes fromCHANGELOG.mdonly, so without this promotion the Highlights block would stay stranded on the PR; see "Finalise Release" below). The CLI consumes the same Highlights block duringsynthorg updateon stable channels: it walks every release in(installed, target]oldest-to-newest in batches of 3 and renders the styled summary by default, withctoggling between the AI summary and the Release Please commit-based changelog. Releases without a Highlights block (pre-rollout orNo-Highlights:opt-out) fall back to the commit view automatically. Dev pre-releases have no Highlights block by design, so the CLI walk renders a single combined commit list via the GitHub compare API instead. Walk is gated to interactive TTY runs;--quiet/--json/--yes/ non-TTY contexts skip the walk and print the terse "Update available" notice + release-notes URL. The LICENSE / PR-body / head-SHA reads and the four required-status POSTs are wrapped by.github/scripts/gh_with_retry.sh(retry transient 401/5xx, fast-fail definitive 4xx);timeout-minutes: 15bounds the stacked retry ladders so a black-holed connection cannot hold therelease-pleaseconcurrency group. - Auto Rollover:
auto-rollover.yml: detects when the last stable tag's patch meets the__synthorg_rollover_at_patchthreshold in.github/release-please-config.json(default 9), creates an empty commit on a versioned rollover branch (chore/auto-rollover-v<next>), and opens a PR whose body carries theRelease-As: 0.(minor+1).0trailer so the squash-merge lands it onmainand Release Please targets the minor bump. Four skip guards: (1) Release Please release commits and its own prior rollover commits (matched on subject prefix); (2) a history-independent check (gh pr list) that the rollover PR for this exact version branch is already merged or open (skips MERGED / OPEN, but not CLOSED-without-merge, which never took effect); (3) anyRelease-As:trailer already in thelast-stable..HEADrange, evaluated fail-closed so a range that cannot be computed (incomplete fetch) skips the run rather than rolling over; (4) any open Release Please release PR whose body already queues aRelease-As:trailer. Gated by thereleasedeployment environment. The empty commit and the rollover branch ref are created via the Git Data API (POST/git/commits+ POST/git/refs, force-PATCHif the branch ref already exists) under the App installation token, so the squash-merge ontomainships a verified signature (required bymain's signed-commits rule) and triggers downstream Release + Dev Release workflows. The dedup-readgh pr listguards are wrapped by the shared.github/scripts/gh_with_retry.shhelper (bounded exponential retry on transient 401/5xx, fast-fail on definitive 4xx, exit 75 on exhaustion which here means fail-closed skip);timeout-minutes: 8accommodates the helper's ~1m45s worst-case ladder. The Git Data API writes stay un-retried so a real write failure pages. - Graduate:
graduate.yml:workflow_dispatchone-clickRelease-As:trailer for target versions that skip the normal patch cadence (1.0 graduation, explicit minor jumps). Inputs:target_version+reason. Validates target is strictly above last stable (hard-blocks downgrades). Creates a signed empty commit onmainwith the trailer via the Git Data API under the App installation token. Gated by thereleasedeployment environment. The parent-tree and verification reads go through.github/scripts/gh_with_retry.sh; the commit POST + ref PATCH stay un-retried so a write failure on this manual, human-watched graduation pages. - Dev Release:
dev-release.yml: creates semver dev tags (e.g.v0.8.4-dev.3) and draft pre-releases on every push to main (skips Release Please version-bump commits). Tags trigger existing Docker + CLI workflows for full build/scan/sign pipeline. Gated by thereleasedeployment environment. Uses therelease-runner-setupcomposite for token mint. Pre-release body is built locally viagit log -1on the head SHA andgh release create --notes-file: title$DEV_TAG(e.g.v0.8.4-dev.5), then aDev build #N toward vX.Y.Zline,**Commit:** <short SHA>,**Subject:** <commit subject>, the**Full pipeline:**disclaimer, and the channel opt-in tip. Only the short SHA and the commit subject are written into the notes file -- the full commit body (squash-merge PR descriptions of hundreds of lines, nested markdown, tables) is deliberately omitted because it renders poorly on the release page and buries what changed. Variables go throughprintf '%s'placeholders (the--notes-fileroute avoids command substitution that bare--notes "..."would suffer if a commit subject contained backticks or$(...)). Failure path: ifgh release createreturns non-zero (transient API error, 5xx, rate limit), the workflow exits 1 with the orphan tag preserved -- deleting the tag would race the downstreamtags: v*-listening workflows that the tag-create push already triggered (cli.yml, docker.yml), 404'ing theiractions/checkoutstep. The orphan tag is later garbage-collected by the same workflow's incremental sweep (keeps 5 most recent dev pre-releases) and byfinalize-release.yml's stable-release sweep. End-of-job regression guardVerify minted tag survived the runalways re-resolvesrefs/tags/$DEV_TAG(viaif: always()so the guard runs on failure paths where tag loss is most likely) and exits 1 if absent, routing through the existingreport-failurejob into thedev-release regressiontracking issue. Workflow-tag-lifecycle pre-push gate (scripts/check_workflow_tag_lifecycle.py) statically prevents any future workflow from re-introducing the create-then-conditionally-delete shape. The end-of-run tag-survival check reads through.github/scripts/gh_with_retry.shso a transient 401 cannot fire a false "tag deleted" alarm (a real 404 still fast-fails and trips the guard). - Finalise Release:
finalize-release.yml: assembles the release body and publishes the draft once both Docker + CLI workflows succeed for the tag. Body assembly: prepends the AI Highlights block (stable releases only) extracted from the merged release-please PR body via the head_sha → pulls association, then re-applies the Verification section from the per-image marker comments (<!-- CLI_VERIFICATION_DATA -->,<!-- CONTAINER_VERIFICATION_DATA -->, etc.). The strip step that prevents finalise re-runs from doubling sections gates EVERY marker-pair deletion on both START and END being present in the body;sed '/START/,/END/d'is greedy to EOF without an END, which would tank the entire CHANGELOG-derived body if a contributor's commit subject (now propagated verbatim into dev release bodies viadev-release.yml) happened to contain a literal opening marker. The gate applies to HIGHLIGHTS and to all fiveCLI_*/CONTAINER_*verification-data marker pairs. TheFINALIZE_VERIFICATIONmarker is intentionally greedy-to-EOF: everything after it IS the verification section, rebuilt fresh on each finalise run. Posts afinalize-releasecommit status (pendingat start,success/failureat finish) so workflow_run-triggered failures surface as a red ❌ on the commit row instead of disappearing into the Actions tab. Gated by thereleasedeployment environment. Immutable releases enabled. Handles both stable and dev releases. Stable-release dev-cleanup deletes every dev release + every orphan dev tag matchingvX.Y.Z-dev.Nwhose base version is at or below the published stable (future-version dev builds are skipped via asort -Vsemver compare, so a next-versiondev.1minted during the previous stable's finalise window is not swept out from under its in-flightdocker.ymlrun) -- the innergh apicalls are explicitly capture-and-checked (NOTmapfile < <(...), which silently treats inner-process failures as empty input) and per-taggh release delete/gh api -X DELETEfailures accumulate into a final exit-on-failure check so partial-cleanup is loudly diagnosed. The Highlights propagation path that fetches the release-please PR body splits thegh pr viewcall into capture + classify so an auth / rate-limit failure surfaces a::warning::distinct from "PR was deleted" (legitimate skip with::notice::). Artifact smoke testing happens at BUILD time incli.ymlanddocker.ymlvia thesmoke-test-cli-binaryandsmoke-test-backend-imagecomposite actions; the finalise step does not re-test (Docker images are content-addressed and CLI archives are SHA-256-verified by the cosign-signedchecksums.txt). - CI failure-surfacing policy: every CI workflow must surface its outcome somewhere visible. Non-schedule failure paths (push / pull_request / workflow_run / release / dispatch) post a commit status or PR check; schedule failure paths open or update a tracking GitHub Issue labelled
automation:ci-health. Schedule-triggered workflows have no commit context to attach to, hence the issue lane; manualworkflow_dispatchruns surface failures in the run UI directly so they do not open issues. The shared composite is.github/actions/post-tracking-issue; it dedupes by title across all states (open + closed), so a regression that reappears reopens the same tracker rather than creating a duplicate; consumers that auto-close on success (e.g.ci-preflight.yml) should also unpin in the close path so a closed-and-resolved issue does not stay in the pinned row. Workflows currently using this pattern:apko-lock.yml,ci-preflight.yml,dast.yml,python-audit.yml,evals.yml,scorecard.yml,secret-scan.yml. Pinned tracking-issue label:automation:ci-health. Success events (stable release published, dev pre-release cut, auto-rollover success) deliberately do NOT generate notifications; the GitHub Releases tab and commit row already surface those, and posting them would just spam the tracker. - SBOM Diff:
sbom-diff.yml: inform-only sticky PR comment on Release Please release PRs. Added / removed components + license category counts from the head backend SBOM vs last stable.dependency-review.ymlremains the license gate; this comment is advisory.
Dependencies¶
- Pinned: all versions use
==inpyproject.toml - Groups:
test(pytest + plugins, hypothesis),dev(includes test + ruff, mypy, pre-commit, commitizen, pip-audit) - Required:
mem0ai(Mem0 memory backend, the default backend),mmh3(murmurhash3 for BM25 sparse vector encoding in hybrid search),cryptography(Fernet encryption for sensitive settings at rest),faker(multi-locale agent name generation for templates and setup wizard),httpx(async HTTP client for web tools) - Install:
uv syncinstalls everything (dev group is default) - Web dashboard: Node.js 22+, TypeScript 6.0+, dependencies in
web/package.json(React 19, react-router, shadcn/ui, Base UI, Tailwind CSS 4, Zustand, @tanstack/react-query, @xyflow/react, @dagrejs/dagre, d3-force, @dnd-kit, Recharts, Motion, cmdk-base, js-yaml, Axios, Lucide React, @fontsource-variable/geist, @fontsource-variable/geist-mono, @fontsource-variable/jetbrains-mono, @fontsource-variable/inter, @fontsource/ibm-plex-mono, @fontsource/ibm-plex-sans, CodeMirror 6, Storybook 10, MSW, msw-storybook-addon, Vitest, @vitest/coverage-v8, @testing-library/react, fast-check, ESLint, @eslint-react/eslint-plugin, eslint-plugin-security, Playwright, @lhci/cli, rollup-plugin-visualizer, cross-env) - CLI: Go 1.26+, dependencies in
cli/go.mod(Cobra, charm.land/huh/v2, charm.land/lipgloss/v2, sigstore-go, go-containerregistry, go-tuf) - Landing page: dependencies in
site/package.json(Astro 6, @astrojs/react, React 19, Tailwind CSS 4, js-yaml)
Property-based Testing (Hypothesis): Deep Dive¶
The short rule in CLAUDE.md: Python uses Hypothesis; profiles live in tests/conftest.py; CI runs deterministic 10-example sweeps; failing examples are real bugs.
Profiles¶
Configured in tests/conftest.py, selected via HYPOTHESIS_PROFILE env var:
ci: deterministic,max_examples=10+derandomize=True. Fixed seed per test, same inputs every run (no flakes).dev: 1000 examples.fuzz: 10,000 examples, no deadline. For dedicated fuzzing sessions.extreme: 500,000 examples, no deadline. Overnight deep fuzzing.
.hypothesis/ is gitignored. Failing examples persist to ~/.synthorg/hypothesis-examples/ (write-only shared DB, survives worktree deletion) via _WriteOnlyDatabase in tests/conftest.py.
Running locally¶
- Quick (1000 examples):
HYPOTHESIS_PROFILE=dev uv run python -m pytest tests/ -m unit -n 8 -k properties - Deep (10,000 examples, all
@giventests):HYPOTHESIS_PROFILE=fuzz uv run python -m pytest tests/ -m unit -n 8 --timeout=0 --timeout=0disables the 30s per-test limit that would kill long-running property tests.-k propertiesis intentionally omitted to cover all 46 files with@given, not just the 12*_properties.pyfiles.
When Hypothesis finds a failure¶
It is a real bug. The shrunk example is saved to ~/.synthorg/hypothesis-examples/ for analysis but is not replayed automatically (that would block all test runs).
Do NOT just rerun and move on. Read the failing example from the output, fix the underlying bug, and add an explicit @example(...) decorator to the test so the case is permanently covered in CI.
Cross-language equivalents¶
- React: fast-check (
fc.assert+fc.property) - Go: native
testing.Ffuzz functions (Fuzz*)