Client Simulation¶

The client simulation subsystem generates synthetic workloads that exercise the full task lifecycle end-to-end. Simulated clients (AI-driven, human, or hybrid) submit task requirements through an intake pipeline and review completed deliverables via a configurable review pipeline. This enables systematic evaluation of agent performance, organisational throughput, and quality metrics without real external clients.

Architecture Overview¶

Client Types¶

ClientInterface Protocol¶

All client types implement ClientInterface, providing two operations:

submit_requirement(context): generate or submit a task requirement. Returns None when the client declines to participate.
review_deliverable(context): review a completed deliverable and return feedback with acceptance decision and reasoning.

AIClient¶

LLM-backed client with a configurable persona. Uses CompletionProvider for requirement generation and deliverable review. Persona-driven prompts based on ClientProfile (expertise domains, strictness level).

HumanClient¶

Delegates to the API/dashboard for human input. Uses an async callback pattern for approval flows. No LLM calls; pure API/UI delegation.

HybridClient¶

Composes AIClient + HumanClient: AI drafts requirements and evaluates deliverables, human confirms or overrides decisions.

Client Profile¶

class ClientProfile(BaseModel):
    client_id: NotBlankStr
    name: NotBlankStr
    persona: NotBlankStr
    expertise_domains: tuple[NotBlankStr, ...]
    strictness_level: float  # 0.0 (lenient) to 1.0 (strict)

Profiles control how clients generate requirements and evaluate deliverables. strictness_level influences feedback strategies; stricter clients reject more deliverables and provide more detailed failure analysis.

Request Lifecycle¶

Client requests follow an independent state machine from the task lifecycle:

stateDiagram-v2
    [*] --> SUBMITTED
    SUBMITTED --> TRIAGING : intake engine receives
    SUBMITTED --> CANCELLED : rejected at submission
    TRIAGING --> SCOPING : triage complete
    TRIAGING --> CANCELLED : rejected during triage
    SCOPING --> APPROVED : scoping complete
    SCOPING --> CANCELLED : rejected during scoping
    APPROVED --> TASK_CREATED : task created in TaskEngine
    APPROVED --> CANCELLED : rejected before creation
    TASK_CREATED --> [*]
    CANCELLED --> [*]

RequestStatus is independent from TaskStatus. After TASK_CREATED, the task's own lifecycle (CREATED -> ASSIGNED -> ... -> COMPLETED) takes over.

Requirement Generation¶

Five pluggable strategies implement RequirementGenerator:

Strategy	Approach	Cost	Variety
`TemplateGenerator`	Pattern-based with variable slots	Low	Low
`LLMGenerator`	LLM-generated novel requirements	High	High
`DatasetGenerator`	Loads from curated corpus	Low	Medium
`HybridGenerator`	Dataset seeds + LLM refinement	Medium	High
`ProceduralGenerator`	Algorithmic with dependency graphs	Low	Medium

Each returns tuple[TaskRequirement, ...] containing structured requirements with title, description, type, priority, complexity, and acceptance criteria.

Feedback Strategies¶

Four pluggable strategies implement FeedbackStrategy:

Strategy	Signal	Use Case
`BinaryFeedback`	Accept/reject with reason	Simple pass/fail evaluation
`ScoredFeedback`	Multi-dimensional scoring	Rich feedback for agent learning
`CriteriaCheckFeedback`	Per-criterion pass/fail	Structured failure analysis
`AdversarialFeedback`	Deliberately strict/ambiguous	Stress testing and edge cases

All produce ClientFeedback with accepted boolean, reason, optional scores dictionary, and unmet_criteria tuple.

Review Pipeline¶

The review pipeline walks a chain of ReviewStage implementations in order. Each stage returns a ReviewVerdict:

PASS: continue to the next stage.
FAIL: short-circuit; task returns to IN_PROGRESS for rework.
SKIP: stage not applicable; continue to next.

Pipeline progress is tracked in task metadata (not via new TaskStatus values). The task stays in IN_REVIEW throughout pipeline execution.

# Metadata tracked on the task during pipeline execution
{
    "review_pipeline": {
        "current_stage": "client",
        "stages_completed": ["internal"],
        "stage_results": {
            "internal": {"verdict": "pass", "reason": null},
            "client": {"verdict": "fail", "reason": "Missing tests"}
        }
    }
}

Built-in Stages¶

InternalReviewStage: wraps existing ReviewGateService logic. Backward-compatible default first stage.
ClientReviewStage: invokes ClientInterface.review_deliverable(). Maps ClientFeedback to ReviewStageResult.

Intake Engine¶

The IntakeEngine manages the ClientRequest lifecycle from SUBMITTED through TASK_CREATED. It routes requests to a configured IntakeStrategy:

DirectIntake: pass-through; creates a task immediately from the requirement with minimal validation.
AgentIntake: routes to an intake agent (PM/Account Manager) for triage, scoping, and approval before task creation.

Boot wiring¶

synthorg.client.runtime_builder.build_client_simulation_runtime constructs the IntakeEngine (plus a ReviewPipeline of InternalReviewStage with the VerificationReviewStage appended when verification_review_enabled) during app construction whenever a TaskEngine is present, and create_app attaches the resulting ClientSimulationState so has_simulation_runtime is true and the /simulations + /requests controllers register. (The /requests/{id}/approve work-entry door is separately gated off by default via simulations.client_intake_enabled; see Client-intake work-entry path.) The strategy is selected from the simulations settings namespace (intake_strategy ∈ {direct, agent}, intake_model, intake_default_project, review_pipeline_strategy, plus the verification stage's verification_review_enabled / verification_grader / verification_decomposer): construction reads them via the bootstrap resolver (env > registered default) because ConfigResolver is not wired yet, but these keys are hot (restart_required=False, not read_only_post_init). An on-startup hook re-resolves them from the settings DB once the resolver is wired, and the SimulationsSettingsSubscriber rebuilds the simulation runtime via reload_runtime_services on any change, so a strategy / model / project / review-pipeline / verification change applies with no restart. The rebuild swaps only the config-driven intake engine + review pipeline (including the verification stage) onto the existing ClientSimulationState (via dataclasses.replace), preserving the live client pool and the request / simulation / feedback stores so in-flight work is never discarded. intake_default_project is the project the intake strategy files tasks into and the real work-entry adapter stamps on the work item (see Client-intake work-entry path). The default direct strategy makes no LLM calls, so the runtime comes online for an empty company. A selected agent strategy that cannot be satisfied (no provider or no model) degrades to direct with a warning rather than failing boot.

Client-intake work-entry path (benchmark door, off by default)¶

POST /requests/{id}/approve is the synthetic-client intake work-entry path. It role-plays external customers filing work, so it is a benchmark surface, not a standing production front door, and is gated off by default behind simulations.client_intake_enabled (the always-on operator work-entry path is POST /objectives). When the flag is off the endpoint returns 503 pointing at the setting; enabling it takes effect on the next request (hot, no restart). On approval (flag on) the request is walked to APPROVED and a background task runs the IntakeEntryAdapter (WorkSource.INTAKE), which maps the ClientRequest onto a WorkItem and drives the work pipeline spine (intake -> projects -> decompose -> solo or team execution). The endpoint returns 202 Accepted with the APPROVED request; the terminal TASK_CREATED or CANCELLED state lands asynchronously and is observable via GET /requests/{id} and the request WebSocket channel. Reviewer scoping_notes from a prior /scope call are folded into the work item's intent body so the manual scope flow is preserved.

The adapter is wired only when the door is enabled and the work pipeline is online (engine.pipeline.entry.boot.wire_real_intake_entry, called from the boot runtime-services hook, the post-setup provider reinit, and the SimulationsSettingsSubscriber on a client_intake_enabled change) and attached to the AppState.intake_entry_adapter seam. With the door off, or when no work pipeline is wired (empty company / no provider), no adapter is wired and no client-intake project is seeded, so nothing appears as a standing empty project; approve then surfaces the 503 feature-disabled (or AgentRuntimeNotConfiguredError) response rather than minting a task no agent will run.

The task_board source is the sibling work-entry path: POST /tasks routes a board filing through TaskBoardEntryAdapter (WorkSource.TASK_BOARD) which builds the WorkItem from the user-submitted title/description/project and drives the same spine. The endpoint returns 202 Accepted with a TaskBoardSubmissionResponse envelope (correlation id + echo); the spine creates the task inside its intake phase and the spine-created task surfaces on the tasks WebSocket channel via task.created. Empty-company / no-adapter returns AgentRuntimeNotConfiguredError. The board's column moves remain pure status walks of the spine-created task. The adapter is wired by engine.pipeline.entry.boot.wire_real_task_board_entry (same boot + post-setup hot-swap shape as the intake helper, minus the project bootstrap since board filings carry their own project) and attached to the AppState.task_board_entry_adapter seam.

The simulations.intake_default_project setting (DB > env > registered default, hot) names the project the intake strategy files tasks into and the adapter stamps on the work item; when the door is enabled wire_real_intake_entry re-reads it live from the settings resolver whenever the adapter is (re)wired, and that project is created then (not at boot) so the pipeline's project-existence check and the created task agree. With the door off (the default), no such project is seeded.

Task Source Tracking¶

Tasks created through client simulation carry a source field:

class TaskSource(StrEnum):
    INTERNAL = "internal"      # Created by agent/human within the org
    CLIENT = "client"          # From a client (real or simulated)
    SIMULATION = "simulation"  # From simulation runner

This enables filtering and analytics by task origin without affecting the task lifecycle state machine.

Simulation Runner¶

SimulationRunner orchestrates batch simulation runs:

Spawn a pool of clients (AI/human/hybrid mix per ClientPoolConfig).
Generate requirements via RequirementGenerator.
Submit requirements to IntakeEngine.
Wait for task completion via TaskEngine.
Review deliverables via ClientReviewStage.
Collect metrics (SimulationMetrics).
Generate reports via ReportStrategy.

ContinuousMode provides event-driven always-on simulation with scheduled requirement generation and review triggers.

Idempotency¶

POST /api/v1/simulations/ registers the run via SimulationStore.register_if_absent, an atomic check-and-insert under the store's lock. A redelivered request (JetStream redelivery, HTTP 5xx-driven retry, etc.) carrying the same simulation_id returns HTTP 409 Conflict instead of spawning a second runner that races the first on update_status and corrupts metrics. Clients that supply their own simulation_id get retry safety for free; clients that omit it receive a fresh UUID per call and never collide.

Configuration¶

All configuration is composed into ClientSimulationConfig:

class ClientSimulationConfig(BaseModel):
    pool: ClientPoolConfig           # Pool size, AI/human/hybrid ratios
    generators: RequirementGeneratorConfig  # Strategy + settings
    feedback: FeedbackConfig         # Strategy + scoring rubric
    report: ReportConfig             # Report style discriminator
    runner: SimulationRunnerConfig   # Concurrency, timeouts
    continuous: ContinuousModeConfig # Interval, max concurrent

Configuration & Factories¶

Each client strategy family has a config discriminator that a factory function in synthorg.client.factory dispatches to the concrete implementation. Misconfiguration fails loudly: every factory raises UnknownStrategyError (a ValueError subclass) on an unknown discriminator rather than silently falling back to a default.

Config discriminator	Factory function	Strategies
`RequirementGeneratorConfig.strategy`	`build_requirement_generator()`	`template` → `TemplateGenerator`, `llm` → `LLMGenerator`, `dataset` → `DatasetGenerator`, `procedural` → `ProceduralGenerator`
`FeedbackConfig.strategy`	`build_feedback_strategy(config, *, client_id)`	`binary` → `BinaryFeedback`, `scored` → `ScoredFeedback`, `criteria_check` → `CriteriaCheckFeedback`, `adversarial` → `AdversarialFeedback`
`ReportConfig.strategy`	`build_report_strategy()`	`summary` → `SummaryReport`, `detailed` → `DetailedReport`, `json_export` → `JsonExportReport`, `metrics_only` → `MetricsOnlyReport`
`ClientPoolConfig.selection_strategy`	`build_client_pool_strategy()`	`round_robin` → `RoundRobinStrategy`, `weighted_random` → `WeightedRandomStrategy`, `domain_matched` → `DomainMatchedStrategy`
`adapter` arg (intake entry point)	`build_entry_point_strategy(adapter, *, project_id=None)`	`direct` → `DirectAdapter`, `project` → `ProjectAdapter`, `intake` → `IntakeAdapter`
`IntakeConfig.strategy`	`build_intake_strategy(config, *, task_engine, default_project, provider=None, cost_tracker=None)`	`direct` → `DirectIntake`, `agent` → `AgentIntake`
`WorkSource` (work-entry adapter)	`build_work_entry_adapter(source, *, work_pipeline, default_project)`	`intake` → `IntakeEntryAdapter`, `task_board` → `TaskBoardEntryAdapter`

The factories follow the project-wide pluggable-subsystems pattern (protocol + strategy + factory + config discriminator). No silent defaults: a misspelled discriminator is a hard error at construction time, not a runtime surprise during a simulation.

Hybrid requirement generator is intentionally excluded from factory dispatch

RequirementGeneratorConfig.strategy="hybrid" does not resolve through build_requirement_generator(). HybridGenerator composes multiple underlying generators with weights, so it has no single-argument factory; callers must construct it manually with a tuple of (generator, weight) pairs. Passing "hybrid" to the factory raises UnknownStrategyError: this is a deliberate deviation from the other strategies, not an oversight.

Observability¶

Event constants in synthorg.observability.events.client and synthorg.observability.events.review_pipeline cover:

Client request lifecycle (submitted, triaging, scoped, approved, rejected)
Client review lifecycle (started, completed, feedback recorded)
Requirement generation events
Simulation run lifecycle (started, round completed, completed)
Review pipeline lifecycle (started, stage completed, completed)
Intake processing (received, accepted, rejected)