Configuration Precedence¶
On-demand reference for how SynthOrg resolves configuration values. The
short rule in CLAUDE.md is the contract; this page is the full
exception registry, source matrix, and rationale.
The rule¶
Three sources, in order, first match wins:
1. Settings Database (per-installation runtime override, set via /settings)
2. Environment Variable (deployment preset; docker-compose, K8s, .env)
3. Code Default (SettingDefinition.default)
The chain is implemented in synthorg.settings.service.SettingsService
and is the only sanctioned way to resolve a runtime-mutable
setting. Direct env-var reads in application code are forbidden except
for the documented bootstrap exceptions below.
YAML is not a precedence tier. The company.yaml file is an
ingestion format for company templates (charter, departments, agents,
workflows). Its contents flow into domain tables on synthorg init;
they do not participate in the settings chain.
The three categories¶
Every setting belongs to exactly one of three categories. The category determines which subset of the chain applies.
Category 1: Standard mutable¶
Default category. The full chain applies: a /settings runtime
override (DB row) wins over an env var, which wins over the registered
code default.
Examples: observability.root_log_level, observability.log_level_console,
api.lifecycle_cleanup_enabled, engine.timeout_enforcement_enabled.
Category 2: Read-only post-init¶
Registered with read_only_post_init=True (which implies
restart_required=True). The DB lookup is bypassed on reads, and
SettingsService.set(), set_many(), and delete() raise
SettingReadOnlyError so an operator does not believe a runtime
override took effect when the running process keeps the boot-time value.
delete_namespace() does not raise: a read-only key in the target
namespace is logged as a WARNING (reason="read_only_post_init_swept")
and skipped, so it cannot hold the writable overrides the operator wants
to clear hostage.
For these entries the precedence chain collapses to env > default.
The DB step is bypassed on reads (get, get_namespace, get_all,
get_page, get_versioned) regardless of whether a stale row exists,
because the running process resolves its bootstrap value once and
holds onto it; a row left over from a pre-rename schema or an ops
mistake on a peer node would otherwise surface a value the runtime no
longer honours. The /settings UI therefore reflects the actual
running value, sourced from the env var or registered default at
first read.
The DB-bypass branch lives in src/synthorg/settings/service.py
inside get() (the if not definition.read_only_post_init: guard
around the _resolve_db call) and is mirrored in
_resolve_with_db_lookup for batch reads (the if
definition.read_only_post_init: db_hit = None short-circuit before
the DB row is consulted).
Examples: api.server_port, api.server_host, api.api_prefix,
communication.nats_url, workers.count, observability.log_directory,
api.cors_allowed_origins, api.trusted_proxies,
api.rate_limiter_enabled.
Category 3: Bootstrap secret (init-time exception)¶
Read once at process start before SettingsService exists. No
registry entry. Pure env. The value is captured into a typed domain
object at the boot site (e.g. JwtSecret, CursorConfig,
SettingsEncryptor, persistence config) and never re-read.
Why not register them as Cat-2 with sensitive=True? Two reasons:
- Persistence URLs and credentials. Rotating DB credentials at
runtime through the settings UI exposes them to every operator
holding
settings:read, even whensensitive=Truemasks the displayed value. Env-only plus a secret backend is the safer pattern. - Bootstrap secrets. JWT secret, master key, pagination cursor secret, settings encryption key: all read once before the settings service exists. A registry entry would be inert for these.
Examples: SYNTHORG_DATABASE_URL, SYNTHORG_DB_PATH,
SYNTHORG_POSTGRES_SSL_MODE, SYNTHORG_CONFIG_PATH,
SYNTHORG_JWT_SECRET, SYNTHORG_MASTER_KEY,
SYNTHORG_PAGINATION_CURSOR_SECRET, SYNTHORG_SETTINGS_KEY.
Discoverability¶
Every (namespace, key) emits one INFO settings.value.resolved event
on its first cold read per process. The payload carries source
(db / env / default) so an operator can audit at startup which
surface supplied each value. Subsequent resolutions stay at DEBUG.
Category 3 secrets do not emit a settings.value.resolved event;
they are read directly at the boot site and logged via the
domain-specific startup event (e.g. API_APP_STARTUP,
SETTINGS_ENCRYPTOR_BOOTSTRAP).
Source matrix¶
Category 1 examples (DB > env > default)¶
| Setting | Env override | Notes |
|---|---|---|
observability.root_log_level |
SYNTHORG_OBSERVABILITY_ROOT_LOG_LEVEL |
Standard mutable. |
observability.log_level_console |
SYNTHORG_LOG_LEVEL |
Mutable; overrides the console sink only. |
telemetry.enabled |
SYNTHORG_TELEMETRY_ENABLED |
Mutable; the collector reads the env var at boot for the fast-path, then honours runtime DB mutations on the next process restart. |
engine.timeout_enforcement_enabled |
SYNTHORG_ENGINE_TIMEOUT_ENFORCEMENT_ENABLED |
Mutable kill-switch. |
providers.model_refresh_mode |
SYNTHORG_PROVIDERS_MODEL_REFRESH_MODE |
Config discriminator for the periodic model-refresh subsystem (off / manual_only / detect_only / reconcile_recommend); off is the safe default. The scheduler re-reads it every tick (fail-safe to off), so mode changes apply without a restart. |
providers.model_refresh_interval_seconds |
SYNTHORG_PROVIDERS_MODEL_REFRESH_INTERVAL_SECONDS |
Cadence between automatic reconcile cycles (60s..604800s). Re-read by the scheduler each tick (like the mode), so a change applies on the next cycle without a restart. |
providers.model_refresh_auto_apply_within_family |
SYNTHORG_PROVIDERS_MODEL_REFRESH_AUTO_APPLY_WITHIN_FAMILY |
Opt-in (default off) auto-apply of strictly in-family upgrades; re-read every cycle. |
chief_of_staff.propose_enabled |
SYNTHORG_CHIEF_OF_STAFF_PROPOSE_ENABLED |
On-by-default conversational capability; live-gated per request via ensure_feature_enabled (no restart). The siblings explain_chat_enabled / group_chat_enabled / routing_enabled behave the same. |
chief_of_staff.alerts_enabled |
SYNTHORG_CHIEF_OF_STAFF_ALERTS_ENABLED |
Off-by-default autonomous capability (also: learning_enabled / narrative_enabled / invite_enabled). No restart: alerts_enabled is started/stopped live by ChiefOfStaffAlertsSettingsSubscriber; the others are gated per cycle/turn. Each additionally requires the persona master switch self_improvement.chief_of_staff_enabled. |
chief_of_staff.direct_mcp_enabled |
SYNTHORG_CHIEF_OF_STAFF_DIRECT_MCP_ENABLED |
Off-by-default autonomous MCP acting; restart_required (KEEP). The actor is built fail-closed at construction (needs engine.has_security_governance) with no per-request governance re-check, so enabling it requires a deliberate redeploy. |
chief_of_staff.chat_model |
SYNTHORG_CHIEF_OF_STAFF_CHAT_MODEL |
Per-feature model for conversational turns (also propose_model / routing_model / narrative_model); read live per LLM call, no restart. Auto-filled at setup-complete when left blank. |
knowledge.enabled |
SYNTHORG_KNOWLEDGE_ENABLED |
On-by-default knowledge substrate; restart_required (the engine wires at boot). |
research.enabled |
SYNTHORG_RESEARCH_ENABLED |
On-by-default research pipeline; restart_required. The model lives in research.model (auto-filled at setup-complete). |
self_improvement.enabled |
SYNTHORG_SELF_IMPROVEMENT_ENABLED |
Off-by-default self-modification master switch; read live per cycle by run_cycle (with engine.evolution_enabled), so toggling it applies with no restart. The strategy toggles (config_tuning_enabled / architecture_proposals_enabled / prompt_tuning_enabled), tool_creation_enabled (+ its allowlist), and the analysis / code-mod models are likewise live. |
self_improvement.code_modification_enabled |
SYNTHORG_SELF_IMPROVEMENT_CODE_MODIFICATION_ENABLED |
Off-by-default self-modifying code; restart_required (KEEP). validate_prerequisites() verifies GitHub credentials only at startup and the strategy + CodeApplier are built at boot, so enabling it requires a deliberate redeploy. |
providers.tool_call_feedback_enabled |
SYNTHORG_PROVIDERS_TOOL_CALL_FEEDBACK_ENABLED |
Master switch for the runtime tool-call failure feedback loop (default true). Re-read live per observation by the ToolCallFeedbackTracker, so toggling it on/off applies without a restart while the sink stays installed. |
providers.tool_call_failure_threshold |
SYNTHORG_PROVIDERS_TOOL_CALL_FAILURE_THRESHOLD |
Decayed-score threshold (1..20, default 3) at which a model is downgraded (tool_calls_verified=False). Re-read on each failure. |
providers.tool_call_failure_decay_half_life_seconds |
SYNTHORG_PROVIDERS_TOOL_CALL_FAILURE_DECAY_HALF_LIFE_SECONDS |
Half-life (60s..86400s, default 3600s) over which a failure's weight halves, so a transient blip decays away rather than permanently downgrading a capable model. Re-read on each failure. |
Category 2 examples (env > default; DB bypassed)¶
| Setting | Env override | Notes |
|---|---|---|
api.server_host |
SYNTHORG_API_SERVER_HOST |
Consumed pre-init via bootstrap_resolver at app construction; registry entry for /settings discoverability. |
api.server_port |
SYNTHORG_API_SERVER_PORT |
Same as above. |
api.api_prefix |
SYNTHORG_API_API_PREFIX |
Same. |
api.cors_allowed_origins |
SYNTHORG_API_CORS_ALLOWED_ORIGINS |
Same. JSON-encoded list. |
api.trusted_proxies |
SYNTHORG_API_TRUSTED_PROXIES |
Same. JSON-encoded list. |
api.rate_limiter_enabled |
SYNTHORG_API_RATE_LIMITER_ENABLED |
Same. Bool token (true/false/1/0/yes/no). |
communication.nats_url |
SYNTHORG_NATS_URL |
Read once by the bus driver at startup. |
workers.count |
SYNTHORG_WORKERS |
Read at worker-process boot AND by the worker pool builder. |
observability.log_directory |
SYNTHORG_LOG_DIR |
Path-traversal validated at the boot site. |
budget.coordination_metrics_max_entries |
SYNTHORG_BUDGET_COORDINATION_METRICS_MAX_ENTRIES |
Sizes the coordination-metrics ring buffer at boot. |
budget.baseline_window_size |
SYNTHORG_BUDGET_BASELINE_WINDOW_SIZE |
Sizes the single-agent baseline window at BaselineStore construction. |
Category 3 examples (env only; no registry entry)¶
| Concern | Env var | Boot site |
|---|---|---|
| SQLite path | SYNTHORG_DB_PATH |
api/boot_persistence.py, api/app_helpers.py, api/integrations_wiring.py |
| Postgres URL | SYNTHORG_DATABASE_URL |
api/boot_persistence.py, api/app_helpers.py |
| Postgres SSL mode | SYNTHORG_POSTGRES_SSL_MODE |
api/boot_persistence.py |
| Config-file path | SYNTHORG_CONFIG_PATH |
api/boot_persistence.py, backup/factory.py |
| JWT secret | SYNTHORG_JWT_SECRET |
api/auth/secret.py |
| Master key (OAuth) | SYNTHORG_MASTER_KEY |
integrations/oauth/pkce.py |
| Pagination cursor secret | SYNTHORG_PAGINATION_CURSOR_SECRET |
api/cursor_config.py |
| Settings encryption key | SYNTHORG_SETTINGS_KEY |
settings/encryption.py |
For the full inventory of SYNTHORG_* env vars, see
environment-variables.md.
Custom env var names (env_var_override)¶
The default env var name for a registered setting is auto-derived as
SYNTHORG_<NAMESPACE>_<KEY>. When an established operator-facing env
var name predates this rule (e.g. the Docker-compose template already
sets SYNTHORG_LOG_DIR), the registry definition can set
env_var_override="SYNTHORG_LOG_DIR" and the resolver will look up
that exact name instead. Settings currently using overrides:
| Registry key | Override env var |
|---|---|
observability/log_directory |
SYNTHORG_LOG_DIR |
observability/log_level_console |
SYNTHORG_LOG_LEVEL |
communication/nats_url |
SYNTHORG_NATS_URL |
workers/count |
SYNTHORG_WORKERS |
workers/executor_http_timeout_seconds |
SYNTHORG_WORKER_HTTP_TIMEOUT_SECONDS |
tools/sandbox_image |
SYNTHORG_SANDBOX_IMAGE |
tools/sidecar_image |
SYNTHORG_SIDECAR_IMAGE |
When env_var_override is set, the auto-derived name is not
consulted: only the override. This keeps the operator surface clean:
exactly one env var name per setting.
Adding a new setting¶
- Decide which category fits.
- Category 1 (mutable): register a normal
SettingDefinitionin the appropriatesrc/synthorg/settings/definitions/<namespace>.pymodule. The env-var override is auto-derived asSYNTHORG_<NAMESPACE>_<KEY>; supplyenv_var_override=if an operator-facing name predates the rule. - Category 2 (init-time read-only but operator-visible): register
with
restart_required=Trueandread_only_post_init=True. TheSettingsServicerejects runtime mutation and bypasses the DB on reads. - Category 3 (bootstrap secret): do not register. Read the env var directly at the boot site and document the env var on environment-variables.md. Capture into a typed domain object; never re-read.
- Consume the value via
ConfigResolver.get_*()(post-init) orsynthorg.settings.bootstrap_resolver.resolve_init_value(...)(pre-init). Directos.environ.getreads in application code outside startup are forbidden.
Bootstrap resolver (pre-SettingsService Cat-2 reads)¶
Some Category-2 settings are consumed at app construction time, before
SettingsService has been wired. Examples: rate-limiter middleware
construction, log-sink bootstrap, log-directory selection. Reading
os.environ directly at these sites is drift: the registry already
owns the env var name and the default, and the chain (env > default)
should be applied uniformly.
synthorg.settings.bootstrap_resolver.resolve_init_value(...) is the
sanctioned pre-init resolver. It reads the SettingDefinition from
the registry to obtain the env var name (override or auto-derived)
and the typed default, then returns the env value (if set) or the
registered default. Optional parse callback validates and converts
the env string to the consumer's type, returning None to fall back
to the default.
from synthorg.settings.bootstrap_resolver import resolve_init_value
from synthorg.settings.enums import SettingNamespace
resolved = resolve_init_value(
SettingNamespace.API,
"rate_limiter_enabled",
parse=_parse_bool_token,
)
rate_limiter_enabled = resolved.value
Used by:
synthorg.api.app._build_rate_limiter_enabled(rate-limiter middleware boot)synthorg.api.app_builders._bootstrap_app_logging(log directory)synthorg.observability.setup._apply_console_level_override(console log level)
Pydantic mirror fields (apply_settings_mirrors)¶
Many Pydantic config classes (ApiConfig, ServerConfig,
BudgetConfig, etc.) carry fields that mirror registered settings.
With YAML eliminated from the precedence chain, the Pydantic-tier
default would otherwise drift from the env-tier override resolved by
SettingsService.
Settings-only registered keys (no Pydantic mirror)¶
Some registered settings are consumed exclusively through
SettingsService (or ConfigResolver) and have no corresponding
field on any Pydantic config class. They participate in the standard
precedence chain (DB > env > default) without needing a mirror
declaration. Examples in the company namespace:
company.name_locales: consumed insrc/synthorg/api/controllers/setup/company_helpers.pyviaSettingsService.get_entry.company.description: registered for/settingsUI discoverability; no current code consumer.
These keys are NOT fields on RootConfig; treating them as
settings-only avoids the dual-surface drift that the mirror pattern
exists to fix.
synthorg.settings.mirrors.apply_settings_mirrors is the sanctioned
fix. Each Pydantic class with mirror fields declares them via a
MirrorField tuple and attaches a model_validator(mode="before")
that populates unset fields from the registry. The Pydantic field
declarations remain (consumer API unchanged) but the value at
construction time IS the precedence-chain result.
from typing import Any, ClassVar
from pydantic import BaseModel, ConfigDict, Field, model_validator
from synthorg.settings.enums import SettingNamespace
from synthorg.settings.mirrors import (
MirrorField, apply_settings_mirrors, parse_bool,
)
class MyConfig(BaseModel):
model_config = ConfigDict(frozen=True, allow_inf_nan=False)
_MIRROR_FIELDS: ClassVar[tuple[MirrorField, ...]] = (
MirrorField(
field="enabled",
namespace=SettingNamespace.MYNS,
key="enabled",
parse=parse_bool,
),
)
enabled: bool = Field(default=True)
@model_validator(mode="before")
@classmethod
def _apply_mirrors(cls, data: Any) -> Any:
return apply_settings_mirrors(data, cls._MIRROR_FIELDS)
Available parsers¶
synthorg.settings.mirrors ships the parser callbacks below. A
MirrorField with parse=None applies identity parsing (the raw env
string reaches the field, and the Pydantic field type does any
coercion). A parser returning None signals invalid input; the
registered default is then applied.
| Parser | Signature | Use for |
|---|---|---|
parse_bool |
(str) -> bool \| None |
Boolean tokens (true/false/1/0/yes/no). |
parse_int |
(str) -> int \| None |
Integer settings. |
parse_float |
(str) -> float \| None |
Float settings. |
parse_str_tuple_json |
(str) -> tuple[str, ...] \| None |
JSON list-of-strings into a tuple. |
parse_json_int_pair_dict |
(str) -> dict[str, list[int]] \| None |
JSON {op: [int, int]} (e.g. PerOpRateLimitConfig.overrides). Top-level shape only; the owning config's mode="before" validator promotes inner lists to tuples and rejects negatives. |
parse_json_int_dict |
(str) -> dict[str, int] \| None |
JSON {op: int} (e.g. PerOpConcurrencyConfig.overrides). Top-level shape only; the owning validator rejects non-int / negative values. |
The two JSON-dict parsers deliberately validate only the top-level
JSON structure. Per-entry semantics (non-blank keys, tuple arity,
non-negativity) belong to the owning config's mode="before"
validator so operator-facing error context fires before Pydantic
coercion. See "Validator declaration order" in
conventions.md.
Sentinel-preserving mode: only_if_env_set=True¶
When the Pydantic field's None default carries semantic meaning the
registry default would clobber, set only_if_env_set=True on the
MirrorField. The mirror then fires ONLY when the operator has
explicitly set the env var; if the resolver falls back to the
registered default the Pydantic field keeps its declared default.
Used by:
AuthConfig.exclude_paths(None= auto-derive from API prefix)CoordinationSectionConfig.max_concurrency_per_wave(None= unlimited)CeremonyPolicyConfig.{strategy, velocity_calculator, auto_transition, transition_threshold}(None= inherit from level up)
Selecting between the three resolution helpers¶
| Use case | Helper |
|---|---|
Settings consumed at app construction, before SettingsService exists |
bootstrap_resolver.resolve_init_value |
Settings consumed via a Pydantic Config field whose value comes from RootConfig |
mirrors.apply_settings_mirrors |
| Runtime-mutable settings consumed per request | ConfigResolver.get_*() (post-init) |
| Hot-reloadable knobs needing one snapshot per process tick | Bridge-config snapshot pattern below |
Protocol constants are not settings¶
Wire-protocol numerics such as JSON-RPC error codes
(JSONRPC_PARSE_ERROR: int = -32700), framing thresholds, or
specification-mandated limits are NOT operator-tunable policy:
changing the value silently breaks interop with peers that read the
public spec. Express them as typed module-level constants and let
scripts/check_no_magic_numbers.py recognise the annotation as the
named-constant signal. Import Final directly from typing; the
gate matches only the bare names int, float, Final, Final[int],
and Final[float], so qualified forms such as typing.Final[int]
still flag. Examples:
from typing import Final
JSONRPC_PARSE_ERROR: int = -32700
A2A_TASK_NOT_FOUND: int = -32001
_MAX_FRAME_SIZE: Final[int] = 16384
Do not register these in settings/definitions/. The precedence
chain is for values that an operator may legitimately tune; protocol
constants are part of the algorithm.
Bridge-config snapshot pattern (hot-reloadable AppState fields)¶
For controller / service knobs that should be hot-reloadable but cost
too much to resolve through ConfigResolver.get_*() on every request,
the canonical pattern is a frozen Pydantic snapshot on AppState
populated at startup and hot-swapped by a settings subscriber on
operator-driven changes. Reference implementation:
api.max_lifecycle_events_per_query consumed by
ActivityController.list_activities.
The pattern has four pieces:
- Frozen bridge model. A class in
synthorg/settings/bridge_configs.py(e.g.ApiBridgeConfig) withmodel_config = ConfigDict(frozen=True, allow_inf_nan=False, extra="forbid"), one field per setting it carries, defaults that match the registered defaults. The model is the single source of truth for the fallback value: no controller carries a duplicate constant. - Resolver builder.
ConfigResolver.get_<ns>_bridge_config()resolves every field at once via_resolve_bridge_fields(). BridgeConfigStateowner + accessors. Theapp_state.bridge_configowner default-constructs each bridge model so consumers always see a valid snapshot, even before_apply_bridge_confighas run.app_state.bridge_config.<name>returns the current snapshot;app_state.bridge_config.swap_<name>(config)does a wholesale replace under a per-bridgethreading.Lock;app_state.bridge_config.mutate_<name>({field: value, ...})applies a partial update under the same lock so two concurrent subscribers cannot lose each other's writes.- Settings subscriber. A
SettingsSubscriberimplementation insynthorg/settings/subscribers/<name>_bridge_subscriber.pywhose_WATCHEDset lists every hot-reloadable field. On change, the subscriber resolves the new value and callsmutate_*with the single-field update;mutate_*re-validates the merged dict viamodel_validate(...)against the field'sField(ge=..., le=...)bounds, so an out-of-range value raisesValidationErrorand the prior snapshot is retained. Module-load-time guard: every key in_WATCHEDis asserted to exist on the bridge model so a typo or rename surfaces at import, not on the next operator hot-reload.
Use this pattern when the setting is hot-reloadable
(restart_required=False) but per-request resolver lookup would add
overhead or coupling. For restart-required knobs (e.g.
ws_auth_timeout_seconds) the simpler set_*() pattern in
_apply_bridge_config is sufficient.
Bootstrap-wiring trace (ghost-wired settings gate)¶
A registered setting whose consuming machinery exists but is never instantiated at boot is ghost-wired: the value resolves cleanly through the precedence chain, but no code path that reads it ever runs in default config. The standing gate scripts/check_setting_to_startup_trace.py (pre-push and CI) detects two ghost-service patterns and matches settings to them via three matchers. The full mechanics, the suppression marker, and the baseline-file contract have their own reference: Bootstrap-Wiring Trace (Ghost-Wired Settings Gate).
Restart-required justification gate¶
A setting flagged restart_required=True (or read_only_post_init=True,
which implies it) is skipped by the settings-change dispatcher
(settings/dispatcher.py): the operator edit lands in the DB but no
subscriber runs, so the running process keeps the boot value. The audit
behind #2514 found the large majority of those flags were immutable by
omission, not necessity, and converted the per-request / per-tick knobs to
hot-reloadable using the existing seams:
- a per-request / per-call
ConfigResolver.get_*()read (e.g.api.max_meeting_context_keys,api.readiness_probe_timeout_seconds,integrations.oauth_http_timeout_seconds); - a
set_*()setter on a live object pushed by aSettingsSubscriber(e.g. theWsAuthLimitsknobs, theHttpBatchHandlerHTTP batch knobs,backup.path,a2a.client_timeout_seconds,engine.timeout_enforcement_enabled); - a bridge-config snapshot + subscriber (e.g. the
tools.docker_sidecar_*resource limits, read per container launch through the sidecar cache); - a
reload_runtime_servicestrigger for knobsbuild_runtime_servicesalready re-reads (the engine classifier / matcher knobs,external_api.enabled/provider_type,coordination.enable_coordination_middleware,budget.benchmark_provider/model_tier_overrides).
scripts/check_setting_restart_required_justified.py (pre-push + CI) keeps
this from regressing: every restart-bound definition must either carry a
per-line # lint-allow: restart-required -- <reason> marker on its
register(...) block, or sit on the baseline
scripts/setting_restart_required_baseline.txt (the genuine OS/transport
keeps plus the namespaces deferred to #2515 / #2516). It fails when a new
unjustified restart-bound setting appears and warns (passing) on a stale
baseline entry. Regenerate the baseline (rare, explicit approval) with
--update-baseline.
Security toggle write guardrail¶
security.enabled, audit_enabled, post_tool_scanning_enabled, and
output_scan_policy_type are hot-reloadable, but weakening them is a
deliberate-action decision: turning a boolean off, or switching
output_scan_policy_type to log_only, requires confirm=True plus a
non-blank reason and actor at the write path
(settings/write_governance.py, enforced centrally in
SettingsService.set / set_many, surfaced via the dedicated
POST /settings/security/import endpoint). Enabling / tightening applies
immediately with no gate. The per-request interceptor reads the live config
through app_state.security_runtime_config, which the
SecurityBridgeSettingsSubscriber swaps on an authorised change.
Kill-Switch Idiom (MANDATORY)¶
Every long-running async loop in src/synthorg/ MUST be pause-able
at runtime via an <namespace>.<service>_enabled boolean setting,
without restarting the process. The canonical shape:
- Register the flag in
src/synthorg/settings/definitions/<ns>.pywithSettingType.BOOLEAN,default="true", and adescriptionthat names the gated service. The setting participates in the full DB > env > default precedence chain. - Add a fail-safe-to-enabled resolver helper next to the loop. The
"no resolver wired" fast-path returns
Truedirectly so a service constructed in a test or pre-startup context (whereapp_state.has_config_resolverisFalse/config_resolver is None) does not crash on aNone.get_boolaccess:
async def _resolve_<x>_enabled(...) -> bool:
if not app_state.has_config_resolver:
return True
try:
return await app_state.config_resolver.get_bool(<ns>, "<x>_enabled")
except asyncio.CancelledError:
raise
except (MemoryError, RecursionError):
raise
except Exception as exc:
logger.warning(<event>, error_type=type(exc).__name__,
error=safe_error_description(exc))
return True
- Gate the loop body per iteration (or per call for non-loop
surfaces like
NotificationDispatcher.dispatch):
while not self._stop_event.is_set():
if await self._resolve_enabled():
await self._do_work()
else:
logger.debug(<paused_event>, reason="paused_by_setting")
await asyncio.sleep(self._interval)
The fail-safe-to-enabled rule is non-negotiable: a settings-backend outage must not silently silence the surface. Operators silence by setting the value explicitly.
Reference implementations (symbol-only references; line numbers churn):
api.lifecycle_helpers._ticket_cleanup_loop,
api.lifecycle_helpers._audit_retention_loop,
api.webhook_cleanup._webhook_receipt_cleanup_loop,
providers.health_prober.ProviderHealthProber._run_loop,
notifications.dispatcher.NotificationDispatcher.dispatch,
communication.conflict_resolution.escalation.sweeper.EscalationExpirationSweeper._run.
Per-line opt-out:
# lint-allow: long-running-loop-kill-switch -- <reason> on the
while line itself, or on one of the two preceding source lines
(leading comment block / decorator). The justification is mandatory
and must be non-empty (mirrors the existing # lint-allow:
markers). Suppression is per-loop: a function with two unguarded
long-running loops needs two markers, otherwise a function-wide
opt-out could silently mask a new sibling loop added later.
Pre-existing not-yet-pause-able loops live in
scripts/long_running_loops_kill_switch_baseline.txt; the gate
fails when a NEW loop missing the kill-switch lands.
Enforced by scripts/check_long_running_loops_have_kill_switch.py
(pre-push + CI). Scope: the gate scans every long-running
while True: / while not <stop_event>.is_set(): inside an
async def under src/synthorg/, so the loop-bodied surfaces above
(_ticket_cleanup_loop, ProviderHealthProber._run_loop,
_webhook_receipt_cleanup_loop) are lint-enforced. Per-call
non-loop surfaces such as NotificationDispatcher.dispatch are
covered by project convention and reviewed by CodeRabbit / human
review, but they sit outside the AST gate's loop-shaped detection.
Sandbox image cache¶
The Pydantic field defaults in
src/synthorg/tools/sandbox/docker_config.py no longer read
SYNTHORG_SANDBOX_IMAGE / SYNTHORG_SIDECAR_IMAGE directly from
os.environ; the canonical resolution path is
tools.sandbox_image / tools.sidecar_image registered in
definitions/tools.py with env_var_override= matching the
historical env var names. _apply_bridge_config resolves both
once at startup and writes them into the process-singleton cache
in tools/sandbox/_image_resolution.py. Tests override the cache
via set_resolved_*_image(...); the autouse fixture
_isolate_sandbox_image_resolution in
tests/unit/tools/sandbox/conftest.py clears the cache around
every sandbox test.