Security & Trust Policies¶

Every tool invocation in SynthOrg passes through the SecOps security pipeline. This guide covers how to configure autonomy levels, trust strategies, approval workflows, custom policies, and output scanning. For the internal architecture of the security subsystem, see the Security reference.

Autonomy Levels¶

Autonomy levels control which actions require human approval. Set the company-wide level in config.autonomy.level, with optional per-agent overrides:

Level	Value	Behavior
Full	`full`	Agents execute all actions without approval
Semi	`semi`	Risky actions (deploy, db:admin, org:fire) require approval
Supervised	`supervised`	Most actions require approval
Locked	`locked`	All actions require approval

config:
  autonomy:
    level: semi

agents:
  - role: "Junior Developer"
    autonomy_level: supervised  # more restrictive than company default
  - role: "CEO"
    autonomy_level: full        # less restrictive than company default

Trust Strategies¶

Trust strategies control how agents earn (or lose) access to higher-privilege tool categories over time. Configure via the trust section:

DisabledWeightedPer-CategoryMilestone

All agents start and remain at initial_level. No automatic trust progression.

trust:
  strategy: disabled
  initial_level: standard

When to use: Simple setups, fully autonomous orgs, or when you manage trust externally.

Trust score computed from weighted factors. Agents are promoted when their score exceeds a threshold.

trust:
  strategy: weighted
  initial_level: restricted
  weights:
    task_difficulty: 0.3
    completion_rate: 0.25
    error_rate: 0.25
    human_feedback: 0.2
  promotion_thresholds:
    restricted_to_standard:
      score: 0.7
      requires_human_approval: false
    standard_to_elevated:
      score: 0.9
      requires_human_approval: true  # REQUIRED (security invariant)

Weights must sum to 1.0 (within 0.01 tolerance).

When to use: Gradual trust building based on agent performance metrics.

Independent trust levels per action category. Each category can have its own promotion criteria.

trust:
  strategy: per_category
  initial_level: restricted
  initial_category_levels:
    code: restricted
    vcs: sandboxed
    deploy: sandboxed
  category_criteria:
    code:
      restricted_to_standard:
        tasks_completed: 10
        quality_score_min: 7.0
        requires_human_approval: false
      standard_to_elevated:
        tasks_completed: 50
        quality_score_min: 8.5
        requires_human_approval: true  # REQUIRED (security invariant)
    vcs:
      sandboxed_to_restricted:
        tasks_completed: 5
        quality_score_min: 7.0

When to use: Fine-grained control where some action categories are more sensitive than others.

Note

Every category in category_criteria must have a matching entry in initial_category_levels. Categories with criteria but no initial level produce a validation error.

Gate-based trust with explicit criteria per transition.

trust:
  strategy: milestone
  initial_level: restricted
  milestones:
    restricted_to_standard:
      tasks_completed: 20
      quality_score_min: 7.5
      time_active_days: 7
      clean_history_days: 3
      auto_promote: true
      requires_human_approval: false
    standard_to_elevated:
      tasks_completed: 100
      quality_score_min: 8.5
      time_active_days: 30
      clean_history_days: 14
      auto_promote: false
      requires_human_approval: true  # REQUIRED (security invariant)
  re_verification:
    enabled: true
    interval_days: 90
    decay_on_idle_days: 30
    decay_on_error_rate: 0.15

auto_promote and requires_human_approval are mutually exclusive per milestone.

When to use: Organizations that want time-based gates and periodic re-verification.

Security invariant: standard_to_elevated

The standard_to_elevated transition always requires requires_human_approval: true, regardless of trust strategy. This is enforced by validation and cannot be overridden. Attempting to set requires_human_approval: false on this transition produces a validation error.

Tool Access Levels¶

Trust levels map to tool access categories:

Level	Value	Access
Sandboxed	`sandboxed`	Sandbox-only execution, no filesystem or network
Restricted	`restricted`	Read-only filesystem, limited network
Standard	`standard`	Read-write filesystem, version control, code execution
Elevated	`elevated`	All categories including deployment, database admin
Custom	`custom`	Explicit allow/deny lists (ignores the hierarchy)

Levels form a hierarchy where each includes all categories from lower levels.

Re-verification¶

For the milestone strategy, re-verification periodically re-evaluates trust:

Field	Type	Default	Description
`enabled`	bool	`false`	Whether re-verification is active
`interval_days`	int	`90`	Days between re-verifications
`decay_on_idle_days`	int	`30`	Demote one level after this many idle days
`decay_on_error_rate`	float	`0.15`	Demote if error rate exceeds this threshold

Security Configuration¶

The security section controls the SecOps rule engine, output scanning, and audit logging:

security:
  enabled: true
  audit_enabled: true
  post_tool_scanning_enabled: true
  output_scan_policy_type: autonomy_tiered
  hard_deny_action_types:
    - "deploy:production"
    - "db:admin"
    - "org:fire"
  auto_approve_action_types:
    - "code:read"
    - "docs:write"

Security Fields¶

Field	Type	Default	Description
`enabled`	bool	`true`	Master switch for the security subsystem
`audit_enabled`	bool	`true`	Record audit entries for all evaluations
`post_tool_scanning_enabled`	bool	`true`	Scan tool output for secrets and PII
`hard_deny_action_types`	list	`["deploy:production", "db:admin", "org:fire"]`	Actions always denied
`auto_approve_action_types`	list	`["code:read", "docs:write"]`	Actions always approved
`output_scan_policy_type`	string	`"autonomy_tiered"`	Output scan response policy
`custom_policies`	list	`[]`	User-defined policy rules

Warning

hard_deny_action_types and auto_approve_action_types must not overlap. Overlapping entries produce a validation error.

Rule Engine¶

The rule engine runs synchronous checks against every tool invocation:

security:
  rule_engine:
    credential_patterns_enabled: true
    data_leak_detection_enabled: true
    destructive_op_detection_enabled: true
    path_traversal_detection_enabled: true
    max_argument_length: 100000
    custom_allow_bypasses_detectors: false

Built-in Detectors¶

Detector	Config Flag	What It Catches
Credential patterns	`credential_patterns_enabled`	API keys, passwords, tokens in arguments
Data leak detection	`data_leak_detection_enabled`	PII, sensitive file paths, internal URLs
Destructive operations	`destructive_op_detection_enabled`	`rm -rf`, `DROP TABLE`, force-push
Path traversal	`path_traversal_detection_enabled`	`../` sequences, path escape attempts

Each detector can be independently enabled or disabled.

Custom Security Policies¶

Define custom rules to allow, deny, or escalate specific action types:

security:
  custom_policies:
    - name: "block-external-comms"
      description: "Prevent agents from sending external communications"
      action_types:
        - "comms:external"
      verdict: deny
      risk_level: high
      enabled: true
    - name: "escalate-deploys"
      description: "Escalate staging deployments for review"
      action_types:
        - "deploy:staging"
      verdict: escalate
      risk_level: medium

Policy Rule Fields¶

Field	Type	Default	Description
`name`	string	(required)	Unique rule identifier
`description`	string	`""`	Human-readable description
`action_types`	list	`[]`	Action types this rule applies to (`category:action` format)
`verdict`	string	`"deny"`	Verdict: `allow`, `deny`, or `escalate`
`risk_level`	string	`"medium"`	Risk level: `low`, `medium`, `high`, `critical`
`enabled`	bool	`true`	Whether this rule is active

Action Types¶

Action types follow a category:action format. Built-in types include:

Category	Actions
`code`	`read`, `write`, `create`, `delete`, `refactor`
`test`	`write`, `run`
`docs`	`write`
`vcs`	`read`, `commit`, `push`, `branch`
`deploy`	`staging`, `production`
`comms`	`internal`, `external`
`budget`	`spend`, `exceed`
`org`	`hire`, `fire`, `promote`
`db`	`query`, `mutate`, `admin`
`arch`	`decide`

Bypass mode restriction

When custom_allow_bypasses_detectors is true, custom policies are placed before the built-in detectors in the evaluation pipeline. In this mode, only deny verdicts are allowed in custom policies -- allow and escalate would skip all security detectors and are rejected at validation time.

LLM Security Fallback¶

For actions that the rule engine cannot classify with high confidence, an LLM from a different provider family can provide cross-validation:

security:
  llm_fallback:
    enabled: true
    model: "example-medium-001"
    timeout_seconds: 10.0
    max_input_tokens: 2000
    on_error: escalate
    reason_visibility: generic
    argument_truncation: per_value

Field	Type	Default	Description
`enabled`	bool	`false`	Whether LLM fallback is active
`model`	string	`null`	Model ID (auto-selects cross-family if null)
`timeout_seconds`	float	`10.0`	Maximum time for the LLM call
`max_input_tokens`	int	`2000`	Token budget cap for eval prompts
`on_error`	string	`"escalate"`	Policy when LLM call fails: `use_rule_verdict`, `escalate`, `deny`
`reason_visibility`	string	`"generic"`	How much reason is visible: `full`, `generic`, `category`
`argument_truncation`	string	`"per_value"`	Truncation strategy: `whole_string`, `per_value`, `keys_and_values`

Output Scanning¶

After tool execution, the output scanner checks for leaked secrets and PII:

Policy	Value	Behavior
Redact	`redact`	Replace matches with `[REDACTED]` and return
Withhold	`withhold`	Clear the entire output (fail-closed)
Log only	`log_only`	Log findings but pass output through
Autonomy-tiered	`autonomy_tiered`	Delegate response based on agent's autonomy level (default; falls back to `redact`)

security:
  output_scan_policy_type: autonomy_tiered