SEC-1: Prompt Safety, HTML Parsing, and Secret-Log Redaction¶
On-demand reference for the SEC-1 cluster. Short rules in CLAUDE.md:
- Wrap untrusted strings at LLM call sites via
wrap_untrusted()fromsynthorg.engine.prompt_safety. - Never call
lxml.html.fromstringdirectly; useHTMLParseGuard. - Never call any
loggerseverity (exception/warning/error/info/debug) witherror=str(exc)(or any wrapper that smugglesstr(exc)through, including[:200],or fallback, f-strings, and**{"error": ...}dict-unpack); uselogger.warning(EVENT, error_type=type(exc).__name__, error=safe_error_description(exc))instead. The rule is global, not credential-bearing-paths-only:logger.exceptionadds traceback frame-locals, andstr(exc)onhttpx.HTTPStatusError/psycopg.Error/ OAuth provider errors embeds URL or POSTed body content into the log record at every severity level.
Untrusted-content fences at LLM call sites¶
Any attacker-controllable string interpolated into an LLM prompt MUST be wrapped via wrap_untrusted(tag, content) from synthorg.engine.prompt_safety, and the enclosing system prompt MUST append untrusted_content_directive(tags) so the model is explicitly told those fences contain untrusted data.
Attacker-controllable surfaces¶
Task title / description, acceptance criteria, artifact payloads, tool results, tool-invocation arguments, code diffs, multi-tenant strategy config, proposal / alert / query fields, rule metadata, triage requirements, generator context, peer-agent contributions in meeting protocols.
Standard tags¶
TAG_TASK_DATATAG_TASK_FACTTAG_UNTRUSTED_ARTIFACTTAG_TOOL_RESULTTAG_TOOL_ARGUMENTSTAG_CODE_DIFFTAG_CONFIG_VALUETAG_CRITERIA_JSONTAG_PEER_CONTRIBUTIONTAG_MEMORY_ENTRY
Fence breakout protection¶
wrap_untrusted escapes literal </tag> in content (case-insensitively, including whitespace-terminated variants like </tag > or </tag\t>).
Key reference call sites¶
This list is non-exhaustive; treat it as a navigational starting point for new SEC-1 audits. Any LLM call site that interpolates an attacker-controllable string is in scope, whether or not it appears here.
format_task_instructionTaskLedgerMiddlewareLLMRubricGrader._prepare_payload_text_wrap_tool_resultbuild_review_message(semantic_llm_prompt)build_strategic_prompt_sections_encode_decomposer_payloadbuild_task_message/build_system_message(engine/decomposition/llm_prompt.py)separate_analyzer._build_user_message(evolution proposer)LlmSecurityEvaluator._build_messages(tool-invocation arguments viaTAG_TOOL_ARGUMENTS)ChiefOfStaffChat.explain_proposal/.explain_alert/.ask(three surfaces undermeta/chief_of_staff/chat.pyplus directive-append inprompts.pytemplates)CodeModificationStrategy._build_user_prompt(rule metadata + signal context)_BaseSemanticDetector._prompt(four subclasses inengine/classification/semantic_detectors.py)LLMGenerator._build_prompt(client/generators/llm.py)AgentIntake._build_prompt(engine/intake/strategies/agent_intake.py)LLMConsolidationStrategy._build_user_promptand._build_system_prompt(memory/consolidation/llm_strategy.py): wraps each entry underTAG_MEMORY_ENTRY; trajectory-context entries reuse the same tag.LlmCalibrationSampler._build_prompt(hr/performance/llm_calibration_sampler.py): wraps the free-forminteraction_summaryunderTAG_TASK_DATA; bounded numeric metrics are emitted as plain text.SuccessMemoryProposer._build_user_messageand module_SYSTEM_PROMPT(memory/procedural/success_proposer.py): execution context is fenced underTAG_TASK_DATA.SafetyClassifier._build_messages(security/safety_classifier.py): the actiondescription(only attacker-controllable field) is fenced underTAG_TASK_DATA; bounded label fields (tool name, action type, risk level) stayhtml.escaped. The system prompt is computed lazily via_system_prompt()to avoid a circular import throughsynthorg.engine.__init__.- Meeting protocol prompt builders (peer-contribution wrapping):
build_agenda_prompt(communication/meeting/_prompts.py): wraps agenda title / context / items inTAG_TASK_DATARoundRobinProtocol.runandRoundRobinProtocol._run_discussion_rounds(communication/meeting/round_robin.py): both transcript-build paths wrap each turn's content via the shared_format_transcript_entryhelper usingTAG_PEER_CONTRIBUTION_build_conflict_check_prompt/_build_discussion_prompt/_build_synthesis_prompt(communication/meeting/structured_phases.py)_build_synthesis_prompt(communication/meeting/position_papers.py)_render_system_promptincommunication/meeting/agent_caller.pyappends the directive listing bothTAG_TASK_DATAandTAG_PEER_CONTRIBUTIONfor every meeting LLM call
Completion config pinning¶
LLM sites that previously invoked provider.complete() without an explicit CompletionConfig now pin temperature + max_tokens at construction (via __init__ params) so prompt-fingerprint stability can be asserted in tests.
Injection detector (tool results)¶
Tool-result interpolation additionally runs an advisory injection-pattern detector (TOOL_INJECTION_PATTERN_DETECTED) covering closing-tag look-alikes for every standard fence (</task-data>, </task-fact>, </tool-result>, </tool-arguments>, </untrusted-artifact>, </code-diff>, </config-value>, </criteria-json>, </peer-contribution>) plus common override phrases. The telemetry sample is scrubbed via scrub_secret_tokens before logging.
HTML parsing: XXE protection¶
Never call lxml.html.fromstring directly on attacker-controlled input. Use HTMLParseGuard in synthorg.tools.html_parse_guard, which:
- Pre-scans for DOCTYPE with SYSTEM/PUBLIC identifiers and any
<!ENTITY>declaration (rejecting viaXXEDetectedError,is_retryable=False). - Parses with a module-scope
lxml.html.HTMLParser(no_network=True, remove_blank_text=True, recover=True, huge_tree=False).
sanitize() catches XXEDetectedError explicitly so the pre-scan's TOOL_HTML_PARSE_XXE_DETECTED event is the single log entry per rejection (no duplicate TOOL_HTML_PARSE_ERROR). Generic parse failures log error=safe_error_description(exc) without exc_info=True so attacker-controlled payload bytes are not serialised via traceback frame locals.
Secret-log redaction¶
NEVER use these patterns, anywhere in the codebase:
logger.exception(EVENT, error=str(exc))
logger.warning(EVENT, error=str(exc))
logger.error(EVENT, error=str(exc))
logger.info(EVENT, error=str(exc))
logger.debug(EVENT, error=str(exc))
The rule is unconditional. The risk is most acute on credential-bearing paths (OAuth flows, secret backends, settings encryption, A2A client/gateway, API auth middleware, persistence repos), but the pattern is forbidden globally because:
logger.exceptionattaches a traceback whose serialised frame-locals can leakclient_secret/refresh_token/ Fernet ciphertext sitting on the stack at any call site.str(exc)onhttpx.HTTPStatusError/psycopg.Error/ similar embeds URL or posted credential bodies into the message field. The embedded-URL/body risk is independent of severity: adebug/info/warning/errorcall still ends up shipping the credential to whatever sink the operator wires the logger to. The gate therefore covers all five severity methods.
A site that "doesn't handle credentials today" can be one refactor away from carrying a request body or connection string into its frame.
Use instead:
from synthorg.observability import safe_error_description
logger.warning(EVENT, error_type=type(exc).__name__, error=safe_error_description(exc))
exc_info=True is forbidden by default (the structlog exc-info processor still serialises traceback frame-locals even when the error= field is redacted). For genuine framework-boundary handlers that operate downstream of a frame-local scrubber (e.g., a hardened crash sink), opt out per-line with # lint-allow: exc-info -- <reason> on the same physical line as the exc_info=True, keyword. The reason field is mandatory and non-empty.
Caller-facing detail is preserved via raise ... from exc.
Belt-and-braces masking¶
The scrub_event_fields structlog processor masks every log record (covering escaped-quote JSON values, URL form values with stray % bytes, and Authorization: headers).
Pre-commit gate¶
scripts/check_logger_exception_str_exc.py enforces two rules unconditionally (no global allowlist, no baseline) for every logger severity (exception, warning, error, info, debug) on bare logger, attribute-chain loggers (self._logger, audit_logger), or any Name whose id contains logger. The exc_info=True rule (rule 2 below) supports a required same-line per-call opt-out marker (# lint-allow: exc-info -- <reason> with a mandatory non-empty reason); this is a per-instance carve-out for genuine framework-boundary handlers, not a global allowlist or list-based baseline:
- Leak-shape rule (
error=value): theerror=value subtree is walked via a customast.walktraversal that excludesCall.argsand class-introspection chains (sof"{type(exc).__name__}",f"{exc.__class__.__name__}",f"{safe_error_description(exc)}"are not flagged) and that flags any of: str(<exc_like>)calls where<exc_like>isName/Attribute/Subscript(coversstr(exc),str(self._inner),str(exc.args[0])).FormattedValueinterpolation with conversion-1/!s/!r/!aof any leaf whose Name id or Attribute terminal-attr matches_EXCEPTION_LEAF_NAMES(exc, e, err, error, exception, cause, original, inner, _inner).- One-level Name-binding indirection:
error_msg = str(exc); ...; error=error_msg(or any RHS leak shape includingf"...{exc}...") -- the alias is collected per-function-scope and flagged when later passed aserror=. -
Wrapper combinations are walked:
str(exc)[:200](Subscript),str(exc) or fallback(BoolOp),str(exc) if cond else fallback(IfExp),str(exc) + " ctx"(BinOp),f"failed: {str(exc)}"(JoinedStr),**{"error": str(exc)}(Dict-unpack). -
exc_info=Truerule: any literalexc_info=Truekwarg on a logger call is flagged, with a per-line# lint-allow: exc-info -- <reason>opt-out (mandatory non-empty reason). The marker must sit on the same physical line as theexc_info=True,keyword so reviewers and tooling can locate the opt-out without scanning the file.
The matcher is the source of truth; the gate's docstring (scripts/check_logger_exception_str_exc.py) describes the AST shapes covered and the rationale per-rule. The script's filename is preserved (rather than renamed) so the pre-commit hook ID no-new-logger-exception-str-exc stays stable in .pre-commit-config.yaml and CI job references.
OTLP span redaction posture¶
The structlog secret-log redaction policy above covers the structlog sink only: log records that flow through synthorg.observability.get_logger. OpenTelemetry spans are a separate transport (OTLP exporter), so the structlog exc_info=True ban does not transitively cover spans. Instead, the per-transport rules are:
- Span exception attributes: never call
span.record_exception(exc)in production code paths -- it serialises the full traceback (and frame locals) into the OTLP exporter, bypassing every redaction step the structlog sink applies. The middleware's exception handler insrc/synthorg/api/middleware.pyinstead sets the OTel-semconv attributes directly:
span.set_attribute("exception.type", type(exc).__name__)
span.set_attribute("exception.message", safe_error_description(exc))
span.set_status(Status(StatusCode.ERROR, type(exc).__name__))
The message is scrubbed via safe_error_description so credentials embedded in exception strings (httpx response bodies, psycopg connection strings, OAuth tokens) cannot reach the OTLP exporter.
-
Auto-instrumentation opt-out: when wrapping code in
tracer.start_as_current_span(...), passrecord_exception=Falseandset_status_on_exception=Falseso the OTel SDK's default exception-on-context-exit behaviour does not undo the redaction by stampingstr(exc)(unscrubbed) into the span before theset_attributecalls run. -
Span events: code that calls
span.add_event(name, attributes)is responsible for applyingsafe_error_description(or equivalent scrubbing) to every attribute that may carry exception strings, request bodies, or other attacker-controllable content.
The pre-commit check_logger_exception_str_exc.py gate does not cover OTel spans (it AST-walks logger calls only). New OTel call sites must self-police; reviewers should reject any span.record_exception outside test fixtures.