Providers¶
LLM provider abstraction -- protocol, base class, drivers, capabilities, routing, and resilience.
Protocol¶
protocol
¶
Typed protocol for completion providers.
The engine and tests type-hint against CompletionProvider for loose
coupling. Concrete adapters and test doubles satisfy it structurally.
CompletionProvider
¶
Bases: Protocol
Structural interface every LLM provider adapter must satisfy.
Defines three async methods: complete for non-streaming chat
completion, stream for streaming completion, and
get_model_capabilities for capability metadata lookup.
complete
async
¶
Execute a non-streaming chat completion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[ChatMessage]
|
Conversation history. |
required |
model
|
str
|
Model identifier to use. |
required |
tools
|
list[ToolDefinition] | None
|
Available tools for function calling. |
None
|
config
|
CompletionConfig | None
|
Optional completion parameters. |
None
|
Returns:
| Type | Description |
|---|---|
CompletionResponse
|
The full completion response. |
Source code in src/synthorg/providers/protocol.py
stream
async
¶
Execute a streaming chat completion.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[ChatMessage]
|
Conversation history. |
required |
model
|
str
|
Model identifier to use. |
required |
tools
|
list[ToolDefinition] | None
|
Available tools for function calling. |
None
|
config
|
CompletionConfig | None
|
Optional completion parameters. |
None
|
Returns:
| Type | Description |
|---|---|
AsyncIterator[StreamChunk]
|
Async iterator of stream chunks. |
Source code in src/synthorg/providers/protocol.py
get_model_capabilities
async
¶
Return capability metadata for the given model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model identifier. |
required |
Returns:
| Type | Description |
|---|---|
ModelCapabilities
|
Static capability and cost information. |
Base Provider¶
base
¶
Abstract base class for completion providers.
Concrete adapters subclass BaseCompletionProvider and implement
the _do_* hooks. The base class handles input validation,
automatic retry, rate limiting, and provides a cost-computation helper.
BaseCompletionProvider
¶
Bases: ABC
Shared base for all completion provider adapters.
Subclasses implement three hooks:
_do_complete-- raw non-streaming call_do_stream-- raw streaming call_do_get_model_capabilities-- capability lookup
The public methods validate inputs before delegating to hooks.
When a retry_handler and/or rate_limiter are provided,
calls are automatically wrapped with retry and rate-limiting logic.
A static compute_cost helper is available for subclasses to
build TokenUsage records from raw token counts.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
retry_handler
|
RetryHandler | None
|
Optional retry handler for transient errors. |
None
|
rate_limiter
|
RateLimiter | None
|
Optional client-side rate limiter. |
None
|
Source code in src/synthorg/providers/base.py
complete
async
¶
Validate inputs, delegate to _do_complete.
Applies rate limiting and retry automatically when configured.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[ChatMessage]
|
Conversation history. |
required |
model
|
str
|
Model identifier to use. |
required |
tools
|
list[ToolDefinition] | None
|
Available tools for function calling. |
None
|
config
|
CompletionConfig | None
|
Optional completion parameters. |
None
|
Returns:
| Type | Description |
|---|---|
CompletionResponse
|
The completion response. |
Raises:
| Type | Description |
|---|---|
InvalidRequestError
|
If messages are empty or model is blank. |
RetryExhaustedError
|
If all retries are exhausted. |
Source code in src/synthorg/providers/base.py
stream
async
¶
Validate inputs, delegate to _do_stream.
Only the initial connection setup is retried; mid-stream errors are not retried.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
messages
|
list[ChatMessage]
|
Conversation history. |
required |
model
|
str
|
Model identifier to use. |
required |
tools
|
list[ToolDefinition] | None
|
Available tools for function calling. |
None
|
config
|
CompletionConfig | None
|
Optional completion parameters. |
None
|
Returns:
| Type | Description |
|---|---|
AsyncIterator[StreamChunk]
|
Async iterator of stream chunks. |
Raises:
| Type | Description |
|---|---|
InvalidRequestError
|
If messages are empty or model is blank. |
RetryExhaustedError
|
If all retries are exhausted. |
Source code in src/synthorg/providers/base.py
get_model_capabilities
async
¶
Validate model identifier, delegate to _do_get_model_capabilities.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
str
|
Model identifier. |
required |
Returns:
| Type | Description |
|---|---|
ModelCapabilities
|
Static capability and cost information. |
Raises:
| Type | Description |
|---|---|
InvalidRequestError
|
If model is blank. |
Source code in src/synthorg/providers/base.py
compute_cost
staticmethod
¶
Build a TokenUsage from raw token counts and per-1k rates.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
input_tokens
|
int
|
Number of input tokens (must be >= 0). |
required |
output_tokens
|
int
|
Number of output tokens (must be >= 0). |
required |
cost_per_1k_input
|
float
|
Cost per 1,000 input tokens in USD (base currency; finite and >= 0). |
required |
cost_per_1k_output
|
float
|
Cost per 1,000 output tokens in USD (base currency; finite and >= 0). |
required |
Returns:
| Type | Description |
|---|---|
TokenUsage
|
Populated |
Raises:
| Type | Description |
|---|---|
InvalidRequestError
|
If any parameter is negative or non-finite. |
Source code in src/synthorg/providers/base.py
Models¶
models
¶
Provider-layer domain models for chat completion requests and responses.
ZERO_TOKEN_USAGE
module-attribute
¶
ZERO_TOKEN_USAGE = TokenUsage(input_tokens=0, output_tokens=0, cost_usd=0.0)
Additive identity for TokenUsage.
TokenUsage
pydantic-model
¶
Bases: BaseModel
Token counts and cost for a single completion call.
This is the lightweight provider-layer record. The budget layer's
synthorg.budget.CostRecord adds agent/task context around it.
Attributes:
| Name | Type | Description |
|---|---|---|
input_tokens |
int
|
Number of input (prompt) tokens. |
output_tokens |
int
|
Number of output (completion) tokens. |
total_tokens |
int
|
Sum of input and output tokens (computed). |
cost_usd |
float
|
Estimated cost in USD (base currency) for this call. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
input_tokens(int) -
output_tokens(int) -
cost_usd(float)
ToolDefinition
pydantic-model
¶
Bases: BaseModel
Schema for a tool the model can invoke.
Uses raw JSON Schema for parameters_schema because every LLM
provider consumes it natively.
Note
The parameters_schema dict is shallowly frozen by Pydantic's
frozen=True -- field reassignment is prevented but nested
contents can still be mutated in place. BaseTool.to_definition()
provides a deep-copied schema, and ToolInvoker deep-copies
arguments at the execution boundary, so no additional caller-side
copying is needed for standard tool/provider workflows. Direct
consumers outside these paths should deep-copy if they intend to
modify the schema. See the tech stack page (docs/architecture/tech-stack.md).
Attributes:
| Name | Type | Description |
|---|---|---|
name |
NotBlankStr
|
Tool name. |
description |
str
|
Human-readable description of the tool. |
parameters_schema |
dict[str, Any]
|
JSON Schema dict describing the tool parameters. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
name(NotBlankStr) -
description(str) -
parameters_schema(dict[str, Any])
ToolCall
pydantic-model
¶
Bases: BaseModel
A tool invocation requested by the model.
Note
The arguments dict is shallowly frozen by Pydantic's
frozen=True -- field reassignment is prevented but nested
contents can still be mutated in place. The ToolInvoker
deep-copies arguments before passing them to tool
implementations. See the tech stack page (docs/architecture/tech-stack.md).
Attributes:
| Name | Type | Description |
|---|---|---|
id |
NotBlankStr
|
Provider-assigned tool call identifier. |
name |
NotBlankStr
|
Name of the tool to invoke. |
arguments |
dict[str, Any]
|
Parsed arguments dict. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
id(NotBlankStr) -
name(NotBlankStr) -
arguments(dict[str, Any])
ToolResult
pydantic-model
¶
Bases: BaseModel
Result of executing a tool call, sent back to the model.
Attributes:
| Name | Type | Description |
|---|---|---|
tool_call_id |
NotBlankStr
|
The |
content |
str
|
String content returned by the tool. |
is_error |
bool
|
Whether the tool execution failed. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
tool_call_id(NotBlankStr) -
content(str) -
is_error(bool)
ChatMessage
pydantic-model
¶
Bases: BaseModel
A single message in a chat completion conversation.
Attributes:
| Name | Type | Description |
|---|---|---|
role |
MessageRole
|
Message role (system, user, assistant, tool). |
content |
str | None
|
Text content of the message. |
tool_calls |
tuple[ToolCall, ...]
|
Tool calls requested by the assistant (assistant only). |
tool_result |
ToolResult | None
|
Result of a tool execution (tool role only). |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
role(MessageRole) -
content(str | None) -
tool_calls(tuple[ToolCall, ...]) -
tool_result(ToolResult | None)
Validators:
-
_validate_role_constraints
CompletionConfig
pydantic-model
¶
Bases: BaseModel
Optional parameters for a completion request.
All fields are optional -- the provider fills in defaults.
Attributes:
| Name | Type | Description |
|---|---|---|
temperature |
float | None
|
Sampling temperature (0.0-2.0). Actual valid range may vary by provider. |
max_tokens |
int | None
|
Maximum tokens to generate. |
stop_sequences |
tuple[str, ...]
|
Sequences that stop generation. |
top_p |
float | None
|
Nucleus sampling threshold. |
timeout |
float | None
|
Request timeout in seconds. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
temperature(float | None) -
max_tokens(int | None) -
stop_sequences(tuple[str, ...]) -
top_p(float | None) -
timeout(float | None)
CompletionResponse
pydantic-model
¶
Bases: BaseModel
Result of a non-streaming completion call.
Attributes:
| Name | Type | Description |
|---|---|---|
content |
str | None
|
Generated text content (may be |
tool_calls |
tuple[ToolCall, ...]
|
Tool calls the model wants to execute. |
finish_reason |
FinishReason
|
Why the model stopped generating. |
usage |
TokenUsage
|
Token usage and cost breakdown. |
model |
NotBlankStr
|
Model identifier that served the request. |
provider_request_id |
str | None
|
Provider-assigned request ID for debugging. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
content(str | None) -
tool_calls(tuple[ToolCall, ...]) -
finish_reason(FinishReason) -
usage(TokenUsage) -
model(NotBlankStr) -
provider_request_id(str | None)
Validators:
-
_validate_has_output
StreamChunk
pydantic-model
¶
Bases: BaseModel
A single chunk from a streaming completion response.
The event_type discriminator determines which optional fields are
populated.
Attributes:
| Name | Type | Description |
|---|---|---|
event_type |
StreamEventType
|
Type of stream event. |
content |
str | None
|
Text delta (for |
tool_call_delta |
ToolCall | None
|
Tool call received during streaming (for |
usage |
TokenUsage | None
|
Final token usage (for |
error_message |
str | None
|
Error description (for |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
event_type(StreamEventType) -
content(str | None) -
tool_call_delta(ToolCall | None) -
usage(TokenUsage | None) -
error_message(str | None)
Validators:
-
_validate_event_fields
add_token_usage
¶
Create a new TokenUsage with summed token counts and cost.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a
|
TokenUsage
|
First usage record. |
required |
b
|
TokenUsage
|
Second usage record. |
required |
Returns:
| Type | Description |
|---|---|
TokenUsage
|
New |
TokenUsage
|
( |
Source code in src/synthorg/providers/models.py
Enums¶
enums
¶
Provider-layer enumerations.
Errors¶
errors
¶
Provider error hierarchy.
Every provider error carries a is_retryable flag so retry logic
can decide whether to attempt again without inspecting concrete
exception types.
ProviderError
¶
Bases: Exception
Base exception for all provider-layer errors.
Attributes:
| Name | Type | Description |
|---|---|---|
message |
Human-readable error description. |
|
context |
MappingProxyType[str, Any]
|
Immutable metadata about the error (provider, model, etc.). |
is_retryable |
bool
|
Whether the caller should retry the request. |
Note
When converted to string, sensitive context keys (api_key, token, secret, password, authorization) are automatically redacted regardless of casing.
Initialize a provider error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable error description. |
required |
context
|
dict[str, Any] | None
|
Arbitrary metadata about the error. Stored as an immutable mapping; defaults to empty if not provided. |
None
|
Source code in src/synthorg/providers/errors.py
__str__
¶
Format error with optional context metadata.
Sensitive keys (api_key, token, etc.) are redacted to prevent accidental secret leakage in logs and tracebacks.
Source code in src/synthorg/providers/errors.py
AuthenticationError
¶
Bases: ProviderError
Invalid or missing API credentials.
Source code in src/synthorg/providers/errors.py
RateLimitError
¶
Bases: ProviderError
Provider rate limit exceeded.
Initialize a rate limit error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Human-readable error description. |
required |
retry_after
|
float | None
|
Seconds to wait before retrying, if provided by the provider. |
None
|
context
|
dict[str, Any] | None
|
Arbitrary metadata about the error. |
None
|
Source code in src/synthorg/providers/errors.py
ModelNotFoundError
¶
Bases: ProviderError
Requested model does not exist or is not available.
Source code in src/synthorg/providers/errors.py
InvalidRequestError
¶
Bases: ProviderError
Malformed request (bad parameters, too many tokens, etc.).
Source code in src/synthorg/providers/errors.py
ContentFilterError
¶
Bases: ProviderError
Request or response blocked by the provider's content filter.
Source code in src/synthorg/providers/errors.py
ProviderTimeoutError
¶
Bases: ProviderError
Request timed out waiting for provider response.
Source code in src/synthorg/providers/errors.py
ProviderConnectionError
¶
Bases: ProviderError
Network-level failure connecting to the provider.
Source code in src/synthorg/providers/errors.py
ProviderInternalError
¶
Bases: ProviderError
Provider returned a server-side error (5xx).
Source code in src/synthorg/providers/errors.py
DriverNotRegisteredError
¶
Bases: ProviderError
Requested provider driver is not registered in the registry.
Source code in src/synthorg/providers/errors.py
DriverAlreadyRegisteredError
¶
Bases: ProviderError
A driver with this name is already registered.
Reserved for future use if the registry gains mutable operations (add/remove after construction). Not currently raised.
Source code in src/synthorg/providers/errors.py
DriverFactoryNotFoundError
¶
Bases: ProviderError
No factory found for the requested driver type string.
Source code in src/synthorg/providers/errors.py
ProviderAlreadyExistsError
¶
Bases: ProviderError
A provider with this name already exists.
Source code in src/synthorg/providers/errors.py
ProviderNotFoundError
¶
Bases: ProviderError
A provider with this name does not exist.
Source code in src/synthorg/providers/errors.py
ProviderValidationError
¶
Bases: ProviderError
Provider configuration failed validation.
Source code in src/synthorg/providers/errors.py
Capabilities¶
capabilities
¶
Model capability descriptors for provider routing decisions.
ModelCapabilities
pydantic-model
¶
Bases: BaseModel
Static capability and cost metadata for a single LLM model.
Used by the routing layer to decide which model handles a request based on required features (tools, vision, streaming) and cost.
Attributes:
| Name | Type | Description |
|---|---|---|
model_id |
NotBlankStr
|
Provider model identifier (e.g. |
provider |
NotBlankStr
|
Provider name (e.g. |
max_context_tokens |
int
|
Maximum context window size in tokens. |
max_output_tokens |
int
|
Maximum output tokens per request. |
supports_tools |
bool
|
Whether the model supports tool/function calling. |
supports_vision |
bool
|
Whether the model accepts image inputs. |
supports_streaming |
bool
|
Whether the model supports streaming responses. |
supports_streaming_tool_calls |
bool
|
Whether tool calls can be streamed. |
supports_system_messages |
bool
|
Whether system messages are accepted. |
cost_per_1k_input |
float
|
Cost per 1 000 input tokens in USD. |
cost_per_1k_output |
float
|
Cost per 1 000 output tokens in USD. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
model_id(NotBlankStr) -
provider(NotBlankStr) -
max_context_tokens(int) -
max_output_tokens(int) -
supports_tools(bool) -
supports_vision(bool) -
supports_streaming(bool) -
supports_streaming_tool_calls(bool) -
supports_system_messages(bool) -
cost_per_1k_input(float) -
cost_per_1k_output(float)
Validators:
-
_validate_cross_field_constraints
supports_streaming_tool_calls
pydantic-field
¶
Supports streaming tool calls
Registry¶
registry
¶
Provider registry -- the Employment Agency.
Maps provider names to concrete BaseCompletionProvider driver
instances. Built from config via from_config, which reads each
provider's driver field to select the appropriate factory.
ProviderRegistry
¶
Immutable registry of named provider drivers.
Use from_config to build a registry from a config dict, or
construct directly with a pre-built mapping.
Examples:
Build from config::
registry = ProviderRegistry.from_config(
root_config.providers,
)
driver = registry.get("example-provider")
response = await driver.complete(messages, "medium")
Check membership::
if "example-provider" in registry:
...
Initialize with a name -> driver mapping.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
drivers
|
dict[str, BaseCompletionProvider]
|
Mutable dict of provider name to driver instance. The registry takes ownership and freezes a copy. |
required |
Source code in src/synthorg/providers/registry.py
get
¶
Look up a driver by provider name.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Provider name (e.g. |
required |
Returns:
| Type | Description |
|---|---|
BaseCompletionProvider
|
The registered driver instance. |
Raises:
| Type | Description |
|---|---|
DriverNotRegisteredError
|
If no driver is registered. |
Source code in src/synthorg/providers/registry.py
list_providers
¶
__contains__
¶
__len__
¶
from_config
classmethod
¶
Build a registry from a provider config dict.
For each provider, reads the driver field to select a
factory. The factory is called with
(provider_name, config) to produce a driver instance.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
providers
|
dict[str, ProviderConfig]
|
Provider config dict (key = provider name). |
required |
factory_overrides
|
dict[str, object] | None
|
Optional driver-type -> factory mapping for testing or native SDK swaps. |
None
|
Returns:
| Type | Description |
|---|---|
Self
|
A new |
Raises:
| Type | Description |
|---|---|
DriverFactoryNotFoundError
|
If a provider's |
Source code in src/synthorg/providers/registry.py
LiteLLM Driver¶
litellm_driver
¶
LiteLLM-backed completion driver.
Wraps litellm.acompletion behind the BaseCompletionProvider
contract, mapping between domain models and LiteLLM's chat-completion
API.
LiteLLMDriver
¶
Bases: BaseCompletionProvider
Completion driver backed by LiteLLM.
Uses litellm.acompletion for both streaming and non-streaming
calls. Model identifiers are prefixed with the LiteLLM routing key
(litellm_provider if set, otherwise the provider name -- e.g.
example-provider/example-medium-001) so LiteLLM routes to the
correct backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider_name
|
str
|
Provider key from config (e.g. |
required |
config
|
ProviderConfig
|
Provider configuration including API key, base URL, and model definitions. |
required |
Raises:
| Type | Description |
|---|---|
ProviderError
|
All LiteLLM exceptions are mapped to the
|
Source code in src/synthorg/providers/drivers/litellm_driver.py
Routing¶
models
¶
Domain models for the routing engine.
ResolvedModel
pydantic-model
¶
Bases: BaseModel
A fully resolved model reference.
Attributes:
| Name | Type | Description |
|---|---|---|
provider_name |
NotBlankStr
|
Provider that owns this model (e.g. |
model_id |
NotBlankStr
|
Concrete model identifier (e.g. |
alias |
NotBlankStr | None
|
Short alias used in routing rules, if any. |
cost_per_1k_input |
float
|
Cost per 1,000 input tokens in USD (base currency). |
cost_per_1k_output |
float
|
Cost per 1,000 output tokens in USD (base currency). |
max_context |
int
|
Maximum context window size in tokens. |
estimated_latency_ms |
int | None
|
Estimated median latency in milliseconds. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
provider_name(NotBlankStr) -
model_id(NotBlankStr) -
alias(NotBlankStr | None) -
cost_per_1k_input(float) -
cost_per_1k_output(float) -
max_context(int) -
estimated_latency_ms(int | None)
RoutingRequest
pydantic-model
¶
Bases: BaseModel
Inputs to a routing decision.
Not all fields are used by every strategy:
- ManualStrategy requires
model_override. - RoleBasedStrategy requires
agent_level. - CostAwareStrategy uses
task_typeandremaining_budget. - FastestStrategy uses
task_typeandremaining_budget. - SmartStrategy uses all fields in priority order.
Attributes:
| Name | Type | Description |
|---|---|---|
agent_level |
SeniorityLevel | None
|
Seniority level of the requesting agent. |
task_type |
NotBlankStr | None
|
Task type label (e.g. |
model_override |
NotBlankStr | None
|
Explicit model reference for manual routing. |
remaining_budget |
float | None
|
Per-request cost ceiling. Compared against
each model's |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
agent_level(SeniorityLevel | None) -
task_type(NotBlankStr | None) -
model_override(NotBlankStr | None) -
remaining_budget(float | None)
remaining_budget
pydantic-field
¶
Per-request cost ceiling in USD (base currency), compared against model total_cost_per_1k. Not a total session budget.
RoutingDecision
pydantic-model
¶
Bases: BaseModel
Output of a routing decision.
Attributes:
| Name | Type | Description |
|---|---|---|
resolved_model |
ResolvedModel
|
The chosen model. |
strategy_used |
NotBlankStr
|
Name of the strategy that produced this decision. |
reason |
NotBlankStr
|
Human-readable explanation. |
fallbacks_tried |
tuple[str, ...]
|
Model refs that were tried before the final choice. |
Config:
frozen:Trueallow_inf_nan:False
Fields:
-
resolved_model(ResolvedModel) -
strategy_used(NotBlankStr) -
reason(NotBlankStr) -
fallbacks_tried(tuple[str, ...])
router
¶
Model router -- main entry point for routing decisions.
Constructed from RoutingConfig and a provider config dict.
Delegates to strategy implementations.
ModelRouter
¶
Route requests to the appropriate LLM model.
Examples:
Build from config::
router = ModelRouter(
routing_config=root_config.routing,
providers=root_config.providers,
)
decision = router.route(
RoutingRequest(agent_level=SeniorityLevel.SENIOR),
)
Initialize the router.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
routing_config
|
RoutingConfig
|
Routing configuration (strategy, rules, fallback). |
required |
providers
|
dict[str, ProviderConfig]
|
Provider configurations keyed by provider name. |
required |
selector
|
ModelCandidateSelector | None
|
Optional candidate selector for multi-provider
model resolution. Defaults to |
None
|
Raises:
| Type | Description |
|---|---|
UnknownStrategyError
|
If the configured strategy is not recognized. |
Source code in src/synthorg/providers/routing/router.py
route
¶
Route a request to a model.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request
|
RoutingRequest
|
Routing inputs. |
required |
Returns:
| Type | Description |
|---|---|
RoutingDecision
|
A routing decision with the chosen model. |
Raises:
| Type | Description |
|---|---|
ModelResolutionError
|
If a required model cannot be found. |
NoAvailableModelError
|
If all candidates are exhausted. |
Source code in src/synthorg/providers/routing/router.py
strategies
¶
Routing strategies -- stateless implementations of RoutingStrategy.
Each strategy selects a model given a RoutingRequest, a
RoutingConfig, and a ModelResolver. Strategies are stateless
singletons registered in a module-level mapping.
STRATEGY_MAP
module-attribute
¶
STRATEGY_MAP = MappingProxyType(
{
STRATEGY_NAME_MANUAL: _MANUAL,
STRATEGY_NAME_ROLE_BASED: _ROLE_BASED,
STRATEGY_NAME_COST_AWARE: _COST_AWARE,
STRATEGY_NAME_FASTEST: _FASTEST,
STRATEGY_NAME_SMART: _SMART,
STRATEGY_NAME_CHEAPEST: _COST_AWARE,
}
)
Maps config strategy names to singleton instances.
RoutingStrategy
¶
Bases: Protocol
Protocol for model routing strategies.
select
¶
Select a model for the given request.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
request
|
RoutingRequest
|
Routing inputs (agent level, task type, etc.). |
required |
config
|
RoutingConfig
|
Routing configuration (rules, fallback chain). |
required |
resolver
|
ModelResolver
|
Model resolver for alias/ID lookup. |
required |
Returns:
| Type | Description |
|---|---|
RoutingDecision
|
A routing decision with the chosen model. |
Raises:
| Type | Description |
|---|---|
ModelResolutionError
|
If the requested model cannot be found. |
NoAvailableModelError
|
If all candidates are exhausted. |
Source code in src/synthorg/providers/routing/strategies.py
ManualStrategy
¶
Resolve an explicit model override.
Requires request.model_override to be set.
select
¶
Select the explicitly requested model.
Raises:
| Type | Description |
|---|---|
ModelResolutionError
|
If |
Source code in src/synthorg/providers/routing/strategies.py
RoleBasedStrategy
¶
Select model based on agent seniority level.
Matches the first routing rule where rule.role_level equals
request.agent_level. If no rule matches, uses the seniority
catalog's typical_model_tier as a fallback lookup.
select
¶
Select model based on role level.
Raises:
| Type | Description |
|---|---|
ModelResolutionError
|
If no agent_level is set. |
NoAvailableModelError
|
If all candidates are exhausted. |
Source code in src/synthorg/providers/routing/strategies.py
CostAwareStrategy
¶
Select the cheapest model, optionally respecting a budget.
Matches task_type rules first, then falls back to the cheapest
model from the resolver.
select
¶
Select the cheapest available model.
Raises:
| Type | Description |
|---|---|
NoAvailableModelError
|
If no models are registered. |
Source code in src/synthorg/providers/routing/strategies.py
FastestStrategy
¶
Select the fastest model, optionally respecting a budget.
Matches task_type rules first, then falls back to the fastest
model from the resolver. When no models have latency data,
delegates to cheapest (lower-cost models are typically smaller
and faster, making cost a reasonable proxy).
select
¶
Select the fastest available model.
Raises:
| Type | Description |
|---|---|
NoAvailableModelError
|
If no models are registered. |
Source code in src/synthorg/providers/routing/strategies.py
SmartStrategy
¶
Combined strategy with priority-based signal merging.
Priority order: model_override > task_type rules > role_level rules > seniority default > cheapest available (budget-aware) > global fallback_chain > exhausted.
select
¶
Select a model using all available signals.
Raises:
| Type | Description |
|---|---|
NoAvailableModelError
|
If all candidates are exhausted. |
Source code in src/synthorg/providers/routing/strategies.py
Resilience¶
retry
¶
Retry handler with exponential backoff and jitter.
RetryHandler
¶
Wraps async callables with retry logic.
Retries transient errors (is_retryable=True) using exponential
backoff with optional jitter. Non-retryable errors raise immediately.
After exhausting max_retries, raises RetryExhaustedError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
RetryConfig
|
Retry configuration. |
required |
Source code in src/synthorg/providers/resilience/retry.py
execute
async
¶
Execute func with retry on transient errors.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
func
|
Callable[[], Coroutine[object, object, T]]
|
Zero-argument async callable to execute. |
required |
Returns:
| Type | Description |
|---|---|
T
|
The return value of func. |
Raises:
| Type | Description |
|---|---|
RetryExhaustedError
|
If all retries are exhausted. |
ProviderError
|
If the error is non-retryable. |
Source code in src/synthorg/providers/resilience/retry.py
rate_limiter
¶
Client-side rate limiter with RPM and concurrency controls.
RateLimiter
¶
Client-side rate limiter with RPM tracking and concurrency control.
Uses a sliding window for RPM tracking and an asyncio semaphore for
concurrency limiting. Supports pause-until from provider
retry_after hints.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
RateLimiterConfig
|
Rate limiter configuration. |
required |
provider_name
|
str
|
Provider name for logging context. |
required |
Source code in src/synthorg/providers/resilience/rate_limiter.py
acquire
async
¶
Wait for an available slot.
Blocks until both the RPM window and concurrency semaphore allow a new request. Also respects any active pause.
Source code in src/synthorg/providers/resilience/rate_limiter.py
release
¶
pause
¶
Block new requests for seconds.
Called when a RateLimitError with retry_after is received.
Multiple calls take the latest pause-until if it extends further.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
seconds
|
float
|
Duration to pause in seconds. Must be finite and non-negative. |
required |
Raises:
| Type | Description |
|---|---|
ValueError
|
If seconds is negative or not finite. |
Source code in src/synthorg/providers/resilience/rate_limiter.py
errors
¶
Resilience-specific error types.
RetryExhaustedError
¶
Bases: ProviderError
All retry attempts exhausted for a retryable error.
Raised by RetryHandler when max_retries is reached.
The engine layer catches this to trigger fallback chains.
Attributes:
| Name | Type | Description |
|---|---|---|
original_error |
The last retryable error that was raised. |
Initialize with the original error that exhausted retries.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original_error
|
ProviderError
|
The last retryable |
required |