Python SDK
This page covers the full Python SDK path: trace binding, explicit execute control, tool loops, simulation, and private-agent routing.
If you are migrating an existing app and only need the OpenAI-compatible gateway path, start with Quick Start or the SDK overview.
Requires Python ≥ 3.9. Runtime dependencies: httpx, pydantic.
Installation
pip install olyx
Or using Poetry / uv:
poetry add olyx
# or
uv add olyx
Full SDK (explicit control)
Use the Olyx client directly when you want explicit trace control, tool-loop orchestration, and typed SDK results.
import os
import olyx
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
fail_open = False, # fail-closed by default — see Safety Valve
)
client = olyx.Olyx()
For private-agent deployments, add base_url as a single extra line (see Private Agent Routes).
client.execute — the primary call
execute is the governed call path for prompt execution. In the Python SDK, each execute call must be bound to a trace:
trace = client.traces.create(
metadata = {"user_id": "u_123", "org_id": "org_abc", "intent": "translation"}
)
result = client.execute(
trace_id = trace.id,
input = "Translate to French: Hello, world.",
)
Response shape
result.output # model output string | None
result.model # resolved model identifier | None
result.step_id # step ID in the trace graph
result.reason # present when blocked
result.status # "tool_calls_pending" | None
result.tool_calls # list[ToolCall]
result.bypass # True if fail_open bypass path was used
result.raw # raw response payload
result.blocked # convenience property
result.tool_calls_pending # convenience property
Simulate / dry-run
The Python SDK 0.1.x does not yet expose a first-class client.simulate helper. Use client.execute for governed
runtime calls, and use the dashboard or API reference when you need low-level dry-run automation.
This section stays intentionally lightweight until the Python helper lands, so application code does not need to copy a raw HTTP workaround into production.
Policy hooks
Project policy is enforced server-side. In Python 0.1.0, client.execute does not currently accept a policy object.
Use trace metadata (user_id, intent, feature, etc.) to provide routing context and governance attribution.
Blocked responses
A blocked response is a governance event, not an exception:
trace = client.traces.create(metadata={"user_id": "u_123"})
result = client.execute(trace_id=trace.id, input="...")
if result.blocked:
log_governance_event(result.reason, step_id=result.step_id)
else:
return {"output": result.output}
Tool calls
Pass tool definitions and Olyx manages the loop:
import json
tools = [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
},
}]
trace = client.traces.create(metadata={"user_id": "u_123", "intent": "weather_lookup"})
result = client.execute(trace_id=trace.id, input="What is the weather in London?", tools=tools)
while result.tool_calls_pending:
tool_results = []
for call in result.tool_calls:
output = dispatch_tool(call.name, call.arguments)
tool_results.append({
"tool_call_id": call.id,
"name": call.name,
"content": json.dumps(output),
})
result = client.execute(
trace_id = trace.id,
parent_step_id = result.step_id,
tool_results = tool_results,
)
MCP tools in service objects
Initialize one Olyx client per service and create traces per request:
class DocumentService:
def __init__(self, user_id: str):
self.user_id = user_id
self.client = olyx.Olyx()
def summarise(self, document_id: str):
trace = self.client.traces.create(
metadata={"user_id": self.user_id, "intent": "summarization"}
)
return self.client.execute(
trace_id = trace.id,
input = f"Summarise document {document_id}",
tools = self._mcp_tools(),
)
def _mcp_tools(self):
return [{
"type": "mcp",
"server_label": "documents",
"server_url": os.environ["DOCS_MCP_URL"],
"require_approval": "never",
}]
In FastAPI, keep a shared client on app state:
from contextlib import asynccontextmanager
from fastapi import FastAPI
@asynccontextmanager
async def lifespan(app: FastAPI):
app.state.olyx = olyx.Olyx()
yield
Multi-step workflows (explicit trace control)
For workflows spanning multiple model turns, bind all steps to one trace:
trace = client.traces.create(
metadata = {"user_id": "u_123", "task": "research_report"},
revenue = 2.00,
)
step1 = client.execute(
trace_id = trace.id,
input = "Find recent papers on transformer efficiency.",
)
step2 = client.execute(
trace_id = trace.id,
input = f"Summarise: {step1.output}",
)
client.traces.complete(trace.id)
Embeddings
trace = client.traces.create(metadata={"user_id": "u_123", "intent": "embedding"})
emb = client.embeddings.create(
trace_id = trace.id,
input = ["Document one.", "Document two."],
model = "text-embedding-3-small",
)
print(len(emb.embeddings))
print(emb.usage_tokens)
If input validation fails, the SDK raises olyx.ValidationError.
User retention analytics
Attach user and feature metadata to traces so usage is attributable in analytics:
trace = client.traces.create(
metadata = {
"user_id": "u_123",
"org_id": "org_abc",
"intent": "email_draft",
"feature": "sales_assistant",
}
)
result = client.execute(
trace_id = trace.id,
input = "Draft a follow-up email for this deal.",
)
The Safety Valve: Fail-Closed vs. Fail-Open
Fail-closed (default): if the gateway is unreachable, execute raises olyx.CircuitBreakerError.
Fail-open: configure fallback provider settings and pass fail_open=True when needed:
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
fail_open = True,
fallback_provider_url = "https://api.openai.com/v1",
fallback_api_key = os.environ["OPENAI_API_KEY"],
fallback_model = "gpt-4o-mini",
)
client = olyx.Olyx()
trace = client.traces.create(metadata={"user_id": "u_123"})
result = client.execute(
trace_id = trace.id,
input = "Summarise this internal changelog.",
fail_open = True,
)
print(result.bypass) # True when bypass path was used
Testing
Use a dedicated test project with a project-scoped API key. All SDK calls route to the real Olyx backend — traces are created, policy is enforced, and executions count against your quota exactly as in production. Test-environment behaviour is verifiable against real governance rules.
import os
import olyx
olyx.configure(
api_key = os.environ["OLYX_TEST_API_KEY"],
base_url = os.environ.get("OLYX_BASE_URL", "https://olyx.ai"),
mock = False,
)
client = olyx.Olyx()
Set a spend cap on the test project key to bound runaway test costs. Use a separate test project so production trace history stays clean.
Controlling test outcomes
Use client.simulate to exercise your routing policy without invoking a model:
result = client.simulate(input="What is 2+2?")
# result.status == "resolved"
# result.estimated_cost == 0.00018
Use client.checks to test guardrail logic against specific inputs:
check = client.checks(trace_id=trace.id, input=user_input)
if not check.allowed:
return {"error": "Request blocked"}, 403
Offline testing (enterprise)
Enterprise plans include an offline testing flag that enables zero-network test execution. The SDK reads this capability at initialisation via GET /api/v1/sdk/config:
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
offline = os.environ.get("OLYX_ENV") == "test", # only resolves if plan permits
)
Offline mode returns locally-generated stub responses with the same shape as real responses — no HTTP call, no trace, no quota consumption. Use it in CI pipelines with strict egress controls or air-gapped environments.
If offline=True is set on a plan without the offline_testing feature, the SDK raises olyx.ConfigurationError rather than silently falling back to online mode.
Error reference
All SDK errors inherit from olyx.OlyxError and include .status plus optional .code.
| Class | When raised |
|---|---|
olyx.AuthError | 401 — missing, revoked, or expired API key |
olyx.NotFoundError | 404 — resource not found or belongs to another account |
olyx.ValidationError | 400/422 — request validation failed |
olyx.RateLimitError | 429 — rate limit or spend cap reached |
olyx.ServerError | 5xx from the gateway |
olyx.GatewayError | Network or transport failure while reaching the gateway |
olyx.CircuitBreakerError | Gateway unreachable and fail_open is False |
olyx.ConfigurationError | Invalid or missing SDK configuration |
trace = client.traces.create(metadata={"user_id": "u_123"})
try:
result = client.execute(trace_id=trace.id, input="...")
except olyx.CircuitBreakerError:
return {"error": "AI service temporarily unavailable."}, 503
except olyx.RateLimitError as e:
if e.code == "CIRCUIT_OPEN":
pass
elif e.code == "LOOP_DETECTED":
pass
except olyx.AuthError:
pass
Private Agent Routes
The Olyx Agent is a lightweight, outbound-only container for selected private beta deployments. Your application points the SDK at an internal hostname, and the agent forwards Olyx requests outbound through your normal network controls.
Use the agent when the hosted gateway cannot reach an internal provider endpoint or when your deployment needs an internal egress point. Most beta teams can start with the hosted gateway and add the agent later.
Agent quickstart (private deployment)
import os
import olyx
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
base_url = "http://olyx-agent:4000",
)
client = olyx.Olyx()
client.ping() # {"status": "ok", ...}
Then run execute as normal — only base_url changes.
Start the agent
docker run -d \
--name olyx-agent \
-e OLYX_API_KEY="$OLYX_API_KEY" \
-p 4000:4000 \
olyxlabs/olyx-agent:latest
The agent exposes the same API shape as the hosted gateway. It applies your project-level policy before forwarding requests through the configured outbound path.
Point the SDK at the agent
import os
import olyx
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
base_url = "http://olyx-agent:4000", # internal hostname
fail_open = False,
)
SDK behavior is the same from the application perspective — only base_url changes.
Kubernetes sidecar
Run the agent as a sidecar in the same pod as your application:
# deployment.yaml (relevant section)
containers:
- name: app
image: your-app:latest
env:
- name: OLYX_GATEWAY_URL
value: "http://localhost:4000"
- name: olyx-agent
image: olyxlabs/olyx-agent:latest
env:
- name: OLYX_API_KEY
valueFrom:
secretKeyRef:
name: olyx-secrets
key: api-key
ports:
- containerPort: 4000
Operational behavior
| Behavior | Detail |
|---|---|
| Outbound-first | Designed for deployments where your network initiates connections outward. |
| Credential placement | Keep the Olyx API key in the agent or secret manager rather than hardcoding it in app code. |
| Network visibility | Route Olyx-bound model traffic through infrastructure your team already monitors. |
| Policy path | Project-level routing, cost caps, and PII checks still happen before provider execution. |
| Fail-closed default | If the agent is unreachable, olyx.CircuitBreakerError is raised unless you explicitly opt into fail-open behavior. |
TLS
If your network terminates TLS at an internal boundary, pass your CA bundle to the agent container:
docker run -d \
--name olyx-agent \
-e OLYX_API_KEY="$OLYX_API_KEY" \
-v /etc/ssl/internal:/etc/ssl/internal:ro \
-e SSL_CERT_FILE="/etc/ssl/internal/ca-bundle.crt" \
-p 4000:4000 \
olyxlabs/olyx-agent:latest
Never pass verify=False in production. Use it only in local development against a self-signed cert.
Timeout tuning
Internal networks have much lower latency than the public internet. Tighten the SDK timeout so a silently failed agent surfaces faster:
olyx.configure(
api_key = os.environ["OLYX_API_KEY"],
base_url = "http://olyx-agent:4000",
timeout = 5.0, # seconds; default is 30.0
)
Verifying connectivity
curl -s http://olyx-agent:4000/up
# → {"status":"ok","version":"1.4.2"}
In FastAPI, add a startup check in the lifespan:
from contextlib import asynccontextmanager
import olyx
@asynccontextmanager
async def lifespan(app: FastAPI):
app.state.olyx = olyx.Olyx()
app.state.olyx.ping()
yield
In Django, add it to AppConfig.ready():
class AIConfig(AppConfig):
def ready(self):
import olyx
if not settings.DEBUG:
olyx.Olyx().ping()
Gateway migration through the agent
Existing code using the OpenAI Python SDK can route through the agent by changing the base URL to your internal agent hostname:
from openai import OpenAI
client = OpenAI(
api_key = os.environ["OLYX_API_KEY"],
base_url = "http://olyx-agent:4000/v1",
)
# All existing code unchanged — PII scrubbing and routing applied by the agent
response = client.chat.completions.create(model="gpt-4o", messages=[...])
For custom TLS with the OpenAI SDK, pass an httpx client pointing at your agent:
import httpx
from openai import OpenAI
http_client = httpx.Client(verify="/etc/ssl/internal/ca-bundle.crt")
client = OpenAI(
api_key = os.environ["OLYX_API_KEY"],
base_url = "https://olyx-agent.internal/v1",
http_client = http_client,
)
Regional routing
If you run services in multiple regions, put regional agent instances behind your own internal routing layer and point
the SDK at that stable base_url. Keep the first beta deployment simple; add regional routing only after trace latency
shows that it matters.