Python SDK

This page covers the full Python SDK path: trace binding, explicit execute control, tool loops, simulation, and private-agent routing. If you are migrating an existing app and only need the OpenAI-compatible gateway path, start with Quick Start or the SDK overview.

Requires Python ≥ 3.9. Runtime dependencies: httpx, pydantic.

flowchart LR APP[PYTHON APP] GATEWAY[OPENAI-COMPATIBLE GATEWAY] SDK[OLYX SDK] TRACE[TRACE] EXEC[CLIENT EXECUTE] GATE[OLYX GATEWAY OR AGENT] MODEL[MODEL PROVIDER] APP --> GATEWAY --> GATE --> MODEL APP --> SDK --> TRACE --> EXEC --> GATE --> MODEL

Installation

pip install olyx

Or using Poetry / uv:

poetry add olyx

# or
uv add olyx

Full SDK (explicit control)

Use the Olyx client directly when you want explicit trace control, tool-loop orchestration, and typed SDK results.

import os
import olyx

olyx.configure(
    api_key   = os.environ["OLYX_API_KEY"],
    fail_open = False,  # fail-closed by default — see Safety Valve
)

client = olyx.Olyx()

For private-agent deployments, add base_url as a single extra line (see Private Agent Routes).


client.execute — the primary call

execute is the governed call path for prompt execution. In the Python SDK, each execute call must be bound to a trace:

trace = client.traces.create(
    metadata = {"user_id": "u_123", "org_id": "org_abc", "intent": "translation"}
)

result = client.execute(
    trace_id = trace.id,
    input    = "Translate to French: Hello, world.",
)

Response shape

result.output              # model output string | None
result.model               # resolved model identifier | None
result.step_id             # step ID in the trace graph
result.reason              # present when blocked
result.status              # "tool_calls_pending" | None
result.tool_calls          # list[ToolCall]
result.bypass              # True if fail_open bypass path was used
result.raw                 # raw response payload

result.blocked             # convenience property
result.tool_calls_pending  # convenience property

Simulate / dry-run

The Python SDK 0.1.x does not yet expose a first-class client.simulate helper. Use client.execute for governed runtime calls, and use the dashboard or API reference when you need low-level dry-run automation.

This section stays intentionally lightweight until the Python helper lands, so application code does not need to copy a raw HTTP workaround into production.


Policy hooks

Project policy is enforced server-side. In Python 0.1.0, client.execute does not currently accept a policy object.
Use trace metadata (user_id, intent, feature, etc.) to provide routing context and governance attribution.


Blocked responses

A blocked response is a governance event, not an exception:

trace = client.traces.create(metadata={"user_id": "u_123"})
result = client.execute(trace_id=trace.id, input="...")

if result.blocked:
    log_governance_event(result.reason, step_id=result.step_id)
else:
    return {"output": result.output}

Tool calls

Pass tool definitions and Olyx manages the loop:

import json

tools = [{
    "type": "function",
    "function": {
        "name": "get_weather",
        "description": "Get current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {"city": {"type": "string"}},
            "required": ["city"],
        },
    },
}]

trace = client.traces.create(metadata={"user_id": "u_123", "intent": "weather_lookup"})
result = client.execute(trace_id=trace.id, input="What is the weather in London?", tools=tools)

while result.tool_calls_pending:
    tool_results = []
    for call in result.tool_calls:
        output = dispatch_tool(call.name, call.arguments)
        tool_results.append({
            "tool_call_id": call.id,
            "name": call.name,
            "content": json.dumps(output),
        })

    result = client.execute(
        trace_id = trace.id,
        parent_step_id = result.step_id,
        tool_results = tool_results,
    )

MCP tools in service objects

Initialize one Olyx client per service and create traces per request:

class DocumentService:
    def __init__(self, user_id: str):
        self.user_id = user_id
        self.client = olyx.Olyx()

    def summarise(self, document_id: str):
        trace = self.client.traces.create(
            metadata={"user_id": self.user_id, "intent": "summarization"}
        )
        return self.client.execute(
            trace_id = trace.id,
            input = f"Summarise document {document_id}",
            tools = self._mcp_tools(),
        )

    def _mcp_tools(self):
        return [{
            "type": "mcp",
            "server_label": "documents",
            "server_url": os.environ["DOCS_MCP_URL"],
            "require_approval": "never",
        }]

In FastAPI, keep a shared client on app state:

from contextlib import asynccontextmanager
from fastapi import FastAPI

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.olyx = olyx.Olyx()
    yield

Multi-step workflows (explicit trace control)

For workflows spanning multiple model turns, bind all steps to one trace:

trace = client.traces.create(
    metadata = {"user_id": "u_123", "task": "research_report"},
    revenue  = 2.00,
)

step1 = client.execute(
    trace_id = trace.id,
    input = "Find recent papers on transformer efficiency.",
)
step2 = client.execute(
    trace_id = trace.id,
    input = f"Summarise: {step1.output}",
)

client.traces.complete(trace.id)

Embeddings

trace = client.traces.create(metadata={"user_id": "u_123", "intent": "embedding"})
emb = client.embeddings.create(
    trace_id = trace.id,
    input = ["Document one.", "Document two."],
    model = "text-embedding-3-small",
)

print(len(emb.embeddings))
print(emb.usage_tokens)

If input validation fails, the SDK raises olyx.ValidationError.


User retention analytics

Attach user and feature metadata to traces so usage is attributable in analytics:

trace = client.traces.create(
    metadata = {
        "user_id": "u_123",
        "org_id": "org_abc",
        "intent": "email_draft",
        "feature": "sales_assistant",
    }
)

result = client.execute(
    trace_id = trace.id,
    input = "Draft a follow-up email for this deal.",
)

The Safety Valve: Fail-Closed vs. Fail-Open

Fail-closed (default): if the gateway is unreachable, execute raises olyx.CircuitBreakerError.

Fail-open: configure fallback provider settings and pass fail_open=True when needed:

olyx.configure(
    api_key = os.environ["OLYX_API_KEY"],
    fail_open = True,
    fallback_provider_url = "https://api.openai.com/v1",
    fallback_api_key = os.environ["OPENAI_API_KEY"],
    fallback_model = "gpt-4o-mini",
)

client = olyx.Olyx()
trace = client.traces.create(metadata={"user_id": "u_123"})
result = client.execute(
    trace_id = trace.id,
    input = "Summarise this internal changelog.",
    fail_open = True,
)

print(result.bypass)  # True when bypass path was used

Testing

Use a dedicated test project with a project-scoped API key. All SDK calls route to the real Olyx backend — traces are created, policy is enforced, and executions count against your quota exactly as in production. Test-environment behaviour is verifiable against real governance rules.

import os
import olyx

olyx.configure(
    api_key  = os.environ["OLYX_TEST_API_KEY"],
    base_url = os.environ.get("OLYX_BASE_URL", "https://olyx.ai"),
    mock     = False,
)
client = olyx.Olyx()

Set a spend cap on the test project key to bound runaway test costs. Use a separate test project so production trace history stays clean.

Controlling test outcomes

Use client.simulate to exercise your routing policy without invoking a model:

result = client.simulate(input="What is 2+2?")
# result.status == "resolved"
# result.estimated_cost == 0.00018

Use client.checks to test guardrail logic against specific inputs:

check = client.checks(trace_id=trace.id, input=user_input)
if not check.allowed:
    return {"error": "Request blocked"}, 403

Offline testing (enterprise)

Enterprise plans include an offline testing flag that enables zero-network test execution. The SDK reads this capability at initialisation via GET /api/v1/sdk/config:

olyx.configure(
    api_key  = os.environ["OLYX_API_KEY"],
    offline  = os.environ.get("OLYX_ENV") == "test",  # only resolves if plan permits
)

Offline mode returns locally-generated stub responses with the same shape as real responses — no HTTP call, no trace, no quota consumption. Use it in CI pipelines with strict egress controls or air-gapped environments.

If offline=True is set on a plan without the offline_testing feature, the SDK raises olyx.ConfigurationError rather than silently falling back to online mode.


Error reference

All SDK errors inherit from olyx.OlyxError and include .status plus optional .code.

ClassWhen raised
olyx.AuthError401 — missing, revoked, or expired API key
olyx.NotFoundError404 — resource not found or belongs to another account
olyx.ValidationError400/422 — request validation failed
olyx.RateLimitError429 — rate limit or spend cap reached
olyx.ServerError5xx from the gateway
olyx.GatewayErrorNetwork or transport failure while reaching the gateway
olyx.CircuitBreakerErrorGateway unreachable and fail_open is False
olyx.ConfigurationErrorInvalid or missing SDK configuration
trace = client.traces.create(metadata={"user_id": "u_123"})

try:
    result = client.execute(trace_id=trace.id, input="...")
except olyx.CircuitBreakerError:
    return {"error": "AI service temporarily unavailable."}, 503
except olyx.RateLimitError as e:
    if e.code == "CIRCUIT_OPEN":
        pass
    elif e.code == "LOOP_DETECTED":
        pass
except olyx.AuthError:
    pass

Private Agent Routes

The Olyx Agent is a lightweight, outbound-only container for selected private beta deployments. Your application points the SDK at an internal hostname, and the agent forwards Olyx requests outbound through your normal network controls.

Use the agent when the hosted gateway cannot reach an internal provider endpoint or when your deployment needs an internal egress point. Most beta teams can start with the hosted gateway and add the agent later.

Agent quickstart (private deployment)

import os
import olyx

olyx.configure(
    api_key  = os.environ["OLYX_API_KEY"],
    base_url = "http://olyx-agent:4000",
)

client = olyx.Olyx()
client.ping()  # {"status": "ok", ...}

Then run execute as normal — only base_url changes.

Start the agent

docker run -d \
  --name olyx-agent \
  -e OLYX_API_KEY="$OLYX_API_KEY" \
  -p 4000:4000 \
  olyxlabs/olyx-agent:latest

The agent exposes the same API shape as the hosted gateway. It applies your project-level policy before forwarding requests through the configured outbound path.

Point the SDK at the agent

import os
import olyx

olyx.configure(
    api_key  = os.environ["OLYX_API_KEY"],
    base_url = "http://olyx-agent:4000",   # internal hostname
    fail_open = False,
)

SDK behavior is the same from the application perspective — only base_url changes.

Kubernetes sidecar

Run the agent as a sidecar in the same pod as your application:

# deployment.yaml (relevant section)
containers:
  - name: app
    image: your-app:latest
    env:
      - name: OLYX_GATEWAY_URL
        value: "http://localhost:4000"
  - name: olyx-agent
    image: olyxlabs/olyx-agent:latest
    env:
      - name: OLYX_API_KEY
        valueFrom:
          secretKeyRef:
            name: olyx-secrets
            key: api-key
    ports:
      - containerPort: 4000

Operational behavior

BehaviorDetail
Outbound-firstDesigned for deployments where your network initiates connections outward.
Credential placementKeep the Olyx API key in the agent or secret manager rather than hardcoding it in app code.
Network visibilityRoute Olyx-bound model traffic through infrastructure your team already monitors.
Policy pathProject-level routing, cost caps, and PII checks still happen before provider execution.
Fail-closed defaultIf the agent is unreachable, olyx.CircuitBreakerError is raised unless you explicitly opt into fail-open behavior.

TLS

If your network terminates TLS at an internal boundary, pass your CA bundle to the agent container:

docker run -d \
  --name olyx-agent \
  -e OLYX_API_KEY="$OLYX_API_KEY" \
  -v /etc/ssl/internal:/etc/ssl/internal:ro \
  -e SSL_CERT_FILE="/etc/ssl/internal/ca-bundle.crt" \
  -p 4000:4000 \
  olyxlabs/olyx-agent:latest

Never pass verify=False in production. Use it only in local development against a self-signed cert.

Timeout tuning

Internal networks have much lower latency than the public internet. Tighten the SDK timeout so a silently failed agent surfaces faster:

olyx.configure(
    api_key  = os.environ["OLYX_API_KEY"],
    base_url = "http://olyx-agent:4000",
    timeout  = 5.0,  # seconds; default is 30.0
)

Verifying connectivity

curl -s http://olyx-agent:4000/up
# → {"status":"ok","version":"1.4.2"}

In FastAPI, add a startup check in the lifespan:

from contextlib import asynccontextmanager
import olyx

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.olyx = olyx.Olyx()
    app.state.olyx.ping()
    yield

In Django, add it to AppConfig.ready():

class AIConfig(AppConfig):
    def ready(self):
        import olyx
        if not settings.DEBUG:
            olyx.Olyx().ping()

Gateway migration through the agent

Existing code using the OpenAI Python SDK can route through the agent by changing the base URL to your internal agent hostname:

from openai import OpenAI

client = OpenAI(
    api_key  = os.environ["OLYX_API_KEY"],
    base_url = "http://olyx-agent:4000/v1",
)

# All existing code unchanged — PII scrubbing and routing applied by the agent
response = client.chat.completions.create(model="gpt-4o", messages=[...])

For custom TLS with the OpenAI SDK, pass an httpx client pointing at your agent:

import httpx
from openai import OpenAI

http_client = httpx.Client(verify="/etc/ssl/internal/ca-bundle.crt")

client = OpenAI(
    api_key     = os.environ["OLYX_API_KEY"],
    base_url    = "https://olyx-agent.internal/v1",
    http_client = http_client,
)

Regional routing

If you run services in multiple regions, put regional agent instances behind your own internal routing layer and point the SDK at that stable base_url. Keep the first beta deployment simple; add regional routing only after trace latency shows that it matters.

Was this page helpful?