Traces
A trace is the durable record for one logical AI task. Keep one trace per user-visible action so the cost, latency, routing, replay, and security data stay easy to read later.
During closed beta, treat traces as operational observability records. They are built to help engineers debug, replay, and understand AI traffic; they are not a billing processor, compliance archive, or provider invoice.
| Metric | Original | Replay |
|---|---|---|
| Model | gpt-4o-mini | |
| Cost | $0.00032 | |
| Latency | 1,240ms | |
| Grade | A | |
| Delta | baseline |
CHECK Input validation 12ms
| allowed | true |
| pii_detected | false |
| injection_attempt | false |
| secret_leaked | false |
| risk_score | 0.02 |
ROUTE Policy routing 3ms
RUN gpt-4o-mini $0.00032 1,240ms
| Model | Cost | Tier | |
|---|---|---|---|
| gpt-4o-mini | $0.00032 | medium | selected |
| gpt-4o | $0.00318 | complex | |
| claude-haiku-4-5 | $0.00025 | simple |
LOG Trace recorded 0ms
| optimization_grade | A |
| total_cost | 0.00032 |
| total_latency_ms | 1255 |
| grades | {"overall":"A","waste":"A","latency":"B"} |
Trace Lifecycle
Most integrations follow the same flow: create the trace, execute the work inside it, complete it when the task is done, and read it only when you need the detailed record.
| Step | What happens | API surface |
|---|---|---|
| Create | Open a trace before model work starts. Attach metadata and optional request revenue. | POST /api/v1/traces |
| Execute | Run one or more governed model calls inside that trace. | POST /api/v1/executions or SDK execute() |
| Complete | Close the trace after the last model/tool step so Olyx can calculate grades and summary fields. | PATCH /api/v1/traces/:id/complete |
| Read | Retrieve the trace when you need steps, routing details, graph data, or security signals. | GET /api/v1/traces/:id |
Completing the trace returns the computed summary. Read it again only when you need the full steps, graph, routing decision, or security detail.
Creating a Trace
Create the trace first so the later execution, cost, and replay data land on the right record. Attach any metadata you
want to filter on later, such as user_id, tenant_id, feature, plan, or your internal request id.
metadata must be a JSON object when provided. revenue is optional and lets Olyx calculate gross margin in cost
summaries.
POST /api/v1/traces
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json
{
"metadata": {
"user_id": "user_123",
"feature": "translation",
"plan": "team"
},
"revenue": 0.50
}
The create response is intentionally lightweight. It confirms that the trace exists and stays small because no execution has happened yet.
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending",
"created_at": "2026-04-12T09:00:00Z",
"metadata": {
"user_id": "user_123",
"feature": "translation",
"plan": "team"
},
"revenue": 0.50
}
Executing Inside a Trace
Pass the trace id to every governed model call so Olyx can connect routing, cost, latency, tool behavior, and security signals back to the original action.
POST /api/v1/executions
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json
{
"trace_id": "550e8400-e29b-41d4-a716-446655440000",
"input": "Translate the following to French: Hello, world."
}
For application code, prefer the SDK execute() wrapper. It handles the canonical endpoint, request shape, errors, and
response normalization for the language you are using.
Completing a Trace
Call complete once after the last model call or tool loop for the user-visible task has finished.
PATCH /api/v1/traces/550e8400-e29b-41d4-a716-446655440000/complete
Authorization: Bearer ak_<key_id>.<secret>
Completion marks the trace as completed, calculates the summary, and returns the values you usually want for dashboards
and replays.
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"created_at": "2026-04-12T09:00:00Z",
"metadata": {
"user_id": "user_123",
"feature": "translation",
"plan": "team"
},
"revenue": 0.50,
"optimization_grade": "B",
"grades": {
"overall": "B",
"waste": "A",
"latency": "B"
},
"total_cost": 0.00318,
"summary": {
"total_latency_ms": 1240.5,
"total_cost": 0.00318,
"revenue": 0.50,
"gross_margin": 0.49682,
"optimization_grade": "B",
"grades": {
"overall": "B",
"waste": "A",
"latency": "B"
},
"by_model": {
"gpt-4o": 0.00318
},
"by_infrastructure": {
"public_cloud": 0.00318,
"private": 0.0
},
"step_count": 2,
"chain_depth": 0,
"tool_overhead_ms": 0.0,
"stall_probability": 0.02,
"latency_p50": 870.0,
"latency_p95": 1200.0,
"latency_p99": 1238.0
}
}
You do not have to complete immediately. Complete after your own post-processing finishes, but complete every trace you want to include in cost reporting, grades, replay workflows, or operational dashboards.
Retrieving a Trace
Retrieve a trace when you need the full diagnostic record: steps, graph structure, routing decision, and security signals. The list endpoint stays small; the show endpoint is the detailed view.
GET /api/v1/traces/550e8400-e29b-41d4-a716-446655440000
Authorization: Bearer ak_<key_id>.<secret>
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"created_at": "2026-04-12T09:00:00Z",
"optimization_grade": "B",
"grades": {
"overall": "B",
"waste": "A",
"latency": "B"
},
"intent": "translation",
"step_count": 2,
"chain_depth": 0,
"tool_overhead_ms": 0.0,
"stall_probability": 0.02,
"pii_detected": false,
"injection_attempt": false,
"secret_leaked": false,
"secret_match_count": 0,
"tool_fidelity_score": 0.98,
"shadow_score": null,
"shadow_model": null,
"summary": {
"total_latency_ms": 1240.5,
"total_cost": 0.00318,
"revenue": 0.50,
"gross_margin": 0.49682,
"optimization_grade": "B",
"grades": {
"overall": "B",
"waste": "A",
"latency": "B"
},
"by_model": {
"gpt-4o": 0.00318
},
"by_infrastructure": {
"public_cloud": 0.00318,
"private": 0.0
},
"step_count": 2,
"chain_depth": 0,
"tool_overhead_ms": 0.0,
"stall_probability": 0.02,
"latency_p50": 870.0,
"latency_p95": 1200.0,
"latency_p99": 1238.0
},
"steps": [
{
"id": 101,
"type": "check",
"created_at": "2026-04-12T09:00:00Z",
"input": "Translate the following to French: Hello, world.",
"output": {
"allowed": true
},
"cost": null,
"meta": {
"model": null,
"latency_ms": 12.4
},
"parent_step_id": null
},
{
"id": 102,
"type": "run",
"created_at": "2026-04-12T09:00:01Z",
"input": "Translate the following to French: Hello, world.",
"output": "Bonjour le monde.",
"cost": 0.00318,
"meta": {
"model": "gpt-4o",
"latency_ms": 1228.1
},
"parent_step_id": null
}
],
"graph": {
"id": "550e8400-e29b-41d4-a716-446655440000",
"children": []
},
"routing_decision": {
"decision": "gpt-4o",
"score": {
"latency": 0.82,
"cost": 0.91
},
"metadata": {
"selected_from": "project_policy",
"reasoning": "translation request selected a general model",
"candidates": ["gpt-4o", "gpt-4o-mini"],
"resolution": {
"strategy": "balanced",
"attempted_tiers": ["simple", "standard"],
"fallback_used": false,
"fallback_source": null
}
}
}
}
For the fuller operational breakdown, see Summary Fields, Security Signals, Trace Status, Listing Traces, and Trace Steps.
Summary Fields
Summary fields are the compact operational view of the trace. They are returned from completion and from trace retrieval.
| Field | Description |
|---|---|
total_latency_ms | Sum of recorded step latency for the trace. Useful for request-level debugging. |
total_cost | Combined model cost for all cost-bearing steps in the trace. |
revenue | Optional request revenue supplied at trace creation. |
gross_margin | revenue - total_cost. Use this for unit economics, not invoicing. |
optimization_grade | Composite A-F signal for the trace. Equivalent to grades.overall when available. |
grades.overall | Weighted cost and latency grade. |
grades.waste | Cost-efficiency grade relative to the cheapest registered viable model. |
grades.latency | Latency grade based on total run-step time. |
by_model | Cost grouped by model identifier. |
by_infrastructure | Cost grouped into public_cloud and private buckets. |
step_count | Total number of steps recorded on the trace. |
chain_depth | Number of tool-call steps; a quick signal for agentic depth. |
tool_overhead_ms | Time spent waiting on tool execution. |
stall_probability | Probability from 0 to 1 that the execution path was likely to stall or loop. |
latency_p50, latency_p95, latency_p99 | Percentile latency values across run steps for the trace. |
Security Signals
Security signals are returned at the top level of the trace detail response. They summarize what Olyx detected while the request moved through the governed path.
| Field | Description |
|---|---|
pii_detected | PII was found in at least one request or response path. |
injection_attempt | Prompt-injection language or behavior was detected. |
secret_leaked | A secret or token-like value appeared in model output. |
secret_match_count | Number of distinct secret-like matches detected. |
tool_fidelity_score | Average tool-argument fidelity score. Values near 1.0 are clean; lower values suggest invented or mismatched tool arguments. |
shadow_score | Optional score from shadow-model evaluation, when configured. |
shadow_model | Optional shadow model used for evaluation. |
Trace Status
| Status | Meaning |
|---|---|
pending | The trace is open. Execution may still be running or the application has not called complete. |
completed | The trace was closed and summaries have been calculated. |
replay | The trace was produced by replaying an earlier trace. |
failed | Execution hit an unrecoverable error, such as provider failure, timeout, or exhausted fallback. |
Listing Traces
Use listing for tables, polling, and dashboard views. It returns summaries, not full step detail.
GET /api/v1/traces?page=1&per_page=50&status=completed
Authorization: Bearer ak_<key_id>.<secret>
{
"data": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"created_at": "2026-04-12T09:00:00Z",
"optimization_grade": "B",
"grades": {
"overall": "B",
"waste": "A",
"latency": "B"
},
"intent": "translation",
"total_cost": 0.00318
}
],
"meta": {
"page": 1,
"per_page": 50
}
}
| Parameter | Description |
|---|---|
page | Page number. Defaults to 1. |
per_page | Results per page. Defaults to 50; maximum is 100. |
status | Filter by pending, completed, replay, or failed. |
replay_of | Return replay traces derived from a source trace id. |
End-to-End Workflow
These examples show the normal shape: create the trace, execute inside it, complete it in a finally/ensure path, then read the full details only when you need the diagnostic record.
import Olyx, {
GatewayError,
RateLimitError,
} from "@olyx-labs/olyx";
const client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });
const trace = await client.traces.create({
metadata: {
userId: "u_123",
feature: "translation",
},
revenue: 0.10,
});
try {
const result = await client.execute({
traceId: trace.data.id,
input: "Translate to French: Hello, world.",
});
if (result.blocked) {
throw new Error(result.data.reason ?? "Request blocked by policy");
}
console.log(result.data.output);
} catch (error) {
if (error instanceof RateLimitError) {
throw new Error(`Rate limited: ${error.message}`);
}
if (error instanceof GatewayError) {
throw new Error("AI gateway temporarily unavailable");
}
throw error;
} finally {
try {
const completion = await client.traces.complete(trace.data.id);
console.log(completion.data.totalCost);
} catch (completionError) {
console.warn("Trace completion failed", completionError);
}
}
const details = await client.traces.find(trace.data.id);
console.log(details.data.summary?.grossMargin);
import os
import olyx
olyx.configure(api_key=os.environ["OLYX_API_KEY"])
client = olyx.Olyx()
trace = client.traces.create(
metadata={
"user_id": "u_123",
"feature": "translation",
},
revenue=0.10,
)
try:
result = client.execute(
trace_id=trace.id,
input="Translate to French: Hello, world.",
)
if result.blocked:
raise RuntimeError(result.reason or "Request blocked by policy")
print(result.output)
except olyx.RateLimitError as exc:
raise RuntimeError(f"Rate limited: {exc.message}") from exc
except olyx.GatewayError as exc:
raise RuntimeError("AI gateway temporarily unavailable") from exc
finally:
try:
completion = client.traces.complete(trace.id)
print(completion.get("total_cost"))
except Exception:
pass
details = client.traces.find(trace.id)
print(details.summary.get("gross_margin"))
require "olyx"
client = Olyx::Client.new(api_key: ENV.fetch("OLYX_API_KEY"))
trace = client.traces.create(
metadata: {
user_id: "u_123",
feature: "translation"
},
revenue: 0.10
)
begin
result = client.execute(
trace_id: trace.id,
input: "Translate to French: Hello, world."
)
raise(result.reason || "Request blocked by policy") if result.blocked?
puts result.output
rescue Olyx::RateLimitError => e
raise "Rate limited: #{e.message}"
rescue Olyx::GatewayError => e
raise "AI gateway temporarily unavailable: #{e.message}"
ensure
begin
completion = client.traces.complete(trace.id)
puts completion.total_cost
rescue Olyx::Error
nil
end
end
details = client.traces.find(trace.id)
puts details.dig(:summary, "gross_margin")
Trace Steps
Each trace contains ordered steps showing exactly what happened. Steps are the most useful part of the trace when you are debugging why a request cost more than expected, why a tool loop happened, or why a model was selected.
| Step type | What it represents |
|---|---|
check | Safety screening of the input before model execution. |
route | Model selection decision and routing metadata. |
run | A model call with model, latency, and cost. |
fanout | Parallel execution across multiple models. |
evaluate | Evaluation when multiple candidate responses were produced. |
select | Final model or response selected from a fanout. |
blocked | Request stopped by policy, safety, missing configuration, or routing failure. |
tool_call | The model requested a tool invocation. |
tool_result | Your application returned tool output or a tool error. |
embedding | An embedding request with token usage and cost. |
image_generation | An image generation request with model and cost. |
assistant_thread | One turn in a stateful multi-turn assistant conversation. |
log | Application-defined data attached to the trace without a model call. |
passthrough | Request forwarded without policy modification. |
Tool Steps
When a model requests tools, Olyx records a tool_call step. When your application returns tool output, Olyx records a
tool_result step. The next run step shows how the model used the returned tool data.
[
{
"type": "tool_call",
"output": [
{
"name": "get_weather",
"arguments": {
"city": "Paris"
}
}
],
"meta": {
"latency_ms": 0
}
},
{
"type": "tool_result",
"output": [
{
"name": "get_weather",
"result": "18 C, overcast"
}
],
"meta": {
"latency_ms": 84
}
},
{
"type": "run",
"output": "It is 18 C and overcast in Paris.",
"cost": 0.00241,
"meta": {
"model": "gpt-4o",
"latency_ms": 640
}
}
]
A failed tool call should be recorded as a tool_result with an error payload. The following model step shows whether
the model retried, selected a fallback, or produced a degraded answer.
{
"type": "tool_result",
"output": [
{
"name": "get_weather",
"error": "timeout after 5000ms"
}
],
"meta": {
"latency_ms": 5004
}
}
Log Steps
Use log steps for application events that should travel with the trace but are not model calls: user feedback, business outcome, selected A/B variant, cache hit status, or a workflow checkpoint.
POST /api/v1/logs
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json
{
"trace_id": "550e8400-e29b-41d4-a716-446655440000",
"output": {
"user_rating": 5,
"accepted": true,
"ab_variant": "B"
}
}
The create response returns a small receipt instead of echoing the payload:
{
"step_id": 52,
"trace_id": "550e8400-e29b-41d4-a716-446655440000",
"type": "log",
"parent_step_id": null,
"created_at": "2026-04-12T09:00:00Z"
}
Use step_id when you need to link a later step to this event. Read the trace with GET /api/v1/traces/{id} when you need the stored output back for analysis.
Embedding, Image, and Assistant Steps
embedding and image_generation steps carry model, cost, and latency in the same way a run step does. Embedding
steps include token usage in output; image steps can include generated image URLs or provider metadata.
assistant_thread steps represent turns in a stateful multi-turn conversation. The thread_turn value is the 1-based
turn index within the trace.
Failed Traces
A failed trace means execution encountered an unrecoverable error, such as a provider 5xx, gateway timeout, or exhausted fallback path.
{
"id": "550e8400-e29b-41d4-a716-446655440001",
"status": "failed",
"created_at": "2026-04-12T09:01:00Z",
"summary": {
"total_cost": 0.0,
"optimization_grade": null
},
"steps": [
{
"type": "check",
"output": {
"allowed": true
},
"meta": {
"latency_ms": 11.2
}
},
{
"type": "blocked",
"output": {
"reason": "provider_error",
"message": "OpenAI returned 503 after all fallback models were exhausted."
},
"meta": {
"latency_ms": 4820
}
}
]
}
Use GET /api/v1/traces?status=failed to build error-rate views, inspect provider instability, or drive retry
workflows. Failed traces can still be useful: they show how far the request traveled before the failure occurred.
Your trace data is isolated to your project. A trace from another account is never visible through your API key or dashboard session.