Traces

A trace is the durable record for one logical AI task. Keep one trace per user-visible action so the cost, latency, routing, replay, and security data stay easy to read later.

During closed beta, treat traces as operational observability records. They are built to help engineers debug, replay, and understand AI traffic; they are not a billing processor, compliance archive, or provider invoice.

TARGET MODEL Open replay dashboard

Ready to benchmark this sample trace.

Metric	Original	Replay
Model	gpt-4o-mini
Cost	$0.00032
Latency	1,240ms
Grade	A
Delta	baseline

CHECK Input validation 12ms

ANALYZER OUTPUT

allowed	true
pii_detected	false
injection_attempt	false
secret_leaked	false
risk_score	0.02

ROUTE Policy routing 3ms

WHY THIS MODEL?

Chosen gpt-4o-mini

Tier medium

Reason Lowest cost model registered for the Medium complexity tier

Fallback not triggered

RUN gpt-4o-mini $0.00032 1,240ms

COST COMPARISON

Model	Cost	Tier
gpt-4o-mini	$0.00032	medium	selected
gpt-4o	$0.00318	complex
claude-haiku-4-5	$0.00025	simple

Fallback chain gpt-4o-mini → gpt-4o not triggered

LOG Trace recorded 0ms

RECORDED

optimization_grade	A
total_cost	0.00032
total_latency_ms	1255
grades	{"overall":"A","waste":"A","latency":"B"}

Trace Lifecycle

Most integrations follow the same flow: create the trace, execute the work inside it, complete it when the task is done, and read it only when you need the detailed record.

Step	What happens	API surface
Create	Open a trace before model work starts. Attach metadata and optional request revenue.	`POST /api/v1/traces`
Execute	Run one or more governed model calls inside that trace.	`POST /api/v1/executions` or SDK `execute()`
Complete	Close the trace after the last model/tool step so Olyx can calculate grades and summary fields.	`PATCH /api/v1/traces/:id/complete`
Read	Retrieve the trace when you need steps, routing details, graph data, or security signals.	`GET /api/v1/traces/:id`

Completing the trace returns the computed summary. Read it again only when you need the full steps, graph, routing decision, or security detail.

Creating a Trace

Create the trace first so the later execution, cost, and replay data land on the right record. Attach any metadata you want to filter on later, such as user_id, tenant_id, feature, plan, or your internal request id.

metadata must be a JSON object when provided. revenue is optional and lets Olyx calculate gross margin in cost summaries.

POST /api/v1/traces
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50
}

The create response is intentionally lightweight. It confirms that the trace exists and stays small because no execution has happened yet.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "created_at": "2026-04-12T09:00:00Z",
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50
}

Executing Inside a Trace

Pass the trace id to every governed model call so Olyx can connect routing, cost, latency, tool behavior, and security signals back to the original action.

POST /api/v1/executions
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "input": "Translate the following to French: Hello, world."
}

For application code, prefer the SDK execute() wrapper. It handles the canonical endpoint, request shape, errors, and response normalization for the language you are using.

Completing a Trace

Call complete once after the last model call or tool loop for the user-visible task has finished.

PATCH /api/v1/traces/550e8400-e29b-41d4-a716-446655440000/complete
Authorization: Bearer ak_<key_id>.<secret>

Completion marks the trace as completed, calculates the summary, and returns the values you usually want for dashboards and replays.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "created_at": "2026-04-12T09:00:00Z",
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50,
  "optimization_grade": "B",
  "grades": {
    "overall": "B",
    "waste": "A",
    "latency": "B"
  },
  "total_cost": 0.00318,
  "summary": {
    "total_latency_ms": 1240.5,
    "total_cost": 0.00318,
    "revenue": 0.50,
    "gross_margin": 0.49682,
    "optimization_grade": "B",
    "grades": {
      "overall": "B",
      "waste": "A",
      "latency": "B"
    },
    "by_model": {
      "gpt-4o": 0.00318
    },
    "by_infrastructure": {
      "public_cloud": 0.00318,
      "private": 0.0
    },
    "step_count": 2,
    "chain_depth": 0,
    "tool_overhead_ms": 0.0,
    "stall_probability": 0.02,
    "latency_p50": 870.0,
    "latency_p95": 1200.0,
    "latency_p99": 1238.0
  }
}

You do not have to complete immediately. Complete after your own post-processing finishes, but complete every trace you want to include in cost reporting, grades, replay workflows, or operational dashboards.

Retrieving a Trace

Retrieve a trace when you need the full diagnostic record: steps, graph structure, routing decision, and security signals. The list endpoint stays small; the show endpoint is the detailed view.

GET /api/v1/traces/550e8400-e29b-41d4-a716-446655440000
Authorization: Bearer ak_<key_id>.<secret>

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "created_at": "2026-04-12T09:00:00Z",
  "optimization_grade": "B",
  "grades": {
    "overall": "B",
    "waste": "A",
    "latency": "B"
  },
  "intent": "translation",
  "step_count": 2,
  "chain_depth": 0,
  "tool_overhead_ms": 0.0,
  "stall_probability": 0.02,
  "pii_detected": false,
  "injection_attempt": false,
  "secret_leaked": false,
  "secret_match_count": 0,
  "tool_fidelity_score": 0.98,
  "shadow_score": null,
  "shadow_model": null,
  "summary": {
    "total_latency_ms": 1240.5,
    "total_cost": 0.00318,
    "revenue": 0.50,
    "gross_margin": 0.49682,
    "optimization_grade": "B",
    "grades": {
      "overall": "B",
      "waste": "A",
      "latency": "B"
    },
    "by_model": {
      "gpt-4o": 0.00318
    },
    "by_infrastructure": {
      "public_cloud": 0.00318,
      "private": 0.0
    },
    "step_count": 2,
    "chain_depth": 0,
    "tool_overhead_ms": 0.0,
    "stall_probability": 0.02,
    "latency_p50": 870.0,
    "latency_p95": 1200.0,
    "latency_p99": 1238.0
  },
  "steps": [
    {
      "id": 101,
      "type": "check",
      "created_at": "2026-04-12T09:00:00Z",
      "input": "Translate the following to French: Hello, world.",
      "output": {
        "allowed": true
      },
      "cost": null,
      "meta": {
        "model": null,
        "latency_ms": 12.4
      },
      "parent_step_id": null
    },
    {
      "id": 102,
      "type": "run",
      "created_at": "2026-04-12T09:00:01Z",
      "input": "Translate the following to French: Hello, world.",
      "output": "Bonjour le monde.",
      "cost": 0.00318,
      "meta": {
        "model": "gpt-4o",
        "latency_ms": 1228.1
      },
      "parent_step_id": null
    }
  ],
  "graph": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "children": []
  },
  "routing_decision": {
    "decision": "gpt-4o",
    "score": {
      "latency": 0.82,
      "cost": 0.91
    },
    "metadata": {
      "selected_from": "project_policy",
      "reasoning": "translation request selected a general model",
      "candidates": ["gpt-4o", "gpt-4o-mini"],
      "resolution": {
        "strategy": "balanced",
        "attempted_tiers": ["simple", "standard"],
        "fallback_used": false,
        "fallback_source": null
      }
    }
  }
}

For the fuller operational breakdown, see Summary Fields, Security Signals, Trace Status, Listing Traces, and Trace Steps.

Summary Fields

Summary fields are the compact operational view of the trace. They are returned from completion and from trace retrieval.

Field	Description
`total_latency_ms`	Sum of recorded step latency for the trace. Useful for request-level debugging.
`total_cost`	Combined model cost for all cost-bearing steps in the trace.
`revenue`	Optional request revenue supplied at trace creation.
`gross_margin`	`revenue - total_cost`. Use this for unit economics, not invoicing.
`optimization_grade`	Composite A-F signal for the trace. Equivalent to `grades.overall` when available.
`grades.overall`	Weighted cost and latency grade.
`grades.waste`	Cost-efficiency grade relative to the cheapest registered viable model.
`grades.latency`	Latency grade based on total run-step time.
`by_model`	Cost grouped by model identifier.
`by_infrastructure`	Cost grouped into `public_cloud` and `private` buckets.
`step_count`	Total number of steps recorded on the trace.
`chain_depth`	Number of tool-call steps; a quick signal for agentic depth.
`tool_overhead_ms`	Time spent waiting on tool execution.
`stall_probability`	Probability from `0` to `1` that the execution path was likely to stall or loop.
`latency_p50`, `latency_p95`, `latency_p99`	Percentile latency values across run steps for the trace.

Security Signals

Security signals are returned at the top level of the trace detail response. They summarize what Olyx detected while the request moved through the governed path.

Field	Description
`pii_detected`	PII was found in at least one request or response path.
`injection_attempt`	Prompt-injection language or behavior was detected.
`secret_leaked`	A secret or token-like value appeared in model output.
`secret_match_count`	Number of distinct secret-like matches detected.
`tool_fidelity_score`	Average tool-argument fidelity score. Values near `1.0` are clean; lower values suggest invented or mismatched tool arguments.
`shadow_score`	Optional score from shadow-model evaluation, when configured.
`shadow_model`	Optional shadow model used for evaluation.

Trace Status

Status	Meaning
`pending`	The trace is open. Execution may still be running or the application has not called complete.
`completed`	The trace was closed and summaries have been calculated.
`replay`	The trace was produced by replaying an earlier trace.
`failed`	Execution hit an unrecoverable error, such as provider failure, timeout, or exhausted fallback.

Listing Traces

Use listing for tables, polling, and dashboard views. It returns summaries, not full step detail.

GET /api/v1/traces?page=1&per_page=50&status=completed
Authorization: Bearer ak_<key_id>.<secret>

{
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "status": "completed",
      "created_at": "2026-04-12T09:00:00Z",
      "optimization_grade": "B",
      "grades": {
        "overall": "B",
        "waste": "A",
        "latency": "B"
      },
      "intent": "translation",
      "total_cost": 0.00318
    }
  ],
  "meta": {
    "page": 1,
    "per_page": 50
  }
}

Parameter	Description
`page`	Page number. Defaults to `1`.
`per_page`	Results per page. Defaults to `50`; maximum is `100`.
`status`	Filter by `pending`, `completed`, `replay`, or `failed`.
`replay_of`	Return replay traces derived from a source trace id.

End-to-End Workflow

These examples show the normal shape: create the trace, execute inside it, complete it in a finally/ensure path, then read the full details only when you need the diagnostic record.

import Olyx, {
  GatewayError,
  RateLimitError,
} from "@olyx-labs/olyx";

const client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

const trace = await client.traces.create({
  metadata: {
    userId: "u_123",
    feature: "translation",
  },
  revenue: 0.10,
});

try {
  const result = await client.execute({
    traceId: trace.data.id,
    input: "Translate to French: Hello, world.",
  });

  if (result.blocked) {
    throw new Error(result.data.reason ?? "Request blocked by policy");
  }

  console.log(result.data.output);
} catch (error) {
  if (error instanceof RateLimitError) {
    throw new Error(`Rate limited: ${error.message}`);
  }

  if (error instanceof GatewayError) {
    throw new Error("AI gateway temporarily unavailable");
  }

  throw error;
} finally {
  try {
    const completion = await client.traces.complete(trace.data.id);
    console.log(completion.data.totalCost);
  } catch (completionError) {
    console.warn("Trace completion failed", completionError);
  }
}

const details = await client.traces.find(trace.data.id);
console.log(details.data.summary?.grossMargin);

import os
import olyx

olyx.configure(api_key=os.environ["OLYX_API_KEY"])
client = olyx.Olyx()

trace = client.traces.create(
    metadata={
        "user_id": "u_123",
        "feature": "translation",
    },
    revenue=0.10,
)

try:
    result = client.execute(
        trace_id=trace.id,
        input="Translate to French: Hello, world.",
    )

    if result.blocked:
        raise RuntimeError(result.reason or "Request blocked by policy")

    print(result.output)
except olyx.RateLimitError as exc:
    raise RuntimeError(f"Rate limited: {exc.message}") from exc
except olyx.GatewayError as exc:
    raise RuntimeError("AI gateway temporarily unavailable") from exc
finally:
    try:
        completion = client.traces.complete(trace.id)
        print(completion.get("total_cost"))
    except Exception:
        pass

details = client.traces.find(trace.id)
print(details.summary.get("gross_margin"))

require "olyx"

client = Olyx::Client.new(api_key: ENV.fetch("OLYX_API_KEY"))

trace = client.traces.create(
  metadata: {
    user_id: "u_123",
    feature: "translation"
  },
  revenue: 0.10
)

begin
  result = client.execute(
    trace_id: trace.id,
    input: "Translate to French: Hello, world."
  )

  raise(result.reason || "Request blocked by policy") if result.blocked?

  puts result.output
rescue Olyx::RateLimitError => e
  raise "Rate limited: #{e.message}"
rescue Olyx::GatewayError => e
  raise "AI gateway temporarily unavailable: #{e.message}"
ensure
  begin
    completion = client.traces.complete(trace.id)
    puts completion.total_cost
  rescue Olyx::Error
    nil
  end
end

details = client.traces.find(trace.id)
puts details.dig(:summary, "gross_margin")

Trace Steps

Each trace contains ordered steps showing exactly what happened. Steps are the most useful part of the trace when you are debugging why a request cost more than expected, why a tool loop happened, or why a model was selected.

Step type	What it represents
`check`	Safety screening of the input before model execution.
`route`	Model selection decision and routing metadata.
`run`	A model call with model, latency, and cost.
`fanout`	Parallel execution across multiple models.
`evaluate`	Evaluation when multiple candidate responses were produced.
`select`	Final model or response selected from a fanout.
`blocked`	Request stopped by policy, safety, missing configuration, or routing failure.
`tool_call`	The model requested a tool invocation.
`tool_result`	Your application returned tool output or a tool error.
`embedding`	An embedding request with token usage and cost.
`image_generation`	An image generation request with model and cost.
`assistant_thread`	One turn in a stateful multi-turn assistant conversation.
`log`	Application-defined data attached to the trace without a model call.
`passthrough`	Request forwarded without policy modification.

Tool Steps

When a model requests tools, Olyx records a tool_call step. When your application returns tool output, Olyx records a tool_result step. The next run step shows how the model used the returned tool data.

[
  {
    "type": "tool_call",
    "output": [
      {
        "name": "get_weather",
        "arguments": {
          "city": "Paris"
        }
      }
    ],
    "meta": {
      "latency_ms": 0
    }
  },
  {
    "type": "tool_result",
    "output": [
      {
        "name": "get_weather",
        "result": "18 C, overcast"
      }
    ],
    "meta": {
      "latency_ms": 84
    }
  },
  {
    "type": "run",
    "output": "It is 18 C and overcast in Paris.",
    "cost": 0.00241,
    "meta": {
      "model": "gpt-4o",
      "latency_ms": 640
    }
  }
]

A failed tool call should be recorded as a tool_result with an error payload. The following model step shows whether the model retried, selected a fallback, or produced a degraded answer.

{
  "type": "tool_result",
  "output": [
    {
      "name": "get_weather",
      "error": "timeout after 5000ms"
    }
  ],
  "meta": {
    "latency_ms": 5004
  }
}

Log Steps

Use log steps for application events that should travel with the trace but are not model calls: user feedback, business outcome, selected A/B variant, cache hit status, or a workflow checkpoint.

POST /api/v1/logs
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "output": {
    "user_rating": 5,
    "accepted": true,
    "ab_variant": "B"
  }
}

The create response returns a small receipt instead of echoing the payload:

{
  "step_id": 52,
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "log",
  "parent_step_id": null,
  "created_at": "2026-04-12T09:00:00Z"
}

Use step_id when you need to link a later step to this event. Read the trace with GET /api/v1/traces/{id} when you need the stored output back for analysis.

Embedding, Image, and Assistant Steps

embedding and image_generation steps carry model, cost, and latency in the same way a run step does. Embedding steps include token usage in output; image steps can include generated image URLs or provider metadata.

assistant_thread steps represent turns in a stateful multi-turn conversation. The thread_turn value is the 1-based turn index within the trace.

Failed Traces

A failed trace means execution encountered an unrecoverable error, such as a provider 5xx, gateway timeout, or exhausted fallback path.

{
  "id": "550e8400-e29b-41d4-a716-446655440001",
  "status": "failed",
  "created_at": "2026-04-12T09:01:00Z",
  "summary": {
    "total_cost": 0.0,
    "optimization_grade": null
  },
  "steps": [
    {
      "type": "check",
      "output": {
        "allowed": true
      },
      "meta": {
        "latency_ms": 11.2
      }
    },
    {
      "type": "blocked",
      "output": {
        "reason": "provider_error",
        "message": "OpenAI returned 503 after all fallback models were exhausted."
      },
      "meta": {
        "latency_ms": 4820
      }
    }
  ]
}

Use GET /api/v1/traces?status=failed to build error-rate views, inspect provider instability, or drive retry workflows. Failed traces can still be useful: they show how far the request traveled before the failure occurred.

Your trace data is isolated to your project. A trace from another account is never visible through your API key or dashboard session.