Traces

A trace is the durable record for one logical AI task. Keep one trace per user-visible action so the cost, latency, routing, replay, and security data stay easy to read later.

During closed beta, treat traces as operational observability records. They are built to help engineers debug, replay, and understand AI traffic; they are not a billing processor, compliance archive, or provider invoice.

TRACE 550e8400-e29b-41d4 COMPLETED
CHECK Input validation 12ms
ANALYZER OUTPUT
allowed true
pii_detected false
injection_attempt false
secret_leaked false
risk_score 0.02
ROUTE Policy routing 3ms
WHY THIS MODEL?
Chosen gpt-4o-mini
Tier medium
Reason Lowest cost model registered for the Medium complexity tier
Fallback not triggered
RUN gpt-4o-mini $0.00032 1,240ms
COST COMPARISON
Model Cost Tier
gpt-4o-mini $0.00032 medium selected
gpt-4o $0.00318 complex
claude-haiku-4-5 $0.00025 simple
Fallback chain gpt-4o-mini → gpt-4o not triggered
LOG Trace recorded 0ms
RECORDED
optimization_grade A
total_cost 0.00032
total_latency_ms 1255
grades {"overall":"A","waste":"A","latency":"B"}

Trace Lifecycle

Most integrations follow the same flow: create the trace, execute the work inside it, complete it when the task is done, and read it only when you need the detailed record.

StepWhat happensAPI surface
CreateOpen a trace before model work starts. Attach metadata and optional request revenue.POST /api/v1/traces
ExecuteRun one or more governed model calls inside that trace.POST /api/v1/executions or SDK execute()
CompleteClose the trace after the last model/tool step so Olyx can calculate grades and summary fields.PATCH /api/v1/traces/:id/complete
ReadRetrieve the trace when you need steps, routing details, graph data, or security signals.GET /api/v1/traces/:id

Completing the trace returns the computed summary. Read it again only when you need the full steps, graph, routing decision, or security detail.

Creating a Trace

Create the trace first so the later execution, cost, and replay data land on the right record. Attach any metadata you want to filter on later, such as user_id, tenant_id, feature, plan, or your internal request id.

metadata must be a JSON object when provided. revenue is optional and lets Olyx calculate gross margin in cost summaries.

POST /api/v1/traces
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50
}

The create response is intentionally lightweight. It confirms that the trace exists and stays small because no execution has happened yet.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "pending",
  "created_at": "2026-04-12T09:00:00Z",
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50
}

Executing Inside a Trace

Pass the trace id to every governed model call so Olyx can connect routing, cost, latency, tool behavior, and security signals back to the original action.

POST /api/v1/executions
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "input": "Translate the following to French: Hello, world."
}

For application code, prefer the SDK execute() wrapper. It handles the canonical endpoint, request shape, errors, and response normalization for the language you are using.

Completing a Trace

Call complete once after the last model call or tool loop for the user-visible task has finished.

PATCH /api/v1/traces/550e8400-e29b-41d4-a716-446655440000/complete
Authorization: Bearer ak_<key_id>.<secret>

Completion marks the trace as completed, calculates the summary, and returns the values you usually want for dashboards and replays.

{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "created_at": "2026-04-12T09:00:00Z",
  "metadata": {
    "user_id": "user_123",
    "feature": "translation",
    "plan": "team"
  },
  "revenue": 0.50,
  "optimization_grade": "B",
  "grades": {
    "overall": "B",
    "waste": "A",
    "latency": "B"
  },
  "total_cost": 0.00318,
  "summary": {
    "total_latency_ms": 1240.5,
    "total_cost": 0.00318,
    "revenue": 0.50,
    "gross_margin": 0.49682,
    "optimization_grade": "B",
    "grades": {
      "overall": "B",
      "waste": "A",
      "latency": "B"
    },
    "by_model": {
      "gpt-4o": 0.00318
    },
    "by_infrastructure": {
      "public_cloud": 0.00318,
      "private": 0.0
    },
    "step_count": 2,
    "chain_depth": 0,
    "tool_overhead_ms": 0.0,
    "stall_probability": 0.02,
    "latency_p50": 870.0,
    "latency_p95": 1200.0,
    "latency_p99": 1238.0
  }
}

You do not have to complete immediately. Complete after your own post-processing finishes, but complete every trace you want to include in cost reporting, grades, replay workflows, or operational dashboards.

Retrieving a Trace

Retrieve a trace when you need the full diagnostic record: steps, graph structure, routing decision, and security signals. The list endpoint stays small; the show endpoint is the detailed view.

GET /api/v1/traces/550e8400-e29b-41d4-a716-446655440000
Authorization: Bearer ak_<key_id>.<secret>
{
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "completed",
  "created_at": "2026-04-12T09:00:00Z",
  "optimization_grade": "B",
  "grades": {
    "overall": "B",
    "waste": "A",
    "latency": "B"
  },
  "intent": "translation",
  "step_count": 2,
  "chain_depth": 0,
  "tool_overhead_ms": 0.0,
  "stall_probability": 0.02,
  "pii_detected": false,
  "injection_attempt": false,
  "secret_leaked": false,
  "secret_match_count": 0,
  "tool_fidelity_score": 0.98,
  "shadow_score": null,
  "shadow_model": null,
  "summary": {
    "total_latency_ms": 1240.5,
    "total_cost": 0.00318,
    "revenue": 0.50,
    "gross_margin": 0.49682,
    "optimization_grade": "B",
    "grades": {
      "overall": "B",
      "waste": "A",
      "latency": "B"
    },
    "by_model": {
      "gpt-4o": 0.00318
    },
    "by_infrastructure": {
      "public_cloud": 0.00318,
      "private": 0.0
    },
    "step_count": 2,
    "chain_depth": 0,
    "tool_overhead_ms": 0.0,
    "stall_probability": 0.02,
    "latency_p50": 870.0,
    "latency_p95": 1200.0,
    "latency_p99": 1238.0
  },
  "steps": [
    {
      "id": 101,
      "type": "check",
      "created_at": "2026-04-12T09:00:00Z",
      "input": "Translate the following to French: Hello, world.",
      "output": {
        "allowed": true
      },
      "cost": null,
      "meta": {
        "model": null,
        "latency_ms": 12.4
      },
      "parent_step_id": null
    },
    {
      "id": 102,
      "type": "run",
      "created_at": "2026-04-12T09:00:01Z",
      "input": "Translate the following to French: Hello, world.",
      "output": "Bonjour le monde.",
      "cost": 0.00318,
      "meta": {
        "model": "gpt-4o",
        "latency_ms": 1228.1
      },
      "parent_step_id": null
    }
  ],
  "graph": {
    "id": "550e8400-e29b-41d4-a716-446655440000",
    "children": []
  },
  "routing_decision": {
    "decision": "gpt-4o",
    "score": {
      "latency": 0.82,
      "cost": 0.91
    },
    "metadata": {
      "selected_from": "project_policy",
      "reasoning": "translation request selected a general model",
      "candidates": ["gpt-4o", "gpt-4o-mini"],
      "resolution": {
        "strategy": "balanced",
        "attempted_tiers": ["simple", "standard"],
        "fallback_used": false,
        "fallback_source": null
      }
    }
  }
}

For the fuller operational breakdown, see Summary Fields, Security Signals, Trace Status, Listing Traces, and Trace Steps.

Summary Fields

Summary fields are the compact operational view of the trace. They are returned from completion and from trace retrieval.

FieldDescription
total_latency_msSum of recorded step latency for the trace. Useful for request-level debugging.
total_costCombined model cost for all cost-bearing steps in the trace.
revenueOptional request revenue supplied at trace creation.
gross_marginrevenue - total_cost. Use this for unit economics, not invoicing.
optimization_gradeComposite A-F signal for the trace. Equivalent to grades.overall when available.
grades.overallWeighted cost and latency grade.
grades.wasteCost-efficiency grade relative to the cheapest registered viable model.
grades.latencyLatency grade based on total run-step time.
by_modelCost grouped by model identifier.
by_infrastructureCost grouped into public_cloud and private buckets.
step_countTotal number of steps recorded on the trace.
chain_depthNumber of tool-call steps; a quick signal for agentic depth.
tool_overhead_msTime spent waiting on tool execution.
stall_probabilityProbability from 0 to 1 that the execution path was likely to stall or loop.
latency_p50, latency_p95, latency_p99Percentile latency values across run steps for the trace.

Security Signals

Security signals are returned at the top level of the trace detail response. They summarize what Olyx detected while the request moved through the governed path.

FieldDescription
pii_detectedPII was found in at least one request or response path.
injection_attemptPrompt-injection language or behavior was detected.
secret_leakedA secret or token-like value appeared in model output.
secret_match_countNumber of distinct secret-like matches detected.
tool_fidelity_scoreAverage tool-argument fidelity score. Values near 1.0 are clean; lower values suggest invented or mismatched tool arguments.
shadow_scoreOptional score from shadow-model evaluation, when configured.
shadow_modelOptional shadow model used for evaluation.

Trace Status

StatusMeaning
pendingThe trace is open. Execution may still be running or the application has not called complete.
completedThe trace was closed and summaries have been calculated.
replayThe trace was produced by replaying an earlier trace.
failedExecution hit an unrecoverable error, such as provider failure, timeout, or exhausted fallback.

Listing Traces

Use listing for tables, polling, and dashboard views. It returns summaries, not full step detail.

GET /api/v1/traces?page=1&per_page=50&status=completed
Authorization: Bearer ak_<key_id>.<secret>
{
  "data": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "status": "completed",
      "created_at": "2026-04-12T09:00:00Z",
      "optimization_grade": "B",
      "grades": {
        "overall": "B",
        "waste": "A",
        "latency": "B"
      },
      "intent": "translation",
      "total_cost": 0.00318
    }
  ],
  "meta": {
    "page": 1,
    "per_page": 50
  }
}
ParameterDescription
pagePage number. Defaults to 1.
per_pageResults per page. Defaults to 50; maximum is 100.
statusFilter by pending, completed, replay, or failed.
replay_ofReturn replay traces derived from a source trace id.

End-to-End Workflow

These examples show the normal shape: create the trace, execute inside it, complete it in a finally/ensure path, then read the full details only when you need the diagnostic record.

import Olyx, {
  GatewayError,
  RateLimitError,
} from "@olyx-labs/olyx";

const client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

const trace = await client.traces.create({
  metadata: {
    userId: "u_123",
    feature: "translation",
  },
  revenue: 0.10,
});

try {
  const result = await client.execute({
    traceId: trace.data.id,
    input: "Translate to French: Hello, world.",
  });

  if (result.blocked) {
    throw new Error(result.data.reason ?? "Request blocked by policy");
  }

  console.log(result.data.output);
} catch (error) {
  if (error instanceof RateLimitError) {
    throw new Error(`Rate limited: ${error.message}`);
  }

  if (error instanceof GatewayError) {
    throw new Error("AI gateway temporarily unavailable");
  }

  throw error;
} finally {
  try {
    const completion = await client.traces.complete(trace.data.id);
    console.log(completion.data.totalCost);
  } catch (completionError) {
    console.warn("Trace completion failed", completionError);
  }
}

const details = await client.traces.find(trace.data.id);
console.log(details.data.summary?.grossMargin);
import os
import olyx

olyx.configure(api_key=os.environ["OLYX_API_KEY"])
client = olyx.Olyx()

trace = client.traces.create(
    metadata={
        "user_id": "u_123",
        "feature": "translation",
    },
    revenue=0.10,
)

try:
    result = client.execute(
        trace_id=trace.id,
        input="Translate to French: Hello, world.",
    )

    if result.blocked:
        raise RuntimeError(result.reason or "Request blocked by policy")

    print(result.output)
except olyx.RateLimitError as exc:
    raise RuntimeError(f"Rate limited: {exc.message}") from exc
except olyx.GatewayError as exc:
    raise RuntimeError("AI gateway temporarily unavailable") from exc
finally:
    try:
        completion = client.traces.complete(trace.id)
        print(completion.get("total_cost"))
    except Exception:
        pass

details = client.traces.find(trace.id)
print(details.summary.get("gross_margin"))
require "olyx"

client = Olyx::Client.new(api_key: ENV.fetch("OLYX_API_KEY"))

trace = client.traces.create(
  metadata: {
    user_id: "u_123",
    feature: "translation"
  },
  revenue: 0.10
)

begin
  result = client.execute(
    trace_id: trace.id,
    input: "Translate to French: Hello, world."
  )

  raise(result.reason || "Request blocked by policy") if result.blocked?

  puts result.output
rescue Olyx::RateLimitError => e
  raise "Rate limited: #{e.message}"
rescue Olyx::GatewayError => e
  raise "AI gateway temporarily unavailable: #{e.message}"
ensure
  begin
    completion = client.traces.complete(trace.id)
    puts completion.total_cost
  rescue Olyx::Error
    nil
  end
end

details = client.traces.find(trace.id)
puts details.dig(:summary, "gross_margin")

Trace Steps

Each trace contains ordered steps showing exactly what happened. Steps are the most useful part of the trace when you are debugging why a request cost more than expected, why a tool loop happened, or why a model was selected.

Step typeWhat it represents
checkSafety screening of the input before model execution.
routeModel selection decision and routing metadata.
runA model call with model, latency, and cost.
fanoutParallel execution across multiple models.
evaluateEvaluation when multiple candidate responses were produced.
selectFinal model or response selected from a fanout.
blockedRequest stopped by policy, safety, missing configuration, or routing failure.
tool_callThe model requested a tool invocation.
tool_resultYour application returned tool output or a tool error.
embeddingAn embedding request with token usage and cost.
image_generationAn image generation request with model and cost.
assistant_threadOne turn in a stateful multi-turn assistant conversation.
logApplication-defined data attached to the trace without a model call.
passthroughRequest forwarded without policy modification.

Tool Steps

When a model requests tools, Olyx records a tool_call step. When your application returns tool output, Olyx records a tool_result step. The next run step shows how the model used the returned tool data.

[
  {
    "type": "tool_call",
    "output": [
      {
        "name": "get_weather",
        "arguments": {
          "city": "Paris"
        }
      }
    ],
    "meta": {
      "latency_ms": 0
    }
  },
  {
    "type": "tool_result",
    "output": [
      {
        "name": "get_weather",
        "result": "18 C, overcast"
      }
    ],
    "meta": {
      "latency_ms": 84
    }
  },
  {
    "type": "run",
    "output": "It is 18 C and overcast in Paris.",
    "cost": 0.00241,
    "meta": {
      "model": "gpt-4o",
      "latency_ms": 640
    }
  }
]

A failed tool call should be recorded as a tool_result with an error payload. The following model step shows whether the model retried, selected a fallback, or produced a degraded answer.

{
  "type": "tool_result",
  "output": [
    {
      "name": "get_weather",
      "error": "timeout after 5000ms"
    }
  ],
  "meta": {
    "latency_ms": 5004
  }
}

Log Steps

Use log steps for application events that should travel with the trace but are not model calls: user feedback, business outcome, selected A/B variant, cache hit status, or a workflow checkpoint.

POST /api/v1/logs
Authorization: Bearer ak_<key_id>.<secret>
Content-Type: application/json

{
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "output": {
    "user_rating": 5,
    "accepted": true,
    "ab_variant": "B"
  }
}

The create response returns a small receipt instead of echoing the payload:

{
  "step_id": 52,
  "trace_id": "550e8400-e29b-41d4-a716-446655440000",
  "type": "log",
  "parent_step_id": null,
  "created_at": "2026-04-12T09:00:00Z"
}

Use step_id when you need to link a later step to this event. Read the trace with GET /api/v1/traces/{id} when you need the stored output back for analysis.

Embedding, Image, and Assistant Steps

embedding and image_generation steps carry model, cost, and latency in the same way a run step does. Embedding steps include token usage in output; image steps can include generated image URLs or provider metadata.

assistant_thread steps represent turns in a stateful multi-turn conversation. The thread_turn value is the 1-based turn index within the trace.

Failed Traces

A failed trace means execution encountered an unrecoverable error, such as a provider 5xx, gateway timeout, or exhausted fallback path.

{
  "id": "550e8400-e29b-41d4-a716-446655440001",
  "status": "failed",
  "created_at": "2026-04-12T09:01:00Z",
  "summary": {
    "total_cost": 0.0,
    "optimization_grade": null
  },
  "steps": [
    {
      "type": "check",
      "output": {
        "allowed": true
      },
      "meta": {
        "latency_ms": 11.2
      }
    },
    {
      "type": "blocked",
      "output": {
        "reason": "provider_error",
        "message": "OpenAI returned 503 after all fallback models were exhausted."
      },
      "meta": {
        "latency_ms": 4820
      }
    }
  ]
}

Use GET /api/v1/traces?status=failed to build error-rate views, inspect provider instability, or drive retry workflows. Failed traces can still be useful: they show how far the request traveled before the failure occurred.

Your trace data is isolated to your project. A trace from another account is never visible through your API key or dashboard session.

Was this page helpful?