TypeScript SDK

This page covers the full TypeScript SDK path: trace binding, explicit execute control, tool loops, simulation, and private-agent routing. If you are migrating an existing app and only need the OpenAI-compatible gateway path, start with Quick Start or the SDK overview.

flowchart LR APP[NODE APP] GATEWAY[OPENAI-COMPATIBLE GATEWAY] SDK[OLYX SDK] TRACE[TRACE] EXEC[CLIENT EXECUTE] GATE[OLYX GATEWAY OR AGENT] MODEL[MODEL PROVIDER] APP --> GATEWAY --> GATE --> MODEL APP --> SDK --> TRACE --> EXEC --> GATE --> MODEL

Full SDK (explicit control)

The Olyx SDK is a control plane entry point, not a transport helper. It does not wrap OpenAI’s client or replicate provider APIs. Instead, it gives your application a single governed call site that handles routing, PII scrubbing, cost tracking, and observability as a unit.

The right shape:

import Olyx from "@olyx-labs/olyx";

const client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

// Create a trace first — every execution belongs to one
const trace  = await client.traces.create({ metadata: { userId: "u_123" } });
const result = await client.execute({
  traceId: trace.data.id,
  input:   "Translate to French: Hello, world.",
});

console.log(result.data.output);  // "Bonjour le monde."
console.log(result.data.model);   // "gpt-4o-mini"

Not:

// Don't do this — low leverage, just clones OpenAI
await openaiClient.chat.completions.create({ model: "gpt-4o", messages: [...] });

How responses are shaped. Every resource method returns an OlyxResult<T>. Access the typed payload via .data. Convenience properties — .blocked, .toolCallsPending, .toolCalls, .bypass — live directly on the result object, not on .data.

Requires Node.js ≥ 18 (uses the built-in fetch API). No runtime dependencies.

Installation

npm install @olyx-labs/olyx
# or: yarn add @olyx-labs/olyx  /  pnpm add @olyx-labs/olyx

TypeScript types are bundled — no @types/olyx needed.

Configuration

Configure once when your application starts:

import Olyx from "@olyx-labs/olyx";

const client = new Olyx({
  apiKey:  process.env.OLYX_API_KEY!,
  baseUrl: process.env.OLYX_GATEWAY_URL,
  failOpen: false,  // fail-closed by default — see Safety Valve
});

All options:

Option	Type	Default	What it does
`apiKey`	`string`	—	Your `ak_live_...` key. Required.
`baseUrl`	`string`	SDK default	Gateway URL. Override for private-agent routes.
`failOpen`	`boolean`	`false`	Allow direct provider bypass when gateway is unreachable.
`fallbackProviderUrl`	`string`	—	OpenAI-compatible endpoint to use when failing open. Required when `failOpen` is enabled.
`fallbackApiKey`	`string`	`OLYX_FALLBACK_API_KEY` env	API key for the fallback provider. Required when `failOpen` is enabled.
`fallbackModel`	`string`	`gpt-4o-mini`	Model to call on the fallback provider.

client.execute — the primary call

client.execute is the single entry point for all governed AI calls. Under the hood it:

Routes the input through the safety check
Selects the best model tier for this request
Calls the model with PII removed
Records cost, latency, and routing decision to the trace
Returns structured output alongside governance metadata

const trace  = await client.traces.create({
  metadata: { userId: "u_123", intent: "translation" },
  revenue:  0.10,  // what you charge for this call — enables margin tracking
});

const result = await client.execute({
  traceId: trace.data.id,
  input:   "Translate to French: Hello, world.",
});

// Mark the trace complete — triggers optimization grading
await client.traces.complete(trace.data.id);

Response shape

result.data.output    // model output string
result.data.stepId    // step ID — links to this exact call in the dashboard
result.data.model     // resolved model, e.g. "gpt-4o-mini"
result.data.status    // undefined (normal) | "tool_calls_pending" | "blocked"
result.data.reason    // reason string when blocked

// Convenience properties on the result wrapper:
result.blocked           // true if a safety or policy rule blocked the request
result.toolCallsPending  // true when the model wants to call a tool
result.toolCalls         // array of tool call objects (when toolCallsPending is true)
result.bypass            // true if the call bypassed governance (fail_open path)

Messages API

Pass a full conversation history instead of a plain string:

const result = await client.execute({
  traceId:  trace.data.id,
  messages: [
    { role: "system",    content: "You are a helpful assistant." },
    { role: "user",      content: "What is the capital of France?" },
    { role: "assistant", content: "Paris." },
    { role: "user",      content: "And Germany?" },
  ],
});

Simulate / dry-run

Before committing to a call, ask Olyx what it would do — no model is invoked, no cost is incurred:

const sim = await client.simulate.create({
  input:    "Summarise this quarterly financial report...",
  metadata: { userId: "u_123", intent: "summarization" },
});

console.log(sim.data.status);         // "resolved" | "blocked" | "unconfigured"
console.log(sim.data.model);          // "gpt-4o-mini" | null
console.log(sim.data.estimatedCost);  // 0.00045 (USD)
console.log(sim.data.riskScore);      // 0.08  (0–1; above 0.7 routes to Secure tier)
console.log(sim.data.tier);           // "medium" | null
console.log(sim.data.fallbackPath);   // ["gpt-4o-mini", "gpt-4o"]
console.log(sim.data.reason);         // null unless blocked/unconfigured

Full response shape

sim.data.status         // "resolved" | "blocked" | "unconfigured"
sim.data.model          // chosen model or null
sim.data.estimatedCost  // estimated USD cost
sim.data.riskScore      // composite risk score (0–1)
sim.data.tier           // matched tier
sim.data.fallbackPath   // ordered fallback models, chosen first
sim.data.reason         // human-readable reason when blocked/unconfigured

Use estimatedCost for pre-flight budget checks and fallbackPath to explain failover order.

Use simulate for pre-flight budget gates, compliance checks, and user-facing cost previews before a model is invoked.

Policy hooks

Project policy is enforced server-side. In TypeScript 0.1.x, client.execute does not accept a separate policy object or arbitrary metadata payload.

Attach context to the trace before executing. That keeps attribution, routing context, and dashboard search tied to the whole workflow instead of one step.

const trace = await client.traces.create({
  metadata: {
    userId: "u_123",
    intent: "summarization",
    feature: "research_assistant",
  },
});

const result = await client.execute({
  traceId:  trace.data.id,
  input:    "Summarise our internal acquisition memo.",
});

Use project settings and model registry configuration for routing, spend caps, and selected private-agent behavior.

Context	Where to set it
User or tenant attribution	`client.traces.create({ metadata })`
Spend caps	Project and API key settings
Model routing	Model Registry and routing tiers
Private-agent paths	Agent deployment plus project routing settings

Blocked responses

A blocked response is a governance event, not an exception:

const result = await client.execute({ traceId: trace.data.id, input: "..." });

if (result.blocked) {
  // Input was stopped by a safety rule, PII guardrail, or policy constraint.
  // result.data.stepId links to the exact blocked step in the dashboard.
  logGovernanceEvent(result.data.reason, { stepId: result.data.stepId });
} else {
  res.json({ output: result.data.output });
}

Tool calls

Pass tool definitions and Olyx manages the execution loop — schema translation per provider is automatic:

const tools = [{
  name:        "get_weather",
  description: "Get current weather for a city",
  parameters:  {
    type:       "object",
    properties: { city: { type: "string" } },
    required:   ["city"],
  },
}];

const trace = await client.traces.create({ metadata: { userId: "u_123" } });
let result = await client.execute({ traceId: trace.data.id, input: "What is the weather in London?", tools });

while (result.toolCallsPending) {
  const toolResults = await Promise.all(
    result.toolCalls.map(async (call) => ({
      toolCallId: call.id,
      name:       call.name,
      content:    await dispatchTool(call.name, call.input),
    }))
  );

  // Continue the same trace — pass back the tool results
  result = await client.execute({
    traceId:      trace.data.id,
    parentStepId: result.data.stepId,
    toolResults,
  });
}

console.log(result.data.output);
await client.traces.complete(trace.data.id);

MCP tools in service classes

Initialize the client once and declare MCP servers as private configuration. Each service owns its own MCP scope — the client is reused across calls.

import Olyx from "@olyx-labs/olyx";

class DocumentService {
  private client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

  async summarise(documentId: string, userId: string): Promise<string> {
    const trace  = await this.client.traces.create({
      metadata: { userId, intent: "summarization" },
    });
    const result = await this.client.execute({
      traceId:  trace.data.id,
      input:    `Summarise document ${documentId}`,
      tools:    this.mcpTools(),
    });
    await this.client.traces.complete(trace.data.id);
    return result.data.output ?? "";
  }

  private mcpTools() {
    return [{
      type:             "mcp",
      serverLabel:      "documents",
      serverUrl:        process.env.DOCS_MCP_URL!,
      requireApproval:  "never",
    }];
  }
}

Services with different MCP servers compose naturally:

class SearchService {
  private client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

  async query(q: string, userId: string): Promise<string> {
    const trace  = await this.client.traces.create({
      metadata: { userId, intent: "search" },
    });
    const result = await this.client.execute({
      traceId:  trace.data.id,
      input:    q,
      tools:    this.mcpTools(),
    });
    await this.client.traces.complete(trace.data.id);
    return result.data.output ?? "";
  }

  private mcpTools() {
    return [{
      type:            "mcp",
      serverLabel:     "search",
      serverUrl:       process.env.SEARCH_MCP_URL!,
      requireApproval: "never",
    }];
  }
}

class AnalyticsService {
  private client = new Olyx({ apiKey: process.env.OLYX_API_KEY! });

  async insight(metric: string, windowDays: number, userId: string): Promise<string> {
    const trace  = await this.client.traces.create({
      metadata: { userId, intent: "analytics" },
    });
    const result = await this.client.execute({
      traceId:  trace.data.id,
      input:    `Explain the trend in ${metric} over the last ${windowDays} days.`,
      tools:    this.mcpTools(),
    });
    await this.client.traces.complete(trace.data.id);
    return result.data.output ?? "";
  }

  private mcpTools() {
    return [{
      type:            "mcp",
      serverLabel:     "data_warehouse",
      serverUrl:       process.env.DW_MCP_URL!,
      requireApproval: "never",
      vpcOnly:         true,
    }];
  }
}

vpcOnly is a routing intent flag. Entitlement and access control are enforced by the Olyx gateway, not by client-side SDK checks.

In Express / Next.js, instantiate services once at startup so the connection is shared:

// app.ts
export const documentService  = new DocumentService();
export const searchService    = new SearchService();
export const analyticsService = new AnalyticsService();

// routes/reports.ts
import { analyticsService } from "../app.js";

app.get("/reports/:metric", async (req, res) => {
  const output = await analyticsService.insight(req.params.metric, 30, req.user.id);
  res.json({ output });
});

Multi-step workflows (explicit trace control)

For agentic workflows that span multiple execute calls — where you want the full chain visible as a single trace — bind calls to an explicit trace:

// Create a trace once for the full workflow
const trace = await client.traces.create({
  metadata: { userId: "u_123", task: "research_report" },
  revenue:  2.00,  // what you charge for this workflow
});

// Each call bound to the same trace
const step1 = await client.execute({
  traceId: trace.data.id,
  input:   "Find recent papers on transformer efficiency.",
});

const step2 = await client.execute({
  traceId: trace.data.id,
  input:   `Summarise: ${step1.data.output}`,
});

// Complete triggers optimization grading across all steps
await client.traces.complete(trace.data.id);

// All steps are linked under one trace in the dashboard
console.log(trace.data.id);

Direct model runs

client.runs.create bypasses the routing engine and calls a specific model directly. Use it when you need a controlled comparison (A/B between two models on the same input), a test harness that must hit a known model, or a debug call where routing itself is the variable you’re holding fixed.

const [runA, runB] = await Promise.all([
  client.runs.create({ traceId: trace.data.id, model: "gpt-4o",      input: "Summarise: ..." }),
  client.runs.create({ traceId: trace.data.id, model: "gpt-4o-mini", input: "Summarise: ..." }),
]);

// Compare quality vs cost before choosing a routing tier
console.log(runA.data.output, runB.data.output);

Response shape:

run.data.output   // model output string
run.data.model    // the model that was called — confirms the exact version used
run.data.stepId   // links this call to the trace in the dashboard

Unlike execute, runs.create does not apply routing rules, fallback chains, or cost gating. PII scrubbing and audit logging still apply — the call is always traceable.

Safety checks without a model call

Run the full safety filter in isolation — no token costs, just a pass/fail result. Use this to gate user input before an expensive model call, or to surface which specific check failed.

const check = await client.checks.create({
  traceId: trace.data.id,
  input:   userInput,
});

if (!check.data.allowed) {
  return res.status(400).json({ error: "Input did not pass safety check." });
}

Response shape:

check.data.allowed          // false if any check failed
check.data.stepId           // optional step id written to the trace
check.data.reason           // present when allowed is false
check.data.meta.piiDetected
check.data.meta.injectionAttempt
check.data.meta.secretLeaked
check.data.meta.riskScore

Log reason plus meta.riskScore alongside your governance event so support can diagnose blocks quickly.

Attaching logs to a trace

Attach arbitrary data (user feedback, downstream metrics) to a trace without making a model call:

const log = await client.logs.create({
  traceId:      trace.data.id,
  parentStepId: step2.data.stepId,  // link to the specific step
  output:       { rating: 5, feedback: "Great summary!", latencyMs: 340 },
});

log.data.stepId        // recorded log step
log.data.traceId       // trace that received the log
log.data.parentStepId  // linked parent step, if supplied

Embeddings

The TypeScript SDK does not expose a dedicated client.embeddings resource. Use the OpenAI-compatible gateway path from Quick Start instead — point your existing embedding client at the Olyx gateway and PII scrubbing is applied before the vector is generated:

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey:  process.env.OLYX_API_KEY,
  baseURL: "https://olyx.ai/v1",
});

// PII is stripped before the vector is generated.
// If any chunk is blocked, the request throws with status 422.
const response = await openai.embeddings.create({
  model: "text-embedding-3-small",
  input: ["Document one.", "Document two."],
});

Binding embedding calls to a workflow trace

Pass X-Olyx-Trace-Id as a request header to attach the embedding call to an existing SDK-created trace. The proxy then adds the step to your trace instead of creating a new one — giving you a single trace for the full workflow with one total cost and one optimization grade.

const trace = await client.traces.create({ metadata: { userId: "u_123" } });

// Create the embeddings client with the trace bound
const openai = new OpenAI({
  apiKey:         process.env.OLYX_API_KEY,
  baseURL:        "https://olyx.ai/v1",
  defaultHeaders: { "X-Olyx-Trace-Id": trace.data.id },
});

// This embedding step is now part of the trace
await openai.embeddings.create({ model: "text-embedding-3-small", input: ["..."] });

// So is this execute call
const result = await client.execute({ traceId: trace.data.id, input: "Summarise..." });

// Completing the trace grades all steps together
await client.traces.complete(trace.data.id);

The gateway enforces customer scoping — the header is silently rejected with a 404 if the trace belongs to a different account. Passing a completed trace ID returns a 422; start a new trace instead.

A native client.embeddings.create SDK method is on the roadmap.

User retention analytics

Attach user and feature metadata to the trace before executing. This gives Olyx enough context to group AI feature usage by user, tenant, and product surface.

const trace = await client.traces.create({
  metadata: {
    userId:  "u_123",
    orgId:   "org_abc",
    intent:  "email_draft",
    feature: "sales_assistant",
  },
});

const result = await client.execute({
  traceId:  trace.data.id,
  input:    "Draft a follow-up email for this deal.",
});

With consistent userId and intent tagging across calls, the Olyx dashboard surfaces:

Per-user AI cost — identify your highest-value users and price accordingly
Feature adoption — which AI surfaces are used vs. ignored
Optimization grades per cohort — are heavy users getting efficient routing?
Blocked call rate — flag users hitting safety guardrails repeatedly

Use this signal to inform retention decisions: users with high AI engagement and low block rates are your stickiest users.

The Safety Valve: Fail-Closed vs. Fail-Open

Fail-closed (default): if the gateway is unreachable, execute throws CircuitBreakerError. No prompt leaves your process ungoverned.

Fail-open: set failOpen: true to call fallbackProviderUrl directly during outages.

// Per-call override — only this call can bypass governance
const result = await client.execute({
  traceId:  trace.data.id,
  input:    "Summarise this internal changelog.",
  failOpen: true,
});

console.log(result.bypass);  // true — no audit trail for this call

// Global opt-in — all execute calls on this client instance can bypass
const client = new Olyx({
  apiKey:              process.env.OLYX_API_KEY!,
  failOpen:            true,
  fallbackProviderUrl: "https://api.openai.com/v1",
  fallbackModel:       "gpt-4o-mini",
});

Testing

Use a dedicated test project with a project-scoped API key. All SDK calls route to the real Olyx backend — traces are created, policy is enforced, and executions count against your quota exactly as in production. Test-environment behaviour is verifiable against real governance rules.

import Olyx from "@olyx-labs/olyx";

const client = new Olyx({
  apiKey:  process.env.OLYX_TEST_API_KEY!,
  baseUrl: process.env.OLYX_BASE_URL ?? "https://olyx.ai",
});

Set a spend cap on the test project key to bound runaway test costs. Use a separate test project to keep production trace history clean.

Controlling test outcomes

Use client.simulate to exercise your routing policy without invoking a model:

const sim = await client.simulate.create({
  input: "What is 2+2?",
});
// sim.data.status         → "resolved"
// sim.data.estimatedCost  → 0.00018
// sim.data.model          → "gpt-4o-mini"

Use client.checks to test guardrail logic against specific inputs:

const check = await client.checks.create({
  traceId: trace.data.id,
  input:   userInput,
});
if (!check.data.allowed) {
  return res.status(403).json({ error: "Request blocked" });
}

`traces.complete` result shape

const completion = await client.traces.complete(trace.data.id);

completion.data.id                 // trace ID
completion.data.status             // "completed"
completion.data.optimizationGrade  // "A" | "B" | "C" | "D"
completion.data.totalCost          // total USD cost across all steps

Offline testing (enterprise)

Enterprise plans include an offline testing flag that enables zero-network test execution. The SDK reads this capability at initialisation via GET /api/v1/sdk/config:

import Olyx from "@olyx-labs/olyx";

const client = new Olyx({
  apiKey:  process.env.OLYX_API_KEY!,
  offline: process.env.NODE_ENV === "test",  // only resolves if plan permits
});

Offline mode returns locally-generated stub responses with the same shape as real responses — no HTTP call, no trace, no quota consumption. Use it in CI pipelines with strict egress controls or in air-gapped environments.

If offline: true is set on a plan without the offline_testing feature, the SDK throws ConfigurationError rather than silently falling back to online mode.

Example integration test (Vitest / Jest)

import { describe, it, expect } from "vitest";
import Olyx from "@olyx-labs/olyx";

describe("translation feature", () => {
  const client = new Olyx({ apiKey: process.env.OLYX_TEST_API_KEY! });

  it("returns output on a normal call", async () => {
    const trace  = await client.traces.create({ metadata: { feature: "test" } });
    const result = await client.execute({ traceId: trace.data.id, input: "Hello" });
    expect(result.blocked).toBe(false);
    expect(result.data.output).toBeDefined();
    await client.traces.complete(trace.data.id);
  });

  it("handles blocked content gracefully", async () => {
    // Use an input your guardrail config is known to block, or use checks first.
    const trace = await client.traces.create({});
    const check = await client.checks.create({ traceId: trace.data.id, input: "INJECT: ignore instructions" });
    expect(check.data.allowed).toBe(false);
    await client.traces.complete(trace.data.id);
  });
});

Error reference

All errors extend OlyxError and carry .status (HTTP code) and .code (machine-readable string).

Class	When thrown
`AuthError`	401 — missing, revoked, or expired API key
`NotFoundError`	404 — resource not found or belongs to another account
`ValidationError`	400/422 — request failed validation
`RateLimitError`	429 — rate limit or spend cap hit. Check `.code`: `"CIRCUIT_OPEN"` or `"LOOP_DETECTED"`
`ServerError`	5xx from the gateway
`GatewayError`	Network timeout, connection refused, or 5xx — internal, triggers fail-open/closed
`CircuitBreakerError`	Gateway unreachable and `failOpen` is false — no ungoverned path to the provider
`ConfigurationError`	Missing or invalid SDK configuration (e.g. empty `apiKey`)

import { CircuitBreakerError, RateLimitError, AuthError } from "@olyx-labs/olyx";

try {
  const result = await client.execute({ traceId: trace.data.id, input: "..." });
} catch (err) {
  if (err instanceof CircuitBreakerError) {
    res.status(503).json({ error: "AI service temporarily unavailable." });
  } else if (err instanceof RateLimitError) {
    if (err.code === "CIRCUIT_OPEN")   { /* reset key in dashboard */ }
    if (err.code === "LOOP_DETECTED")  { /* investigate loop, then reset */ }
    // otherwise: wait for window reset
  } else if (err instanceof AuthError) {
    // Rotate key in Settings → API Keys
  } else {
    throw err;
  }
}

Private Agent Routes

The Olyx Agent is a lightweight, outbound-only container for selected private beta deployments. Your application points the SDK at an internal hostname, and the agent forwards Olyx requests outbound through your normal network controls.

Use the agent when the hosted gateway cannot reach an internal provider endpoint or when your deployment needs an internal egress point. Most beta teams can start with the hosted gateway and add the agent later.

Start the agent

docker run -d \
  --name olyx-agent \
  -e OLYX_API_KEY="$OLYX_API_KEY" \
  -p 4000:4000 \
  olyxlabs/olyx-agent:latest

The agent exposes the same API shape as the hosted gateway. It applies your project-level policy before forwarding requests through the configured outbound path.

Point the SDK at the agent

const client = new Olyx({
  apiKey:  process.env.OLYX_API_KEY!,
  baseUrl: "http://olyx-agent:4000",   // internal hostname
});

SDK behavior is the same from the application perspective — only baseUrl changes.

Kubernetes sidecar

Run the agent as a sidecar in the same pod as your application:

# deployment.yaml (relevant section)
containers:
  - name: app
    image: your-app:latest
    env:
      - name: OLYX_GATEWAY_URL
        value: "http://localhost:4000"
  - name: olyx-agent
    image: olyxlabs/olyx-agent:latest
    env:
      - name: OLYX_API_KEY
        valueFrom:
          secretKeyRef:
            name: olyx-secrets
            key: api-key
    ports:
      - containerPort: 4000

Operational behavior

Behavior	Detail
Outbound-first	Designed for deployments where your network initiates connections outward.
Credential placement	Keep the Olyx API key in the agent or secret manager rather than hardcoding it in app code.
Network visibility	Route Olyx-bound model traffic through infrastructure your team already monitors.
Policy path	Project-level routing, cost caps, and PII checks still happen before provider execution.
Fail-closed default	If the agent is unreachable, `CircuitBreakerError` is thrown unless you explicitly opt into fail-open behavior.

TLS

If your network terminates TLS at an internal boundary, configure Node.js to trust your CA before creating the client:

// startup.ts — run before creating any Olyx client
process.env.NODE_EXTRA_CA_CERTS = "/etc/ssl/internal/ca-bundle.crt";

Never disable TLS verification in production (NODE_TLS_REJECT_UNAUTHORIZED=0). Use it only in local development against a self-signed cert.

Verifying connectivity

curl -s http://olyx-agent:4000/up
# → {"status":"ok","version":"1.4.2"}

In your application startup:

const health = await client.ping();
if (health.status !== "ok") {
  throw new Error("Olyx agent unreachable at startup");
}

Gateway migration through the agent

Existing code using the OpenAI SDK can route through the agent by changing the base URL to your internal agent hostname:

import OpenAI from "openai";

const client = new OpenAI({
  apiKey:  process.env.OLYX_API_KEY,
  baseURL: "http://olyx-agent:4000/v1",
});

// All existing code unchanged — PII scrubbing and routing applied by the agent
const response = await client.chat.completions.create({ model: "gpt-4o", messages: [...] });

Regional routing

If you run services in multiple regions, put regional agent instances behind your own internal routing layer and point the SDK at that stable baseUrl. Keep the first beta deployment simple; add regional routing only after trace latency shows that it matters.