Model Registry
The Model Registry is the list of models a project is allowed to use. Application code should call Olyx through the SDK; the registry decides which provider endpoint, credentials, token rates, and routing tier are available behind that call.
During closed beta, start with one or two models you can reason about. Add private routes, fallback chains, and additional providers only when trace data shows you need them.
Why Use It
The registry keeps provider configuration out of application call sites.
| Capability | What it gives you |
|---|---|
| Provider changes without app rewrites | Change model configuration in one project instead of every service. |
| Project-specific rates | Track cost using the rates your team configures for that model. |
| Routing tiers | Map request classes to model identifiers. |
| Private routes | Represent agent-reachable models or internal endpoints when beta deployments need them. |
| Fallbacks | Define a controlled secondary route for non-sensitive workloads. |
The registry is not a pricing source of truth for every provider. Keep provider prices current in your own settings and verify them during billing review.
Registry Shape
Each registry entry describes one model endpoint.
| Field | Description |
|---|---|
name | Display label for humans. |
identifier | Stable model name used by routing settings and traces. |
provider | Provider adapter: openai, anthropic, gemini, bedrock, azure, or internal. |
base_url | Provider or internal endpoint URL. |
is_public | Whether the endpoint is treated as a public provider route. |
input_cost_per_1k | Prompt-token rate used for project cost summaries. |
output_cost_per_1k | Completion-token rate used for project cost summaries. |
data_retention_days | Retention setting for trace data associated with the model. |
api_key | Provider credential, stored server-side and not returned after creation. |
additional_config | Provider-specific options, fallback identifiers, or region metadata. |
Adding a Model
Add models from the dashboard during normal beta setup. The API reference contains the low-level create and update endpoint details for automation.
Use a small entry first.
{
"name": "GPT-4o Mini",
"identifier": "gpt-4o-mini",
"provider": "openai",
"base_url": "https://api.openai.com/v1",
"is_public": true,
"input_cost_per_1k": 0.00015,
"output_cost_per_1k": 0.0006,
"data_retention_days": 30,
"currency": "usd"
}
After creation, responses should show whether a provider credential exists without returning the secret itself.
| Response field | Why it matters |
|---|---|
has_api_key | Confirms the model has a stored credential without exposing it. |
fallback_identifier | Shows the configured fallback model, if any. |
additional_config | Shows provider-specific metadata that is safe to return. |
Public vs Private Models
Use is_public as a routing and review signal. It does not magically make a network path private; it tells Olyx how the
project intends to classify the model route.
| Model type | Typical use |
|---|---|
| Public | Hosted providers reached through public APIs. |
| Private | Agent-reachable model servers, internal OpenAI-compatible endpoints, or selected private deployments. |
Private model routes require infrastructure setup outside the registry. For most closed-beta teams, start with public models and add private agent routes only when the workload needs that network posture.
Routing Tiers
Routing tiers map project behavior to registered model identifiers. The SDK call stays the same; project settings decide which registered model handles the request.
Example project setting:
{
"routing_tiers": {
"simple": "gemini-2.0-flash",
"medium": "gpt-4o",
"complex": "claude-sonnet-4-6",
"secure": "private-llama"
}
}
| Tier | Use it for |
|---|---|
simple | Short, low-risk, low-complexity requests. |
medium | Default application work and structured output. |
complex | Longer context, code, deep reasoning, or higher-quality needs. |
secure | Selected sensitive workloads when a private route is configured. |
Each value must match a registered model identifier in the project.
Fallback Chains
Fallback chains give non-sensitive workloads a secondary path when the primary model is unavailable. Keep them short and easy to explain during beta.
{
"additional_config": {
"fallback_identifier": "gpt-4o-mini"
}
}
| Rule | Why |
|---|---|
| Use registered identifiers | The fallback must exist in the same project registry. |
| Keep chains short | Long chains are harder to debug and compare. |
| Avoid sensitive downgrades | Sensitive workloads should not silently fall back to an unintended public route. |
| Test with replays | Replays help compare output and cost before changing live routing. |
Tool Support
Tool definitions should be written once at the SDK layer. Olyx normalizes them for the selected model path where the provider adapter supports tool calling.
| Provider path | Tool handling |
|---|---|
| OpenAI-compatible | OpenAI function-calling format, passed through as-is. |
| Anthropic | Translates OpenAI function-calling format to Anthropic’s native tool schema. |
| Google Gemini | Translates OpenAI function-calling format to Gemini functionDeclarations. |
| AWS Bedrock | Translates to Bedrock Converse toolConfig / toolSpec shape. |
| Azure OpenAI | OpenAI function-calling format, passed through as-is. |
| MCP workflows | Your application executes tools and continues the trace with tool results. |
For SDK examples, use MCP and Architecture. Keep provider tool differences out of application business logic wherever possible.
Recommended Beta Setup
Start small enough that each route can be explained from traces.
| Phase | Setup |
|---|---|
| Day one | One public model, one project-scoped key, one staging trace. |
| First production traffic | Add production project, spend caps, and a medium/default route. |
| After trace history | Tune token rates, routing tiers, and grading baselines from real data. |
| When needed | Add private agent routes, fallback chains, or provider-specific models. |
This keeps the registry useful without making the closed-beta setup look overbuilt.