LLM Configuration

The LLM router acts as a unified gateway that provides access to multiple AI model providers through a single OpenAI-compatible API. Configure providers, models, rate limits, and more.

Configuration Topics

Essential Configuration

Provider Configuration - Set up OpenAI, Anthropic, Google, and AWS Bedrock
Model Management - Configure models and create aliases
Token Rate Limiting - Control token consumption per user and model

Advanced Features

Token Forwarding - Allow users to provide their own API keys
Header Rules - Transform and manage HTTP headers for providers

Quick Start

Basic Configuration

Enable LLM routing in your nexus.toml:


[llm]
enabled = true  # Enable LLM functionality

# Configure protocol endpoints
[llm.protocols.openai]
enabled = true
path = "/llm/openai"  # OpenAI-compatible endpoint

[llm.protocols.anthropic]
enabled = true
path = "/llm/anthropic"  # Anthropic-compatible endpoint

# Configure OpenAI provider
[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"

# Must explicitly configure models
[llm.providers.openai.models.gpt-4]
[llm.providers.openai.models."gpt-3.5-turbo"]

# Configure Anthropic provider
[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.ANTHROPIC_API_KEY }}"

[llm.providers.anthropic.models."claude-3-5-sonnet-20241022"]

Using the API


# List available models (OpenAI protocol)
curl http://localhost:8000/llm/openai/v1/models

# Chat completion (OpenAI protocol)
curl -X POST http://localhost:8000/llm/openai/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "openai/gpt-4",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

# Chat completion (Anthropic protocol)
curl -X POST http://localhost:8000/llm/anthropic/v1/messages \
  -H "Content-Type: application/json" \
  -H "x-api-key: not-used" \
  -H "anthropic-version: 2023-06-01" \
  -d '{
    "model": "anthropic/claude-3-5-sonnet-20241022",
    "messages": [{"role": "user", "content": "Hello!"}],
    "max_tokens": 1024
  }'

Model Naming Convention

Models are prefixed with their provider name:

Format: {provider_name}/{model_id}
Examples: openai/gpt-4, anthropic/claude-3-5-sonnet-20241022

Note: All models must be explicitly configured. Models that are not configured will return a 404 error.

Complete Example

Here's a comprehensive configuration showing multiple providers and features:


[llm]
enabled = true

# Enable both protocols
[llm.protocols.openai]
enabled = true
path = "/llm/openai"

[llm.protocols.anthropic]
enabled = true
path = "/llm/anthropic"

# OpenAI with multiple models
[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
forward_token = true  # Allow user-provided keys

[llm.providers.openai.models.gpt-4]
[llm.providers.openai.models."gpt-3.5-turbo"]
[llm.providers.openai.models.smart]
rename = "gpt-4"  # Alias: "openai/smart" → "gpt-4"

# Token rate limiting for OpenAI
[llm.providers.openai.rate_limits.per_user]
input_token_limit = 100000
interval = "60s"

# Anthropic configuration
[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.ANTHROPIC_API_KEY }}"

[llm.providers.anthropic.models."claude-3-5-sonnet-20241022"]
[llm.providers.anthropic.models.fast]
rename = "claude-3-haiku-20240307-v1:0"

# AWS Bedrock
[llm.providers.bedrock]
type = "bedrock"
region = "us-east-1"

[llm.providers.bedrock.models.claude]
rename = "anthropic.claude-3-sonnet-20240229-v1:0"

Client Library Support

The LLM router supports both OpenAI and Anthropic client libraries:

OpenAI Clients

Python


from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:8000/llm/openai/v1",
    api_key="not-used"
)

JavaScript


import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'http://localhost:8000/llm/openai/v1',
  apiKey: 'not-used'
});

Anthropic Clients

Python


from anthropic import Anthropic

client = Anthropic(
    base_url="http://localhost:8000/llm/anthropic",
    api_key="not-used"
)

JavaScript


import Anthropic from '@anthropic-ai/sdk';

const anthropic = new Anthropic({
  baseURL: 'http://localhost:8000/llm/anthropic',
  apiKey: 'not-used'
});

Claude Code


export ANTHROPIC_BASE_URL="http://localhost:8000/llm/anthropic"
export ANTHROPIC_MODEL="anthropic/claude-3-5-sonnet-20241022"

Key Features

Multi-Protocol Support: Native OpenAI and Anthropic protocol endpoints
Unified Access: Route to any provider through either protocol
Client Compatibility: Works with OpenAI, Anthropic, and Claude Code clients
Model Aliases: Create custom names for models
Token Rate Limiting: Control usage per user and model
Token Forwarding: Users can provide their own API keys
Header Transformation: Forward, insert, remove, and rename HTTP headers
Streaming Support: Real-time responses via SSE
Multiple Providers: Mix models from different vendors

Best Practices

Explicit Model Configuration: Only configure models you need
Use Environment Variables: Never hardcode API keys
Configure Rate Limits: Protect against excessive usage
Create Meaningful Aliases: Simplify model names for users
Monitor Usage: Track token consumption and costs
Test Thoroughly: Verify models before production

Troubleshooting

For debugging, check:

Model availability (OpenAI): GET /llm/openai/v1/models
Model availability (Anthropic): GET /llm/anthropic/v1/models
Nexus logs: nexus --log debug
Provider authentication
Rate limit configuration
Network connectivity to providers

Next Steps

Start with Provider Configuration
Set up Model Management
Configure Token Rate Limiting
Learn how to use the API
Integrate with Claude Code