The LLM router acts as a unified gateway that provides access to multiple AI model providers through a single OpenAI-compatible API. Configure providers, models, rate limits, and more.
- Provider Configuration - Set up OpenAI, Anthropic, Google, and AWS Bedrock
- Model Management - Configure models and create aliases
- Token Rate Limiting - Control token consumption per user and model
- Token Forwarding - Allow users to provide their own API keys
- Header Rules - Transform and manage HTTP headers for providers
Enable LLM routing in your nexus.toml:
[llm]
enabled = true # Enable LLM functionality
# Configure protocol endpoints
[llm.protocols.openai]
enabled = true
path = "/llm/openai" # OpenAI-compatible endpoint
[llm.protocols.anthropic]
enabled = true
path = "/llm/anthropic" # Anthropic-compatible endpoint
# Configure OpenAI provider
[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
# Must explicitly configure models
[llm.providers.openai.models.gpt-4]
[llm.providers.openai.models."gpt-3.5-turbo"]
# Configure Anthropic provider
[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.ANTHROPIC_API_KEY }}"
[llm.providers.anthropic.models."claude-3-5-sonnet-20241022"]# List available models (OpenAI protocol)
curl http://localhost:8000/llm/openai/v1/models
# Chat completion (OpenAI protocol)
curl -X POST http://localhost:8000/llm/openai/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "openai/gpt-4",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# Chat completion (Anthropic protocol)
curl -X POST http://localhost:8000/llm/anthropic/v1/messages \
-H "Content-Type: application/json" \
-H "x-api-key: not-used" \
-H "anthropic-version: 2023-06-01" \
-d '{
"model": "anthropic/claude-3-5-sonnet-20241022",
"messages": [{"role": "user", "content": "Hello!"}],
"max_tokens": 1024
}'Models are prefixed with their provider name:
- Format:
{provider_name}/{model_id} - Examples:
openai/gpt-4,anthropic/claude-3-5-sonnet-20241022
Note: All models must be explicitly configured. Models that are not configured will return a 404 error.
Here's a comprehensive configuration showing multiple providers and features:
[llm]
enabled = true
# Enable both protocols
[llm.protocols.openai]
enabled = true
path = "/llm/openai"
[llm.protocols.anthropic]
enabled = true
path = "/llm/anthropic"
# OpenAI with multiple models
[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
forward_token = true # Allow user-provided keys
[llm.providers.openai.models.gpt-4]
[llm.providers.openai.models."gpt-3.5-turbo"]
[llm.providers.openai.models.smart]
rename = "gpt-4" # Alias: "openai/smart" → "gpt-4"
# Token rate limiting for OpenAI
[llm.providers.openai.rate_limits.per_user]
input_token_limit = 100000
interval = "60s"
# Anthropic configuration
[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.ANTHROPIC_API_KEY }}"
[llm.providers.anthropic.models."claude-3-5-sonnet-20241022"]
[llm.providers.anthropic.models.fast]
rename = "claude-3-haiku-20240307-v1:0"
# AWS Bedrock
[llm.providers.bedrock]
type = "bedrock"
region = "us-east-1"
[llm.providers.bedrock.models.claude]
rename = "anthropic.claude-3-sonnet-20240229-v1:0"The LLM router supports both OpenAI and Anthropic client libraries:
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:8000/llm/openai/v1",
api_key="not-used"
)import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'http://localhost:8000/llm/openai/v1',
apiKey: 'not-used'
});from anthropic import Anthropic
client = Anthropic(
base_url="http://localhost:8000/llm/anthropic",
api_key="not-used"
)import Anthropic from '@anthropic-ai/sdk';
const anthropic = new Anthropic({
baseURL: 'http://localhost:8000/llm/anthropic',
apiKey: 'not-used'
});export ANTHROPIC_BASE_URL="http://localhost:8000/llm/anthropic"
export ANTHROPIC_MODEL="anthropic/claude-3-5-sonnet-20241022"- Multi-Protocol Support: Native OpenAI and Anthropic protocol endpoints
- Unified Access: Route to any provider through either protocol
- Client Compatibility: Works with OpenAI, Anthropic, and Claude Code clients
- Model Aliases: Create custom names for models
- Token Rate Limiting: Control usage per user and model
- Token Forwarding: Users can provide their own API keys
- Header Transformation: Forward, insert, remove, and rename HTTP headers
- Streaming Support: Real-time responses via SSE
- Multiple Providers: Mix models from different vendors
- Explicit Model Configuration: Only configure models you need
- Use Environment Variables: Never hardcode API keys
- Configure Rate Limits: Protect against excessive usage
- Create Meaningful Aliases: Simplify model names for users
- Monitor Usage: Track token consumption and costs
- Test Thoroughly: Verify models before production
For debugging, check:
- Model availability (OpenAI):
GET /llm/openai/v1/models - Model availability (Anthropic):
GET /llm/anthropic/v1/models - Nexus logs:
nexus --log debug - Provider authentication
- Rate limit configuration
- Network connectivity to providers
- Start with Provider Configuration
- Set up Model Management
- Configure Token Rate Limiting
- Learn how to use the API
- Integrate with Claude Code