Token Forwarding

Nexus supports token forwarding, allowing users to provide their own API keys at request time instead of using the configured keys. This feature enables flexible billing models and user-managed API access.

Overview

Token forwarding allows:

Users to bring their own API keys
Separate billing per user
Development with personal keys
Fallback to configured keys when needed

Configuring Token Forwarding

Enable token forwarding for any provider by setting forward_token = true:


[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"  # Fallback key (optional with forwarding)
forward_token = true  # Enable token forwarding

[llm.providers.anthropic]
type = "anthropic"
# No api_key required when token forwarding is enabled
forward_token = true

[llm.providers.google]
type = "google"
api_key = "{{ env.GOOGLE_API_KEY }}"
forward_token = false  # Explicitly disabled (default)

Using Token Forwarding

When token forwarding is enabled, users pass their API key using the X-Provider-API-Key header:

OpenAI Example


curl -X POST http://localhost:8000/llm/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Provider-API-Key: sk-your-openai-key" \
  -d '{
    "model": "openai/gpt-4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Anthropic Example


curl -X POST http://localhost:8000/llm/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "X-Provider-API-Key: sk-ant-your-anthropic-key" \
  -d '{
    "model": "anthropic/claude-3-opus-20240229",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

Token Forwarding Behavior

User-provided keys (via header) take priority
Falls back to configured key if no header provided
Returns 401 error if neither key is available

Always uses the configured API key
Ignores the X-Provider-API-Key header
Returns 401 error if no configured key exists

Client Library Examples

Python (OpenAI SDK)


from openai import OpenAI

# Using your own API key
client = OpenAI(
    base_url="http://localhost:8000/llm/v1",
    api_key="not-used",  # Required by SDK but ignored
    default_headers={
        "X-Provider-API-Key": "sk-your-openai-key"
    }
)

response = client.chat.completions.create(
    model="openai/gpt-4",
    messages=[{"role": "user", "content": "Hello!"}]
)

JavaScript/TypeScript


import OpenAI from 'openai';

const openai = new OpenAI({
  baseURL: 'http://localhost:8000/llm/v1',
  apiKey: 'not-used',  // Required by SDK but ignored
  defaultHeaders: {
    'X-Provider-API-Key': 'sk-your-openai-key'
  }
});

const completion = await openai.chat.completions.create({
  model: 'openai/gpt-4',
  messages: [{ role: 'user', content: 'Hello!' }]
});

Custom HTTP Client


async function callNexusLLM(apiKey, model, messages) {
  const response = await fetch('http://localhost:8000/llm/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Content-Type': 'application/json',
      'X-Provider-API-Key': apiKey  // User's own API key
    },
    body: JSON.stringify({ model, messages })
  });

  return response.json();
}

// User provides their own key
const result = await callNexusLLM(
  'sk-user-api-key',
  'openai/gpt-4',
  [{ role: 'user', content: 'Hello' }]
);

Configuration Patterns

SaaS with User Keys

Allow users to optionally provide their own keys:


[llm.providers.openai]
type = "openai"
api_key = "{{ env.COMPANY_OPENAI_KEY }}"  # Company pays by default
forward_token = true  # Users can override with their key

[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.COMPANY_ANTHROPIC_KEY }}"
forward_token = true

Development Environment

Developers use personal keys:


[llm.providers.openai]
type = "openai"
# No default key - developers must provide their own
forward_token = true

[llm.providers.anthropic]
type = "anthropic"
forward_token = true

Mixed Configuration

Some providers allow forwarding, others don't:


# Users can use their own OpenAI keys
[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"
forward_token = true

# Company Anthropic key only
[llm.providers.anthropic]
type = "anthropic"
api_key = "{{ env.COMPANY_ANTHROPIC_KEY }}"
forward_token = false  # No user keys allowed

# Always use company Google key
[llm.providers.google]
type = "google"
api_key = "{{ env.GOOGLE_API_KEY }}"
# forward_token defaults to false

AWS Bedrock Limitation

Important: Token forwarding is not supported for AWS Bedrock providers.

Unlike other providers that use simple API keys, AWS Bedrock requires:

AWS credentials (access key ID, secret access key, session tokens)
AWS Signature Version 4 (SigV4) signing
Request-specific signatures based on content and timestamp
Complex authentication flow

Due to this complexity, Bedrock providers must use pre-configured AWS credentials:


[llm.providers.bedrock]
type = "bedrock"
region = "us-east-1"
profile = "production"
# forward_token is not supported - will be a validation error if provided

Security Considerations

API Key Validation

Nexus passes keys directly to providers
Invalid keys result in provider-specific error responses
Keys are not logged or stored by Nexus

Rate Limiting with Forwarded Tokens

Server-level rate limits still apply
Token-based rate limits require client identification

Integration with Other Features

With OAuth2 Authentication

Combine OAuth2 for user authentication with token forwarding:


# Require OAuth2 for access
[server.oauth]
url = "https://auth.example.com/.well-known/jwks.json"
poll_interval = "5m"
expected_issuer = "https://auth.example.com"
expected_audience = "nexus-api"

[server.oauth.protected_resource]
resource = "https://nexus.example.com"
authorization_servers = ["https://auth.example.com"]

# Allow authenticated users to use their own API keys
[llm.providers.openai]
type = "openai"
forward_token = true

Users must provide both:


curl -X POST http://localhost:8000/llm/v1/chat/completions \
  -H "Authorization: Bearer <oauth-token>" \
  -H "X-Provider-API-Key: sk-user-api-key" \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4", "messages": [...]}'

Next Steps

Review API Usage for integration examples
Configure Rate Limiting for token-forwarded requests
Set up monitoring for API key usage