Model Management

Control which models are exposed through your Nexus instance and create custom aliases for better usability.

Model Configuration

Basic Model Configuration

Models must be explicitly configured for each provider. The model configuration uses the following format:


[llm.providers.<provider_name>.models.<api_model_name>]
# Model will be available as "<provider_name>/<api_model_name>"
# By default, <api_model_name> is also used as the upstream model name

Example:


[llm.providers.openai.models.gpt-4]
# Available as "openai/gpt-4", maps to upstream "gpt-4"

[llm.providers.openai.models."gpt-3.5-turbo"]
# Model names with special characters must be quoted

Model Renaming (Aliasing)

Create custom model names that map to different upstream model identifiers:


[llm.providers.openai.models.smart]
rename = "gpt-4"  # API: "openai/smart" → Upstream: "gpt-4"

[llm.providers.anthropic.models.fast]
rename = "claude-3-5-sonnet-20241022"  # API: "anthropic/fast" → Upstream: "claude-3-5-sonnet-20241022"

[llm.providers.google.models."my-gemini"]
rename = "gemini-1.5-pro"  # API: "google/my-gemini" → Upstream: "gemini-1.5-pro"

Use Cases for Model Aliases

User-Friendly Names: Simplify complex model identifiers


[llm.providers.bedrock.models.claude]
rename = "anthropic.claude-3-sonnet-20240229-v1:0"
# Use "bedrock/claude" instead of the long model ID

Abstraction Layer: Hide provider-specific naming


[llm.providers.openai.models.chat]
rename = "gpt-4"

[llm.providers.anthropic.models.chat]
rename = "claude-3-5-sonnet-20241022"
# Both available as "*/chat" for consistency

Version Management: Switch models without changing client code


# Easy to update when new versions are released
[llm.providers.openai.models.latest]
rename = "gpt-4-turbo-2024-11-01"  # Update this when newer versions available

Model Naming Convention

Models are prefixed with their provider instance name:

Format: {provider_name}/{model_id}
Examples:
- openai/gpt-4
- anthropic/claude-3-5-sonnet-20241022
- google/gemini-1.5-pro
- azure_openai/gpt-4 (custom provider name)

Listing Available Models

API Endpoint


GET /llm/openai/v1/models

Returns all explicitly configured models:


{
  "object": "list",
  "data": [
    {
      "id": "openai/gpt-4",
      "object": "model",
      "owned_by": "openai"
    },
    {
      "id": "openai/smart",  // Custom alias
      "object": "model",
      "owned_by": "openai"
    },
    {
      "id": "anthropic/fast",  // Custom alias
      "object": "model",
      "owned_by": "anthropic"
    }
  ]
}

Command Line


curl http://localhost:8000/llm/openai/v1/models | jq '.data[].id'

Model Organization Strategies

By Capability

Organize models by their capabilities:


# Chat models
[llm.providers.openai.models."gpt-4"]
[llm.providers.openai.models."gpt-3.5-turbo"]

# Code models
[llm.providers.anthropic.models.coder]
rename = "claude-3-5-sonnet-20241022"  # Best for coding tasks

By Cost/Performance

Create tiers based on cost and performance:


# Economy tier
[llm.providers.openai.models.economy]
rename = "gpt-3.5-turbo"

[llm.providers.anthropic.models.economy]
rename = "claude-3-haiku-20240307-v1:0"

# Standard tier
[llm.providers.openai.models.standard]
rename = "gpt-4"

[llm.providers.anthropic.models.standard]
rename = "claude-3-sonnet-20240229-v1:0"

# Premium tier
[llm.providers.openai.models.premium]
rename = "gpt-4-turbo-preview"

[llm.providers.anthropic.models.premium]
rename = "claude-3-opus-20240229"

By Use Case

Configure models for specific use cases:


# Customer support
[llm.providers.openai.models.support]
rename = "gpt-3.5-turbo"  # Fast and cost-effective

# Content generation
[llm.providers.anthropic.models.writer]
rename = "claude-3-5-sonnet-20241022"  # Excellent writing capabilities

# Code review
[llm.providers.openai.models.reviewer]
rename = "gpt-4"  # Strong reasoning for code analysis

# Translation
[llm.providers.google.models.translator]
rename = "gemini-1.5-pro"  # Good multilingual support

Model Access Control

Control which models are available based on configuration:

Selective Model Exposure

Only expose specific models:


[llm.providers.openai]
type = "openai"
api_key = "{{ env.OPENAI_API_KEY }}"

# Only expose GPT-4, not GPT-3.5
[llm.providers.openai.models.gpt-4]

# GPT-3.5-turbo is NOT configured, so it's not accessible

Environment-Specific Models

Different models for different environments:


# Development environment
[llm.providers.openai.models.dev]
rename = "gpt-3.5-turbo"  # Cheaper for development

# Production environment (use environment variable)
[llm.providers.openai.models.prod]
rename = "{{ env.PRODUCTION_MODEL }}"  # Set to "gpt-4" in production

Error Handling

Model Not Found

When a client requests an unconfigured model:


{
  "error": {
    "message": "Model 'openai/gpt-5' not found",
    "type": "invalid_request_error",
    "code": "model_not_found"
  }
}

Configuration Errors

Common issues and solutions:

Model name with dots: Must be quoted


# Wrong
[llm.providers.google.models.gemini-1.5-pro]

# Correct
[llm.providers.google.models."gemini-1.5-pro"]

Duplicate aliases: Each model name must be unique within a provider


# Wrong - duplicate "smart" name
[llm.providers.openai.models.smart]
rename = "gpt-4"

[llm.providers.openai.models.smart]  # Error!
rename = "gpt-3.5-turbo"

Best Practices

Start Small: Only configure models you actually use
Consistent Naming: Use a clear naming convention across providers
Document Aliases: Keep documentation of what each alias maps to
Version in Aliases: Include version info when relevant
Test Models: Verify models work before production deployment
Monitor Usage: Track which models are used most frequently

Migration Tips

When migrating from automatic model discovery:

List Current Usage: Check logs to see which models are actually used
Add Configurations: Start with the most-used models
Test Thoroughly: Verify all client applications still work
Gradual Rollout: Consider using multiple Nexus instances during migration
Update Documentation: Ensure all model references are updated

AWS Bedrock Models

AWS Bedrock provides access to foundation models from multiple vendors. Here are the most commonly used models:

Anthropic Claude Models


[llm.providers.bedrock]
type = "bedrock"
region = "us-east-1"

# Claude Opus 4.1 - Most capable (latest)
[llm.providers.bedrock.models."anthropic.claude-opus-4-1-20250805-v1:0"]

# Claude Sonnet 3.7 - Balanced performance (latest)
[llm.providers.bedrock.models."anthropic.claude-3-7-sonnet-20250219-v1:0"]

# Claude Haiku 3.5 - Fast and efficient
[llm.providers.bedrock.models."anthropic.claude-3-5-haiku-20241022-v1:0"]

# Create aliases for easier use
[llm.providers.bedrock.models.claude-opus]
rename = "anthropic.claude-opus-4-1-20250805-v1:0"

[llm.providers.bedrock.models.claude-sonnet]
rename = "anthropic.claude-3-7-sonnet-20250219-v1:0"

[llm.providers.bedrock.models.claude-haiku]
rename = "anthropic.claude-3-5-haiku-20241022-v1:0"

Tool Support: All Claude models (Opus, Sonnet, and Haiku) via Bedrock have excellent support for function calling and tools.

Amazon Nova Models


# Nova Pro - Advanced reasoning
[llm.providers.bedrock.models."amazon.nova-pro-v1:0"]

# Nova Lite - Efficient performance
[llm.providers.bedrock.models."amazon.nova-lite-v1:0"]

# Nova Micro - Ultra-fast responses
[llm.providers.bedrock.models."amazon.nova-micro-v1:0"]

Meta Llama Models


# Llama 3.1 405B - Largest and most capable
[llm.providers.bedrock.models."meta.llama3-1-405b-instruct-v1:0"]

# Llama 3.1 70B - High performance
[llm.providers.bedrock.models."meta.llama3-1-70b-instruct-v1:0"]

# Llama 3.1 8B - Efficient
[llm.providers.bedrock.models."meta.llama3-1-8b-instruct-v1:0"]

Other Popular Models


# Mistral Large
[llm.providers.bedrock.models."mistral.mistral-large-2402-v1:0"]

# Cohere Command R+
[llm.providers.bedrock.models."cohere.command-r-plus-v1:0"]

# DeepSeek R1 - Reasoning optimized
[llm.providers.bedrock.models."deepseek.deepseek-r1"]

Bedrock Configuration Example


[llm.providers.bedrock]
type = "bedrock"
region = "us-east-1"

# Semantic aliases for different use cases
[llm.providers.bedrock.models.chat]
rename = "anthropic.claude-3-sonnet-20240229-v1:0"

[llm.providers.bedrock.models.fast]
rename = "anthropic.claude-3-haiku-20240307-v1:0"

[llm.providers.bedrock.models.powerful]
rename = "anthropic.claude-3-opus-20240229-v1:0"

[llm.providers.bedrock.models.coding]
rename = "meta.llama3-1-70b-instruct-v1:0"

For a complete list of available Bedrock models and their full IDs, refer to the AWS Bedrock documentation.

Next Steps

Configure Token Rate Limiting per model
Set up Token Forwarding for user-provided keys
Learn how to use the API