Control which models are exposed through your Nexus instance and create custom aliases for better usability.

Models must be explicitly configured for each provider. The model configuration uses the following format:

[llm.providers.<provider_name>.models.<api_model_name>] # Model will be available as "<provider_name>/<api_model_name>" # By default, <api_model_name> is also used as the upstream model name

Example:

[llm.providers.openai.models.gpt-4] # Available as "openai/gpt-4", maps to upstream "gpt-4" [llm.providers.openai.models."gpt-3.5-turbo"] # Model names with special characters must be quoted

Create custom model names that map to different upstream model identifiers:

[llm.providers.openai.models.smart] rename = "gpt-4" # API: "openai/smart" → Upstream: "gpt-4" [llm.providers.anthropic.models.fast] rename = "claude-3-5-sonnet-20241022" # API: "anthropic/fast" → Upstream: "claude-3-5-sonnet-20241022" [llm.providers.google.models."my-gemini"] rename = "gemini-1.5-pro" # API: "google/my-gemini" → Upstream: "gemini-1.5-pro"
  1. User-Friendly Names: Simplify complex model identifiers
[llm.providers.bedrock.models.claude] rename = "anthropic.claude-3-sonnet-20240229-v1:0" # Use "bedrock/claude" instead of the long model ID
  1. Abstraction Layer: Hide provider-specific naming
[llm.providers.openai.models.chat] rename = "gpt-4" [llm.providers.anthropic.models.chat] rename = "claude-3-5-sonnet-20241022" # Both available as "*/chat" for consistency
  1. Version Management: Switch models without changing client code
# Easy to update when new versions are released [llm.providers.openai.models.latest] rename = "gpt-4-turbo-2024-11-01" # Update this when newer versions available

Models are prefixed with their provider instance name:

  • Format: {provider_name}/{model_id}
  • Examples:
    • openai/gpt-4
    • anthropic/claude-3-5-sonnet-20241022
    • google/gemini-1.5-pro
    • azure_openai/gpt-4 (custom provider name)
GET /llm/v1/models

Returns all explicitly configured models:

{ "object": "list", "data": [ { "id": "openai/gpt-4", "object": "model", "owned_by": "openai" }, { "id": "openai/smart", // Custom alias "object": "model", "owned_by": "openai" }, { "id": "anthropic/fast", // Custom alias "object": "model", "owned_by": "anthropic" } ] }
curl http://localhost:8000/llm/v1/models | jq '.data[].id'

Organize models by their capabilities:

# Chat models [llm.providers.openai.models."gpt-4"] [llm.providers.openai.models."gpt-3.5-turbo"] # Code models [llm.providers.anthropic.models.coder] rename = "claude-3-5-sonnet-20241022" # Best for coding tasks

Create tiers based on cost and performance:

# Economy tier [llm.providers.openai.models.economy] rename = "gpt-3.5-turbo" [llm.providers.anthropic.models.economy] rename = "claude-3-haiku-20240307-v1:0" # Standard tier [llm.providers.openai.models.standard] rename = "gpt-4" [llm.providers.anthropic.models.standard] rename = "claude-3-sonnet-20240229-v1:0" # Premium tier [llm.providers.openai.models.premium] rename = "gpt-4-turbo-preview" [llm.providers.anthropic.models.premium] rename = "claude-3-opus-20240229"

Configure models for specific use cases:

# Customer support [llm.providers.openai.models.support] rename = "gpt-3.5-turbo" # Fast and cost-effective # Content generation [llm.providers.anthropic.models.writer] rename = "claude-3-5-sonnet-20241022" # Excellent writing capabilities # Code review [llm.providers.openai.models.reviewer] rename = "gpt-4" # Strong reasoning for code analysis # Translation [llm.providers.google.models.translator] rename = "gemini-1.5-pro" # Good multilingual support

Control which models are available based on configuration:

Only expose specific models:

[llm.providers.openai] type = "openai" api_key = "{{ env.OPENAI_API_KEY }}" # Only expose GPT-4, not GPT-3.5 [llm.providers.openai.models.gpt-4] # GPT-3.5-turbo is NOT configured, so it's not accessible

Different models for different environments:

# Development environment [llm.providers.openai.models.dev] rename = "gpt-3.5-turbo" # Cheaper for development # Production environment (use environment variable) [llm.providers.openai.models.prod] rename = "{{ env.PRODUCTION_MODEL }}" # Set to "gpt-4" in production

When a client requests an unconfigured model:

{ "error": { "message": "Model 'openai/gpt-5' not found", "type": "invalid_request_error", "code": "model_not_found" } }

Common issues and solutions:

  1. Model name with dots: Must be quoted
# Wrong [llm.providers.google.models.gemini-1.5-pro] # Correct [llm.providers.google.models."gemini-1.5-pro"]
  1. Duplicate aliases: Each model name must be unique within a provider
# Wrong - duplicate "smart" name [llm.providers.openai.models.smart] rename = "gpt-4" [llm.providers.openai.models.smart] # Error! rename = "gpt-3.5-turbo"
  1. Start Small: Only configure models you actually use
  2. Consistent Naming: Use a clear naming convention across providers
  3. Document Aliases: Keep documentation of what each alias maps to
  4. Version in Aliases: Include version info when relevant
  5. Test Models: Verify models work before production deployment
  6. Monitor Usage: Track which models are used most frequently

When migrating from automatic model discovery:

  1. List Current Usage: Check logs to see which models are actually used
  2. Add Configurations: Start with the most-used models
  3. Test Thoroughly: Verify all client applications still work
  4. Gradual Rollout: Consider using multiple Nexus instances during migration
  5. Update Documentation: Ensure all model references are updated

AWS Bedrock provides access to foundation models from multiple vendors. Here are the most commonly used models:

[llm.providers.bedrock] type = "bedrock" region = "us-east-1" # Claude Opus 4.1 - Most capable (latest) [llm.providers.bedrock.models."anthropic.claude-opus-4-1-20250805-v1:0"] # Claude Sonnet 3.7 - Balanced performance (latest) [llm.providers.bedrock.models."anthropic.claude-3-7-sonnet-20250219-v1:0"] # Claude Haiku 3.5 - Fast and efficient [llm.providers.bedrock.models."anthropic.claude-3-5-haiku-20241022-v1:0"] # Create aliases for easier use [llm.providers.bedrock.models.claude-opus] rename = "anthropic.claude-opus-4-1-20250805-v1:0" [llm.providers.bedrock.models.claude-sonnet] rename = "anthropic.claude-3-7-sonnet-20250219-v1:0" [llm.providers.bedrock.models.claude-haiku] rename = "anthropic.claude-3-5-haiku-20241022-v1:0"

Tool Support: All Claude models (Opus, Sonnet, and Haiku) via Bedrock have excellent support for function calling and tools.

# Nova Pro - Advanced reasoning [llm.providers.bedrock.models."amazon.nova-pro-v1:0"] # Nova Lite - Efficient performance [llm.providers.bedrock.models."amazon.nova-lite-v1:0"] # Nova Micro - Ultra-fast responses [llm.providers.bedrock.models."amazon.nova-micro-v1:0"]
# Llama 3.1 405B - Largest and most capable [llm.providers.bedrock.models."meta.llama3-1-405b-instruct-v1:0"] # Llama 3.1 70B - High performance [llm.providers.bedrock.models."meta.llama3-1-70b-instruct-v1:0"] # Llama 3.1 8B - Efficient [llm.providers.bedrock.models."meta.llama3-1-8b-instruct-v1:0"]
# Mistral Large [llm.providers.bedrock.models."mistral.mistral-large-2402-v1:0"] # Cohere Command R+ [llm.providers.bedrock.models."cohere.command-r-plus-v1:0"] # DeepSeek R1 - Reasoning optimized [llm.providers.bedrock.models."deepseek.deepseek-r1"]
[llm.providers.bedrock] type = "bedrock" region = "us-east-1" # Semantic aliases for different use cases [llm.providers.bedrock.models.chat] rename = "anthropic.claude-3-sonnet-20240229-v1:0" [llm.providers.bedrock.models.fast] rename = "anthropic.claude-3-haiku-20240307-v1:0" [llm.providers.bedrock.models.powerful] rename = "anthropic.claude-3-opus-20240229-v1:0" [llm.providers.bedrock.models.coding] rename = "meta.llama3-1-70b-instruct-v1:0"

For a complete list of available Bedrock models and their full IDs, refer to the AWS Bedrock documentation.

© Grafbase, Inc.