Nexus provides rate limiting capabilities to protect your server from abuse and ensure fair resource usage. You can configure global and per-IP rate limits at the server level.
Enable rate limiting in your nexus.toml:
[server.rate_limits]
enabled = true
storage = "memory" # or use Redis for distributed rate limiting
[server.rate_limits.global]
limit = 1000
interval = "60s"
[server.rate_limits.per_ip]
limit = 100
interval = "60s"enabled: Enable or disable rate limiting (default:false)storage: Storage backend - either"memory"(default) or a Redis configuration
Global Limits (server.rate_limits.global)
limit: Maximum requests across all clientsinterval: Time window (e.g., "60s", "5m", "1h")
Per-IP Limits (server.rate_limits.per_ip)
limit: Maximum requests per IP addressinterval: Time window
Uses an in-memory rate limiter, suitable for single-instance deployments:
[server.rate_limits]
storage = "memory"For distributed rate limiting across multiple Nexus instances:
[server.rate_limits]
storage = { type = "redis", url = "redis://localhost:6379" }url: Redis connection URL (default:"redis://localhost:6379/0")key_prefix: Prefix for rate limit keys (default:"nexus:rate_limits:")pool.max_size: Maximum connection pool size (default: 16)pool.min_idle: Minimum idle connections (default: 0)pool.timeout_create: Timeout for creating connections (optional)pool.timeout_wait: Timeout for waiting for a connection (optional)pool.timeout_recycle: Timeout for recycling connections (optional)tls.enabled: Enable TLS connection (default:false)tls.insecure: Skip TLS certificate verification (optional)tls.ca_cert_path: Path to CA certificate (optional)tls.client_cert_path: Path to client certificate (optional)tls.client_key_path: Path to client private key (optional)response_timeout: Timeout for Redis responses (optional)connection_timeout: Timeout for Redis connections (optional)
[server.rate_limits]
storage = {
type = "redis",
url = "redis://username:password@redis.example.com:6379/0",
key_prefix = "nexus:rate_limits:prod:",
response_timeout = "10s",
connection_timeout = "10s"
}
# With connection pool configuration
[server.rate_limits.storage.pool]
max_size = 20
min_idle = 5
timeout_create = "5s"
timeout_wait = "5s"
# With TLS configuration
[server.rate_limits.storage.tls]
enabled = true
ca_cert_path = "/etc/ssl/certs/redis-ca.pem"When a client exceeds the rate limit, Nexus responds with:
- HTTP status code
429 Too Many Requests Retry-Afterheader indicating when the client can retry (in seconds)
For more granular control, you can also configure:
Limit requests to specific MCP servers:
[mcp.servers.expensive_api]
url = "https://api.example.com/mcp"
[mcp.servers.expensive_api.rate_limits]
limit = 10
interval = "60s"Limit usage of specific tools within MCP servers:
[mcp.servers.my_api.rate_limits.tools]
expensive_operation = { limit = 5, interval = "300s" }
bulk_process = { limit = 2, interval = "600s" }See MCP rate limiting for details.
Limit token consumption for AI models:
[llm.providers.openai.rate_limits.per_user]
input_token_limit = 100000
interval = "60s"See LLM rate limiting for details.
Rate limits are evaluated in the following order:
Server-level limits (checked first via middleware):
- Global limits - total requests across all clients
- Per-IP limits - requests per IP address
Module-specific limits (checked after server limits pass):
For MCP requests:
- Tool-specific limits (most specific)
- MCP server limits (least specific)
For LLM requests:
- Model-specific token limits with user group (most specific)
- Model-specific token limits
- Provider-level token limits with user group
- Provider-level token limits (least specific)
All applicable limits are enforced - a request must pass all rate limit checks to succeed. Server middleware limits are checked first; if they pass, then the request proceeds to module-specific limit checks.
- Start Conservative: Begin with lower limits and increase based on usage patterns
- Monitor Usage: Track rate limit hits to identify patterns
- Use Redis in Production: For multi-instance deployments
- Different Intervals: Use appropriate time windows for different resources
- Test Limits: Verify configuration before production deployment
- Log Rate Limit Events: Monitor for potential abuse or misconfiguration
Monitor these metrics to optimize rate limiting:
- Rate limit hit frequency by IP
- Top IPs hitting rate limits
- Rate limit effectiveness (blocked vs allowed requests)
- Redis connection pool utilization (if using Redis)
- Enable Client Identification for per-user rate limiting
- Configure CORS for browser-based clients
- Set up CSRF Protection for additional security