Nexus provides comprehensive OpenTelemetry metrics for monitoring all aspects of the system including server operations, LLM interactions, and MCP tool executions. All metrics follow OpenTelemetry semantic conventions and can be exported to any OpenTelemetry-compatible backend.
Notes:
- All histograms are delta temporality histograms that also function as counters (the count field tracks number of observations)
- Delta histograms report the change since the last export, not cumulative values
- Many metrics include
client.id
andclient.group
attributes when Client Identification is enabled. These attributes allow you to track usage per client and implement tiered access controls.
To enable metrics, configure telemetry in your nexus.toml
:
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
See the complete telemetry configuration guide for all options including:
- Service identification and resource attributes
- Protocol selection (gRPC vs HTTP)
- Batch export optimization
- Integration examples for popular backends
Metric: http.server.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of HTTP server requests
Attributes:
http.request.method
: HTTP method (GET, POST, etc.)http.response.status_code
: HTTP response status codehttp.route
: The matched route pattern
Use Case: Monitor API latency, identify slow endpoints, track error rates
Available when using Redis as the rate limiting backend. See Rate Limiting Configuration for setup details.
Metric: redis.command.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks execution time of Redis operations
Attributes:
operation
: Type of Redis operationcheck_and_consume
: HTTP rate limit checkingcheck_and_consume_tokens
: Token-based rate limit checking
status
: Operation status (success
orerror
)tokens
: Number of tokens (only for token operations)
Use Case: Monitor Redis performance, identify bottlenecks in rate limiting
Metric: redis.pool.connections.in_use
Type: Gauge
Description: Current number of connections checked out from the pool
Attributes: None
Use Case: Monitor connection pool utilization
Metric: redis.pool.connections.available
Type: Gauge
Description: Current number of connections available in the pool
Attributes: None
Use Case: Ensure adequate pool capacity
Following OpenTelemetry GenAI semantic conventions for AI model operations. See LLM Configuration for provider setup details.
Metric: gen_ai.client.operation.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the total duration of LLM chat completion operations
Attributes:
gen_ai.system
: Always "nexus.llm"gen_ai.operation.name
: Always "chat.completions"gen_ai.request.model
: The model identifier (e.g., "openai/gpt-4")gen_ai.response.finish_reason
: How the response ended (stop/length/tool_calls/content_filter)client.id
: Client identifier (from x-client-id header, see Client Identification)client.group
: Client group (from x-client-group header, see Client Identification)error.type
: Error type for failed requests:invalid_request
- Malformed requestauthentication_failed
- Invalid API keyinsufficient_quota
- Quota exceededmodel_not_found
- Unknown modelrate_limit_exceeded
- Provider or token rate limit hitstreaming_not_supported
- Streaming unavailable for modelinvalid_model_format
- Incorrect model name formatprovider_not_found
- Unknown providerinternal_error
- Server errorprovider_api_error
- Upstream provider errorconnection_error
- Network failure
Use Case: Monitor LLM latency, compare performance across providers and models
Metric: gen_ai.client.time_to_first_token
Type: Delta Histogram with Counter (milliseconds)
Description: Duration until the first token is received in streaming responses
Attributes:
gen_ai.system
: Always "nexus.llm"gen_ai.operation.name
: Always "chat.completions"gen_ai.request.model
: The model identifierclient.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)
Use Case: Monitor streaming response latency, critical for user experience
Metric: gen_ai.client.input.token.usage
Type: Counter
Description: Cumulative count of input tokens consumed
Attributes:
gen_ai.system
: Always "nexus.llm"gen_ai.request.model
: The model identifierclient.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)
Metric: gen_ai.client.output.token.usage
Type: Counter
Description: Cumulative count of output tokens generated
Attributes:
gen_ai.system
: Always "nexus.llm"gen_ai.request.model
: The model identifierclient.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)
Metric: gen_ai.client.total.token.usage
Type: Counter
Description: Cumulative total tokens (input + output)
Attributes:
gen_ai.system
: Always "nexus.llm"gen_ai.request.model
: The model identifierclient.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)
Metrics for Model Context Protocol operations and tool executions. See MCP Configuration for server setup details.
Metric: mcp.tool.call.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of MCP tool invocations including both built-in and downstream tools
Attributes:
tool_name
: Name of the tool being calledtool_type
: Type of tool (builtin
ordownstream
)status
: Operation status (success
orerror
)client.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)- Additional for search operations:
keyword_count
: Number of keywords in search queryresult_count
: Number of results returned
- Additional for execute operations on downstream tools:
server_name
: Name of the downstream MCP server
- Additional for errors:
error.type
: Specific error type:parse_error
- Invalid JSON (-32700)invalid_request
- Not a valid request (-32600)method_not_found
- Method/tool does not exist (-32601)invalid_params
- Invalid method parameters (-32602)internal_error
- Internal server error (-32603)rate_limit_exceeded
- Rate limit hit (-32000)server_error
- Other server errors (-32001 to -32099)unknown
- Any other error code
Use Case: Monitor tool performance, identify slow or failing tools, track usage patterns
Metric: mcp.tools.list.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of listing available tools from MCP servers
Attributes:
method
: Always "list_tools"status
: Operation status (success
orerror
)client.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)
Use Case: Monitor tool discovery performance and server responsiveness
Metric: mcp.prompt.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of prompt-related operations (list/get)
Attributes:
method
: Operation type (list_prompts
orget_prompt
)status
: Operation status (success
orerror
)client.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)- Additional for errors:
error.type
: Same error types as tool call duration
Use Case: Monitor prompt template retrieval and listing performance
Metric: mcp.resource.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of resource-related operations (list/read)
Attributes:
method
: Operation type (list_resources
orread_resource
)status
: Operation status (success
orerror
)client.id
: Client identifier (see Client Identification)client.group
: Client group (see Client Identification)- Additional for errors:
error.type
: Same error types as tool call duration
Use Case: Monitor resource access patterns and performance
- Telemetry Configuration - Complete configuration guide
- Telemetry Overview - All telemetry types
- Server Configuration - Server settings
- LLM Configuration - LLM provider setup
- MCP Configuration - MCP server setup