Nexus provides comprehensive OpenTelemetry metrics for monitoring all aspects of the system including server operations, LLM interactions, and MCP tool executions. All metrics follow OpenTelemetry semantic conventions and can be exported to any OpenTelemetry-compatible backend.

Notes:

  • All histograms are delta temporality histograms that also function as counters (the count field tracks number of observations)
  • Delta histograms report the change since the last export, not cumulative values
  • Many metrics include client.id and client.group attributes when Client Identification is enabled. These attributes allow you to track usage per client and implement tiered access controls.

To enable metrics, configure telemetry in your nexus.toml:

[telemetry.exporters.otlp] enabled = true endpoint = "http://localhost:4317"

See the complete telemetry configuration guide for all options including:

  • Service identification and resource attributes
  • Protocol selection (gRPC vs HTTP)
  • Batch export optimization
  • Integration examples for popular backends

Metric: http.server.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of HTTP server requests

Attributes:

  • http.request.method: HTTP method (GET, POST, etc.)
  • http.response.status_code: HTTP response status code
  • http.route: The matched route pattern

Use Case: Monitor API latency, identify slow endpoints, track error rates

Available when using Redis as the rate limiting backend. See Rate Limiting Configuration for setup details.

Metric: redis.command.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks execution time of Redis operations

Attributes:

  • operation: Type of Redis operation
    • check_and_consume: HTTP rate limit checking
    • check_and_consume_tokens: Token-based rate limit checking
  • status: Operation status (success or error)
  • tokens: Number of tokens (only for token operations)

Use Case: Monitor Redis performance, identify bottlenecks in rate limiting

Metric: redis.pool.connections.in_use
Type: Gauge
Description: Current number of connections checked out from the pool
Attributes: None
Use Case: Monitor connection pool utilization

Metric: redis.pool.connections.available
Type: Gauge
Description: Current number of connections available in the pool
Attributes: None
Use Case: Ensure adequate pool capacity

Following OpenTelemetry GenAI semantic conventions for AI model operations. See LLM Configuration for provider setup details.

Metric: gen_ai.client.operation.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the total duration of LLM chat completion operations

Attributes:

  • gen_ai.system: Always "nexus.llm"
  • gen_ai.operation.name: Always "chat.completions"
  • gen_ai.request.model: The model identifier (e.g., "openai/gpt-4")
  • gen_ai.response.finish_reason: How the response ended (stop/length/tool_calls/content_filter)
  • client.id: Client identifier (from x-client-id header, see Client Identification)
  • client.group: Client group (from x-client-group header, see Client Identification)
  • error.type: Error type for failed requests:
    • invalid_request - Malformed request
    • authentication_failed - Invalid API key
    • insufficient_quota - Quota exceeded
    • model_not_found - Unknown model
    • rate_limit_exceeded - Provider or token rate limit hit
    • streaming_not_supported - Streaming unavailable for model
    • invalid_model_format - Incorrect model name format
    • provider_not_found - Unknown provider
    • internal_error - Server error
    • provider_api_error - Upstream provider error
    • connection_error - Network failure

Use Case: Monitor LLM latency, compare performance across providers and models

Metric: gen_ai.client.time_to_first_token
Type: Delta Histogram with Counter (milliseconds)
Description: Duration until the first token is received in streaming responses

Attributes:

  • gen_ai.system: Always "nexus.llm"
  • gen_ai.operation.name: Always "chat.completions"
  • gen_ai.request.model: The model identifier
  • client.id: Client identifier (see Client Identification)
  • client.group: Client group (see Client Identification)

Use Case: Monitor streaming response latency, critical for user experience

Metric: gen_ai.client.input.token.usage
Type: Counter
Description: Cumulative count of input tokens consumed

Attributes:

Metric: gen_ai.client.output.token.usage
Type: Counter
Description: Cumulative count of output tokens generated

Attributes:

Metric: gen_ai.client.total.token.usage
Type: Counter
Description: Cumulative total tokens (input + output)

Attributes:

Metrics for Model Context Protocol operations and tool executions. See MCP Configuration for server setup details.

Metric: mcp.tool.call.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of MCP tool invocations including both built-in and downstream tools

Attributes:

  • tool_name: Name of the tool being called
  • tool_type: Type of tool (builtin or downstream)
  • status: Operation status (success or error)
  • client.id: Client identifier (see Client Identification)
  • client.group: Client group (see Client Identification)
  • Additional for search operations:
    • keyword_count: Number of keywords in search query
    • result_count: Number of results returned
  • Additional for execute operations on downstream tools:
    • server_name: Name of the downstream MCP server
  • Additional for errors:
    • error.type: Specific error type:
      • parse_error - Invalid JSON (-32700)
      • invalid_request - Not a valid request (-32600)
      • method_not_found - Method/tool does not exist (-32601)
      • invalid_params - Invalid method parameters (-32602)
      • internal_error - Internal server error (-32603)
      • rate_limit_exceeded - Rate limit hit (-32000)
      • server_error - Other server errors (-32001 to -32099)
      • unknown - Any other error code

Use Case: Monitor tool performance, identify slow or failing tools, track usage patterns

Metric: mcp.tools.list.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of listing available tools from MCP servers

Attributes:

Use Case: Monitor tool discovery performance and server responsiveness

Metric: mcp.prompt.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of prompt-related operations (list/get)

Attributes:

  • method: Operation type (list_prompts or get_prompt)
  • status: Operation status (success or error)
  • client.id: Client identifier (see Client Identification)
  • client.group: Client group (see Client Identification)
  • Additional for errors:
    • error.type: Same error types as tool call duration

Use Case: Monitor prompt template retrieval and listing performance

Metric: mcp.resource.request.duration
Type: Delta Histogram with Counter (milliseconds)
Description: Tracks the duration of resource-related operations (list/read)

Attributes:

  • method: Operation type (list_resources or read_resource)
  • status: Operation status (success or error)
  • client.id: Client identifier (see Client Identification)
  • client.group: Client group (see Client Identification)
  • Additional for errors:
    • error.type: Same error types as tool call duration

Use Case: Monitor resource access patterns and performance

© Grafbase, Inc.