We're excited to announce that Nexus now provides complete observability through OpenTelemetry integration. Starting with metrics in version 0.3.5, followed by distributed tracing in 0.4.0, and completing with logs export in 0.4.1, Nexus delivers monitoring, tracing, and logging capabilities—all through industry standard OpenTelemetry protocols.
OpenTelemetry provides a vendor-neutral, standardized approach to observability that integrates seamlessly with your existing monitoring stack. Whether you're using Prometheus, Grafana Cloud, Datadog, or AWS CloudWatch, Nexus's telemetry data flows directly into your preferred backend with minimal configuration.
Getting started requires just a few lines in your nexus.toml
:
[telemetry]
service_name = "nexus-production"
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://otel-collector:4317"
protocol = "grpc"
timeout = 10000
# Optional: Fine-tune trace sampling
[telemetry.traces]
sample_rate = 0.1 # Sample 10% of requests
This configuration enables all three observability signals—metrics, traces, and logs—sending them to your OpenTelemetry collector for processing.
Our metrics implementation follows OpenTelemetry semantic conventions, providing standardized measurements across three key areas:
Track every interaction with language models through metrics like gen_ai.client.operation.duration
and token usage counters. Monitor time-to-first-token for streaming responses, track token consumption by model and client, and identify performance bottlenecks across providers.
Measure tool performance with mcp.tool.call.duration
and related metrics. Understand which tools are most frequently used, monitor success rates and error patterns, and track search operations with keyword and result count attributes.
Keep tabs on HTTP request latency with http.server.request.duration
and Redis operation health. Monitor connection pool utilization, track rate limiting impact, and ensure optimal backend performance.
Learn more about available metrics and their attributes in our metrics documentation.
Version 0.4.0 introduced distributed tracing that visualizes request flows across your entire AI infrastructure. Each request generates a hierarchical span structure showing:
- HTTP Request Spans: Root spans capturing method, route, status, and client identification
- MCP Operation Spans: Detailed tool interaction tracking with authentication context
- LLM Operation Spans: Model parameters, token usage, and response details
- Redis Operation Spans: Rate limiting checks and connection pool metrics
Configure sampling rates from 0.0 to 1.0 to balance observability with overhead:
[telemetry.traces]
sample_rate = 0.05 # 5% sampling for production
max_events_per_span = 128
max_attributes_per_span = 128
The tracing system maintains W3C trace context propagation, ensuring spans remain connected across service boundaries. Explore configuration options in our tracing guide.
The latest release completes the observability trinity with structured log export. Every log entry is automatically enriched with trace correlation when emitted within an active span:
- Automatic Trace Correlation: Logs include trace_id and span_id for request tracking
- Rich Attributes: Source location, module path, and custom resource attributes
Control log verbosity and export separately if needed:
# Optional: Separate log export endpoint
[telemetry.logs.exporters.otlp]
enabled = true
endpoint = "http://loki-otlp:4318"
protocol = "http"
Detailed configuration options are available in our logs documentation.
All telemetry features are designed for production use:
- Compile-time optimization: Zero overhead when telemetry is disabled
- Efficient batching: Asynchronous export with configurable batch settings
- Smart defaults: Delta temporality histograms and automatic high-cardinality limiting
- Flexible backends: Works with any OTLP-compatible collector
These observability features provide the foundation for advanced capabilities we're building, including automatic anomaly detection, cost tracking per model and client, and intelligent alerting based on usage patterns. The OpenTelemetry integration ensures Nexus fits seamlessly into your existing observability stack while providing AI-specific insights that generic monitoring tools miss.
To get started with the latest version:
docker pull ghcr.io/grafbase/nexus:stable