Configure Nexus telemetry to export OpenTelemetry metrics, traces, and logs to your observability backend. Nexus follows OpenTelemetry semantic conventions for consistency with other tools in your monitoring stack.
Enable telemetry in your nexus.toml
:
[telemetry]
service_name = "nexus-production" # Optional, defaults to "nexus"
# Resource attributes for all telemetry
[telemetry.resource_attributes]
environment = "production"
region = "us-east-1"
team = "platform"
# OTLP exporter configuration
[telemetry.exporters.otlp]
enabled = true # Must be true to export metrics
endpoint = "http://localhost:4317" # Your OTLP collector endpoint
protocol = "grpc" # or "http" depending on your setup
timeout = "60s"
# Batch export settings (optional, these are defaults)
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s"
max_queue_size = 2048
max_export_batch_size = 512
max_concurrent_exports = 1
[telemetry]
service_name = "nexus-production" # Identifies your service in metrics
Default: "nexus"
Add metadata that will be attached to all telemetry data:
[telemetry.resource_attributes]
environment = "production" # Environment name
region = "us-east-1" # Geographic region
team = "platform" # Owning team
version = "1.2.3" # Application version
datacenter = "aws-east" # Data center location
These attributes appear as labels in metrics and help with filtering, grouping, and correlation.
The OpenTelemetry Protocol (OTLP) exporter sends data to collectors or backends:
[telemetry.exporters.otlp]
enabled = true # Required to enable export
endpoint = "http://localhost:4317" # OTLP endpoint URL
protocol = "grpc" # Protocol: "grpc" or "http"
timeout = "60s" # Request timeout
- enabled: Must be
true
to activate telemetry export - endpoint: URL of your OTLP receiver (collector, Grafana Agent, etc.)
- protocol:
"grpc"
(default) - More efficient, binary protocol"http"
- Better for proxies and load balancers
- timeout: Maximum time to wait for export requests
Control how telemetry data is batched for export:
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s" # How often to export
max_queue_size = 2048 # Maximum items in queue
max_export_batch_size = 512 # Items per export batch
max_concurrent_exports = 1 # Parallel export requests
- scheduled_delay: Lower values = more real-time, higher network overhead
- max_queue_size: Increase if data is being dropped during spikes
- max_export_batch_size: Larger batches = more efficient, but higher memory usage
- max_concurrent_exports: Usually keep at 1 unless your backend supports high concurrency
Configure distributed tracing:
[telemetry.tracing]
sampling = 0.15 # Sample 15% of requests (0.0 to 1.0)
parent_based_sampler = false # Respect parent's sampling decision (default: false)
# Collection limits (per span)
[telemetry.tracing.collect]
max_events_per_span = 128
max_attributes_per_span = 128
max_links_per_span = 128
max_attributes_per_event = 128
max_attributes_per_link = 128
[telemetry.tracing.propagation]
trace_context = false # W3C Trace Context (default: false)
aws_xray = false # AWS X-Ray format (default: false)
# Override global OTLP exporter for traces (optional)
[telemetry.tracing.exporters.otlp]
enabled = true
endpoint = "http://traces-collector:4317"
protocol = "grpc"
timeout = "30s"
- sampling: Fraction of requests to trace (0.0-1.0)
- Production: 0.01-0.1 (1-10%)
- Development: 1.0 (100%)
- parent_based_sampler: Parent-based sampling strategy (default: false)
- When
true
: Respects upstream service's sampling decision from trace context - When
false
: Uses local sampling ratio regardless of parent trace - Benefits: Ensures complete distributed traces and consistent sampling across services
- When
- collect: Per-span collection limits
max_events_per_span
: Maximum events per span (default: 128)max_attributes_per_span
: Maximum attributes per span (default: 128)max_links_per_span
: Maximum links per span (default: 128)max_attributes_per_event
: Maximum attributes per event (default: 128)max_attributes_per_link
: Maximum attributes per link (default: 128)
- propagation: Context propagation formats
trace_context
: W3C standard (default: false)aws_xray
: For AWS environments (default: false)
- exporters: Override global OTLP settings specifically for traces
Configure metrics export:
# Override global OTLP exporter for metrics (optional)
[telemetry.metrics.exporters.otlp]
enabled = true
endpoint = "http://metrics-collector:4317"
protocol = "grpc" # or "http"
timeout = "30s"
# Batch export settings for metrics (optional)
[telemetry.metrics.exporters.otlp.batch_export]
scheduled_delay = "10s"
max_queue_size = 4096
max_export_batch_size = 1024
max_concurrent_exports = 1
If not specified, metrics will use the global OTLP exporter configuration.
Configure structured log export via OpenTelemetry:
# Override global OTLP exporter for logs (optional)
[telemetry.logs.exporters.otlp]
enabled = true
endpoint = "http://logs-collector:4317"
protocol = "grpc" # or "http"
timeout = "30s"
# Batch export settings for logs (optional)
[telemetry.logs.exporters.otlp.batch_export]
scheduled_delay = "10s" # Batch logs for 10 seconds
max_queue_size = 8192 # Buffer for log spikes
max_export_batch_size = 2048 # Large batches for efficiency
max_concurrent_exports = 1 # Parallel export requests
If not specified, logs will use the global OTLP exporter configuration.
Control log verbosity using the --log
flag or NEXUS_LOG
environment variable:
# Set log level
nexus --log info # Production (default)
nexus --log debug # Development
nexus --log trace # Maximum verbosity
nexus --log off # Disable logging
# Per-module configuration
nexus --log "nexus=debug,tower_http=info"
# Using environment variable
NEXUS_LOG=debug nexus
Control output format with --log-style
or NEXUS_LOG_STYLE
:
nexus --log-style json # Structured JSON output
nexus --log-style color # Colorized terminal output
nexus --log-style text # Plain text output
# Using environment variable
NEXUS_LOG_STYLE=json nexus
See Logs documentation for details on log attributes, correlation, and queries.
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
exporters:
prometheus:
endpoint: "0.0.0.0:8889"
namespace: nexus
const_labels:
environment: production
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [prometheus]
traces:
receivers: [otlp]
exporters: [prometheus] # Or your trace backend
logs:
receivers: [otlp]
exporters: [prometheus] # Or your logs backend
Nexus configuration:
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"
Since custom headers aren't supported directly, use a local collector:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
otlphttp:
endpoint: https://otlp-gateway-prod-us-central-0.grafana.net/otlp
headers:
authorization: Basic ${env:GRAFANA_CLOUD_TOKEN}
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [otlphttp]
traces:
receivers: [otlp]
exporters: [otlphttp]
logs:
receivers: [otlp]
exporters: [otlphttp]
Nexus configuration:
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"
Export via the Datadog Agent with OTLP support:
# datadog.yaml
otlp_config:
receiver:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
Nexus configuration:
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"
Via OpenTelemetry Collector with AWS EMF exporter:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
exporters:
awsemf:
region: us-east-1
namespace: Nexus
dimension_rollup_option: NoDimensionRollup
service:
pipelines:
metrics:
receivers: [otlp]
exporters: [awsemf]
traces:
receivers: [otlp]
exporters: [awsxray] # AWS X-Ray for traces
logs:
receivers: [otlp]
exporters: [awscloudwatchlogs] # CloudWatch for logs
For high-traffic environments, optimize batch settings:
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "10s" # Less frequent exports
max_queue_size = 4096 # Buffer more data
max_export_batch_size = 1024 # Larger batches
max_concurrent_exports = 2 # More parallelism if supported
For near real-time data:
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "1s" # Very frequent exports
max_export_batch_size = 256 # Smaller batches
Minimize resource usage:
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "30s" # Infrequent exports
max_queue_size = 512 # Smaller buffer
max_export_batch_size = 128 # Small batches
-
Check Configuration:
[telemetry.exporters.otlp] enabled = true # Must be explicitly enabled
-
Verify Endpoint Connectivity:
# For gRPC protocol grpcurl -plaintext localhost:4317 list # For HTTP protocol curl -v http://localhost:4318/v1/metrics
-
Enable Debug Logging:
nexus --log debug 2>&1 | grep -i telemetry
- Wrong protocol: Ensure your collector supports the protocol you've configured
- Network issues: Firewall blocking OTLP ports (4317 for gRPC, 4318 for HTTP)
- Resource exhaustion: Queue full due to slow collector or network
- Authentication: Some backends require authentication via collector
- High memory usage: Reduce
max_queue_size
ormax_export_batch_size
- Export failures: Increase
timeout
or check collector capacity - Missing data: Increase
max_queue_size
if queue is overflowing
[telemetry]
service_name = "nexus-dev"
[telemetry.resource_attributes]
environment = "development"
[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s" # Default is fine
[telemetry.tracing]
sampling = 1.0 # Sample everything in dev
parent_based_sampler = false # Don't need parent-based in dev
[telemetry]
service_name = "nexus-prod"
[telemetry.resource_attributes]
environment = "production"
region = "{{ env.AWS_REGION }}"
version = "{{ env.APP_VERSION }}"
[telemetry.exporters.otlp]
enabled = true
endpoint = "{{ env.OTEL_ENDPOINT }}"
protocol = "grpc"
timeout = "30s" # Shorter timeout for prod
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "10s" # Less frequent for efficiency
max_queue_size = 4096 # Handle traffic spikes
max_export_batch_size = 1024 # Efficient batching
[telemetry.tracing]
sampling = 0.1 # Sample 10% of requests
parent_based_sampler = true # Respect upstream sampling for complete traces
[telemetry.tracing.propagation]
trace_context = true # Enable W3C trace context
aws_xray = false # Or true if using AWS
- Network Security: Use TLS-enabled collectors in production
- Data Sensitivity: Be careful with resource attributes - they appear in all metrics
- Access Control: Ensure only authorized services can send to your OTLP endpoint
- Data Retention: Configure appropriate retention policies in your backend
- Telemetry Overview - Understanding available telemetry types
- Metrics - All available metrics and queries
- Traces - Distributed tracing spans and configuration
- Logs - Structured application logs and correlation
- Server Configuration - HTTP server settings
- LLM Configuration - Language model settings
- MCP Configuration - Tool protocol settings