Telemetry Configuration

Configure Nexus telemetry to export OpenTelemetry metrics, traces, and logs to your observability backend. Nexus follows OpenTelemetry semantic conventions for consistency with other tools in your monitoring stack.

Basic Configuration

Enable telemetry in your nexus.toml:


[telemetry]
service_name = "nexus-production"  # Optional, defaults to "nexus"

# Resource attributes for all telemetry
[telemetry.resource_attributes]
environment = "production"
region = "us-east-1"
team = "platform"

# OTLP exporter configuration
[telemetry.exporters.otlp]
enabled = true                       # Must be true to export metrics
endpoint = "http://localhost:4317"   # Your OTLP collector endpoint
protocol = "grpc"                    # or "http" depending on your setup
timeout = "60s"

# Optional: Additional headers for authentication
# For gRPC protocol:
[telemetry.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.OTLP_TOKEN }}"
x-nexus-shard = "primary"

# Optional: TLS configuration for gRPC
# [telemetry.exporters.otlp.grpc.tls]
# domain_name = "collector.example.com"
# ca = "/path/to/ca.crt"

# For HTTP protocol (use otlp.http.headers instead):
# [telemetry.exporters.otlp.http.headers]
# authorization = "Bearer {{ env.OTLP_TOKEN }}"
# x-nexus-shard = "primary"

# Batch export settings (optional, these are defaults)
[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s"
max_queue_size = 2048
max_export_batch_size = 512
max_concurrent_exports = 1

Configuration Options

Service Identification


[telemetry]
service_name = "nexus-production"  # Identifies your service in metrics

Default: "nexus"

Resource Attributes

Add metadata that will be attached to all telemetry data:


[telemetry.resource_attributes]
environment = "production"     # Environment name
region = "us-east-1"          # Geographic region
team = "platform"             # Owning team
version = "1.2.3"             # Application version
datacenter = "aws-east"       # Data center location

These attributes appear as labels in metrics and help with filtering, grouping, and correlation.

OTLP Exporter

The OpenTelemetry Protocol (OTLP) exporter sends data to collectors or backends:


[telemetry.exporters.otlp]
enabled = true                       # Required to enable export
endpoint = "http://localhost:4317"   # OTLP endpoint URL
protocol = "grpc"                    # Protocol: "grpc" or "http"
timeout = "60s"                      # Request timeout

# Optional: Custom headers for authentication (protocol-specific)
# For gRPC:
[telemetry.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.OTLP_TOKEN }}"
x-custom-header = "value"

# Optional: TLS configuration for gRPC
[telemetry.exporters.otlp.grpc.tls]
domain_name = "custom_name"       # Override server name for TLS verification
key = "/path/to/key.pem"          # Client certificate key
cert = "/path/to/cert.pem"        # Client certificate
ca = "/path/to/ca.crt"            # Custom CA certificate

# For HTTP:
# [telemetry.exporters.otlp.http.headers]
# authorization = "Bearer {{ env.OTLP_TOKEN }}"
# x-custom-header = "value"

Configuration Options:

enabled: Must be true to activate telemetry export
endpoint: URL of your OTLP receiver (collector, Grafana Agent, etc.)
protocol:
- "grpc" (default) - More efficient, binary protocol
- "http" - Better for proxies and load balancers
timeout: Maximum time to wait for export requests
headers (optional): Custom headers for authentication or routing
- Supports environment variable substitution with {{ env.VAR_NAME }}
- Headers are applied to all requests for this exporter
- Protocol-specific header validation applies (e.g., gRPC metadata rules)

Header and TLS Configuration

Custom headers and TLS can be configured for authentication and secure communication:

Headers

Headers must be configured under the protocol-specific section:


# For gRPC protocol (when protocol = "grpc")
[telemetry.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.OTLP_TOKEN }}"    # Authentication
x-routing-key = "nexus-prod"                     # Custom routing
x-tenant-id = "{{ env.TENANT_ID }}"             # Multi-tenancy

# For HTTP protocol (when protocol = "http")
[telemetry.exporters.otlp.http.headers]
Authorization = "Bearer {{ env.OTLP_TOKEN }}"    # HTTP Authorization header
X-Routing-Key = "nexus-prod"                     # Custom routing
X-Tenant-Id = "{{ env.TENANT_ID }}"             # Multi-tenancy

TLS Configuration (gRPC only)

For secure gRPC connections, configure TLS certificates:


[telemetry.exporters.otlp.grpc.tls]
domain_name = "collector.example.com"  # Server name for TLS verification
key = "/etc/nexus/certs/client.key"    # Client certificate key (for mTLS)
cert = "/etc/nexus/certs/client.crt"   # Client certificate (for mTLS)
ca = "/etc/nexus/certs/ca.crt"         # Custom CA certificate

# Or use environment variables for paths
[telemetry.exporters.otlp.grpc.tls]
domain_name = "{{ env.OTLP_TLS_DOMAIN }}"
key = "{{ env.CLIENT_KEY_PATH }}"
cert = "{{ env.CLIENT_CERT_PATH }}"
ca = "{{ env.CA_CERT_PATH }}"

Configuration Guidelines:

Environment Variables: Use {{ env.VAR_NAME }} for sensitive values
Protocol Rules:
- gRPC: Headers become metadata, cannot start with "grpc-" (reserved)
- HTTP: Standard HTTP header rules apply
TLS Options:
- domain_name: Override the server name used for TLS verification
- key + cert: Enable mutual TLS (mTLS) authentication
- ca: Use custom CA certificate instead of system CA bundle
Inheritance: Signal-specific exporters (traces/metrics/logs) can override global settings
Security: Keep tokens and certificate paths in environment variables

Batch Export Settings

Control how telemetry data is batched for export:


[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s"               # How often to export
max_queue_size = 2048                # Maximum items in queue
max_export_batch_size = 512          # Items per export batch
max_concurrent_exports = 1           # Parallel export requests

Batch Configuration Guidelines:

scheduled_delay: Lower values = more real-time, higher network overhead
max_queue_size: Increase if data is being dropped during spikes
max_export_batch_size: Larger batches = more efficient, but higher memory usage
max_concurrent_exports: Usually keep at 1 unless your backend supports high concurrency

Tracing Configuration

Configure distributed tracing:


[telemetry.tracing]
sampling = 0.15                   # Sample 15% of requests (0.0 to 1.0)
parent_based_sampler = false      # Respect parent's sampling decision (default: false)

# Collection limits (per span)
[telemetry.tracing.collect]
max_events_per_span = 128
max_attributes_per_span = 128
max_links_per_span = 128
max_attributes_per_event = 128
max_attributes_per_link = 128

[telemetry.tracing.propagation]
trace_context = false             # W3C Trace Context (default: false)
aws_xray = false                  # AWS X-Ray format (default: false)

# Override global OTLP exporter for traces (optional)
[telemetry.tracing.exporters.otlp]
enabled = true
endpoint = "http://traces-collector:4317"
protocol = "grpc"
timeout = "30s"

# Optional: Headers specific to trace export (protocol-specific)
# For gRPC:
[telemetry.tracing.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.TRACE_TOKEN }}"
x-trace-priority = "high"

# For HTTP:
# [telemetry.tracing.exporters.otlp.http.headers]
# authorization = "Bearer {{ env.TRACE_TOKEN }}"
# x-trace-priority = "high"

Tracing Options:

sampling: Fraction of requests to trace (0.0-1.0)
- Production: 0.01-0.1 (1-10%)
- Development: 1.0 (100%)
parent_based_sampler: Parent-based sampling strategy (default: false)
- When true: Respects upstream service's sampling decision from trace context
- When false: Uses local sampling ratio regardless of parent trace
- Benefits: Ensures complete distributed traces and consistent sampling across services
collect: Per-span collection limits
- max_events_per_span: Maximum events per span (default: 128)
- max_attributes_per_span: Maximum attributes per span (default: 128)
- max_links_per_span: Maximum links per span (default: 128)
- max_attributes_per_event: Maximum attributes per event (default: 128)
- max_attributes_per_link: Maximum attributes per link (default: 128)
propagation: Context propagation formats
- trace_context: W3C standard (default: false)
- aws_xray: For AWS environments (default: false)
exporters: Override global OTLP settings specifically for traces

Metrics Configuration

Configure metrics export:


# Override global OTLP exporter for metrics (optional)
[telemetry.metrics.exporters.otlp]
enabled = true
endpoint = "http://metrics-collector:4317"
protocol = "grpc"    # or "http"
timeout = "30s"

# Optional: Headers specific to metrics export (protocol-specific)
# For gRPC:
[telemetry.metrics.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.METRICS_TOKEN }}"

# For HTTP:
# [telemetry.metrics.exporters.otlp.http.headers]
# authorization = "Bearer {{ env.METRICS_TOKEN }}"

# Batch export settings for metrics (optional)
[telemetry.metrics.exporters.otlp.batch_export]
scheduled_delay = "10s"
max_queue_size = 4096
max_export_batch_size = 1024
max_concurrent_exports = 1

If not specified, metrics will use the global OTLP exporter configuration.

Logs Configuration

Configure structured log export via OpenTelemetry:


# Override global OTLP exporter for logs (optional)
[telemetry.logs.exporters.otlp]
enabled = true
endpoint = "http://logs-collector:4317"
protocol = "grpc"    # or "http"
timeout = "30s"

# Optional: Headers specific to logs export (protocol-specific)
# For gRPC:
[telemetry.logs.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.LOGS_TOKEN }}"

# For HTTP:
# [telemetry.logs.exporters.otlp.http.headers]
# authorization = "Bearer {{ env.LOGS_TOKEN }}"

# Batch export settings for logs (optional)
[telemetry.logs.exporters.otlp.batch_export]
scheduled_delay = "10s"               # Batch logs for 10 seconds
max_queue_size = 8192                # Buffer for log spikes
max_export_batch_size = 2048         # Large batches for efficiency
max_concurrent_exports = 1           # Parallel export requests

If not specified, logs will use the global OTLP exporter configuration.

Log Level Control:

Control log verbosity using the --log flag or NEXUS_LOG environment variable:


# Set log level
nexus --log info           # Production (default)
nexus --log debug          # Development
nexus --log trace          # Maximum verbosity
nexus --log off            # Disable logging

# Per-module configuration
nexus --log "nexus=debug,tower_http=info"

# Using environment variable
NEXUS_LOG=debug nexus

Control output format with --log-style or NEXUS_LOG_STYLE:


nexus --log-style json     # Structured JSON output
nexus --log-style color    # Colorized terminal output
nexus --log-style text     # Plain text output

# Using environment variable
NEXUS_LOG_STYLE=json nexus

See Logs documentation for details on log attributes, correlation, and queries.

Integration Examples

Prometheus via OpenTelemetry Collector


# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"
    namespace: nexus
    const_labels:
      environment: production

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]
    traces:
      receivers: [otlp]
      exporters: [prometheus]  # Or your trace backend
    logs:
      receivers: [otlp]
      exporters: [prometheus]  # Or your logs backend

Nexus configuration:


[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"

Grafana Cloud

You can now send telemetry directly to Grafana Cloud using custom headers:


# Direct connection to Grafana Cloud
[telemetry.exporters.otlp]
enabled = true
endpoint = "https://otlp-gateway-prod-us-central-0.grafana.net/otlp"
protocol = "http"
timeout = "30s"

[telemetry.exporters.otlp.http.headers]
authorization = "Basic {{ env.GRAFANA_CLOUD_TOKEN }}"

Alternatively, you can still use a local collector if you need additional processing:


# otel-collector-config.yaml (optional)
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  otlphttp:
    endpoint: https://otlp-gateway-prod-us-central-0.grafana.net/otlp
    headers:
      authorization: Basic ${env:GRAFANA_CLOUD_TOKEN}

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [otlphttp]
    traces:
      receivers: [otlp]
      exporters: [otlphttp]
    logs:
      receivers: [otlp]
      exporters: [otlphttp]

Datadog

Export via the Datadog Agent with OTLP support:


# datadog.yaml
otlp_config:
  receiver:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

Nexus configuration:


[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"

AWS CloudWatch

Via OpenTelemetry Collector with AWS EMF exporter:


# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  awsemf:
    region: us-east-1
    namespace: Nexus
    dimension_rollup_option: NoDimensionRollup

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [awsemf]
    traces:
      receivers: [otlp]
      exporters: [awsxray]  # AWS X-Ray for traces
    logs:
      receivers: [otlp]
      exporters: [awscloudwatchlogs]  # CloudWatch for logs

Performance Tuning

High Volume Deployments

For high-traffic environments, optimize batch settings:


[telemetry.exporters.otlp.batch_export]
scheduled_delay = "10s"      # Less frequent exports
max_queue_size = 4096        # Buffer more data
max_export_batch_size = 1024 # Larger batches
max_concurrent_exports = 2   # More parallelism if supported

Low Latency Requirements

For near real-time data:


[telemetry.exporters.otlp.batch_export]
scheduled_delay = "1s"       # Very frequent exports
max_export_batch_size = 256  # Smaller batches

Resource Constrained Environments

Minimize resource usage:


[telemetry.exporters.otlp.batch_export]
scheduled_delay = "30s"      # Infrequent exports
max_queue_size = 512         # Smaller buffer
max_export_batch_size = 128  # Small batches

Troubleshooting

Telemetry Not Working

Check Configuration:


[telemetry.exporters.otlp]
enabled = true  # Must be explicitly enabled

Verify Endpoint Connectivity:


# For gRPC protocol
grpcurl -plaintext localhost:4317 list

# For HTTP protocol
curl -v http://localhost:4318/v1/metrics

Enable Debug Logging:


nexus --log debug 2>&1 | grep -i telemetry

Common Configuration Errors

Wrong protocol: Ensure your collector supports the protocol you've configured
Network issues: Firewall blocking OTLP ports (4317 for gRPC, 4318 for HTTP)
Resource exhaustion: Queue full due to slow collector or network
Authentication: Some backends require authentication via collector

Performance Issues

High memory usage: Reduce max_queue_size or max_export_batch_size
Export failures: Increase timeout or check collector capacity
Missing data: Increase max_queue_size if queue is overflowing

Environment-Specific Configurations

Development


[telemetry]
service_name = "nexus-dev"

[telemetry.resource_attributes]
environment = "development"

[telemetry.exporters.otlp]
enabled = true
endpoint = "http://localhost:4317"
protocol = "grpc"

[telemetry.exporters.otlp.batch_export]
scheduled_delay = "5s"  # Default is fine

[telemetry.tracing]
sampling = 1.0  # Sample everything in dev
parent_based_sampler = false  # Don't need parent-based in dev

Production


[telemetry]
service_name = "nexus-prod"

[telemetry.resource_attributes]
environment = "production"
region = "{{ env.AWS_REGION }}"
version = "{{ env.APP_VERSION }}"

[telemetry.exporters.otlp]
enabled = true
endpoint = "{{ env.OTEL_ENDPOINT }}"
protocol = "grpc"
timeout = "30s"  # Shorter timeout for prod

# Authentication headers for production (gRPC)
[telemetry.exporters.otlp.grpc.headers]
authorization = "Bearer {{ env.OTEL_AUTH_TOKEN }}"
x-environment = "production"

# TLS configuration for secure connections (optional)
[telemetry.exporters.otlp.grpc.tls]
domain_name = "{{ env.OTEL_TLS_DOMAIN }}"  # e.g., "telemetry.company.com"
ca = "{{ env.CA_CERT_PATH }}"              # Custom CA if needed
# For mTLS authentication:
# key = "{{ env.CLIENT_KEY_PATH }}"
# cert = "{{ env.CLIENT_CERT_PATH }}"

[telemetry.exporters.otlp.batch_export]
scheduled_delay = "10s"      # Less frequent for efficiency
max_queue_size = 4096        # Handle traffic spikes
max_export_batch_size = 1024 # Efficient batching

[telemetry.tracing]
sampling = 0.1  # Sample 10% of requests
parent_based_sampler = true  # Respect upstream sampling for complete traces

[telemetry.tracing.propagation]
trace_context = true  # Enable W3C trace context
aws_xray = false      # Or true if using AWS

Security Considerations

Network Security: Use TLS-enabled collectors in production
Data Sensitivity: Be careful with resource attributes - they appear in all metrics
Access Control: Ensure only authorized services can send to your OTLP endpoint
Data Retention: Configure appropriate retention policies in your backend

Telemetry Overview - Understanding available telemetry types
Metrics - All available metrics and queries
Traces - Distributed tracing spans and configuration
Logs - Structured application logs and correlation
Server Configuration - HTTP server settings
LLM Configuration - Language model settings
MCP Configuration - Tool protocol settings