MillerByte.Logging.Api

Resilience & Observability

Last updated: 1/22/2026

The logging package includes built-in resilience patterns using Polly, health checks for monitoring, and OpenTelemetry integration for distributed tracing.

Polly Resilience Policies

Enable automatic retry and circuit breaker policies for MongoDB operations:

builder.Services.AddApiLogging(options =>
{
    // Enable Polly policies
    options.UsePollyResilience = true;
    
    // Retry configuration
    options.RetryCount = 3;
    options.InitialRetryDelayMs = 100;
    options.MaxRetryDelayMs = 2000;
    options.RetryJitterMaxMs = 100;
    options.UseExponentialBackoff = true;
    
    // Circuit breaker configuration
    options.CircuitBreakerEnabled = true;
    options.CircuitBreakerFailureThreshold = 5;
    options.CircuitBreakerSamplingDurationSeconds = 30;
    options.CircuitBreakerMinimumThroughput = 10;
    options.CircuitBreakerDurationSeconds = 30;
});

Retry Policy Behavior

  • Retries transient MongoDB errors (timeouts, connection failures)
  • Uses exponential backoff with jitter to prevent thundering herd
  • Configurable retry count and delay limits

Circuit Breaker Behavior

  • Opens after configured failure threshold
  • Prevents cascade failures when MongoDB is unavailable
  • Automatically probes for recovery after break duration

Fallback Logging

Continue capturing logs even when MongoDB is unavailable:

options.EnableFallbackLogging = true;
options.FallbackLogFilePath = "/var/log/apilogging-fallback/";

// When MongoDB fails, logs are written to:
// /var/log/apilogging-fallback/actions_{timestamp}.jsonl
// /var/log/apilogging-fallback/sessions_{timestamp}.jsonl

Health Checks

Register health checks for monitoring:

// In Program.cs
builder.Services.AddApiLogging(options => { /* ... */ })
    .AddApiLoggingHealthChecks();

// Configure endpoints
app.MapHealthChecks("/health/live", new HealthCheckOptions
{
    Predicate = _ => false // Liveness check
});

app.MapHealthChecks("/health/ready", new HealthCheckOptions
{
    Predicate = check => check.Tags.Contains("ready")
});

Health Check Details

The health check verifies:

  • MongoDB connection is available
  • Database can be queried
  • Required collections exist

OpenTelemetry Integration

Enable distributed tracing with OpenTelemetry:

builder.Services.AddApiLogging(options =>
{
    options.EnableOpenTelemetry = true;
    
    // Trace/span IDs are automatically captured from Activity.Current
    // and stored with each log entry
});

// OpenTelemetry setup
builder.Services.AddOpenTelemetry()
    .WithTracing(tracing =>
    {
        tracing
            .AddAspNetCoreInstrumentation()
            .AddHttpClientInstrumentation()
            .AddSource("MillerByte.Logging.Api")  // Logging package source
            .AddOtlpExporter();
    });

Captured Tracing Data

Each API action log includes:

  • TraceId - W3C trace ID
  • SpanId - Current span ID
  • ParentSpanId - Parent span ID
  • TraceState - W3C trace state
{
  "TraceId": "abcd1234567890abcd1234567890abcd",
  "SpanId": "1234567890abcdef",
  "ParentSpanId": "fedcba0987654321",
  "TraceState": "congo=t61rcWkgMzE"
}

Background Processing

Configure the background channel for batch log processing:

options.ChannelCapacity = 10000;         // Max queued items
options.MaxBatchSize = 100;              // Batch size for writes
options.BatchFlushIntervalMs = 1000;     // Max wait between flushes
options.ProcessorWorkerCount = 4;        // Parallel workers

Channel Overflow Behavior

// When channel is full:
options.DropLogsOnOverflow = true;   // Discard new logs
// OR
options.DropLogsOnOverflow = false;  // Block until space available

// Log overflow events
// The service will log warnings when items are dropped

MongoDB Connection Resilience

options.MongoClientSettings = new MongoClientSettings
{
    // Connection pooling
    MaxConnectionPoolSize = 100,
    MinConnectionPoolSize = 10,
    
    // Timeouts
    ConnectTimeout = TimeSpan.FromSeconds(10),
    ServerSelectionTimeout = TimeSpan.FromSeconds(30),
    SocketTimeout = TimeSpan.FromSeconds(60),
    
    // Read/Write settings
    ReadPreference = ReadPreference.SecondaryPreferred,
    WriteConcern = WriteConcern.WMajority,
    
    // Retry
    RetryWrites = true,
    RetryReads = true
};

Metrics

The logging service exposes metrics for monitoring:

// Available metrics (when OpenTelemetry is enabled):
// - apilogging_actions_logged_total
// - apilogging_sessions_created_total
// - apilogging_channel_queue_size
// - apilogging_batch_write_duration_ms
// - apilogging_mongodb_errors_total
// - apilogging_circuit_breaker_state

Monitoring Dashboard Setup

Example Prometheus/Grafana queries:

# Actions logged per minute
rate(apilogging_actions_logged_total[1m])

# Queue depth (should stay low)
apilogging_channel_queue_size

# MongoDB error rate
rate(apilogging_mongodb_errors_total[5m])

# Circuit breaker in open state (alert if > 0)
apilogging_circuit_breaker_state == 1

Complete Resilient Configuration

builder.Services.AddApiLogging(options =>
{
    // Connection
    options.ConnectionString = configuration["MongoDB:ConnectionString"];
    options.DatabaseName = "ApiLogs";
    
    // Polly resilience
    options.UsePollyResilience = true;
    options.RetryCount = 3;
    options.UseExponentialBackoff = true;
    options.CircuitBreakerEnabled = true;
    options.CircuitBreakerFailureThreshold = 5;
    options.CircuitBreakerDurationSeconds = 30;
    
    // Fallback
    options.EnableFallbackLogging = true;
    options.FallbackLogFilePath = "/var/log/apilogging/";
    
    // Background processing
    options.ChannelCapacity = 10000;
    options.MaxBatchSize = 100;
    options.BatchFlushIntervalMs = 500;
    options.ProcessorWorkerCount = 4;
    options.DropLogsOnOverflow = true;
    
    // Observability
    options.EnableOpenTelemetry = true;
})
.AddApiLoggingHealthChecks();

// OpenTelemetry
builder.Services.AddOpenTelemetry()
    .WithTracing(tracing =>
    {
        tracing
            .AddAspNetCoreInstrumentation()
            .AddSource("MillerByte.Logging.Api")
            .AddOtlpExporter(otlp =>
            {
                otlp.Endpoint = new Uri("http://otel-collector:4317");
            });
    })
    .WithMetrics(metrics =>
    {
        metrics
            .AddAspNetCoreInstrumentation()
            .AddMeter("MillerByte.Logging.Api")
            .AddOtlpExporter();
    });

Graceful Shutdown

The background processor handles graceful shutdown automatically:

// On application shutdown:
// 1. Channel stops accepting new items
// 2. Remaining items in channel are flushed
// 3. Final batch is written to MongoDB
// 4. Service reports completion

// Configure shutdown timeout
options.ShutdownTimeoutMs = 30000;  // 30 seconds max