Learn-skills.dev logging

Logging setup, structured logging, and log management. Use when user asks to "add logging", "set up structured logging", "configure log levels", "create a logger", "set up log rotation", "send logs to ELK", "configure Winston", "set up Pino", "add request logging", "implement audit logging", "log formatting", "log correlation", "debug logging", "log sampling", "log filtering", or mentions logging best practices, structured logging, log aggregation, log levels, observability, log rotation, or centralized logging.

install

source · Clone the upstream repo

git clone https://github.com/NeverSight/learn-skills.dev

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/NeverSight/learn-skills.dev "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/skills-md/1mangesh1/dev-skills-collection/logging" ~/.claude/skills/neversight-learn-skills-dev-logging && rm -rf "$T"

manifest: data/skills-md/1mangesh1/dev-skills-collection/logging/SKILL.md

source content

Logging

Comprehensive guide to application logging: structured logging, log levels, library choices, centralized aggregation, security considerations, and production-grade patterns across languages and frameworks.

Logging Levels

Standard logging levels from least to most severe:

Level	Value	Purpose	Example
TRACE	10	Ultra-fine-grained diagnostic detail	Entering/exiting functions, variable values
DEBUG	20	Diagnostic information for developers	SQL queries, cache lookups, parsed configs
INFO	30	Normal operational events	Server started, user logged in, job completed
WARN	40	Unexpected but recoverable situations	Deprecated API used, retry attempt, slow query
ERROR	50	Failure in a specific operation	Database connection failed, API call timeout
FATAL	60	System-wide unrecoverable failure	Out of memory, missing critical config, corrupted state

Level Selection Guidelines

Production: Set to INFO (or WARN for high-throughput services).
Staging: Set to DEBUG to catch issues before production.
Development: Set to TRACE for maximum visibility.
Per-module overrides: Allow specific modules to log at a finer level without flooding the entire log.
Dynamic level changes: Support runtime log level adjustment without restarts (e.g., via admin API or environment variable reload).

Structured Logging (JSON Format)

Always use structured logging in production. Plain text logs are difficult to parse, filter, and aggregate.

Structured Log Entry Example

{
  "timestamp": "2025-09-15T14:23:07.412Z",
  "level": "ERROR",
  "logger": "com.myapp.UserService",
  "message": "Failed to fetch user profile",
  "service": "user-service",
  "environment": "production",
  "version": "2.4.1",
  "host": "ip-10-0-3-42",
  "correlationId": "req_a1b2c3d4e5",
  "traceId": "4bf92f3577b34da6a3ce929d0e0e4736",
  "spanId": "00f067aa0ba902b7",
  "userId": "usr_88291",
  "error": {
    "type": "DatabaseTimeoutError",
    "message": "Connection timed out after 5000ms",
    "stack": "DatabaseTimeoutError: Connection timed out...\n    at Pool.query (/app/db.js:42:11)"
  },
  "duration_ms": 5023,
  "metadata": {
    "db_host": "primary-rds.us-east-1",
    "retry_count": 3
  }
}

Why Structured Logging Matters

Machine-parseable: Log aggregation tools can index and query fields directly.
Consistent schema: Every log entry follows the same shape, enabling reliable alerting.
Contextual richness: Attach arbitrary metadata (user ID, request ID, tenant ID) to each entry.
Filterable: Query logs by level, service, user, time range, or any field combination.

Node.js Logging

Winston

Winston is the most popular Node.js logging library with transport-based architecture.

// logger.js - Winston configuration
const winston = require('winston');

const logger = winston.createLogger({
  level: process.env.LOG_LEVEL || 'info',
  format: winston.format.combine(
    winston.format.timestamp({ format: 'YYYY-MM-DDTHH:mm:ss.SSSZ' }),
    winston.format.errors({ stack: true }),
    winston.format.json()
  ),
  defaultMeta: {
    service: process.env.SERVICE_NAME || 'my-app',
    environment: process.env.NODE_ENV || 'development',
    version: process.env.APP_VERSION || '0.0.0',
  },
  transports: [
    // Write errors to a dedicated file
    new winston.transports.File({
      filename: 'logs/error.log',
      level: 'error',
      maxsize: 10 * 1024 * 1024, // 10MB
      maxFiles: 5,
      tailable: true,
    }),
    // Write all logs to combined file
    new winston.transports.File({
      filename: 'logs/combined.log',
      maxsize: 50 * 1024 * 1024, // 50MB
      maxFiles: 10,
      tailable: true,
    }),
  ],
});

// In development, also log to console with colorized output
if (process.env.NODE_ENV !== 'production') {
  logger.add(new winston.transports.Console({
    format: winston.format.combine(
      winston.format.colorize(),
      winston.format.simple()
    ),
  }));
}

module.exports = logger;

Pino

Pino is the fastest Node.js logger, optimized for low overhead in production.

// logger.js - Pino configuration
const pino = require('pino');

const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
  formatters: {
    level(label) {
      return { level: label };
    },
    bindings(bindings) {
      return {
        pid: bindings.pid,
        host: bindings.hostname,
        service: process.env.SERVICE_NAME || 'my-app',
      };
    },
  },
  timestamp: pino.stdTimeFunctions.isoTime,
  redact: {
    paths: ['req.headers.authorization', 'req.headers.cookie', 'body.password',
            'body.ssn', 'body.creditCard', '*.token', '*.secret'],
    censor: '[REDACTED]',
  },
  serializers: {
    err: pino.stdSerializers.err,
    req: pino.stdSerializers.req,
    res: pino.stdSerializers.res,
  },
});

module.exports = logger;

Pino with Express Middleware

const pinoHttp = require('pino-http');
const logger = require('./logger');

app.use(pinoHttp({
  logger,
  autoLogging: {
    ignore: (req) => req.url === '/health' || req.url === '/ready',
  },
  customLogLevel: (req, res, err) => {
    if (res.statusCode >= 500 || err) return 'error';
    if (res.statusCode >= 400) return 'warn';
    return 'info';
  },
  customSuccessMessage: (req, res) => {
    return `${req.method} ${req.url} completed with ${res.statusCode}`;
  },
  customErrorMessage: (req, res, err) => {
    return `${req.method} ${req.url} failed with ${res.statusCode}: ${err.message}`;
  },
  customProps: (req) => ({
    correlationId: req.headers['x-correlation-id'] || req.id,
  }),
}));

Bunyan

const bunyan = require('bunyan');

const logger = bunyan.createLogger({
  name: 'my-app',
  level: process.env.LOG_LEVEL || 'info',
  serializers: bunyan.stdSerializers,
  streams: [
    { level: 'info', stream: process.stdout },
    { level: 'error', path: 'logs/error.log' },
    {
      level: 'debug',
      type: 'rotating-file',
      path: 'logs/debug.log',
      period: '1d',
      count: 7,
    },
  ],
});

Python Logging

Standard Library logging

# logging_config.py
import logging
import logging.config
import json
import sys
from datetime import datetime, timezone


class JSONFormatter(logging.Formatter):
    """Custom formatter that outputs structured JSON log entries."""

    def format(self, record):
        log_entry = {
            "timestamp": datetime.now(timezone.utc).isoformat(),
            "level": record.levelname,
            "logger": record.name,
            "message": record.getMessage(),
            "module": record.module,
            "function": record.funcName,
            "line": record.lineno,
        }
        if record.exc_info and record.exc_info[0] is not None:
            log_entry["exception"] = {
                "type": record.exc_info[0].__name__,
                "message": str(record.exc_info[1]),
                "traceback": self.formatException(record.exc_info),
            }
        # Include any extra fields passed via the `extra` parameter
        for key, value in record.__dict__.items():
            if key not in logging.LogRecord(
                "", 0, "", 0, "", (), None
            ).__dict__ and key not in ("message", "asctime"):
                log_entry[key] = value
        return json.dumps(log_entry)


LOGGING_CONFIG = {
    "version": 1,
    "disable_existing_loggers": False,
    "formatters": {
        "json": {
            "()": JSONFormatter,
        },
        "standard": {
            "format": "%(asctime)s [%(levelname)s] %(name)s: %(message)s",
        },
    },
    "handlers": {
        "console": {
            "class": "logging.StreamHandler",
            "formatter": "json",
            "stream": "ext://sys.stdout",
        },
        "file": {
            "class": "logging.handlers.RotatingFileHandler",
            "formatter": "json",
            "filename": "logs/app.log",
            "maxBytes": 10485760,  # 10MB
            "backupCount": 5,
        },
        "error_file": {
            "class": "logging.handlers.RotatingFileHandler",
            "formatter": "json",
            "filename": "logs/error.log",
            "maxBytes": 10485760,
            "backupCount": 5,
            "level": "ERROR",
        },
    },
    "root": {
        "level": "INFO",
        "handlers": ["console", "file", "error_file"],
    },
}

logging.config.dictConfig(LOGGING_CONFIG)
logger = logging.getLogger(__name__)

structlog

import structlog
import logging

structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.StackInfoRenderer(),
        structlog.dev.set_exc_info,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer(),
    ],
    wrapper_class=structlog.make_filtering_bound_logger(logging.INFO),
    context_class=dict,
    logger_factory=structlog.PrintLoggerFactory(),
    cache_logger_on_first_use=True,
)

log = structlog.get_logger()

# Usage with bound context
request_log = log.bind(
    correlation_id="req_a1b2c3",
    user_id="usr_88291",
    service="user-service",
)
request_log.info("processing_request", endpoint="/api/users", method="GET")
request_log.error("request_failed", error="timeout", duration_ms=5023)

Loguru

from loguru import logger
import sys

# Remove default handler and configure structured output
logger.remove()
logger.add(
    sys.stdout,
    format="{time:YYYY-MM-DDTHH:mm:ss.SSS}Z | {level: <8} | {name}:{function}:{line} | {message}",
    level="INFO",
    serialize=True,  # Output as JSON
)
logger.add(
    "logs/app.log",
    rotation="100 MB",
    retention="30 days",
    compression="gz",
    serialize=True,
    level="DEBUG",
)
logger.add(
    "logs/error.log",
    rotation="50 MB",
    retention="90 days",
    compression="gz",
    level="ERROR",
    backtrace=True,
    diagnose=True,
)

# Usage
logger.info("User logged in", user_id="usr_88291", ip="192.168.1.10")

Java Logging

SLF4J + Logback

<!-- logback.xml -->
<configuration>
  <appender name="CONSOLE" class="ch.qos.logback.core.ConsoleAppender">
    <encoder class="net.logstash.logback.encoder.LogstashEncoder">
      <includeMdcKeyName>correlationId</includeMdcKeyName>
      <includeMdcKeyName>userId</includeMdcKeyName>
      <customFields>{"service":"user-service","environment":"${ENV:-dev}"}</customFields>
    </encoder>
  </appender>

  <appender name="FILE" class="ch.qos.logback.core.rolling.RollingFileAppender">
    <file>logs/application.log</file>
    <rollingPolicy class="ch.qos.logback.core.rolling.SizeAndTimeBasedRollingPolicy">
      <fileNamePattern>logs/application.%d{yyyy-MM-dd}.%i.log.gz</fileNamePattern>
      <maxFileSize>100MB</maxFileSize>
      <maxHistory>30</maxHistory>
      <totalSizeCap>3GB</totalSizeCap>
    </rollingPolicy>
    <encoder class="net.logstash.logback.encoder.LogstashEncoder" />
  </appender>

  <root level="INFO">
    <appender-ref ref="CONSOLE" />
    <appender-ref ref="FILE" />
  </root>

  <!-- Per-package log levels -->
  <logger name="com.myapp.repository" level="DEBUG" />
  <logger name="org.hibernate.SQL" level="DEBUG" />
  <logger name="org.springframework.web" level="WARN" />
</configuration>

// Usage with SLF4J
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.slf4j.MDC;

public class UserService {
    private static final Logger log = LoggerFactory.getLogger(UserService.class);

    public User getUser(String userId) {
        MDC.put("userId", userId);
        MDC.put("correlationId", RequestContext.getCorrelationId());
        try {
            log.info("Fetching user profile");
            User user = userRepository.findById(userId);
            log.debug("User fetched successfully, roles={}", user.getRoles());
            return user;
        } catch (Exception e) {
            log.error("Failed to fetch user profile", e);
            throw e;
        } finally {
            MDC.clear();
        }
    }
}

Log4j2

<!-- log4j2.xml -->
<Configuration status="WARN">
  <Appenders>
    <Console name="Console" target="SYSTEM_OUT">
      <JsonLayout compact="true" eventEol="true" stacktraceAsString="true">
        <KeyValuePair key="service" value="${env:SERVICE_NAME:-my-app}" />
        <KeyValuePair key="environment" value="${env:ENV:-dev}" />
      </JsonLayout>
    </Console>
    <RollingFile name="File" fileName="logs/app.log"
                 filePattern="logs/app-%d{yyyy-MM-dd}-%i.log.gz">
      <JsonLayout compact="true" eventEol="true" />
      <Policies>
        <SizeBasedTriggeringPolicy size="100MB" />
        <TimeBasedTriggeringPolicy interval="1" modulate="true" />
      </Policies>
      <DefaultRolloverStrategy max="30" />
    </RollingFile>
  </Appenders>
  <Loggers>
    <Root level="info">
      <AppenderRef ref="Console" />
      <AppenderRef ref="File" />
    </Root>
  </Loggers>
</Configuration>

Go Logging

slog (Standard Library, Go 1.21+)

package main

import (
    "context"
    "log/slog"
    "os"
)

func setupLogger() *slog.Logger {
    opts := &slog.HandlerOptions{
        Level:     slog.LevelInfo,
        AddSource: true,
        ReplaceAttr: func(groups []string, a slog.Attr) slog.Attr {
            // Redact sensitive fields
            if a.Key == "password" || a.Key == "token" || a.Key == "secret" {
                a.Value = slog.StringValue("[REDACTED]")
            }
            return a
        },
    }
    handler := slog.NewJSONHandler(os.Stdout, opts)
    return slog.New(handler)
}

func main() {
    logger := setupLogger()
    slog.SetDefault(logger)

    // Basic logging
    slog.Info("server starting", "port", 8080, "version", "2.4.1")

    // With context and groups
    ctx := context.Background()
    slog.InfoContext(ctx, "processing request",
        slog.Group("request",
            slog.String("method", "GET"),
            slog.String("path", "/api/users"),
            slog.String("correlationId", "req_a1b2c3"),
        ),
        slog.Group("user",
            slog.String("id", "usr_88291"),
        ),
    )
}

Zap (Uber)

package main

import (
    "go.uber.org/zap"
    "go.uber.org/zap/zapcore"
    "time"
)

func NewLogger() (*zap.Logger, error) {
    config := zap.Config{
        Level:            zap.NewAtomicLevelAt(zap.InfoLevel),
        Development:      false,
        Encoding:         "json",
        EncoderConfig: zapcore.EncoderConfig{
            TimeKey:        "timestamp",
            LevelKey:       "level",
            NameKey:        "logger",
            CallerKey:      "caller",
            MessageKey:     "message",
            StacktraceKey:  "stacktrace",
            LineEnding:     zapcore.DefaultLineEnding,
            EncodeLevel:    zapcore.LowercaseLevelEncoder,
            EncodeTime:     zapcore.ISO8601TimeEncoder,
            EncodeDuration: zapcore.MillisDurationEncoder,
            EncodeCaller:   zapcore.ShortCallerEncoder,
        },
        OutputPaths:      []string{"stdout", "logs/app.log"},
        ErrorOutputPaths:  []string{"stderr"},
        InitialFields: map[string]interface{}{
            "service": "user-service",
        },
    }
    return config.Build()
}

func main() {
    logger, _ := NewLogger()
    defer logger.Sync()

    logger.Info("request processed",
        zap.String("method", "GET"),
        zap.String("path", "/api/users"),
        zap.Duration("latency", 42*time.Millisecond),
        zap.Int("status", 200),
    )
}

Zerolog

package main

import (
    "os"
    "time"

    "github.com/rs/zerolog"
    "github.com/rs/zerolog/log"
)

func init() {
    zerolog.TimeFieldFormat = time.RFC3339Nano
    zerolog.SetGlobalLevel(zerolog.InfoLevel)

    log.Logger = zerolog.New(os.Stdout).
        With().
        Timestamp().
        Str("service", "user-service").
        Str("version", "2.4.1").
        Caller().
        Logger()
}

func main() {
    log.Info().
        Str("method", "GET").
        Str("path", "/api/users").
        Dur("latency", 42*time.Millisecond).
        Int("status", 200).
        Msg("request processed")
}

Request/Response Logging Middleware

Express.js Middleware

const { v4: uuidv4 } = require('uuid');
const { AsyncLocalStorage } = require('async_hooks');
const logger = require('./logger');

const asyncLocalStorage = new AsyncLocalStorage();

function requestLoggingMiddleware(req, res, next) {
  const correlationId = req.headers['x-correlation-id'] || uuidv4();
  const startTime = process.hrtime.bigint();

  // Store context for the lifetime of this request
  const context = { correlationId, startTime };
  req.correlationId = correlationId;
  res.setHeader('X-Correlation-Id', correlationId);

  // Log the incoming request
  logger.info('request_received', {
    correlationId,
    method: req.method,
    url: req.originalUrl,
    ip: req.ip,
    userAgent: req.get('user-agent'),
  });

  // Capture response finish
  const originalEnd = res.end;
  res.end = function (...args) {
    const duration = Number(process.hrtime.bigint() - startTime) / 1e6;
    logger.info('request_completed', {
      correlationId,
      method: req.method,
      url: req.originalUrl,
      statusCode: res.statusCode,
      duration_ms: Math.round(duration * 100) / 100,
      contentLength: res.get('content-length'),
    });
    originalEnd.apply(res, args);
  };

  asyncLocalStorage.run(context, () => next());
}

// Helper to retrieve correlation ID anywhere in the call stack
function getCorrelationId() {
  const store = asyncLocalStorage.getStore();
  return store?.correlationId || 'unknown';
}

module.exports = { requestLoggingMiddleware, getCorrelationId };

Python FastAPI Middleware

import time
import uuid
from contextvars import ContextVar
from starlette.middleware.base import BaseHTTPMiddleware
import structlog

correlation_id_var: ContextVar[str] = ContextVar("correlation_id", default="unknown")
log = structlog.get_logger()


class RequestLoggingMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        correlation_id = request.headers.get("X-Correlation-Id", str(uuid.uuid4()))
        correlation_id_var.set(correlation_id)
        structlog.contextvars.bind_contextvars(correlation_id=correlation_id)

        start_time = time.perf_counter()
        log.info(
            "request_received",
            method=request.method,
            path=request.url.path,
            client_ip=request.client.host,
        )

        try:
            response = await call_next(request)
            duration_ms = (time.perf_counter() - start_time) * 1000
            log.info(
                "request_completed",
                method=request.method,
                path=request.url.path,
                status_code=response.status_code,
                duration_ms=round(duration_ms, 2),
            )
            response.headers["X-Correlation-Id"] = correlation_id
            return response
        except Exception as exc:
            duration_ms = (time.perf_counter() - start_time) * 1000
            log.error(
                "request_failed",
                method=request.method,
                path=request.url.path,
                error=str(exc),
                duration_ms=round(duration_ms, 2),
            )
            raise
        finally:
            structlog.contextvars.unbind_contextvars("correlation_id")

Correlation IDs and Distributed Tracing

Propagation Pattern

Client --> API Gateway --> Service A --> Service B --> Database
  |           |               |              |
  |    correlationId     correlationId   correlationId
  |    traceId           traceId         traceId
  |    spanId: span-1    spanId: span-2  spanId: span-3

Implementation Rules

Generate a correlation ID at the system boundary (API gateway or first service to receive the request).
Propagate via HTTP headers (
```
X-Correlation-Id
```
,
```
traceparent
```
for W3C Trace Context).
Attach to every log entry within that request scope.
Pass downstream in all outgoing HTTP calls, message queue messages, and async tasks.
Use OpenTelemetry for standardized trace propagation across languages and frameworks.

OpenTelemetry Integration

// opentelemetry-setup.js
const { NodeTracerProvider } = require('@opentelemetry/sdk-trace-node');
const { SimpleSpanProcessor } = require('@opentelemetry/sdk-trace-base');
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-http');
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node');
const { registerInstrumentations } = require('@opentelemetry/instrumentation');

const provider = new NodeTracerProvider();

provider.addSpanProcessor(new SimpleSpanProcessor(
  new OTLPTraceExporter({ url: 'http://otel-collector:4318/v1/traces' })
));

provider.register();

registerInstrumentations({
  instrumentations: [getNodeAutoInstrumentations()],
});

Log Rotation and Retention

Linux logrotate Configuration

# /etc/logrotate.d/myapp
/var/log/myapp/*.log {
    daily
    rotate 30
    compress
    delaycompress
    missingok
    notifempty
    copytruncate
    maxsize 100M
    dateext
    dateformat -%Y%m%d
    postrotate
        /usr/bin/systemctl reload myapp > /dev/null 2>&1 || true
    endscript
}

Retention Policy Guidelines

Log Type	Retention	Rationale
Application logs	30 days	Debugging recent issues
Access logs	90 days	Security auditing, compliance
Audit logs	1-7 years	Regulatory compliance (HIPAA, SOX, PCI-DSS)
Error logs	90 days	Root cause analysis
Debug logs	7 days	Short-term diagnostics only
Security event logs	1 year+	Incident response, forensics

Docker Logging Configuration

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "50m",
    "max-file": "5",
    "compress": "true",
    "labels": "service,environment"
  }
}

Centralized Logging

ELK Stack (Elasticsearch, Logstash, Kibana)

# docker-compose.yml - ELK stack
version: "3.8"
services:
  elasticsearch:
    image: docker.elastic.co/elasticsearch/elasticsearch:8.11.0
    environment:
      - discovery.type=single-node
      - xpack.security.enabled=false
      - "ES_JAVA_OPTS=-Xms1g -Xmx1g"
    ports:
      - "9200:9200"
    volumes:
      - es-data:/usr/share/elasticsearch/data

  logstash:
    image: docker.elastic.co/logstash/logstash:8.11.0
    volumes:
      - ./logstash/pipeline:/usr/share/logstash/pipeline
    ports:
      - "5044:5044"
    depends_on:
      - elasticsearch

  kibana:
    image: docker.elastic.co/kibana/kibana:8.11.0
    ports:
      - "5601:5601"
    environment:
      - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
    depends_on:
      - elasticsearch

volumes:
  es-data:

Logstash Pipeline

# logstash/pipeline/logstash.conf
input {
  beats { port => 5044 }
  tcp {
    port => 5000
    codec => json_lines
  }
}

filter {
  if [level] == "ERROR" or [level] == "FATAL" {
    mutate { add_tag => ["alert_worthy"] }
  }
  if [duration_ms] and [duration_ms] > 5000 {
    mutate { add_tag => ["slow_request"] }
  }
  # Remove sensitive fields
  mutate {
    remove_field => ["password", "token", "secret", "authorization"]
  }
}

output {
  elasticsearch {
    hosts => ["http://elasticsearch:9200"]
    index => "logs-%{[service]}-%{+YYYY.MM.dd}"
  }
}

Grafana Loki (Lightweight Alternative)

# promtail-config.yml
server:
  http_listen_port: 9080

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: application
    static_configs:
      - targets: ["localhost"]
        labels:
          job: myapp
          __path__: /var/log/myapp/*.log
    pipeline_stages:
      - json:
          expressions:
            level: level
            service: service
            correlationId: correlationId
      - labels:
          level:
          service:

Datadog Log Collection

# datadog-agent.yaml
logs_enabled: true
logs_config:
  container_collect_all: true
  processing_rules:
    - type: mask_sequences
      name: mask_credit_cards
      replace_placeholder: "[MASKED_CC]"
      pattern: '\b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b'
    - type: mask_sequences
      name: mask_ssn
      replace_placeholder: "[MASKED_SSN]"
      pattern: '\b\d{3}-\d{2}-\d{4}\b'

What to Log and What NOT to Log

Always Log

Request/response metadata (method, URL, status code, duration)
Authentication events (login, logout, failed attempts, token refresh)
Authorization failures (access denied)
Business-critical operations (order placed, payment processed, user created)
Errors and exceptions with full stack traces
External service calls (API, database, cache) with latency
Configuration changes and deployments
Health check results and dependency status

Never Log

Passwords or authentication credentials
API keys, tokens, or secrets
PII: Social Security numbers, full credit card numbers, date of birth
PHI: Medical records, health information (HIPAA)
Session tokens or cookies in full
Request/response bodies containing sensitive user data
Encryption keys or certificates
Database connection strings with embedded credentials

Redaction Patterns

// Pino redaction (recommended approach)
const logger = pino({
  redact: {
    paths: [
      'req.headers.authorization',
      'req.headers.cookie',
      'body.password',
      'body.ssn',
      'body.creditCard',
      'body.dateOfBirth',
      '*.apiKey',
      '*.secret',
      '*.token',
    ],
    censor: '[REDACTED]',
  },
});

// Manual redaction utility
function redactSensitive(obj) {
  const sensitiveKeys = /password|secret|token|apikey|authorization|ssn|credit/i;
  return JSON.parse(JSON.stringify(obj, (key, value) => {
    if (sensitiveKeys.test(key)) return '[REDACTED]';
    return value;
  }));
}

Performance Logging (Timing and Metrics)

Timer Utility

class PerformanceLogger {
  constructor(logger) {
    this.logger = logger;
  }

  async time(label, fn, metadata = {}) {
    const start = process.hrtime.bigint();
    try {
      const result = await fn();
      const duration = Number(process.hrtime.bigint() - start) / 1e6;
      this.logger.info('operation_completed', {
        operation: label,
        duration_ms: Math.round(duration * 100) / 100,
        success: true,
        ...metadata,
      });
      return result;
    } catch (error) {
      const duration = Number(process.hrtime.bigint() - start) / 1e6;
      this.logger.error('operation_failed', {
        operation: label,
        duration_ms: Math.round(duration * 100) / 100,
        success: false,
        error: error.message,
        ...metadata,
      });
      throw error;
    }
  }
}

// Usage
const perfLogger = new PerformanceLogger(logger);
const users = await perfLogger.time('fetch_users', () => db.query('SELECT * FROM users'), {
  table: 'users',
  query_type: 'select',
});

Slow Query Detection

import functools
import time
import structlog

log = structlog.get_logger()

def log_slow(threshold_ms=1000):
    """Decorator that logs a warning if a function exceeds the threshold."""
    def decorator(fn):
        @functools.wraps(fn)
        async def wrapper(*args, **kwargs):
            start = time.perf_counter()
            try:
                result = await fn(*args, **kwargs)
                return result
            finally:
                elapsed_ms = (time.perf_counter() - start) * 1000
                if elapsed_ms > threshold_ms:
                    log.warning(
                        "slow_operation",
                        function=fn.__name__,
                        duration_ms=round(elapsed_ms, 2),
                        threshold_ms=threshold_ms,
                    )
        return wrapper
    return decorator

@log_slow(threshold_ms=500)
async def fetch_user_profile(user_id: str):
    return await db.users.find_one({"_id": user_id})

Audit Logging Patterns

Audit Log Schema

{
  "timestamp": "2025-09-15T14:23:07.412Z",
  "eventType": "USER_ROLE_CHANGED",
  "actor": {
    "id": "usr_admin_001",
    "type": "user",
    "ip": "10.0.1.50",
    "userAgent": "Mozilla/5.0..."
  },
  "target": {
    "type": "user",
    "id": "usr_88291",
    "name": "john.doe@example.com"
  },
  "action": "UPDATE",
  "changes": {
    "before": { "role": "viewer" },
    "after": { "role": "admin" }
  },
  "result": "SUCCESS",
  "correlationId": "req_a1b2c3d4e5",
  "metadata": {
    "reason": "Promotion approved by VP Engineering",
    "approvalTicket": "JIRA-4521"
  }
}

Audit Logger Implementation

class AuditLogger {
  constructor(logger, store) {
    this.logger = logger;
    this.store = store; // Persistent, append-only store
  }

  async log({ eventType, actor, target, action, changes, result, metadata }) {
    const entry = {
      timestamp: new Date().toISOString(),
      eventType,
      actor,
      target,
      action,
      changes,
      result,
      correlationId: getCorrelationId(),
      metadata,
    };

    // Write to append-only audit store (database or immutable log)
    await this.store.append(entry);

    // Also send to standard logger for aggregation
    this.logger.info('audit_event', entry);
  }
}

// Usage
await auditLogger.log({
  eventType: 'USER_ROLE_CHANGED',
  actor: { id: req.user.id, type: 'user', ip: req.ip },
  target: { type: 'user', id: targetUserId },
  action: 'UPDATE',
  changes: { before: { role: 'viewer' }, after: { role: 'admin' } },
  result: 'SUCCESS',
  metadata: { reason: 'Promotion', approvalTicket: 'JIRA-4521' },
});

Log Format Standards

Recommended Fields for Every Log Entry

Field	Type	Required	Description
timestamp	ISO 8601 string	Yes	When the event occurred
level	string	Yes	Log severity level
message	string	Yes	Human-readable description
service	string	Yes	Name of the emitting service
environment	string	Yes	dev, staging, production
correlationId	string	Recommended	Request trace identifier
traceId	string	Recommended	Distributed trace ID (OpenTelemetry)
host	string	Recommended	Hostname or pod name
version	string	Recommended	Application version
error	object	Conditional	Error details (when level >= ERROR)
duration_ms	number	Conditional	Operation duration in milliseconds

Alert-Worthy Log Patterns

Configure alerts when these patterns appear:

Pattern	Alert Level	Action
FATAL level log	P1 - Critical	Page on-call immediately
Error rate > 5% over 5 minutes	P1 - Critical	Page on-call
Repeated auth failures from same IP	P2 - High	Notify security team
5xx response rate spike	P2 - High	Page on-call
Slow query > 10s	P3 - Medium	Create ticket
Disk space for logs > 85%	P3 - Medium	Notify ops team
Deprecated API usage increase	P4 - Low	Create ticket for next sprint
New unique error message	P4 - Low	Review in daily triage

Contextual Logging

Node.js AsyncLocalStorage (Continuation-Local Storage)

const { AsyncLocalStorage } = require('async_hooks');

const asyncLocalStorage = new AsyncLocalStorage();

// Middleware to establish context
function contextMiddleware(req, res, next) {
  const context = {
    correlationId: req.headers['x-correlation-id'] || uuidv4(),
    userId: req.user?.id,
    tenantId: req.headers['x-tenant-id'],
    requestPath: req.path,
  };
  asyncLocalStorage.run(context, next);
}

// Logger wrapper that automatically includes context
function createContextualLogger(baseLogger) {
  return {
    info: (msg, extra = {}) => {
      const ctx = asyncLocalStorage.getStore() || {};
      baseLogger.info({ ...ctx, ...extra, message: msg });
    },
    error: (msg, extra = {}) => {
      const ctx = asyncLocalStorage.getStore() || {};
      baseLogger.error({ ...ctx, ...extra, message: msg });
    },
    warn: (msg, extra = {}) => {
      const ctx = asyncLocalStorage.getStore() || {};
      baseLogger.warn({ ...ctx, ...extra, message: msg });
    },
    debug: (msg, extra = {}) => {
      const ctx = asyncLocalStorage.getStore() || {};
      baseLogger.debug({ ...ctx, ...extra, message: msg });
    },
  };
}

Java MDC (Mapped Diagnostic Context)

import org.slf4j.MDC;
import javax.servlet.*;
import java.io.IOException;
import java.util.UUID;

public class MDCFilter implements Filter {
    @Override
    public void doFilter(ServletRequest req, ServletResponse res, FilterChain chain)
            throws IOException, ServletException {
        try {
            HttpServletRequest httpReq = (HttpServletRequest) req;
            String correlationId = httpReq.getHeader("X-Correlation-Id");
            if (correlationId == null) {
                correlationId = UUID.randomUUID().toString();
            }
            MDC.put("correlationId", correlationId);
            MDC.put("userId", getCurrentUserId(httpReq));
            MDC.put("requestPath", httpReq.getRequestURI());

            chain.doFilter(req, res);
        } finally {
            MDC.clear();
        }
    }
}

Python contextvars

from contextvars import ContextVar
import structlog

# Define context variables
correlation_id: ContextVar[str] = ContextVar("correlation_id", default="none")
user_id: ContextVar[str] = ContextVar("user_id", default="anonymous")
tenant_id: ContextVar[str] = ContextVar("tenant_id", default="unknown")

# Configure structlog to automatically include contextvars
structlog.configure(
    processors=[
        structlog.contextvars.merge_contextvars,
        structlog.processors.add_log_level,
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.JSONRenderer(),
    ],
)

# In middleware
async def logging_middleware(request, call_next):
    structlog.contextvars.bind_contextvars(
        correlation_id=request.headers.get("X-Correlation-Id", str(uuid.uuid4())),
        user_id=getattr(request.state, "user_id", "anonymous"),
        tenant_id=request.headers.get("X-Tenant-Id", "unknown"),
    )
    response = await call_next(request)
    structlog.contextvars.unbind_contextvars("correlation_id", "user_id", "tenant_id")
    return response

Log Sampling for High-Traffic Systems

In high-throughput services (thousands of requests per second), logging every event becomes cost-prohibitive. Sampling reduces volume while preserving visibility.

Sampling Strategies

Strategy	Description	Use Case
Rate-based	Log 1 in N requests	General traffic reduction
Probabilistic	Random chance per event (e.g., 10%)	Uniform sampling across all traffic
Priority-based	Always log errors; sample info/debug	Ensure errors are never lost
Head-based	Decide at request start, propagate	Consistent per-request view
Tail-based	Decide after request completes	Sample only interesting requests (errors, slow)
Adaptive	Adjust rate based on current volume	Handle traffic spikes gracefully

Implementation

class SampledLogger {
  constructor(logger, options = {}) {
    this.logger = logger;
    this.sampleRate = options.sampleRate || 0.1; // 10% of info logs
    this.alwaysLogLevels = new Set(options.alwaysLogLevels || ['warn', 'error', 'fatal']);
    this.counter = 0;
    this.rateBasedInterval = options.rateBasedInterval || 100; // Log 1 in 100
  }

  _shouldLog(level) {
    if (this.alwaysLogLevels.has(level)) return true;
    // Probabilistic sampling for non-critical levels
    return Math.random() < this.sampleRate;
  }

  info(msg, meta = {}) {
    if (this._shouldLog('info')) {
      this.logger.info(msg, { ...meta, sampled: true, sampleRate: this.sampleRate });
    }
  }

  debug(msg, meta = {}) {
    if (this._shouldLog('debug')) {
      this.logger.debug(msg, { ...meta, sampled: true, sampleRate: this.sampleRate });
    }
  }

  warn(msg, meta = {}) {
    this.logger.warn(msg, meta); // Always log warnings
  }

  error(msg, meta = {}) {
    this.logger.error(msg, meta); // Always log errors
  }
}

// Usage: only 10% of info/debug logs are emitted
const sampledLogger = new SampledLogger(logger, {
  sampleRate: 0.1,
  alwaysLogLevels: ['warn', 'error', 'fatal'],
});

OpenTelemetry Tail-Based Sampling

# otel-collector-config.yaml
processors:
  tail_sampling:
    decision_wait: 10s
    num_traces: 100000
    policies:
      - name: always-sample-errors
        type: status_code
        status_code:
          status_codes: [ERROR]
      - name: always-sample-slow
        type: latency
        latency:
          threshold_ms: 5000
      - name: probabilistic-sample-rest
        type: probabilistic
        probabilistic:
          sampling_percentage: 10