install
source · Clone the upstream repo
git clone https://github.com/Intense-Visions/harness-engineering
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/resilience-dead-letter" ~/.claude/skills/intense-visions-harness-engineering-resilience-dead-letter && rm -rf "$T"
manifest:
agents/skills/claude-code/resilience-dead-letter/SKILL.mdsource content
Dead Letter Queue Pattern
Handle permanently failing messages with dead letter queues for safe inspection, alerting, and reprocessing
When to Use
- Message processing fails repeatedly after retries are exhausted
- Need to prevent poison messages from blocking the main queue
- Requiring manual inspection of failed messages to diagnose issues
- Building event-driven systems with guaranteed message handling
Instructions
- Configure a dead letter queue (DLQ) alongside every main processing queue.
- After N retry attempts (typically 3-5), move the message to the DLQ instead of retrying forever.
- Preserve the original message payload, headers, error details, and attempt count in the DLQ entry.
- Set up alerts on DLQ message count — messages in the DLQ indicate a processing problem.
- Build a reprocessing mechanism to replay DLQ messages back to the main queue after fixing the issue.
- Add a DLQ dashboard for operators to inspect, diagnose, and manually resolve failed messages.
// queues/dead-letter.ts interface DeadLetterEntry<T> { id: string; originalQueue: string; payload: T; error: string; attempts: number; firstFailedAt: string; lastFailedAt: string; metadata: Record<string, unknown>; } export class DeadLetterQueue<T> { private entries: Map<string, DeadLetterEntry<T>> = new Map(); constructor( private readonly name: string, private readonly onDeadLetter?: (entry: DeadLetterEntry<T>) => void ) {} add(entry: Omit<DeadLetterEntry<T>, 'id' | 'lastFailedAt'>): void { const id = crypto.randomUUID(); const deadLetter: DeadLetterEntry<T> = { ...entry, id, lastFailedAt: new Date().toISOString(), }; this.entries.set(id, deadLetter); this.onDeadLetter?.(deadLetter); console.error(`[DLQ:${this.name}] Message dead-lettered: ${entry.error}`, { id, attempts: entry.attempts, }); } list(): DeadLetterEntry<T>[] { return Array.from(this.entries.values()); } get(id: string): DeadLetterEntry<T> | undefined { return this.entries.get(id); } remove(id: string): boolean { return this.entries.delete(id); } reprocess(id: string): T | undefined { const entry = this.entries.get(id); if (entry) { this.entries.delete(id); return entry.payload; } return undefined; } get count(): number { return this.entries.size; } }
// workers/order-processor.ts import { DeadLetterQueue } from '../queues/dead-letter'; interface OrderMessage { orderId: string; items: Array<{ productId: string; qty: number }>; } const dlq = new DeadLetterQueue<OrderMessage>('orders', (entry) => { // Alert on dead letter alerting.send({ severity: 'warning', message: `Order processing failed: ${entry.error}`, context: { orderId: entry.payload.orderId, attempts: entry.attempts }, }); }); const MAX_RETRIES = 3; async function processMessage(message: OrderMessage, attempt = 1): Promise<void> { try { await orderService.process(message); } catch (error) { if (attempt >= MAX_RETRIES) { dlq.add({ originalQueue: 'orders', payload: message, error: error instanceof Error ? error.message : String(error), attempts: attempt, firstFailedAt: new Date().toISOString(), metadata: { lastAttemptError: String(error) }, }); return; // Do not rethrow — message is safely in DLQ } // Retry with backoff await delay(1000 * Math.pow(2, attempt)); return processMessage(message, attempt + 1); } }
Details
Cloud provider DLQs:
- AWS SQS: Configure
withRedrivePolicy
andmaxReceiveCountdeadLetterTargetArn - Azure Service Bus: Built-in DLQ subqueue on every queue/subscription
- Google Pub/Sub: Configure
on the subscriptiondeadLetterPolicy - RabbitMQ: Declare
on the queuex-dead-letter-exchange
BullMQ (Node.js) dead letter pattern:
const queue = new Queue('orders'); const worker = new Worker('orders', processOrder, { settings: { backoffStrategies: { custom: (attemptsMade) => Math.pow(2, attemptsMade) * 1000 }, }, }); worker.on('failed', async (job, err) => { if (job && job.attemptsMade >= job.opts.attempts!) { // Move to DLQ await dlqQueue.add('dead-letter', { originalJob: job.data, error: err.message, attempts: job.attemptsMade, }); } });
Reprocessing strategy: When the underlying issue is fixed:
- Inspect DLQ messages to confirm the fix addresses the failure
- Replay messages one at a time to verify
- Batch replay remaining messages
- Monitor for new DLQ entries
DLQ metrics to track: Message count (should trend toward zero), age of oldest message, inflow rate (new messages/hour), category of errors.
Source
Process
- Read the instructions and examples in this document.
- Apply the patterns to your implementation, adapting to your specific context.
- Verify your implementation against the details and edge cases listed above.
Harness Integration
- Type: knowledge — this skill is a reference document, not a procedural workflow.
- No tools or state — consumed as context by other skills and agents.
Success Criteria
- The patterns described in this document are applied correctly in the implementation.
- Edge cases and anti-patterns listed in this document are avoided.