install
source · Clone the upstream repo
git clone https://github.com/Intense-Visions/harness-engineering
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/Intense-Visions/harness-engineering "$T" && mkdir -p ~/.claude/skills && cp -r "$T/agents/skills/claude-code/otel-metrics-pattern" ~/.claude/skills/intense-visions-harness-engineering-otel-metrics-pattern-ba4318 && rm -rf "$T"
manifest:
agents/skills/claude-code/otel-metrics-pattern/SKILL.mdsource content
OpenTelemetry Metrics Pattern
Record application metrics with OpenTelemetry counters, histograms, and gauges for monitoring and alerting
When to Use
- Tracking request rates, error rates, and latency distributions
- Monitoring business metrics (orders processed, payments completed)
- Setting up SLI/SLO monitoring dashboards
- Replacing Prometheus client with vendor-neutral OpenTelemetry metrics
Instructions
- Get a meter from the registered MeterProvider:
.metrics.getMeter('service-name', '1.0.0') - Choose the right instrument: Counter for monotonically increasing values, Histogram for distributions, Gauge for point-in-time values.
- Add attributes (labels) to metric recordings for dimensional analysis.
- Keep attribute cardinality low — high cardinality (user IDs as labels) causes metric explosion.
- Use semantic conventions for metric names:
,http.server.request.duration
.http.server.active_requests - Register metric instruments once at startup, then record values throughout the application.
// telemetry/metrics.ts import { metrics, ValueType } from '@opentelemetry/api'; const meter = metrics.getMeter('order-service', '1.0.0'); // Counter — monotonically increasing (total requests, errors) export const orderCounter = meter.createCounter('orders.created', { description: 'Total number of orders created', unit: '1', }); export const orderErrorCounter = meter.createCounter('orders.errors', { description: 'Total number of order creation errors', unit: '1', }); // Histogram — distribution of values (latency, request size) export const orderDurationHistogram = meter.createHistogram('orders.duration', { description: 'Order creation duration', unit: 'ms', valueType: ValueType.DOUBLE, }); // UpDownCounter — can increase and decrease (active connections, queue depth) export const activeOrdersGauge = meter.createUpDownCounter('orders.active', { description: 'Number of orders currently being processed', unit: '1', }); // Observable gauge — value is read on collection (memory, CPU) meter.createObservableGauge( 'process.memory.heap', { description: 'Heap memory usage', unit: 'By', }, (result) => { result.observe(process.memoryUsage().heapUsed); } );
// Usage in service code export async function createOrder(userId: string, items: OrderItem[]): Promise<Order> { const startTime = performance.now(); activeOrdersGauge.add(1, { 'order.type': 'standard' }); try { const order = await db.orders.create({ userId, items }); orderCounter.add(1, { 'order.type': 'standard', 'order.status': 'created', 'payment.method': order.paymentMethod, }); return order; } catch (error) { orderErrorCounter.add(1, { 'error.type': error instanceof Error ? error.constructor.name : 'unknown', }); throw error; } finally { activeOrdersGauge.add(-1, { 'order.type': 'standard' }); orderDurationHistogram.record(performance.now() - startTime, { 'order.type': 'standard', }); } }
Details
Instrument types:
| Instrument | Type | Example |
|---|---|---|
| Counter | Monotonic sum | Total requests, bytes sent |
| UpDownCounter | Non-monotonic sum | Active connections, queue depth |
| Histogram | Distribution | Request duration, response size |
| Observable Counter | Async monotonic sum | CPU time |
| Observable UpDownCounter | Async non-monotonic | Thread count |
| Observable Gauge | Async point-in-time | Temperature, memory usage |
Attribute cardinality: Each unique combination of attributes creates a separate time series. With 10 status codes and 5 methods, you get 50 time series. Adding user ID (100K users) would create 5 million time series — do not do this.
Recommended metric names:
— request latency histogramhttp.server.request.duration
— concurrent requests gaugehttp.server.active_requests
— outgoing request latencyhttp.client.request.duration
— database query durationdb.client.operation.duration
— message processing timemessaging.process.duration
Histogram bucket configuration: Default buckets work for most cases. Customize for specific SLOs:
const meterProvider = new MeterProvider({ views: [ new View({ instrumentName: 'http.server.request.duration', aggregation: new ExplicitBucketHistogramAggregation([ 5, 10, 25, 50, 100, 250, 500, 1000, 2500, 5000, ]), }), ], });
RED method: Rate (requests/sec), Errors (error rate), Duration (latency distribution). These three metrics cover most monitoring needs for any service.
Source
https://opentelemetry.io/docs/concepts/signals/metrics/
Process
- Read the instructions and examples in this document.
- Apply the patterns to your implementation, adapting to your specific context.
- Verify your implementation against the details and edge cases listed above.
Harness Integration
- Type: knowledge — this skill is a reference document, not a procedural workflow.
- No tools or state — consumed as context by other skills and agents.
Success Criteria
- The patterns described in this document are applied correctly in the implementation.
- Edge cases and anti-patterns listed in this document are avoided.