install
source · Clone the upstream repo
git clone https://github.com/ai-analyst-lab/ai-analyst
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/ai-analyst-lab/ai-analyst "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.claude/skills/metric-spec" ~/.claude/skills/ai-analyst-lab-ai-analyst-metric-spec && rm -rf "$T"
manifest:
.claude/skills/metric-spec/skill.mdsource content
Skill: Metric Spec
Purpose
Define any metric clearly and completely using a standardized template so there is no ambiguity about what is being measured, how it's calculated, or how to interpret it.
When to Use
Apply this skill when defining a new metric, when a metric is referenced without a clear definition, or when different people are using the same metric name to mean different things. Every metric used in an analysis should have a spec.
Instructions
Metric Spec Template
## Metric: [Name] ### Definition **Plain English:** [One sentence a non-technical person can understand] **Formula:** [Exact calculation] ### Components | Component | Definition | Source | |-----------|-----------|--------| | **Numerator** | [What's being counted/summed in the top] | [Table.column] | | **Denominator** | [What's being counted in the bottom (if ratio)] | [Table.column] | | **Unit of analysis** | [What does one row represent?] | [e.g., per user, per session, per order] | ### Segmentation Dimensions | Dimension | Values | Why | |-----------|--------|-----| | [e.g., Device type] | [mobile, desktop, tablet] | [Different UX → different conversion] | | [e.g., Acquisition channel] | [organic, paid, referral] | [Different intent → different behavior] | | [e.g., Geography] | [US, EU, APAC] | [Different markets → different baselines] | ### Data Source - **Primary table:** [schema.table_name] - **Key columns:** [list] - **Refresh cadence:** [real-time / hourly / daily / weekly] - **Latency:** [how delayed is the data?] - **Reference query:** [SQL query that computes this metric — the canonical implementation] ### Thresholds | Condition | Value | Action | |-----------|-------|--------| | **Healthy** | [e.g., >3.5%] | No action needed | | **Watch** | [e.g., 2.5-3.5%] | Monitor weekly, investigate if persists >2 weeks | | **Investigate** | [e.g., <2.5%] | Root cause analysis within 48 hours | | **Alert** | [e.g., <1.5%] | Escalate to leadership, immediate investigation | ### Known Limitations - [Limitation 1: e.g., "Does not include guest checkouts — only registered users"] - [Limitation 2: e.g., "Affected by bot traffic; filter using is_bot flag"] - [Limitation 3: e.g., "Denominator changes when new markets launch — compare like-for-like"] ### Related Metrics - [Upstream: what drives this metric?] - [Downstream: what does this metric drive?] - [Alternative: other ways to measure the same concept] ### Driver Decomposition (Optional) If this is a key business metric, decompose it into its drivers to enable faster diagnosis when the metric changes. **Decomposition type:** [Multiplicative / Additive] | Driver | Formula | Relationship | Data Source | |--------|---------|-------------|-------------| | [driver 1] | [formula] | [× / +] | [table.column] | | [driver 2] | [formula] | [× / +] | [table.column] | | [driver 3] | [formula] | [× / +] | [table.column] | **Diagnostic rule:** If [parent metric] drops, check these drivers in order: 1. [driver 1] — [why this is the most likely cause / highest leverage] 2. [driver 2] — [what changes in this driver would look like] 3. [driver 3] — [least common but possible] **Verification:** [parent metric] = [driver 1] × [driver 2] × [driver 3] (for multiplicative) or [parent metric] = [driver 1] + [driver 2] + [driver 3] (for additive)
Writing Rules
- Definition must be unambiguous — two different analysts reading the spec should write the same SQL
- Always specify the denominator — "conversion rate" is meaningless without knowing what's in the denominator (visitors? sessions? users?)
- Always specify the time window — "DAU" measured daily is different from "DAU" measured as a 7-day average
- Always specify exclusions — which users/events are filtered out? (test accounts, internal users, bots)
- Thresholds should be based on historical data — not gut feel. State the basis: "Based on 6-month average of 3.8% ± 0.4%"
Examples
Example 1: Conversion Rate
## Metric: Checkout Conversion Rate ### Definition **Plain English:** The percentage of users who visit the checkout page and complete a purchase. **Formula:** (Users who completed purchase) / (Users who viewed checkout page) × 100 ### Components | Component | Definition | Source | |-----------|-----------|--------| | **Numerator** | Distinct users with a `purchase_completed` event within 24h of checkout view | events.event_type = 'purchase_completed' | | **Denominator** | Distinct users with a `checkout_viewed` event | events.event_type = 'checkout_viewed' | | **Unit of analysis** | Per user per day (deduplicated — a user counts once even with multiple checkout views) | ### Segmentation Dimensions | Dimension | Values | Why | |-----------|--------|-----| | Device type | mobile, desktop, tablet | Mobile checkout has different UX friction | | Payment method | credit card, PayPal, Apple Pay | Different failure rates by method | | New vs returning | first purchase, repeat | Different conversion baselines | ### Data Source - **Primary table:** analytics.events - **Key columns:** user_id, event_type, event_timestamp, device_type, properties.payment_method - **Refresh cadence:** Hourly - **Latency:** ~2 hours from event to availability ### Thresholds | Condition | Value | Action | |-----------|-------|--------| | **Healthy** | >3.5% | No action | | **Watch** | 2.5-3.5% | Monitor; check if specific segment is dragging | | **Investigate** | <2.5% | Root cause within 48h; check payment processor, page load times | | **Alert** | <1.5% | Immediate escalation; likely a bug or outage | ### Known Limitations - Does not include guest checkouts (only logged-in users) - 24h attribution window means some slow purchasers are excluded - Bot filtering depends on `is_bot` flag accuracy (~95% reliable)
Example 2: Revenue Metric
## Metric: Monthly Recurring Revenue (MRR) ### Definition **Plain English:** The total monthly revenue from all active subscriptions, normalized to a monthly rate. **Formula:** SUM(active_subscriptions × monthly_equivalent_price) as of the last day of the month ### Components | Component | Definition | Source | |-----------|-----------|--------| | **Numerator** | Sum of monthly-equivalent price for all subscriptions with status='active' on the measurement date | subscriptions.price / (billing_interval_months) | | **Denominator** | N/A (absolute metric, not a ratio) | — | | **Unit of analysis** | Per month, measured on last calendar day | ### Segmentation Dimensions | Dimension | Values | Why | |-----------|--------|-----| | Plan tier | free, starter, pro, enterprise | Different ARPU and churn dynamics | | Billing interval | monthly, annual | Annual has lower churn but deferred revenue | | Cohort month | signup month | Tracks retention and expansion by cohort | ### Thresholds | Condition | Value | Action | |-----------|-------|--------| | **Healthy** | MoM growth >3% | On track for annual targets | | **Watch** | MoM growth 0-3% | Dig into new vs expansion vs churn components | | **Investigate** | MoM growth <0% | Net churn exceeding new business — root cause urgently | ### Known Limitations - Annual subscriptions are divided by 12 for monthly equivalent; actual cash flow differs - Does not include one-time fees, implementation fees, or overages - Enterprise custom pricing may lag in system — verify against finance for board reporting
Example 3: Engagement Metric
## Metric: DAU/MAU Ratio (Stickiness) ### Definition **Plain English:** The percentage of monthly users who use the product on any given day. Higher = more habitual usage. **Formula:** (Average daily active users in the month) / (Monthly active users) × 100 ### Components | Component | Definition | Source | |-----------|-----------|--------| | **Numerator** | Average of daily distinct users with ≥1 meaningful action, averaged across all days in the month | AVG(daily_active_users) where action ∈ meaningful_actions | | **Denominator** | Distinct users with ≥1 meaningful action in the entire month | COUNT(DISTINCT user_id) for the month | | **Unit of analysis** | Per month | ### Segmentation Dimensions | Dimension | Values | Why | |-----------|--------|-----| | User tenure | <30d, 30-90d, 90-365d, >365d | New users have different patterns | | Plan tier | free, paid | Paid users should be stickier | | Platform | web, iOS, Android | Mobile tends to be stickier | ### Thresholds | Condition | Value | Action | |-----------|-------|--------| | **Healthy** | >25% | Strong daily habit (comparable to social apps) | | **Watch** | 15-25% | Typical for B2B SaaS; look for improvement opportunities | | **Investigate** | <15% | Weak daily habit; investigate activation and feature adoption | ### Known Limitations - "Meaningful action" definition matters enormously — login alone should NOT count - Weekday/weekend patterns affect daily averages; consider business-day-only variant for B2B - Bots and automated API calls must be excluded or this metric is inflated
Example 4: Metric with Driver Decomposition
## Metric: Revenue ### Definition **Plain English:** Total revenue from completed orders in a period. **Formula:** COUNT(orders) × AVG(order_value) ### Components | Component | Definition | Source | |-----------|-----------|--------| | **Numerator** | Sum of total_amount for orders with status='completed' | orders.total_amount WHERE status='completed' | | **Denominator** | N/A (absolute metric) | — | | **Unit of analysis** | Per month | ### Driver Decomposition **Decomposition type:** Multiplicative Revenue = Active Users × Orders per User × Average Order Value | Driver | Formula | Relationship | Data Source | |--------|---------|-------------|-------------| | Active Users | COUNT(DISTINCT user_id) with ≥1 order in period | × | orders.user_id | | Orders per User | COUNT(orders) / COUNT(DISTINCT user_id) | × | orders | | Average Order Value | SUM(total_amount) / COUNT(orders) | × | orders.total_amount | **Diagnostic rule:** If Revenue drops, check these drivers in order: 1. Active Users — did fewer users place orders? (acquisition or retention problem) 2. Orders per User — did users buy less frequently? (engagement or value problem) 3. Average Order Value — did users spend less per order? (pricing, mix shift, or promo problem) **Verification:** Revenue = Active Users × Orders per User × AOV
Auto-Registration in Metric Dictionary
After writing a metric spec, automatically register it in the metric dictionary:
- Read
to get the active dataset ID..knowledge/active.yaml - Check
exists. If not, create it..knowledge/datasets/{active}/metrics/index.yaml - Generate a metric
from the metric name: lowercase, hyphens, no spaces (e.g., "Checkout Conversion Rate" →id
).checkout-conversion-rate - If the metric ID already exists in
, update the entry. If new, append it.index.yaml - Write a YAML file at
following the schema in.knowledge/datasets/{active}/metrics/{id}.yaml
. Map metric spec fields to YAML fields:.knowledge/datasets/_metric_schema.yaml
← Formula from specdefinition.formula
← Infer from formula (%, count, currency, ratio)definition.unit
← Infer from thresholds (higher_is_better / lower_is_better)definition.direction
← Primary table from Data Source sectionsource.tables
← Reference query if providedsource.sql
← Segmentation Dimensions column namesdimensions
← Thresholds section valuesguardrails
- Update
with the new/updated entry.index.yaml
Anti-Patterns
- Never define a metric without specifying the denominator — "conversion rate" is meaningless without context
- Never use a metric name that means different things to different teams — if marketing's "conversion" ≠ product's "conversion," create two separate specs
- Never set thresholds without historical data — arbitrary thresholds lead to false alarms or missed problems
- Never skip the "known limitations" section — every metric has caveats, and hiding them doesn't make them go away
- Never use a ratio without understanding what moves the numerator vs. denominator independently — a "improving" conversion rate could mean you lost low-intent traffic, not that you improved the product
Reference Queries for Common Metrics
Use these canonical SQL patterns when computing standard metrics. Replace
{schema} with the active dataset schema (e.g., your_dataset).
Conversion Rate (Event-Based)
-- Conversion rate: % of users who performed action B after action A SELECT COUNT(DISTINCT CASE WHEN b.user_id IS NOT NULL THEN a.user_id END) * 1.0 / NULLIF(COUNT(DISTINCT a.user_id), 0) AS conversion_rate FROM {schema}.events a LEFT JOIN {schema}.events b ON a.user_id = b.user_id AND b.event_type = '{{TARGET_EVENT}}' AND b.timestamp >= a.timestamp AND b.timestamp <= a.timestamp + INTERVAL '{{WINDOW}}' WHERE a.event_type = '{{SOURCE_EVENT}}' AND a.timestamp BETWEEN '{{START_DATE}}' AND '{{END_DATE}}';
Revenue (Order-Based)
-- Total revenue and order count for a period SELECT COUNT(DISTINCT order_id) AS total_orders, SUM(total_amount) AS total_revenue, AVG(total_amount) AS avg_order_value, COUNT(DISTINCT user_id) AS purchasing_users FROM {schema}.orders WHERE status = 'completed' AND order_date BETWEEN '{{START_DATE}}' AND '{{END_DATE}}';
Active Users (DAU / WAU / MAU)
-- Daily/Weekly/Monthly active users SELECT DATE_TRUNC('{{GRANULARITY}}', timestamp) AS period, COUNT(DISTINCT user_id) AS active_users FROM {schema}.events WHERE event_type IN ({{QUALIFYING_EVENTS}}) AND timestamp BETWEEN '{{START_DATE}}' AND '{{END_DATE}}' GROUP BY 1 ORDER BY 1;
Retention Rate (Cohort-Based)
-- Cohort retention: % of users active in period N after signup WITH cohorts AS ( SELECT user_id, DATE_TRUNC('{{GRANULARITY}}', signup_date) AS cohort FROM {schema}.users ), activity AS ( SELECT DISTINCT user_id, DATE_TRUNC('{{GRANULARITY}}', timestamp) AS active_period FROM {schema}.events ) SELECT c.cohort, DATE_DIFF('{{GRANULARITY}}', c.cohort, a.active_period) AS period_number, COUNT(DISTINCT a.user_id) * 1.0 / NULLIF(COUNT(DISTINCT c.user_id), 0) AS retention_rate FROM cohorts c LEFT JOIN activity a ON c.user_id = a.user_id GROUP BY 1, 2 ORDER BY 1, 2;
NPS (Net Promoter Score)
-- Net Promoter Score: % promoters - % detractors SELECT COUNT(CASE WHEN score >= 9 THEN 1 END) * 100.0 / NULLIF(COUNT(*), 0) - COUNT(CASE WHEN score <= 6 THEN 1 END) * 100.0 / NULLIF(COUNT(*), 0) AS nps, COUNT(CASE WHEN score >= 9 THEN 1 END) AS promoters, COUNT(CASE WHEN score BETWEEN 7 AND 8 THEN 1 END) AS passives, COUNT(CASE WHEN score <= 6 THEN 1 END) AS detractors, COUNT(*) AS total_responses FROM {schema}.nps_responses WHERE submitted_at BETWEEN '{{START_DATE}}' AND '{{END_DATE}}';
Usage notes:
- Always replace
with the active dataset's schema prefix{schema} - Replace
placeholders with actual values for the analysis{{VARIABLE}} - These are starting patterns — adapt WHERE clauses and JOINs for your specific data model
- Always validate output with the Data Quality Check skill before drawing conclusions