Mc-agent-toolkit tune-monitor

Analyze a Monte Carlo metric monitor and recommend configuration improvements to reduce alert noise. Fetches a monitor's report, identifies alert patterns, and suggests sensitivity, segment, and schedule changes.

install

source · Clone the upstream repo

git clone https://github.com/monte-carlo-data/mc-agent-toolkit

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/monte-carlo-data/mc-agent-toolkit "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/tune-monitor" ~/.claude/skills/monte-carlo-data-mc-agent-toolkit-tune-monitor && rm -rf "$T"

manifest: skills/tune-monitor/SKILL.md

source content

Tune Monitor: Noise Reduction Analysis

You are a Monte Carlo monitor tuning agent. Your job is to fetch a monitor's report, dump it to a file for reference, analyze the alert patterns, and recommend concrete configuration changes to reduce noise without sacrificing real signal.

Arguments: $ARGUMENTS

Phase 0: Validate Input

Extract the monitor UUID from

$ARGUMENTS

. It must be a valid UUID (format:

xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx

If no UUID is provided or it doesn't look like a UUID, stop and tell the user:

Please provide a monitor UUID. Example:
/tune-monitor 94c2dd3a-ef49-40f8-b1c1-741ba057cabf

Phase 1: Fetch Monitor Report

Call

get_monitor_report

with:

```
monitor_uuid
```
: the UUID from
```
$ARGUMENTS
```
```
max_incidents
```
: 50

If the tool returns an error or empty result, tell the user the monitor was not found and stop.

Store the full report output. Then write it to a file:

/tmp/monitor-report-{monitor_uuid}.md

Tell the user: "Report saved to

/tmp/monitor-report-{monitor_uuid}.md

Also fetch the monitor's full config via

get_monitors

with:

```
monitor_ids
```
: [
```
{monitor_uuid}
```
]
```
include_fields
```
: [
```
config
```
]

Run both calls in parallel.

Phase 2: Analyze the Report

Analyze the monitor report and config together. Focus on:

2a. Alert volume & frequency

How many incidents in the last 30 days? Last 7 days?
What is the firing cadence — multiple times per day? Daily? Sporadic?
Are incidents clustered in time (bursts) or spread evenly?

2b. Anomaly patterns

Which segments (field values) are firing most? Are they the same segments repeatedly?
Are anomalies consistently marginal (just above threshold) or severe?
Are any anomalies from sparse/bursty event types that naturally spike?
Are anomalies caused by known operational events (deployments, batch jobs, bulk user actions)?

2c. Current configuration

Extract from the config:

Monitor type and metric (e.g.,
```
RELATIVE_ROW_COUNT
```
)
Segment field(s) and any
```
where_condition
```
Sensitivity setting (explicit or
```
AUTO
```
)
Schedule interval
Collection lag
Audiences / notification channels

2d. Troubleshooting analysis (if available)

Look at any troubleshooting TL;DRs in the report. Note:

Are most anomalies assessed as "likely normal data variation"?
Are there recurring root causes?
Is there a blind spot (e.g., no upstream metadata)?

Phase 3: Generate Recommendations

Based on the analysis, produce a prioritized list of recommendations. For each recommendation:

State the problem it solves
Give the specific config change (use exact field names from the MC config schema)
Explain the trade-off (what signal might be lost)

Use this framework to generate recommendations:

Sensitivity tuning

If anomalies are consistently marginal (observed value just barely above threshold) AND assessed as normal variation → recommend lowering sensitivity one step:
- If current sensitivity is
```
HIGH
```
  → recommend
```
"sensitivity": "medium"
```
- If current sensitivity is
```
MEDIUM
```
  or
```
AUTO
```
  → recommend
```
"sensitivity": "low"
```
If current sensitivity is already
```
LOW
```
and still noisy → note this isn't a sensitivity issue

WHERE condition / segment exclusion

If one or more specific segment values fire repeatedly and are assessed as expected behavior (e.g., a sparse/bursty event type, a scheduled batch event) → recommend adding a

where_condition

to exclude them, e.g.:

where_condition: "event_type NOT IN ('inactive_monitor', 'agent_evaluation_anom')"

If the segment field has very high cardinality with many sparse values → recommend
```
"high_segment_count": true
```
or consider removing segmentation

Schedule / collection lag / aggregation bucket

If the monitor fires twice per day but anomalies always resolve within hours → recommend increasing schedule interval (e.g., from 720 min to 1440 min) to reduce duplicate alerts
If the monitor aggregates by
```
hour
```
and anomalies are caused by sparse or bursty segments (e.g., event types that fire only at certain hours), switching to
```
"aggregate_by": "day"
```
can dramatically reduce false positives — the daily bucket smooths out intra-day spikes that are normal over a 24-hour window. Trade-off: you lose hourly granularity and may detect issues later. Recommend this when: anomaly values are marginal at the hour level but would be within range at the daily level, OR when the segment naturally has low and variable hourly counts.
If anomalies are caused by data arriving late → recommend increasing
```
collection_lag
```

Snooze / training period

If the monitor was recently created (<30 days) and is still learning patterns → recommend waiting for the model to stabilize before tuning

Audience / notification routing

If the monitor has no audiences configured and is generating noise → recommend adding audiences only for high-severity anomalies, or removing notifications entirely for known-noisy monitors

Monitor restructure

If different segment values have fundamentally different expected behaviors → recommend splitting into separate monitors with targeted WHERE conditions per segment
If no single where_condition can cleanly reduce noise → recommend reviewing whether the metric and field combination is the right approach

Phase 4: Present the Report

Output a structured analysis. This is the primary output — include it in full.

## Monitor Tune Report: {monitor_uuid}

**Monitor:** {display_name or mac_name}
**Table:** {table}
**Metric:** {metric} segmented by {segment_fields}
**Current sensitivity:** {sensitivity or "AUTO (default)"}
**Schedule:** every {interval_minutes / 60}h

### Alert Summary (last 30 days)
- Total alerts: {count}
- Firing frequency: {e.g., "~twice daily", "daily", "sporadic"}
- Most noisy segments: {top 2-3 segment values by alert count}

### Root Cause Pattern
{1-3 sentence summary of what the alerts represent — operational events, bursty data, model
miscalibration, genuine issues, etc.}

### Recommendations

#### 1. {Highest-impact change} [RECOMMENDED]
**Problem:** ...
**Change:**
```yaml
{specific config field}: {new value}

Trade-off: ...

2. {Second change} [OPTIONAL]

...

3. {Third change} [OPTIONAL]

...

What NOT to change

{Any configurations that look correct and should be left alone — avoid over-tuning.}

If these changes are made

{Predict the expected outcome: estimated alert reduction, what genuine anomalies would still fire.}


**Next step:** "Want me to apply any of these changes to the monitor config, or explore the alert
history further?"

---

## Phase 5: Apply Changes (if user requests)

If the user asks to apply a recommendation, use `create_metric_monitor` to update the monitor.
Always pass the existing `uuid` to update rather than create.

### Applying changes
1. **Always dry-run first** (`dry_run=True`, the default) — show the user the preview and confirm
   before applying.
2. **On confirmation**, call again with `dry_run=False`.
3. **Check the returned UUID** — if it differs from the one you passed, tell the user the old
   monitor was replaced with a new one.

---

## Guidelines

- **Be specific.** Generic advice like "reduce sensitivity" is less useful than exact config changes.
- **Prefer surgical changes.** A targeted WHERE condition beats a blunt sensitivity reduction.
- **Preserve signal.** Always explain what genuine anomalies would still be caught after tuning.
- **Cite evidence.** Reference specific incident dates, segment values, and counts from the report.
- **Degrade gracefully.** If troubleshooting runs are missing, note the limited context and
  reason from alert patterns alone.