Mc-agent-toolkit monte-carlo-storage-cost-analysis

install

source · Clone the upstream repo

git clone https://github.com/monte-carlo-data/mc-agent-toolkit

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/monte-carlo-data/mc-agent-toolkit "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/storage-cost-analysis" ~/.claude/skills/monte-carlo-data-mc-agent-toolkit-monte-carlo-storage-cost-analysis && rm -rf "$T"

manifest: skills/storage-cost-analysis/SKILL.md

source content

Monte Carlo Storage Cost Analysis Skill

This skill helps identify tables that are wasting storage and recommends safe cleanup actions. It uses Monte Carlo's observability data -- lineage, query activity, monitoring status, and importance scores -- to classify waste patterns and compute safety tiers before recommending any action.

Reference files live next to this skill file. Use the Read tool (not MCP resources) to access them:

Waste pattern definitions and classification:
```
references/waste-patterns.md
```
(relative to this file)
Safety tier computation:
```
references/safety-tiers.md
```
(relative to this file)

When to activate this skill

Activate when the user:

Asks about storage costs, waste, or cleanup opportunities
Wants to find unused, unread, or stale tables
Asks "which tables can I drop?" or "what's costing us money?"
Mentions storage optimization, cost reduction, or warehouse cleanup
Wants to identify zombie tables or dead-end pipelines

When NOT to activate this skill

Do not activate when the user is:

Just querying data or exploring table contents
Creating or modifying monitors (use the monitoring-advisor skill)
Investigating data quality incidents (use the prevent skill)
Looking at pipeline performance (use the performance-diagnosis skill)

Prerequisites

The following MCP tools must be available (connect to Monte Carlo's MCP server):

```
search
```
-- find tables by name, filter by monitoring status and importance
```
get_table
```
-- get table metadata (size, type, timestamps)
```
get_asset_lineage
```
-- check upstream/downstream dependencies
```
get_queries_for_table
```
-- check read/write query activity
```
get_warehouses
```
-- list available warehouses

Workflow

Step 1: Identify the scope

Ask the user which warehouse to analyze, or use the one they mentioned. Call

get_warehouses

to list available warehouses if needed.

If the user specifies a schema or tag filter, use

search

with the appropriate filters to scope the analysis.

Step 2: Find candidate tables

Use

search

to find tables that may be waste candidates. Run multiple searches to cover different patterns:

Unmonitored tables:

search(query="*", is_monitored=false, resource_ids=[warehouse_id])

-- tables nobody cared enough to monitor

All tables (if the user wants a full analysis): paginate through
```
search
```
results for the target warehouse

For each candidate, note: table name, importance score, monitoring status, MCON.

Step 3: Investigate each candidate

For each candidate table (or top N by size), gather evidence:

Query activity: Call
```
get_queries_for_table(mcon=table_mcon, query_type="source")
```
to check reads, and
```
query_type="destination"
```
to check writes. Focus on:
- Total read count (zero reads = potential waste)
- Last read timestamp (stale if >90 days ago)
- Write frequency (write-only tables are waste candidates)
Downstream dependencies: Call
```
get_asset_lineage(mcons=[table_mcon], direction="DOWNSTREAM")
```
to check if anything consumes this table.
- ```
has_relationships: false
```
  = no downstream consumers (safer to remove)
- Has downstream consumers = do NOT recommend removal without user review
Table metadata: Call
```
get_table(mcon=table_mcon)
```
for size, type, and last update time.

Step 4: Classify waste patterns

Read the

references/waste-patterns.md

file and classify each table into one of the waste categories based on the evidence gathered. Apply the safety tier computation from

references/safety-tiers.md

Step 5: Present recommendations

Group findings by waste pattern and present to the user:

Safe to remove (safety tier 0-1): Tables with no downstream dependencies, no reads, low importance. Recommend
```
DROP TABLE
```
or archival.
Needs review (safety tier 2-3): Tables with some risk factors. Present the evidence and let the user decide.
Needs lineage investigation (has downstream deps): Tables that have consumers -- never recommend removal without the user verifying downstream impact.

For each recommendation, include:

Table name (human-readable, never MCONs)
Waste pattern (e.g., "Unread -- zero queries in 90 days")
Size and estimated monthly cost (Snowflake: ~$23/TB/month)
Safety tier with explanation
Specific action: DROP, ARCHIVE, INVESTIGATE, or KEEP

Important rules

Never recommend removing a table with downstream consumers without explicit lineage verification. Safety first.
Always explain WHY a table is waste -- don't just say "drop it."
Cost estimates are approximate. Snowflake: ~$23/TB/month. For non-Snowflake warehouses, show size only (no cost estimate) and note that pricing varies.
Importance scores are computed metrics, not business criticality. A low importance score doesn't mean the table is safe to remove -- always check lineage and query activity.
Present results as a table for easy scanning: table name, waste pattern, size, safety tier, recommendation.
Never expose MCONs, UUIDs, or internal identifiers to the user.