Mc-agent-toolkit monte-carlo-storage-cost-analysis
git clone https://github.com/monte-carlo-data/mc-agent-toolkit
T=$(mktemp -d) && git clone --depth=1 https://github.com/monte-carlo-data/mc-agent-toolkit "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/storage-cost-analysis" ~/.claude/skills/monte-carlo-data-mc-agent-toolkit-monte-carlo-storage-cost-analysis && rm -rf "$T"
skills/storage-cost-analysis/SKILL.mdMonte Carlo Storage Cost Analysis Skill
This skill helps identify tables that are wasting storage and recommends safe cleanup actions. It uses Monte Carlo's observability data -- lineage, query activity, monitoring status, and importance scores -- to classify waste patterns and compute safety tiers before recommending any action.
Reference files live next to this skill file. Use the Read tool (not MCP resources) to access them:
- Waste pattern definitions and classification:
(relative to this file)references/waste-patterns.md - Safety tier computation:
(relative to this file)references/safety-tiers.md
When to activate this skill
Activate when the user:
- Asks about storage costs, waste, or cleanup opportunities
- Wants to find unused, unread, or stale tables
- Asks "which tables can I drop?" or "what's costing us money?"
- Mentions storage optimization, cost reduction, or warehouse cleanup
- Wants to identify zombie tables or dead-end pipelines
When NOT to activate this skill
Do not activate when the user is:
- Just querying data or exploring table contents
- Creating or modifying monitors (use the monitoring-advisor skill)
- Investigating data quality incidents (use the prevent skill)
- Looking at pipeline performance (use the performance-diagnosis skill)
Prerequisites
The following MCP tools must be available (connect to Monte Carlo's MCP server):
-- find tables by name, filter by monitoring status and importancesearch
-- get table metadata (size, type, timestamps)get_table
-- check upstream/downstream dependenciesget_asset_lineage
-- check read/write query activityget_queries_for_table
-- list available warehousesget_warehouses
Workflow
Step 1: Identify the scope
Ask the user which warehouse to analyze, or use the one they mentioned. Call
get_warehouses to list available warehouses if needed.
If the user specifies a schema or tag filter, use
search with the appropriate filters to scope the analysis.
Step 2: Find candidate tables
Use
search to find tables that may be waste candidates. Run multiple searches to cover different patterns:
- Unmonitored tables:
-- tables nobody cared enough to monitorsearch(query="*", is_monitored=false, resource_ids=[warehouse_id]) - All tables (if the user wants a full analysis): paginate through
results for the target warehousesearch
For each candidate, note: table name, importance score, monitoring status, MCON.
Step 3: Investigate each candidate
For each candidate table (or top N by size), gather evidence:
-
Query activity: Call
to check reads, andget_queries_for_table(mcon=table_mcon, query_type="source")
to check writes. Focus on:query_type="destination"- Total read count (zero reads = potential waste)
- Last read timestamp (stale if >90 days ago)
- Write frequency (write-only tables are waste candidates)
-
Downstream dependencies: Call
to check if anything consumes this table.get_asset_lineage(mcons=[table_mcon], direction="DOWNSTREAM")
= no downstream consumers (safer to remove)has_relationships: false- Has downstream consumers = do NOT recommend removal without user review
-
Table metadata: Call
for size, type, and last update time.get_table(mcon=table_mcon)
Step 4: Classify waste patterns
Read the
references/waste-patterns.md file and classify each table into one of the waste categories based on the evidence gathered. Apply the safety tier computation from references/safety-tiers.md.
Step 5: Present recommendations
Group findings by waste pattern and present to the user:
- Safe to remove (safety tier 0-1): Tables with no downstream dependencies, no reads, low importance. Recommend
or archival.DROP TABLE - Needs review (safety tier 2-3): Tables with some risk factors. Present the evidence and let the user decide.
- Needs lineage investigation (has downstream deps): Tables that have consumers -- never recommend removal without the user verifying downstream impact.
For each recommendation, include:
- Table name (human-readable, never MCONs)
- Waste pattern (e.g., "Unread -- zero queries in 90 days")
- Size and estimated monthly cost (Snowflake: ~$23/TB/month)
- Safety tier with explanation
- Specific action: DROP, ARCHIVE, INVESTIGATE, or KEEP
Important rules
- Never recommend removing a table with downstream consumers without explicit lineage verification. Safety first.
- Always explain WHY a table is waste -- don't just say "drop it."
- Cost estimates are approximate. Snowflake: ~$23/TB/month. For non-Snowflake warehouses, show size only (no cost estimate) and note that pricing varies.
- Importance scores are computed metrics, not business criticality. A low importance score doesn't mean the table is safe to remove -- always check lineage and query activity.
- Present results as a table for easy scanning: table name, waste pattern, size, safety tier, recommendation.
- Never expose MCONs, UUIDs, or internal identifiers to the user.