Claude-skill-registry bigquery-historical-data-aggregator
Aggregates and analyzes historical data from multiple BigQuery tables with similar schemas. Queries multiple tables using UNION ALL, calculates aggregate metrics (averages, sums, counts), handles table discovery via INFORMATION_SCHEMA, and processes large datasets efficiently with batch queries.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bigquery-historical-data-aggregator" ~/.claude/skills/majiayu000-claude-skill-registry-bigquery-historical-data-aggregator && rm -rf "$T"
manifest:
skills/data/bigquery-historical-data-aggregator/SKILL.mdsource content
Instructions
Core Workflow
- Discover Tables: First, query
to identify all available tables within the target dataset. This ensures the skill adapts to the actual table names present.INFORMATION_SCHEMA.TABLES - Aggregate Historical Data: Construct a SQL query that uses
to combine data from all identified tables. Calculate the required aggregate metric (e.g.,UNION ALL
) grouped by the relevant key (e.g.,AVG(score)
,student_id
).name - Handle Large Results: For datasets returning many rows (>50), use a batched querying strategy (e.g.,
andLIMIT
or filtering by key ranges) to retrieve the complete result set without truncation.OFFSET - Join with Latest Data: Read the latest data from the provided source (e.g., a local CSV file). Perform a join between the aggregated historical data and the latest data to enable comparative analysis.
- Calculate Deltas & Filter: Compute the percentage change or difference between historical and latest values. Apply the user-specified threshold filter (e.g.,
).drop_percentage > 0.25 - Output Results: Write the filtered results to the specified output file (e.g.,
).bad_student.csv - Trigger Critical Actions: For records exceeding a higher, critical threshold (e.g.,
), execute immediate actions such as writing critical log entries to a designated logging service.drop_percentage > 0.45
Key Techniques
- Dynamic Table Inclusion: Use the list from
to build theINFORMATION_SCHEMA
query dynamically. Do not hardcode table names.UNION ALL - Efficient Batch Retrieval: When the final aggregated list or intermediate results are large, retrieve data in manageable chunks using
clauses on sequential keys orWHERE
.LIMIT/OFFSET - Precise Percentage Calculation: Ensure the percentage change formula is correct:
.(historical_value - latest_value) / historical_value - Logging for Notification: When writing critical logs, include all necessary identifiers (e.g., name, ID) and context so downstream systems can trigger alerts or notifications.
Error Handling & Validation
- Confirm the target dataset exists before querying.
- Verify that source files (e.g., CSV) exist and are readable.
- Validate that the log bucket or destination for critical alerts exists and is accessible.
Bundled Resources
: A parameterized SQL template for the core aggregation logic.scripts/aggregate_query_template.sql
: An example schema to illustrate the expected table structure.references/schema_example.md