Claude-skill-registry bigquery-historical-data-aggregator

Aggregates and analyzes historical data from multiple BigQuery tables with similar schemas. Queries multiple tables using UNION ALL, calculates aggregate metrics (averages, sums, counts), handles table discovery via INFORMATION_SCHEMA, and processes large datasets efficiently with batch queries.

install

source · Clone the upstream repo

git clone https://github.com/majiayu000/claude-skill-registry

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/bigquery-historical-data-aggregator" ~/.claude/skills/majiayu000-claude-skill-registry-bigquery-historical-data-aggregator && rm -rf "$T"

manifest: skills/data/bigquery-historical-data-aggregator/SKILL.md

Instructions

Core Workflow

Discover Tables: First, query
```
INFORMATION_SCHEMA.TABLES
```
to identify all available tables within the target dataset. This ensures the skill adapts to the actual table names present.
Aggregate Historical Data: Construct a SQL query that uses
```
UNION ALL
```
to combine data from all identified tables. Calculate the required aggregate metric (e.g.,
```
AVG(score)
```
) grouped by the relevant key (e.g.,
```
student_id
```
,
```
name
```
).
Handle Large Results: For datasets returning many rows (>50), use a batched querying strategy (e.g.,
```
LIMIT
```
and
```
OFFSET
```
or filtering by key ranges) to retrieve the complete result set without truncation.
Join with Latest Data: Read the latest data from the provided source (e.g., a local CSV file). Perform a join between the aggregated historical data and the latest data to enable comparative analysis.
Calculate Deltas & Filter: Compute the percentage change or difference between historical and latest values. Apply the user-specified threshold filter (e.g.,
```
drop_percentage > 0.25
```
).
Output Results: Write the filtered results to the specified output file (e.g.,
```
bad_student.csv
```
).
Trigger Critical Actions: For records exceeding a higher, critical threshold (e.g.,
```
drop_percentage > 0.45
```
), execute immediate actions such as writing critical log entries to a designated logging service.

Key Techniques

Dynamic Table Inclusion: Use the list from
```
INFORMATION_SCHEMA
```
to build the
```
UNION ALL
```
query dynamically. Do not hardcode table names.
Efficient Batch Retrieval: When the final aggregated list or intermediate results are large, retrieve data in manageable chunks using
```
WHERE
```
clauses on sequential keys or
```
LIMIT/OFFSET
```
.
Precise Percentage Calculation: Ensure the percentage change formula is correct:
```
(historical_value - latest_value) / historical_value
```
.
Logging for Notification: When writing critical logs, include all necessary identifiers (e.g., name, ID) and context so downstream systems can trigger alerts or notifications.

Error Handling & Validation

Confirm the target dataset exists before querying.
Verify that source files (e.g., CSV) exist and are readable.
Validate that the log bucket or destination for critical alerts exists and is accessible.

Bundled Resources

```
scripts/aggregate_query_template.sql
```
: A parameterized SQL template for the core aggregation logic.
```
references/schema_example.md
```
: An example schema to illustrate the expected table structure.