Claude-skill-registry cube-semantic-layer
ALWAYS USE when building semantic layer, working with metrics/dimensions, or integrating dbt with Cube consumption APIs. Use IMMEDIATELY when working with Cube REST API, GraphQL queries, or Postgres wire protocol (psycopg2 connections). Provides research steps for data modeling, dbt integration, MEASURE() syntax, and API validation.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/cube-semantic-layer" ~/.claude/skills/majiayu000-claude-skill-registry-cube-semantic-layer && rm -rf "$T"
skills/data/cube-semantic-layer/SKILL.mdCube Semantic Layer Development (Research-Driven)
Philosophy
This skill does NOT prescribe specific semantic models or API patterns. Instead, it guides you to:
- Research the current Cube version and capabilities
- Discover existing Cube data models and dbt integrations
- Validate your implementations against Cube documentation
- Verify integration with dbt models and consumption applications
Pre-Implementation Research Protocol
Step 1: Verify Runtime Environment
ALWAYS run this first:
# Check if Cube is installed cube --version 2>/dev/null || echo "Cube CLI not found" # Check if running locally curl -s http://localhost:4000/readyz || echo "Cube API not reachable"
Critical Questions to Answer:
- Is Cube running locally or remotely?
- What version is available?
- What database connection is configured?
- Is dbt integration enabled?
Step 2: Research SDK State (if unfamiliar)
When to research: If you encounter unfamiliar Cube features or need to validate patterns
Research queries (use WebSearch):
- "Cube [feature] documentation 2025" (e.g., "Cube dbt integration 2025")
- "Cube REST API query format 2025"
- "Cube data modeling metrics dimensions 2025"
Official documentation: https://cube.dev/docs
Key documentation sections:
- Data Modeling: https://cube.dev/docs/product/data-modeling
- dbt Integration: https://cube.dev/docs/product/data-modeling/recipes/dbt
- REST API: https://cube.dev/docs/product/apis-integrations/rest-api
- cube_dbt package: https://cube.dev/docs/product/data-modeling/reference/cube_dbt
Step 3: Discover Existing Patterns
BEFORE creating new Cube models, search for existing implementations:
# Find Cube model files find . -name "*.yml" -path "*/model/*" -o -name "*.yaml" -path "*/model/*" find . -name "*.js" -path "*/model/*" # Find dbt integration usage rg "cube_dbt|from_dbt" --type yaml --type py # Find REST API usage rg "cube.*api|/cubejs-api" --type py --type js
Key questions:
- What Cube models (cubes) already exist?
- Is dbt integration configured? (
package usage)cube_dbt - What metrics and dimensions are defined?
- What API endpoints are exposed?
Step 4: Validate Against Architecture
Check architecture docs for integration requirements:
- Read
for consumption layer requirements/docs/ - Understand dbt model → Cube cube mapping
- Verify API requirements for downstream applications
- Check governance metadata propagation
Implementation Guidance (Not Prescriptive)
Cube Data Model Concepts
Core concepts: Cubes, measures, dimensions, joins, pre-aggregations
Entity hierarchy:
Cube Project ├── Cubes (semantic models on top of tables/views) │ ├── Measures (metrics: SUM, COUNT, AVG, etc.) │ ├── Dimensions (attributes for grouping/filtering) │ ├── Joins (relationships between cubes) │ └── Pre-aggregations (materialized rollups) └── Views (organize cubes for consumers)
Research questions:
- What cubes should be created? (one per dbt model? logical grouping?)
- What measures are needed? (KPIs, aggregations)
- What dimensions? (time, categorical, hierarchical)
- What joins between cubes?
dbt Integration (cube_dbt
package)
cube_dbtCore concept: Generate Cube models from dbt manifest.json
Research questions:
- Should Cube models be generated from dbt? (vs hand-written)
- Where is dbt manifest.json located?
- What dbt models should be exposed in Cube?
- How should dbt model metadata be enriched in Cube?
SDK features to research:
package: dbt integration utilitiescube_dbt
: Load dbt metadatacube_dbt.load_dbt_project()
: Convert dbt model to Cube cubecube_dbt.dbt_model_to_cube()- manifest.json parsing: Extract dbt model metadata
- dbt meta tags: Propagate to Cube dimensions/measures
Cubes (Semantic Models)
Core concept: Cubes define the semantic layer on top of physical tables
Research questions:
- What is the SQL definition? (table, view, subquery)
- What measures should be exposed?
- What dimensions?
- What joins to other cubes?
- What pre-aggregations for performance?
SDK features to research:
- Cube definition: YAML or JavaScript
property: Define data source (table, view, SQL)sql- Measures:
(sum, count, avg, min, max, count_distinct, etc.)type - Dimensions:
(string, number, time, boolean, geo)type - Joins:
(one_to_one, one_to_many, many_to_one)relationship - Pre-aggregations: Materialized rollups for performance
Measures (Metrics)
Core concept: Measures are aggregatable metrics
Research questions:
- What KPIs need to be calculated?
- What aggregation type? (sum, count, average, distinct count)
- Are there calculated measures? (ratios, percentages)
- What filters apply to measures?
SDK features to research:
- Measure types:
,sum
,count
,avg
,min
,max
,count_distinctnumber - Calculated measures:
property with expressionssql - Filters:
property on measuresfilters - Rollup measures: Aggregating pre-aggregations
Dimensions (Attributes)
Core concept: Dimensions are attributes for grouping and filtering
Research questions:
- What categorical dimensions? (status, category, region)
- What time dimensions? (date, timestamp, granularity)
- What hierarchical dimensions? (geography, org structure)
- Should dimensions be hidden or exposed?
SDK features to research:
- Dimension types:
,string
,number
,time
,booleangeo - Time dimensions:
(day, week, month, year)granularities - Subquery dimensions: Complex SQL expressions
- Case expressions: Conditional logic in dimensions
Joins (Relationships)
Core concept: Joins define relationships between cubes
Research questions:
- What relationships exist? (user → orders, product → sales)
- What join type? (one-to-one, one-to-many, many-to-one)
- What join SQL?
- Are joins bidirectional?
SDK features to research:
- Join definition:
,relationship
propertysql - Join types:
,one_to_one
,one_to_manymany_to_one - Join SQL: Foreign key relationships
- Many-to-many: Through tables
Pre-aggregations (Performance)
Core concept: Pre-aggregations are materialized rollups for query performance
Research questions:
- What queries are slow?
- What aggregations are common? (daily rollups, by category)
- What materialization schedule?
- What storage for pre-aggregations?
SDK features to research:
- Pre-aggregation definition: Specify measures, dimensions, granularity
- Refresh strategies:
,everyupdate_window - Partitioning: Time-based partitions
- External storage: Store in data warehouse
REST API (Consumption)
Core concept: Cube exposes REST API for data queries
Research questions:
- What API endpoints are needed?
- What authentication?
- What query patterns? (measures + dimensions, filters, time ranges)
- How should API be consumed? (Python, JavaScript, BI tools)
SDK features to research:
- Query format: JSON with
,measures
,dimensions
,filterstimeDimensions - Authentication: JWT tokens, API keys
- Response format:
(result set) +data
(column metadata)schema - Pagination:
,limitoffset
GraphQL API (Consumption)
Core concept: Cube exposes GraphQL API for flexible querying
Research questions:
- What queries are supported?
- What mutations? (none - read-only)
- What authentication?
- How to query multiple cubes?
SDK features to research:
- Endpoint:
(NOT/cubejs-api/graphql
)/graphql - Query format:
query withcube
,measures
,dimensionsfilters - Authentication: Same as REST API (JWT, API keys)
- Introspection: Schema discovery via
query__schema
Example GraphQL query:
query { cube( measures: ["Orders.count", "Orders.totalAmount"] dimensions: ["Orders.status"] filters: [{ dimension: "Orders.status", operator: "equals", values: ["completed"] }] ) { Orders { count totalAmount status } } }
SQL API (Postgres Wire Protocol)
Core concept: Cube exposes SQL API via Postgres wire protocol for BI tool integration
Research questions:
- What port is SQL API on? (default: 15432)
- What authentication? (user/password)
- What SQL dialect? (Postgres-compatible with Cube extensions)
- What BI tools are supported? (Tableau, Looker, Metabase, psycopg2)
SDK features to research:
- Port:
(default 15432)CUBEJS_PG_SQL_PORT - Authentication:
,CUBEJS_SQL_USERCUBEJS_SQL_PASSWORD - Connection: Standard Postgres clients (psycopg2, JDBC, etc.)
- Query syntax: Use
function for aggregatesMEASURE() - Read-only: INSERT/UPDATE/DELETE not supported
CRITICAL - Cube SQL syntax differs from standard SQL:
-- Cube SQL API syntax (NOT standard Postgres) SELECT MEASURE(count) AS order_count, MEASURE(total_amount) AS total FROM Orders WHERE status = 'completed' GROUP BY 1 -- Dimensions referenced directly (no quotes) SELECT status, MEASURE(count) FROM Orders GROUP BY 1 -- Time dimensions with granularity SELECT DATE_TRUNC('month', created_at) AS month, MEASURE(count) FROM Orders GROUP BY 1
Python psycopg2 example:
import psycopg2 conn = psycopg2.connect( host="localhost", port=15432, user="cube", password="cube_password", database="cube" ) cursor = conn.cursor() cursor.execute("SELECT MEASURE(count) FROM Orders") results = cursor.fetchall()
Docker configuration:
environment: CUBEJS_PG_SQL_PORT: "15432" CUBEJS_SQL_USER: ${CUBEJS_SQL_USER:-cube} CUBEJS_SQL_PASSWORD: ${CUBEJS_SQL_PASSWORD:-cube_password}
Validation Workflow
Before Implementation
- ✅ Verified Cube availability (local or remote)
- ✅ Searched for existing Cube models and dbt integration
- ✅ Read architecture docs for consumption layer requirements
- ✅ Identified dbt models to expose in semantic layer
- ✅ Researched unfamiliar Cube features
During Implementation
- ✅ Using
for dbt integration (if applicable)cube_dbt - ✅ Proper cube definitions with measures and dimensions
- ✅ Joins correctly defined between cubes
- ✅ Pre-aggregations for performance optimization
- ✅ Metadata from dbt propagated to Cube
- ✅ Governance metadata (classifications) exposed
After Implementation
- ✅ Verify Cube models load without errors
- ✅ Test REST API queries (
)/cubejs-api/v1/load - ✅ Test GraphQL API queries (
)/cubejs-api/graphql - ✅ Test SQL API queries (port 15432,
syntax)MEASURE() - ✅ Verify measures calculate correctly
- ✅ Test dimension filtering and grouping
- ✅ Check pre-aggregation materialization
- ✅ Validate integration with downstream apps (BI tools, dashboards)
Context Injection (For Future Claude Instances)
When this skill is invoked, you should:
-
Verify runtime state (don't assume):
cube --version curl -s http://localhost:4000/readyz -
Discover existing patterns (don't invent):
find . -name "*.yml" -path "*/model/*" rg "cube_dbt" --type yaml -
Research when uncertain (don't guess):
- Use WebSearch for "Cube [feature] documentation 2025"
- Check official docs: https://cube.dev/docs
-
Validate against architecture (don't assume requirements):
- Read relevant architecture docs in
/docs/ - Understand dbt model mapping to Cube cubes
- Check consumption API requirements
- Read relevant architecture docs in
-
Check dbt integration (if applicable):
- Verify dbt manifest.json location
- Check
package usagecube_dbt - Understand dbt meta tag propagation
Quick Reference: Common Research Queries
Use these WebSearch queries when encountering specific needs:
- dbt integration: "Cube dbt integration cube_dbt package 2025"
- Data modeling: "Cube data modeling cubes measures dimensions 2025"
- Measures: "Cube measures metrics types documentation 2025"
- Dimensions: "Cube dimensions time dimensions 2025"
- Joins: "Cube joins relationships between cubes 2025"
- Pre-aggregations: "Cube pre-aggregations materialization 2025"
- REST API: "Cube REST API query format 2025"
- GraphQL API: "Cube GraphQL API query format endpoint 2025"
- SQL API: "Cube SQL API Postgres wire protocol MEASURE syntax 2025"
- SQL API psycopg2: "Cube SQL API Python psycopg2 connection 2025"
- Python client: "Cube Python REST API client 2025"
- Integration tests: "Cube integration testing pytest httpx 2025"
Integration Points to Research
dbt → Cube Integration
Key question: How are dbt models exposed in Cube semantic layer?
Research areas:
package installation and configurationcube_dbt- manifest.json loading (
)load_dbt_project() - dbt model → Cube cube conversion
- dbt meta tags → Cube dimensions/measures metadata
- Selective model exposure (which dbt models to include)
Cube → Consumption Applications
Key question: How do applications query the Cube semantic layer?
Research areas:
- REST API query format (JSON)
- SQL API for BI tools (Tableau, Looker, Metabase)
- GraphQL API for frontend apps
- JavaScript SDK for web applications
- Authentication and access control
Governance Metadata Propagation
Key question: How should governance metadata flow to Cube?
Research areas:
- dbt meta tags → Cube dimension metadata
- Classification exposure in Cube API
- Access control (row-level, column-level)
- Audit logging for API queries
Cube Development Workflow
Local Development
# Install Cube CLI npm install -g cubejs-cli # Create new Cube project (if needed) cubejs create my-cube-project -d postgres # Start Cube development server npm run dev # Access Cube Playground open http://localhost:4000
Using dbt Integration
# Install cube_dbt in Cube project # In model/schema.yml cubes: - name: my_cube public: true sql: > SELECT * FROM {{ dbt.ref('my_dbt_model') }} measures: - name: count type: count dimensions: - name: id sql: id type: number
REST API Query
import requests # Query Cube REST API response = requests.post( "http://localhost:4000/cubejs-api/v1/load", json={ "query": { "measures": ["orders.count", "orders.total_amount"], "dimensions": ["orders.status"], "timeDimensions": [{ "dimension": "orders.created_at", "granularity": "day", "dateRange": "last 30 days" }] } }, headers={"Authorization": "Bearer <token>"} ) data = response.json() print(data["data"]) # Result set print(data["annotation"]) # Schema metadata
References
- Cube Documentation: Official documentation
- dbt Integration: Using Cube with dbt
- cube_dbt package: dbt integration reference
- REST API Reference: REST API documentation
- GitHub Repository: Cube source code
- cube_dbt GitHub: dbt integration package
Remember: This skill provides research guidance, NOT prescriptive semantic models. Always:
- Verify Cube availability and version
- Discover existing Cube models and dbt integration patterns
- Research Cube capabilities when needed (use WebSearch liberally)
- Validate against actual consumption API requirements
- Test REST API queries and measure calculations before considering complete
- Understand dbt → Cube integration for metadata propagation