Claude-skill-registry-data metaxy

This skill should be used when the user asks to "define a feature", "create a BaseFeature class", "track feature versions", "set up metadata store", "field-level dependencies", "FieldSpec", "FeatureDep", "run metaxy CLI", "metaxy migrations", or needs guidance on metaxy feature definitions, versioning, metadata stores, CLI commands, or testing patterns.

install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry-data
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry-data "$T" && mkdir -p ~/.claude/skills && cp -r "$T/data/metaxy" ~/.claude/skills/majiayu000-claude-skill-registry-data-metaxy && rm -rf "$T"
manifest: data/metaxy/SKILL.md
source content

Metaxy

Metaxy is a metadata layer for multi-modal Data and ML pipelines that manages and tracks feature versions, dependencies, and data lineage across complex computational graphs.

Core Concepts

Feature Definitions

To define a feature, create a class inheriting from

mx.BaseFeature
with a
FeatureSpec
metaclass argument:

import metaxy as mx


class MyFeature(
    mx.BaseFeature,
    spec=mx.FeatureSpec(
        key="my/feature",
        id_columns=["sample_id"],
        fields=["embedding", "score"],
    ),
):
    sample_id: str
    embedding: list[float]
    score: float

To add dependencies between features, use the

deps
parameter with
FeatureDep
. To specify field-level dependencies (for partial data dependencies processing), use
FieldSpec
with
FieldDep
or
FieldsMapping
.

Data Versioning

Metaxy automatically tracks sample versions and propagates changes through the dependency graph. To trigger recomputation when code changes, set

code_version
on
FieldSpec
:

fields = [
    mx.FieldSpec(key="embedding", code_version="2"),  # Bump to invalidate downstream
]

Metadata Stores

To configure a metadata store, create a

metaxy.toml
file or use programmatic configuration:

with mx.MetaxyConfig(stores={"dev": mx.DeltaMetadataStore(root_path="/tmp/metaxy")}).use() as config:
    store = config.get_store("dev")

Supported backends: DuckDB, ClickHouse, BigQuery, LanceDB, Delta Lake.

Feature Graph

To visualize and manage the feature dependency graph, use the CLI:

mx graph render            # Terminal visualization
mx push --store dev        # Push graph to store

CLI

Metaxy provides a CLI (

metaxy
or
mx
alias) for managing features, metadata, and migrations:

mx list features --verbose     # List features with dependencies
mx graph render                # Visualize feature graph
mx metadata status --all-features  # Check metadata freshness (expensive!)
mx migrations apply            # Apply pending migrations
mx mcp                         # Start MCP server for AI assistants

Testing

To test features in isolation, use context managers to avoid polluting the global registry:

import pytest
import metaxy as mx
from metaxy.metadata_store.delta import DeltaMetadataStore


@pytest.fixture
def metaxy_env(tmp_path):
    with mx.FeatureGraph().use():
        store = DeltaMetadataStore(root_path=tmp_path / "delta_test")
        with mx.MetaxyConfig(stores={"test": store}).use() as config:
            yield config

Examples

For complete code examples, see:

  • examples/feature-definitions.md
    - Feature classes with dependencies and field-level deps
  • examples/configuration.md
    - TOML and programmatic configuration
  • examples/metadata-stores.md
    - Store operations
  • examples/testing.md
    - Test isolation patterns
  • examples/cli.md
    - CLI command reference

Documentation

For comprehensive documentation: https://anam-org.github.io/metaxy/

Key pages: