Claude-skill-registry Notebook KISS Builder & Verifier (Pixi + DuckDB)
Create/refactor Jupyter notebooks for AI-agent workflows with per-directory Pixi kernels (pixi.toml), narrative-first KISS structure (markdown above every code cell), robust data loading (DuckDB + TSV/Parquet), beautiful plots, and strict "run-all-cells" validation before reporting completion.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/jupyter-notebook-ai-agents-skill" ~/.claude/skills/majiayu000-claude-skill-registry-notebook-kiss-builder-verifier-pixi-duckdb && rm -rf "$T"
skills/data/jupyter-notebook-ai-agents-skill/SKILL.mdWhen to trigger this skill
Use this skill whenever the user asks to create, refactor, clean up, lint, or productionize a Jupyter notebook (or a Jupytext notebook script) and they care about:
- Reproducibility (restart + run-all should work end-to-end)
- Per-directory environments via
(Pixi + pixi-kernel)pixi.toml - Readable narrative (concise markdown guidance above each code cell)
- Reliable data access (DuckDB + tabular files, correct paths)
- Presentation quality (plots and markdown are visually polished)
If the task is not notebook-centric (e.g., pure library code), do not trigger.
Non‑negotiables (hard rules)
- KISS notebook: short, linear, top-to-bottom, no hidden dependencies between cells.
- Markdown-first: every code cell must be preceded by a markdown cell that:
- states intent in 1–3 sentences or bullets,
- tells what artifact/output will appear,
- notes assumptions (paths, schema, expected shapes).
- Reproducible execution gate: never claim “done” until:
- you restart the kernel (clean state) and execute all cells in order,
- you inspect outputs for correctness/sanity (not just “no exceptions”),
- you fix any warnings/errors that impact correctness.
- Paths must be correct: data files are loaded using paths anchored to the notebook/project directory (see
). Avoid hard-coded home directories.docs/data_loading_duckdb.md - Pretty, tight plots: minimize whitespace; use a cohesive, non-default palette; label axes; include units; readable figure sizes.
Progressive disclosure (keep context lean)
The core rules live here. Load additional guidance only as needed:
- Notebook structure & markdown style:
docs/notebook_structure.md - Pixi + Jupyter kernel setup:
docs/pixi_jupyter.md - Data loading patterns (DuckDB + TSV/Parquet):
docs/data_loading_duckdb.md - Plot styling rules (tight layout, palettes):
docs/plot_style.md - Verification & “definition of done”:
docs/verification.md
Templates:
- Jupytext-first notebook template:
templates/kiss_notebook_template.py - Minimal
example:pixi.tomltemplates/pixi.toml - Optional DuckDB bootstrap:
templates/duckdb_bootstrap.sql
Automation scripts (if filesystem + Python execution is available):
- Execute notebook end-to-end:
scripts/execute_notebook.py - Lint structure (markdown above code):
scripts/lint_notebook_structure.py
Recommended workflow (agent playbook)
Follow this sequence; do not skip the validation gate.
1) Plan the notebook (outline first)
- Create/confirm the notebook’s narrative outline:
- Title + 3-line purpose
- Environment & reproducibility notes
- Data sources (files, DBs), schema expectations
- Analysis/EDA/modeling steps
- Results + conclusions + next steps
2) Scaffold the notebook
- Use the template in
.templates/kiss_notebook_template.py - Keep sections small; each section should have:
- a markdown heading,
- 1–3 code cells max.
3) Implement data access robustly
- Establish
andPROJECT_ROOT
.DATA_DIR - Validate file existence before reading.
- Prefer DuckDB for heavy joins/aggregations; keep pandas for presentation.
4) Create high-quality plots
- Use the plot style helper (see
).docs/plot_style.md - No chart junk; tight margins; consistent typography.
5) Validation gate (mandatory)
- Restart kernel → run all cells.
- If you have CLI access, also run
for a clean execution.scripts/execute_notebook.py - Check outputs:
- row counts, null rates, unique keys, value ranges,
- plot renders and labels,
- any randomness is seeded.
6) Report completion only after passing the gate
When reporting back, include:
- how you ran the notebook (restart+run-all, scripts),
- where data paths point,
- what key outputs/figures were produced,
- any caveats (e.g., external files required).