Claude-awesome-stack notebook-refactor
Extract Jupyter notebook cells into tested Python modules while preserving the exploration workflow. Use when converting prototyping notebooks into production code.
install
source · Clone the upstream repo
git clone https://github.com/giacomogaglione/claude-awesome-stack
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/giacomogaglione/claude-awesome-stack "$T" && mkdir -p ~/.claude/skills && cp -r "$T/stacks/python-ml/skills/notebook-refactor" ~/.claude/skills/giacomogaglione-claude-awesome-stack-notebook-refactor && rm -rf "$T"
manifest:
stacks/python-ml/skills/notebook-refactor/SKILL.mdsource content
Notebook Refactoring Skill
Convert Jupyter notebook exploration code into clean, tested Python modules.
Process
1. Analyze the Notebook
Read the notebook and identify:
- Data loading cells ->
modulesrc/data/ - Preprocessing/transformation cells ->
orsrc/data/
modulesrc/preprocessing/ - Model definition cells ->
modulesrc/models/ - Training loop cells ->
modulesrc/training/ - Evaluation/metrics cells ->
modulesrc/evaluation/ - Visualization cells -> keep in notebook (these are exploratory)
- Configuration values (magic numbers, paths) ->
or config filesrc/config/
2. Extract Functions
For each group of cells:
- Identify inputs and outputs of the cell block
- Extract into a function with:
- Type-annotated parameters for all inputs
- A clear return type
- A docstring explaining what it does and why
- Replace hardcoded values with parameters
- Remove
,display()
debugging statementsprint() - Keep the notebook cell but replace the code with an import + function call
3. Write Tests
For each extracted function, write tests that:
- Use small, deterministic test fixtures (not the full dataset)
- Test the function's contract (input types -> output types/shapes)
- Test edge cases (empty input, single row, missing values)
- Use
for numerical outputsnp.testing.assert_allclose - Do NOT test exact numerical values from model operations (non-deterministic)
4. Update the Notebook
After extraction, the notebook should:
- Import from the new modules instead of defining functions inline
- Still be runnable end-to-end
- Serve as a high-level walkthrough / documentation of the pipeline
- Keep exploratory visualizations and analysis inline
5. Refactoring Checklist
Before marking complete:
- All extracted functions have type hints
- All extracted functions have tests
- Notebook still runs end-to-end with imports
- No hardcoded paths or magic numbers remain
- No unused imports in extracted modules
-
orpyproject.toml
updated if new packages are neededsetup.py