Claude-skill-registry typed-holes-refactor
Refactor codebases using Design by Typed Holes methodology - iterative, test-driven refactoring with formal hole resolution, constraint propagation, and continuous validation. Use when refactoring existing code, optimizing architecture, or consolidating technical debt through systematic hole-driven development.
git clone https://github.com/majiayu000/claude-skill-registry
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/cc-polymath-typed-holes-refactor" ~/.claude/skills/majiayu000-claude-skill-registry-typed-holes-refactor && rm -rf "$T"
skills/data/cc-polymath-typed-holes-refactor/SKILL.mdTyped Holes Refactoring
Systematically refactor codebases using the Design by Typed Holes meta-framework: treat architectural unknowns as typed holes, resolve them iteratively with test-driven validation, and propagate constraints through dependency graphs.
Core Workflow
Phase 0: Hole Discovery & Setup
1. Create safe working branch:
git checkout -b refactor/typed-holes-v1 # CRITICAL: Never work in main, never touch .beads/ in main
2. Analyze current state and identify holes:
python scripts/discover_holes.py # Creates REFACTOR_IR.md with hole catalog
The Refactor IR documents:
- Current State Holes: What's unknown about the current system?
- Refactor Holes: What needs resolution to reach the ideal state?
- Constraints: What must be preserved/improved/maintained?
- Dependencies: Which holes block which others?
3. Write baseline characterization tests:
Create
tests/characterization/ to capture exact current behavior:
# tests/characterization/test_current_behavior.py def test_api_contracts(): """All public APIs must behave identically post-refactor""" for endpoint in discover_public_apis(): old_result = run_current(endpoint, test_inputs) save_baseline(endpoint, old_result) def test_performance_baselines(): """Record current performance - don't regress""" baselines = measure_all_operations() save_json("baselines.json", baselines)
Run tests on main branch - they should all pass. These are your safety net.
Phase 1-N: Iterative Hole Resolution
For each hole (in dependency order):
1. Select next ready hole:
python scripts/next_hole.py # Shows holes whose dependencies are resolved
2. Write validation tests FIRST (test-driven):
# tests/refactor/test_h{N}_resolution.py def test_h{N}_resolved(): """Define what 'resolved correctly' means""" # This should FAIL initially assert desired_state_achieved() def test_h{N}_equivalence(): """Ensure no behavioral regressions""" old_behavior = load_baseline() new_behavior = run_refactored() assert old_behavior == new_behavior
3. Implement resolution:
- Refactor code to make tests pass
- Keep characterization tests passing
- Commit incrementally with clear messages
4. Validate resolution:
python scripts/validate_resolution.py H{N} # Checks: tests pass, constraints satisfied, main untouched
5. Propagate constraints:
python scripts/propagate.py H{N} # Updates dependent holes based on resolution
6. Document and commit:
git add . git commit -m "Resolve H{N}: {description} - Tests: tests/refactor/test_h{N}_*.py pass - Constraints: {constraints satisfied} - Propagates to: {dependent holes}"
Phase Final: Reporting
Generate comprehensive delta report:
python scripts/generate_report.py > REFACTOR_REPORT.md
Report includes:
- Hole resolution summary with validation evidence
- Metrics delta (LOC, complexity, coverage, performance)
- Behavioral analysis (intentional changes documented)
- Constraint validation (all satisfied)
- Risk assessment and migration guide
Key Principles
1. Test-Driven Everything
- Write validation criteria BEFORE implementing
- Tests define "correct resolution"
- Characterization tests are sacred - never let them fail
2. Hole-Driven Progress
- Resolve holes in dependency order
- Each resolution propagates constraints
- Track everything formally in Refactor IR
3. Continuous Validation
Every commit must validate:
- ✅ Characterization tests pass (behavior preserved)
- ✅ Resolution tests pass (hole resolved correctly)
- ✅ Constraints satisfied
- ✅ Main branch untouched
- ✅
intact in main.beads/
4. Safe by Construction
- Work only in refactor branch
- Main is read-only reference
- Beads are untouchable historical artifacts
5. Formal Completeness
Design complete when:
- All holes resolved and validated
- All constraints satisfied
- All phase gates passed
- Metrics improved or maintained
Hole Quality Framework
SMART Criteria for Good Holes
Every hole must be:
-
Specific: Clear, bounded question with concrete answer
- ✓ Good: "How should error handling work in the API layer?"
- ✗ Bad: "How to improve the code?"
-
Measurable: Has testable validation criteria
- ✓ Good: "Reduce duplication from 60% to <15%"
- ✗ Bad: "Make code better"
-
Achievable: Can be resolved with available information
- ✓ Good: "Extract parsing logic to separate module"
- ✗ Bad: "Predict all future requirements"
-
Relevant: Blocks meaningful progress on refactoring
- ✓ Good: "Define core interface (blocks 5 other holes)"
- ✗ Bad: "Decide variable naming convention"
-
Typed: Clear type/structure for resolution
- ✓ Good:
interface Architecture = { layers: Layer[], rules: Rule[] } - ✗ Bad: "Some kind of structure?"
- ✓ Good:
Hole Estimation Framework
Size holes using these categories:
| Size | Duration | Characteristics | Examples |
|---|---|---|---|
| Nano | 1-2 hours | Simple, mechanical changes | Rename files, update imports |
| Small | 4-8 hours | Single module refactor | Extract class, consolidate functions |
| Medium | 1-3 days | Cross-module changes | Define interfaces, reorganize packages |
| Large | 4-7 days | Architecture changes | Layer extraction, pattern implementation |
| Epic | >7 days | SPLIT THIS HOLE | Too large, break into smaller holes |
Estimation Red Flags:
- More than 3 dependencies → Likely Medium+
- Unclear validation → Add time for discovery
- New patterns/tools → Add learning overhead
Hole Splitting Guidelines
Split a hole when:
- Estimate exceeds 7 days
- More than 5 dependencies
- Validation criteria unclear
- Multiple distinct concerns mixed
Splitting strategy:
Epic hole: "Refactor entire authentication system" → Split into: R10_auth_interface: Define new auth interface (Medium) R11_token_handling: Implement JWT tokens (Small) R12_session_management: Refactor sessions (Medium) R13_auth_middleware: Update middleware (Small) R14_auth_testing: Comprehensive test suite (Medium)
After splitting:
- Update dependencies in REFACTOR_IR.md
- Run
to update graphpython scripts/propagate.py - Re-sync with beads:
python scripts/holes_to_beads.py
Common Hole Types
Architecture Holes
"?R1_target_architecture": "What should the ideal structure be?" "?R2_module_boundaries": "How should modules be organized?" "?R3_abstraction_layers": "What layers/interfaces are needed?"
Validation: Architecture tests, dependency analysis, layer violation checks
Implementation Holes
"?R4_consolidation_targets": "What code should merge?" "?R5_extraction_targets": "What code should split out?" "?R6_elimination_targets": "What code should be removed?"
Validation: Duplication detection, equivalence tests, dead code analysis
Quality Holes
"?R7_test_strategy": "How to validate equivalence?" "?R8_migration_path": "How to safely transition?" "?R9_rollback_mechanism": "How to undo if needed?"
Validation: Test coverage metrics, migration dry-runs, rollback tests
See HOLE_TYPES.md for complete catalog.
Constraint Propagation Rules
Rule 1: Interface Resolution → Type Constraints
When: Interface hole resolved with concrete types Then: Propagate type requirements to all consumers Example: Resolve R6: NodeInterface = BaseNode with async run() Propagates to: → R4: Parallel execution must handle async → R5: Error recovery must handle async exceptions
Rule 2: Implementation → Performance Constraints
When: Implementation resolved with resource usage Then: Propagate limits to dependent holes Example: Resolve R4: Parallelization with max_concurrent=3 Propagates to: → R8: Rate limit = provider_limit / 3 → R7: Memory budget = 3 * single_operation_memory
Rule 3: Validation → Test Requirements
When: Validation resolved with test requirements Then: Propagate data needs upstream Example: Resolve R9: Testing needs 50 examples Propagates to: → R7: Metrics must support batch evaluation → R8: Test data collection strategy needed
See CONSTRAINT_RULES.md for complete propagation rules.
Success Indicators
Weekly Progress
- 2-4 holes resolved
- All tests passing
- Constraints satisfied
- Measurable improvements
Red Flags (Stop & Reassess)
- ❌ Characterization tests fail
- ❌ Hole can't be resolved within constraints
- ❌ Constraints contradict each other
- ❌ No progress for 3+ days
- ❌ Main branch accidentally modified
Validation Gates
| Gate | Criteria | Check |
|---|---|---|
| Gate 1: Discovery Complete | All holes cataloged, dependencies mapped | |
| Gate 2: Foundation Holes | Core interfaces resolved, tests pass | |
| Gate 3: Implementation | All refactor holes resolved, metrics improved | |
| Gate 4: Production Ready | Migration tested, rollback verified | |
Claude-Assisted Workflow
This skill is designed for effective Claude/LLM collaboration. Here's how to divide work:
Phase 0: Discovery
Claude's Role:
- Run
to analyze codebasediscover_holes.py - Suggest holes based on code analysis
- Generate initial REFACTOR_IR.md structure
- Write characterization tests to capture current behavior
- Set up test infrastructure
Your Role:
- Confirm holes are well-scoped
- Prioritize which holes to tackle first
- Review and approve REFACTOR_IR.md
- Define critical constraints
Phase 1-N: Hole Resolution
Claude's Role:
- Write resolution tests (TDD) BEFORE implementation
- Implement hole resolution to make tests pass
- Run validation scripts:
,validate_resolution.pycheck_foundation.py - Update REFACTOR_IR.md with resolution details
- Propagate constraints:
python scripts/propagate.py H{N} - Generate commit messages documenting changes
Your Role:
- Make architecture decisions (which pattern, which approach)
- Assess risk and determine constraint priorities
- Review code changes for correctness
- Approve merge to main when complete
Phase Final: Reporting
Claude's Role:
- Generate comprehensive REFACTOR_REPORT.md
- Document all metrics deltas
- List all validation evidence
- Create migration guides
- Prepare PR description
Your Role:
- Final review of report accuracy
- Approve for production deployment
- Conduct post-refactor retrospective
Effective Prompting Patterns
Starting a session:
"I need to refactor [description]. Use typed-holes-refactor skill. Start with discovery phase."
Resolving a hole:
"Resolve H3 (target_architecture). Write tests first, then implement. Use [specific pattern/approach]."
Checking progress:
"Run check_completeness.py and show me the dashboard. What's ready to work on next?"
Generating visualizations:
"Generate dependency graph showing bottlenecks and critical path. Use visualize_graph.py with --analyze."
Claude's Limitations
Claude CANNOT:
- Make subjective architecture decisions (you must decide)
- Determine business-critical constraints (you must specify)
- Run tests that require external services (mock or you run them)
- Merge to main (you must approve and merge)
Claude CAN:
- Analyze code and suggest holes
- Write comprehensive test suites
- Implement resolutions within your constraints
- Generate reports and documentation
- Track progress across sessions (via beads + REFACTOR_IR.md)
Multi-Session Continuity
At session start:
"Continue typed-holes refactoring. Import beads state and show current status from REFACTOR_IR.md."
Claude will:
- Read REFACTOR_IR.md to understand current state
- Check which holes are resolved
- Identify next ready holes
- Resume where previous session left off
You should:
- Keep REFACTOR_IR.md and .beads/ committed to git
- Export beads state at session end:
bd export -o .beads/issues.jsonl - Use /context before starting to ensure Claude has full context
Beads Integration
Why beads + typed holes?
- Beads tracks issues across sessions (prevents lost work)
- Holes track refactoring-specific state (dependencies, constraints)
- Together: Complete continuity for long-running refactors
Setup
# Install beads (once) go install github.com/steveyegge/beads/cmd/bd@latest # After running discover_holes.py python scripts/holes_to_beads.py # Check what's ready bd ready --json
Workflow Integration
During hole resolution:
# Start work on a hole bd update bd-5 --status in_progress --json # Implement resolution # ... write tests, implement code ... # Validate resolution python scripts/validate_resolution.py H3 # Close bead bd close bd-5 --reason "Resolved H3: target_architecture" --json # Export state bd export -o .beads/issues.jsonl git add .beads/issues.jsonl REFACTOR_IR.md git commit -m "Resolve H3: Define target architecture"
Syncing holes ↔ beads:
# After updating REFACTOR_IR.md manually python scripts/holes_to_beads.py # Sync changes to beads # After resolving holes python scripts/holes_to_beads.py # Update bead statuses
Cross-session continuity:
# Session start bd import -i .beads/issues.jsonl bd ready --json # Shows ready holes python scripts/check_completeness.py # Shows overall progress # Session end bd export -o .beads/issues.jsonl git add .beads/issues.jsonl git commit -m "Session checkpoint: 3 holes resolved"
Bead advantages:
- Tracks work across days/weeks
- Shows dependency graph:
bd deps bd-5 - Prevents context loss
- Integrates with overall project management
Scripts Reference
All scripts are in
scripts/:
- Analyze codebase and generate REFACTOR_IR.mddiscover_holes.py
- Show next resolvable holes based on dependenciesnext_hole.py
- Check if hole resolution satisfies constraintsvalidate_resolution.py
- Update dependent holes after resolutionpropagate.py
- Create comprehensive delta reportgenerate_report.py
- Validate Phase 0 completeness (Gate 1)check_discovery.py
- Validate Phase 1 completeness (Gate 2)check_foundation.py
- Validate Phase 2 completeness (Gate 3)check_implementation.py
- Validate Phase 3 readiness (Gate 4)check_production.py
- Overall progress dashboardcheck_completeness.py
- Generate hole dependency visualizationvisualize_graph.py
- Sync holes with beads issuesholes_to_beads.py
Run any script with
--help for detailed usage.
Meta-Consistency
This skill uses its own principles:
| Typed Holes Principle | Application to Refactoring |
|---|---|
| Typed Holes | Architectural unknowns cataloged with types |
| Constraint Propagation | Design constraints flow through dependency graph |
| Iterative Refinement | Hole-by-hole resolution cycles |
| Test-Driven Validation | Tests define correctness |
| Formal Completeness | Gates verify design completeness |
We use the system to refactor the system.
Advanced Topics
For complex scenarios, see:
- HOLE_TYPES.md - Detailed hole taxonomy
- CONSTRAINT_RULES.md - Complete propagation rules
- VALIDATION_PATTERNS.md - Test patterns for different hole types
- EXAMPLES.md - Complete worked examples
Quick Start Example
# 1. Setup git checkout -b refactor/typed-holes-v1 python scripts/discover_holes.py # 2. Write baseline tests # Create tests/characterization/test_*.py # 3. Resolve first hole python scripts/next_hole.py # Shows H1 is ready # Write tests/refactor/test_h1_*.py (fails initially) # Refactor code until tests pass python scripts/validate_resolution.py H1 python scripts/propagate.py H1 git commit -m "Resolve H1: ..." # 4. Repeat for each hole # ... # 5. Generate report python scripts/generate_report.py > REFACTOR_REPORT.md
Troubleshooting
Characterization tests fail
Symptom: Tests that captured baseline behavior now fail
Resolution:
- Revert changes:
to see what changedgit diff - Investigate: What behavior changed and why?
- Decision tree:
- Intentional change: Update baselines with documentation
# Update baseline with reason save_baseline("v2_api", new_behavior, reason="Switched to async implementation") - Unintentional regression: Fix the code, tests must pass
- Intentional change: Update baselines with documentation
Prevention: Run characterization tests before AND after each hole resolution.
Hole can't be resolved
Symptom: Stuck on a hole for >3 days, unclear how to proceed
Resolution:
-
Check dependencies: Are they actually resolved?
python scripts/visualize_graph.py --analyze # Look for unresolved dependencies -
Review constraints: Are they contradictory?
- Example: C1 "preserve all behavior" + C5 "change API contract" → Contradictory
- Fix: Renegotiate constraints with stakeholders
-
Split the hole: If hole is too large
# Original: R4_consolidate_all (Epic, 10+ days) # Split into: R4a_consolidate_parsers (Medium, 2 days) R4b_consolidate_validators (Small, 1 day) R4c_consolidate_handlers (Medium, 2 days) -
Check for circular dependencies:
python scripts/visualize_graph.py # Look for cycles: R4 → R5 → R6 → R4- Fix: Break cycle by introducing intermediate hole or redefining dependencies
Escalation: If still stuck after 5 days, consider alternative refactoring approach.
Contradictory Constraints
Symptom: Cannot satisfy all constraints simultaneously
Example:
- C1: "Preserve exact current behavior" (backward compatibility)
- C5: "Reduce response time by 50%" (performance improvement)
- Current behavior includes slow, synchronous operations
Resolution Framework:
-
Identify the conflict:
C1 requires: Keep synchronous operations C5 requires: Switch to async operations → Contradiction: Can't be both sync and async -
Negotiate priorities:
Option C1 C5 Tradeoff A: Keep sync ✓ ✗ No performance gain B: Switch to async ✗ ✓ Breaking change C: Add async, deprecate sync ⚠️ ✓ Migration burden -
Choose resolution strategy:
- Relax constraint: Change C1 to "Preserve behavior where possible"
- Add migration period: C implemented over 2 releases
- Split into phases: Phase 1 (C1), Phase 2 (C5)
-
Document decision:
## Constraint Resolution: C1 vs C5 **Decision**: Relax C1 to allow async migration **Rationale**: Performance critical for user experience **Migration**: 3-month deprecation period for sync API **Approved by**: [Stakeholder], [Date]
Circular Dependencies
Symptom:
visualize_graph.py shows cycles
Example:
R4 (consolidate parsers) → depends on R6 (define interface) R6 (define interface) → depends on R4 (needs parser examples)
Resolution strategies:
-
Introduce intermediate hole:
H0_parser_analysis: Analyze existing parsers (no dependencies) R6_interface: Define interface using H0 analysis R4_consolidate: Implement using R6 interface -
Redefine dependencies:
- Maybe R4 doesn't actually need R6
- Or R6 only needs partial R4 (split R4)
-
Accept iterative refinement:
R6_interface_v1: Initial interface (simple) R4_consolidate: Implement with v1 interface R6_interface_v2: Refine based on R4 learnings
Prevention: Define architecture holes before implementation holes.
No Progress for 3+ Days
Symptom: Feeling stuck, no commits, uncertain how to proceed
Resolution checklist:
-
Review REFACTOR_IR.md: Are holes well-defined (SMART criteria)?
- If not: Rewrite holes to be more specific
-
Check hole size: Is current hole >7 days estimate?
- If yes: Split into smaller holes
-
Run dashboard:
python scripts/check_completeness.py- Are you working on a blocked hole?
- Switch to a ready hole instead
-
Visualize dependencies:
python scripts/visualize_graph.py --analyze- Identify bottlenecks
- Look for parallel work opportunities
-
Review constraints: Are they achievable?
- Renegotiate if necessary
-
Seek external review:
- Share REFACTOR_IR.md with colleague
- Get feedback on approach
-
Consider alternative: Maybe this refactor isn't feasible
- Document why
- Propose different approach
Reset protocol: If still stuck, revert to last working state and try different approach.
Estimation Failures
Symptom: Hole taking 3x longer than estimated
Analysis:
-
Why did estimate fail?
- Underestimated complexity
- Unforeseen dependencies
- Unclear requirements
- Technical issues (tool problems, infrastructure)
-
Immediate actions:
- Update REFACTOR_IR.md with revised estimate
- If >7 days, split the hole
- Update beads:
bd update bd-5 --note "Revised estimate: 5 days (was 2)"
-
Future improvements:
- Use actual times to calibrate future estimates
- Add buffer for discovery (20% overhead)
- Note uncertainty in IR: "Estimate: 2-4 days (high uncertainty)"
Learning: Track actual vs estimated time in REFACTOR_REPORT.md for future reference.
Begin with Phase 0: Discovery. Always work in a branch. Test first, refactor second.