Claude-starter cleanup-dedupe
Detect duplicated code blocks and refactor to DRY where it reduces complexity. Runs jscpd (multi-language), filters by signal-to-noise, and auto-extracts only token-identical blocks ≥30 LOC. Use when the user asks to deduplicate, DRY up, find copy-paste, or consolidate repeated logic.
install
source · Clone the upstream repo
git clone https://github.com/raintree-technology/claude-starter
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/raintree-technology/claude-starter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/code-quality/cleanup-dedupe" ~/.claude/skills/raintree-technology-claude-starter-cleanup-dedupe && rm -rf "$T"
manifest:
skills/code-quality/cleanup-dedupe/SKILL.mdsource content
Find duplicated code and consolidate where it reduces complexity. Auto-extract only token-identical, sizeable blocks. Smaller or divergent duplicates often shouldn't be DRY'd — premature abstraction is worse than 3 similar lines.
Preflight
- Language detect — jscpd handles 150+ languages, so we run it on whatever is present.
- Git state: refuse auto-apply on dirty tree.
- Report dir: ensure
exists..claude/cleanup-reports/ - Read existing util conventions: where does the project keep shared utilities? Look for
,lib/
,utils/
,shared/
, package directories. Extracted code goes there.common/
Detect
# jscpd - the standard for cross-language clone detection npx jscpd --min-tokens 70 --min-lines 30 --threshold 0 --reporters json --output /tmp/jscpd-out .
Parse
/tmp/jscpd-out/jscpd-report.json. Each duplication entry has firstFile, secondFile, lines, tokens, and the actual fragment.
Filter the noise before further analysis:
- Drop matches in
,node_modules/
,dist/
,build/
,.next/
,__generated__/
.*.generated.* - Drop matches inside test files mirroring each other (test setup duplication is often intentional).
- Drop matches that are just type/interface definitions — those go to
.cleanup-types - Drop matches in migration files.
Assess
Write
.claude/cleanup-reports/cleanup-dedupe-{YYYY-MM-DD}.md:
# Duplication Assessment — YYYY-MM-DD ## Summary - Total clones found: N (after filtering noise) - HIGH confidence (auto-extractable): X - MEDIUM (similar but divergent): Y - LOW (structural similarity, intentional): Z - Estimated LOC saved (HIGH only): ~N ## Clones ### Clone 1 — HIGH (extract to `packages/utils/src/format.ts`) - Files: `apps/app/features/holdings/format.ts:15-67`, `apps/admin/features/users/format.ts:22-74` - 52 lines, 380 tokens, identical - Both implement `formatCompactNumber(n: number): string` - Extract to: `packages/utils/src/formatCompactNumber.ts`, both sites import. ### Clone 2 — MEDIUM - Files: A and B - Similar structure but divergent in 3 spots — the formatting differs by locale, the rounding by precision. - Recommendation: don't extract yet — the abstraction needs a parameterization design that the human should decide. ## Critical Assessment [2-3 paragraphs: are duplicates concentrated in one area? Is there a missing shared package? Are the duplicates a sign that an early abstraction would have been wrong, OR that a missing abstraction is hurting?]
Apply
Auto-extract HIGH-confidence only.
Confidence rubric
HIGH (auto-apply):
- ≥30 LOC AND ≥70 tokens (jscpd defaults).
- Token-identical (jscpd's strict mode, after stripping comments and whitespace).
- Same function signature OR reducible to one with no parameter changes.
- Across 2+ files in the same workspace/package boundary OR a clear shared package exists.
- Not in test files, fixtures, or generated code.
MEDIUM (report only):
- Similar but with 1-3 spot divergences — needs parameterization design.
- Cross-package duplication where extraction requires creating a new shared package.
- 15-29 LOC — borderline; sometimes 3 similar functions are clearer than 1 over-parameterized one.
LOW (note, no action):
- Structural similarity (e.g., 5 React components with the same prop-spreading pattern) — usually intentional.
- Test setup boilerplate.
Execution (HIGH only)
- Determine the destination:
- Same package: existing
orlib/
directory.utils/ - Different packages: shared package if one exists (e.g.,
), else escalate to MEDIUM (don't auto-create new packages).packages/utils/
- Same package: existing
- Create the new util file with the extracted function. Use the canonical name from one of the source files; prefer the more descriptive name.
- Replace both source occurrences with imports of the new util.
- Delete the originals.
- Single commit:
.chore(cleanup): cleanup-dedupe — extracted N shared utilities
Verify
npm run check 2>&1 || (npx tsc --noEmit && npx eslint .) npm test 2>&1 || pytest 2>&1 # Re-run jscpd to confirm reduction npx jscpd --min-tokens 70 --min-lines 30 . 2>&1
If verify fails: revert and downgrade all to MEDIUM.
Output
- "Extracted N shared utilities, saved ~M LOC. K duplicates deferred — see report."
- Report path.
- Verify status.
NEVER
- Extract code from generated/
/migration files..d.ts - Create a new shared package automatically — escalate that decision.
- Extract React/Vue components on prop-pattern similarity alone — composition vs. abstraction is design judgment.
- Force an abstraction that requires changing function signatures of either source — that's a behavior change, not a refactor.
- DRY two similar test cases — tests benefit from explicitness; let them duplicate.
- Extract code that has different error handling or different side effects in each location — not actually duplicate behavior.