Claude-starter cleanup-dedupe

Detect duplicated code blocks and refactor to DRY where it reduces complexity. Runs jscpd (multi-language), filters by signal-to-noise, and auto-extracts only token-identical blocks ≥30 LOC. Use when the user asks to deduplicate, DRY up, find copy-paste, or consolidate repeated logic.

install

source · Clone the upstream repo

git clone https://github.com/raintree-technology/claude-starter

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/raintree-technology/claude-starter "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/code-quality/cleanup-dedupe" ~/.claude/skills/raintree-technology-claude-starter-cleanup-dedupe && rm -rf "$T"

manifest: skills/code-quality/cleanup-dedupe/SKILL.md

source content

Find duplicated code and consolidate where it reduces complexity. Auto-extract only token-identical, sizeable blocks. Smaller or divergent duplicates often shouldn't be DRY'd — premature abstraction is worse than 3 similar lines.

Preflight

Language detect — jscpd handles 150+ languages, so we run it on whatever is present.
Git state: refuse auto-apply on dirty tree.
Report dir: ensure
```
.claude/cleanup-reports/
```
exists.
Read existing util conventions: where does the project keep shared utilities? Look for
```
lib/
```
,
```
utils/
```
,
```
shared/
```
,
```
common/
```
, package directories. Extracted code goes there.

Detect

# jscpd - the standard for cross-language clone detection
npx jscpd --min-tokens 70 --min-lines 30 --threshold 0 --reporters json --output /tmp/jscpd-out .

Parse

/tmp/jscpd-out/jscpd-report.json

. Each duplication entry has

firstFile

secondFile

lines

tokens

, and the actual fragment.

Filter the noise before further analysis:

Drop matches in

node_modules/

dist/

build/

.next/

__generated__/

*.generated.*

Drop matches inside test files mirroring each other (test setup duplication is often intentional).
Drop matches that are just type/interface definitions — those go to
```
cleanup-types
```
.
Drop matches in migration files.

Assess

Write

.claude/cleanup-reports/cleanup-dedupe-{YYYY-MM-DD}.md

# Duplication Assessment — YYYY-MM-DD

## Summary
- Total clones found: N (after filtering noise)
- HIGH confidence (auto-extractable): X
- MEDIUM (similar but divergent): Y
- LOW (structural similarity, intentional): Z
- Estimated LOC saved (HIGH only): ~N

## Clones

### Clone 1 — HIGH (extract to `packages/utils/src/format.ts`)
- Files: `apps/app/features/holdings/format.ts:15-67`, `apps/admin/features/users/format.ts:22-74`
- 52 lines, 380 tokens, identical
- Both implement `formatCompactNumber(n: number): string`
- Extract to: `packages/utils/src/formatCompactNumber.ts`, both sites import.

### Clone 2 — MEDIUM
- Files: A and B
- Similar structure but divergent in 3 spots — the formatting differs by locale, the rounding by precision.
- Recommendation: don't extract yet — the abstraction needs a parameterization design that the human should decide.

## Critical Assessment
[2-3 paragraphs: are duplicates concentrated in one area? Is there a missing shared package? Are the duplicates a sign that an early abstraction would have been wrong, OR that a missing abstraction is hurting?]

Apply

Auto-extract HIGH-confidence only.

Confidence rubric

HIGH (auto-apply):

≥30 LOC AND ≥70 tokens (jscpd defaults).
Token-identical (jscpd's strict mode, after stripping comments and whitespace).
Same function signature OR reducible to one with no parameter changes.
Across 2+ files in the same workspace/package boundary OR a clear shared package exists.
Not in test files, fixtures, or generated code.

MEDIUM (report only):

Similar but with 1-3 spot divergences — needs parameterization design.
Cross-package duplication where extraction requires creating a new shared package.
15-29 LOC — borderline; sometimes 3 similar functions are clearer than 1 over-parameterized one.

LOW (note, no action):

Structural similarity (e.g., 5 React components with the same prop-spreading pattern) — usually intentional.
Test setup boilerplate.

Execution (HIGH only)

Determine the destination:
- Same package: existing
```
lib/
```
  or
```
utils/
```
  directory.
- Different packages: shared package if one exists (e.g.,
```
packages/utils/
```
  ), else escalate to MEDIUM (don't auto-create new packages).
Create the new util file with the extracted function. Use the canonical name from one of the source files; prefer the more descriptive name.
Replace both source occurrences with imports of the new util.
Delete the originals.

Single commit:

chore(cleanup): cleanup-dedupe — extracted N shared utilities

Verify

npm run check 2>&1 || (npx tsc --noEmit && npx eslint .)
npm test 2>&1 || pytest 2>&1
# Re-run jscpd to confirm reduction
npx jscpd --min-tokens 70 --min-lines 30 . 2>&1

If verify fails: revert and downgrade all to MEDIUM.

Output

"Extracted N shared utilities, saved ~M LOC. K duplicates deferred — see report."
Report path.
Verify status.

NEVER

Extract code from generated/
```
.d.ts
```
/migration files.
Create a new shared package automatically — escalate that decision.
Extract React/Vue components on prop-pattern similarity alone — composition vs. abstraction is design judgment.
Force an abstraction that requires changing function signatures of either source — that's a behavior change, not a refactor.
DRY two similar test cases — tests benefit from explicitness; let them duplicate.
Extract code that has different error handling or different side effects in each location — not actually duplicate behavior.