Claude-skill-registry code-clone-assistant
Detect and refactor code duplication with PMD CPD. TRIGGERS - code clones, DRY violations, duplicate code.
install
source · Clone the upstream repo
git clone https://github.com/majiayu000/claude-skill-registry
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/majiayu000/claude-skill-registry "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/data/code-clone-assistant" ~/.claude/skills/majiayu000-claude-skill-registry-code-clone-assistant && rm -rf "$T"
manifest:
skills/data/code-clone-assistant/SKILL.mdsource content
Code Clone Assistant
Detect code clones and guide refactoring using PMD CPD (exact duplicates) + Semgrep (patterns).
Tools
- PMD CPD v7.17.0+: Exact duplicate detection
- Semgrep v1.140.0+: Pattern-based detection
Tested: October 2025 - 30 violations detected across 3 sample files Coverage: ~3x more violations than using either tool alone
When to Use
Triggers: "find duplicate code", "DRY violations", "refactor similar code", "detect code duplication", "similar validation logic", "repeated patterns", "copy-paste code", "exact duplicates"
Why Two Tools?
PMD CPD and Semgrep detect different clone types:
| Aspect | PMD CPD | Semgrep |
|---|---|---|
| Detects | Exact copy-paste duplicates | Similar patterns with variations |
| Scope | Across files ✅ | Within/across files (Pro only) |
| Matching | Token-based (ignores formatting) | Pattern-based (AST matching) |
| Rules | ❌ No custom rules | ✅ Custom rules |
Result: Using both finds ~3x more DRY violations.
Clone Types
| Type | Description | PMD CPD | Semgrep |
|---|---|---|---|
| Type-1 | Exact copies | ✅ Default | ✅ |
| Type-2 | Renamed identifiers | ✅ | ✅ |
| Type-3 | Near-miss with variations | ⚠️ Partial | ✅ Patterns |
| Type-4 | Semantic clones (same behavior) | ❌ | ❌ |
Quick Start Workflow
# Step 1: Detect exact duplicates (PMD CPD) pmd cpd -d . -l python --minimum-tokens 20 -f markdown > pmd-results.md # Step 2: Detect pattern violations (Semgrep) semgrep --config=clone-rules.yaml --sarif --quiet > semgrep-results.sarif # Step 3: Analyze combined results (Claude Code) # Parse both outputs, prioritize by severity # Step 4: Refactor (Claude Code with user approval) # Extract shared functions, consolidate patterns, verify tests
Reference Documentation
For detailed information, see:
- Detection Commands - PMD CPD and Semgrep command details
- Complete Workflow - Detection, analysis, and presentation phases
- Refactoring Strategies - Approaches for addressing violations