Claude-code-plugins bug-clustering
install
source · Clone the upstream repo
git clone https://github.com/jeremylongshore/claude-code-plugins-plus-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jeremylongshore/claude-code-plugins-plus-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/mcp/x-bug-triage/skills/bug-clustering" ~/.claude/skills/jeremylongshore-claude-code-plugins-bug-clustering && rm -rf "$T"
manifest:
plugins/mcp/x-bug-triage/skills/bug-clustering/SKILL.mdsource content
Bug Clustering Process
Step-by-step procedure for transforming raw XPost objects into structured, clustered bug candidates with PII redaction and reliability scoring.
Instructions
Step 1: Parse
For each XPost, produce a BugCandidate with all 33 fields using
lib/parser.ts:
- Extract product_surface, feature_area, symptoms, error_strings, repro_hints
- Extract urls, media_keys, language, conversation references
- Determine source_type (mention, reply, quote_post, search_hit)
Step 1.5: Deduplicate
Before classification, run content-similarity deduplication using
lib/dedupe.ts:
- Call
with parsed candidates and thededuplicateCandidates()
fromcandidate_dedup.hybrid_similarity_threshold
(default 0.70)config/cluster-matching-thresholds.json - Uses char-trigram + token-Jaccard hybrid similarity
- Does NOT remove posts — tags them as duplicate groups with a canonical post (highest engagement)
- Only canonical posts and non-duplicates (
) proceed to classificationforward_ids - Log dedup stats:
"{n} posts ({m} unique, {k} duplicate groups)"
Step 2: Classify
Run
lib/classifier.ts on each candidate:
- Assign one of 12 classifications with confidence score (0.0-1.0) and rationale
- Sarcastic bug reports get classified separately — still treated as signal
Step 3: Redact PII
Run
lib/redactor.ts on each candidate:
- Detect 6 PII types: email, API key, phone, account ID, media flag, URL token
- Replace with
tags[REDACTED:type] - Set pii_flags array and raw_text_storage_policy
Step 4: Score Reliability
Run
lib/reporter-scorer.ts on each candidate:
- 4 dimensions: report quality, independence, account authenticity, historical accuracy
- Composite reporter_reliability_score (0.0-1.0)
Step 5: Tag Reporter Category
Match author against approved_accounts config:
- Categories: public, internal, partner, tester
Step 6: Cluster
Using
lib/clusterer.ts and lib/signatures.ts:
- Generate deterministic bug signature from error_strings + symptoms + feature_area
- Match against active_clusters at >=70% signature overlap
- Family-first guard: different ClusterFamilies NEVER cluster together
- New match: create cluster (initial severity "low")
- Existing match: update report_count, last_seen, sub_status
- Resolved match: reopen with sub_status "regression_reopened"
- Suppressed match: skip, log to audit
Step 7: Persist
- Insert candidates to DB via
lib/db.ts - Insert/update clusters and cluster_posts junction
- Write audit events for each classification, redaction, and cluster action
References
Load evidence tier definitions for proper cluster evidence assessment:
!cat skills/x-bug-triage/references/evidence-policy.md
Load data model reference for BugCandidate fields and cluster schemas:
!cat skills/x-bug-triage/references/schemas.md