Aiwg research-gap-detect
Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. Automates the manual cluster analysis workflow.
git clone https://github.com/jmagly/aiwg
T=$(mktemp -d) && git clone --depth=1 https://github.com/jmagly/aiwg "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.agents/skills/research-gap-detect" ~/.claude/skills/jmagly-aiwg-research-gap-detect && rm -rf "$T"
.agents/skills/research-gap-detect/SKILL.mdResearch Gap Detect
Analyze the research corpus citation graph to find disconnected clusters, isolated papers, and gap opportunities. Optionally searches for bridge paper candidates and files gap issues.
Triggers
- "find research gaps"
- "detect clusters"
- "cluster analysis"
- "find isolated papers"
- "bridge candidate search"
/research-gap-detect
Parameters
--clusters-only
(optional)
--clusters-onlyOnly run cluster detection — skip bridge search and issue filing.
--file-issues
(optional)
--file-issuesAuto-file gap issues for each disconnected cluster pair.
--search-bridges
(optional)
--search-bridgesSearch external databases for papers that could bridge disconnected clusters.
--min-cluster-size N
(optional)
--min-cluster-size NMinimum papers in a cluster to report. Default: 2.
--format
(optional)
--formatOutput format:
full (default), summary, or json.
Execution Flow
Phase 1: Build Citation Graph
- Read the citation-network index (from
)/corpus-index-build --graph citation-network- If stale or missing: run
first/corpus-index-build --graph citation-network
- If stale or missing: run
- Build an adjacency list from outgoing + incoming edges
- Treat as undirected for cluster detection (A cites B ≡ A connected to B)
Phase 2: Connected Components (BFS)
Run BFS/connected-components on the undirected citation graph:
- Initialize: all nodes unvisited
- For each unvisited node: BFS to find its connected component
- Collect components sorted by size (largest first)
Output:
Connected Components: 9 Cluster 1: "Agentic Workflows" (124 papers) Hub: REF-016 (34 connections) Topics: agentic-workflows, multi-agent, orchestration Sample: REF-001, REF-016, REF-024, REF-121 ... Cluster 2: "GUI Agents" (31 papers) Hub: REF-198 (12 connections) Topics: gui-agents, web-agents, screen-understanding Sample: REF-198, REF-201, REF-215 ... ... Cluster 9: "Isolated" (3 papers) No hub (all degree 1) REF-299, REF-312, REF-350
Phase 3: Gap Analysis
For each pair of clusters, assess the gap:
- Topic overlap — do the clusters share any tags?
- Temporal overlap — do they cover the same years?
- Author overlap — do any authors appear in both clusters?
- Bridgeability — could a single paper connect them?
Prioritize gaps by:
- Size product — larger clusters disconnected = higher priority
- Topic proximity — clusters with related but not identical topics
- Recency — newer clusters may simply be missing recent cross-citations
Output:
Gap Analysis: 12 cluster pairs Priority 1: "Agentic Workflows" ↔ "GUI Agents" Gap: 124 × 31 = 3,844 (size product) Topic overlap: agent, llm (2 shared tags) Bridge opportunity: HIGH Suggested search: "LLM agent GUI interaction orchestration" Priority 2: "Evaluation" ↔ "Reproducibility" Gap: 45 × 28 = 1,260 Topic overlap: evaluation, benchmark (2 shared tags) Bridge opportunity: MEDIUM Suggested search: "reproducible LLM evaluation benchmarks" ...
Phase 4: Bridge Search (if --search-bridges)
For each high-priority gap:
- Generate search queries from cluster topic overlap
- Search external databases (Semantic Scholar, arXiv, Google Scholar)
- Filter candidates by:
- Cites papers from BOTH clusters
- Published in overlapping time range
- High citation count (likely to be connecting work)
- Rank candidates by bridge potential
Output:
Bridge Candidates Found: 8 For gap "Agentic Workflows" ↔ "GUI Agents": 1. "WebAgent: World-Centric Web Navigation" (2024) Cites: REF-016 (Cluster 1), REF-198 (Cluster 2) Citations: 87 Bridge potential: HIGH 2. "Agent-E: Vision-Language Planning for Web Tasks" (2024) Cites: REF-024 (Cluster 1), REF-201 (Cluster 2) Citations: 45 Bridge potential: MEDIUM
Phase 5: File Issues (if --file-issues)
For each gap with bridge candidates, file a research induction issue:
## Research Gap: [Cluster A] ↔ [Cluster B] **Gap Size**: [N × M papers disconnected] **Bridge Candidates**: [list] **Suggested Action**: Induct [top candidate] to connect clusters ### Bridge Papers to Induct - [ ] "WebAgent: World-Centric Web Navigation" — arxiv:2401.XXXXX - [ ] "Agent-E: Vision-Language Planning" — arxiv:2403.XXXXX
Phase 6: Report
Research Gap Detection ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Graph: 372 nodes, 1,247 edges Connected components: 9 Largest cluster: 124 papers ("Agentic Workflows") Isolated papers: 3 Gap analysis: 12 cluster pairs HIGH priority: 4 (bridge candidates available) MEDIUM priority: 5 LOW priority: 3 Bridge candidates found: 8 papers Issues filed: 4 Papers recommended for induction: 8
Distinction from research-gap
| Tool | Approach | Output |
|---|---|---|
| Intellectual — topic coverage, missing areas, GRADE gaps | Gap report with search queries |
| Structural — citation graph topology, disconnected components | Cluster map, bridge candidates, filed issues |
research-gap answers "what topics are we missing?" while research-gap-detect answers "which existing papers don't cite each other but should?"
Examples
# Full analysis with bridge search /research-gap-detect --search-bridges # Just show clusters /research-gap-detect --clusters-only # Detect and auto-file issues /research-gap-detect --file-issues # Combined: search + file /research-gap-detect --search-bridges --file-issues # JSON for visualization /research-gap-detect --format json
References
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Builds the citation-network graph
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/citation-backfill/SKILL.md — Prerequisite: complete bidirectional edges
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap/SKILL.md — Complementary intellectual gap analysis
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/induct-research/SKILL.md — Inducts bridge candidates