install
source · Clone the upstream repo
git clone https://github.com/jmagly/aiwg
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/jmagly/aiwg "$T" && mkdir -p ~/.claude/skills && cp -r "$T/.agents/skills/research-acquire" ~/.claude/skills/jmagly-aiwg-research-acquire && rm -rf "$T"
manifest:
.agents/skills/research-acquire/SKILL.mdsource content
Research Acquire Command
Download research papers from public repositories and extract metadata.
Instructions
When invoked, perform automated paper acquisition:
-
Identify Source
- Parse DOI, arXiv ID, or URL
- Determine paper hosting location
- Check if paper already exists in
.aiwg/research/sources/
-
Download Paper
- Attempt direct PDF download from source
- Try fallback sources (arXiv mirror, Unpaywall, PMC)
- Save to
.aiwg/research/sources/[ref-id].pdf - Verify download integrity (file size, PDF structure)
-
Extract Metadata
- Parse PDF metadata (title, authors, year)
- Query CrossRef/Semantic Scholar for enhanced metadata
- Extract abstract, keywords, citation count
- Determine source type (journal, conference, preprint)
-
Generate Frontmatter
- Create YAML frontmatter per @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml
- Assign REF-XXX identifier
- Calculate PDF checksum (SHA-256)
- Set initial GRADE baseline from source type
-
Extract Full Text (default, unless
)--no-extract-text- Extract full text from PDF to
.aiwg/research/sources/text/REF-XXX.txt - This text is the primary input for downstream analysis — analysis agents must read this file, not just metadata or abstract
- If extraction fails (scanned PDF, encrypted): log warning, set
in frontmatterfull_text_available: false
- Extract full text from PDF to
-
Create Finding Document
- Generate
from template.aiwg/research/findings/REF-XXX-[slug].md - Populate frontmatter with extracted metadata
- Add placeholder sections for key findings
- Update fixity manifest
- Generate
-
Post-Acquisition
- Log acquisition in
.aiwg/research/acquisition-log.yaml - Update corpus index
- Suggest next steps (quality assessment, documentation)
- Log acquisition in
Arguments
- DOI, arXiv ID, or URL (required)[identifier]
- Custom output location (default: auto-generate)--output [path]
- Specific REF-XXX identifier (default: auto-assign)--ref-id [REF-XXX]
- Extract full text to--extract-text
file for analysis (default: enabled; use.txt
to skip)--no-extract-text
- Skip metadata enrichment--no-metadata
- Re-download even if paper exists--force
Examples
# Acquire by DOI /research-acquire 10.48550/arXiv.2308.08155 # Acquire by arXiv ID /research-acquire arXiv:2308.08155 # Acquire with custom identifier /research-acquire https://arxiv.org/pdf/2308.08155.pdf --ref-id REF-022 # Acquire with full text extraction /research-acquire 10.1145/3377811.3380330 --extract-text
Expected Output
Acquiring Paper: 10.48550/arXiv.2308.08155 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1: Resolving identifier ✓ DOI resolved to arXiv:2308.08155 ✓ Paper not found in corpus Step 2: Downloading PDF ✓ Downloaded from arxiv.org (2.4 MB) ✓ Saved to .aiwg/research/sources/REF-022.pdf ✓ Checksum: a1b2c3d4e5f6... Step 3: Extracting metadata ✓ Title: AutoGen: Enabling Next-Gen LLM Applications... ✓ Authors: Wu, Q., Bansal, G., Zhang, J., et al. (9 authors) ✓ Year: 2023 ✓ Source: arXiv preprint ✓ Citations: 234 (as of 2026-02-03) Step 4: Creating finding document ✓ Generated .aiwg/research/findings/REF-022-autogen.md ✓ Frontmatter populated ✓ Template sections added Step 5: Updating corpus ✓ Added to fixity manifest ✓ Updated INDEX.md ✓ Logged acquisition ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Acquisition complete! REF-ID: REF-022 Title: AutoGen: Enabling Next-Gen LLM Applications... File: .aiwg/research/sources/REF-022.pdf Finding: .aiwg/research/findings/REF-022-autogen.md Next Steps: 1. /research-quality REF-022 - Assess evidence quality 2. /research-document REF-022 - Create detailed summary 3. /research-cite REF-022 - Generate citation
Provenance Tracking
All acquisitions create provenance records:
# .aiwg/research/provenance/records/REF-022-acquisition.yaml entity: id: "urn:aiwg:artifact:.aiwg/research/sources/REF-022.pdf" type: "research_paper" activity: id: "urn:aiwg:activity:acquisition:REF-022:001" type: "acquisition" started_at: "2026-02-03T12:00:00Z" ended_at: "2026-02-03T12:00:15Z" agent: id: "urn:aiwg:agent:acquisition-agent" type: "aiwg_agent" source: identifier: "10.48550/arXiv.2308.08155" url: "https://arxiv.org/pdf/2308.08155.pdf"
References
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/agents/acquisition-agent.md - Acquisition Agent
- @$AIWG_ROOT/src/research/services/acquisition-service.ts - Download implementation
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/research/frontmatter-schema.yaml - Metadata format
- @.aiwg/research/fixity-manifest.json - Checksum tracking
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/provenance-tracking.md - Provenance requirements