Medical-research-skills arxiv-database
Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.
install
source · Clone the upstream repo
git clone https://github.com/aipoch/medical-research-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Evidence Insight/arxiv-database" ~/.claude/skills/aipoch-medical-research-skills-arxiv-database && rm -rf "$T"
manifest:
scientific-skills/Evidence Insight/arxiv-database/SKILL.mdsource content
ArXiv Database Skill
When to Use
- Use this skill when you need search and retrieve scientific preprints from arxiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, doi, pdf url), or download pdfs for offline reading in a reproducible workflow.
- Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
- Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
- Use this skill when
is the most direct path to complete the request.scripts/arxiv_search.py - Use this skill when you need the
package behavior rather than a generic answer.arxiv-database
Key Features
- Scope-focused workflow aligned to: Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.
- Packaged executable path(s):
.scripts/arxiv_search.py - Reference material available in
for task-specific guidance.references/ - Structured execution path designed to keep outputs consistent and reviewable.
Dependencies
:Python
. Repository baseline for current packaged skills.3.10+
:Third-party packages
. Add pinned versions if this skill needs stricter environment control.not explicitly version-pinned in this skill package
Example Usage
cd "20260316/scientific-skills/Evidence Insight/arxiv-database" python -m py_compile scripts/arxiv_search.py python scripts/arxiv_search.py --help
Example run plan:
- Confirm the user input, output path, and any required config values.
- Edit the in-file
block or documented parameters if the script uses fixed settings.CONFIG - Run
with the validated inputs.python scripts/arxiv_search.py - Review the generated output and return the final artifact with any assumptions called out.
Implementation Details
- Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
- Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
- Primary implementation surface:
.scripts/arxiv_search.py - Reference guidance:
contains supporting rules, prompts, or checklists.references/ - Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
- Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.
1. When to Use
- You need to quickly find arXiv preprints by keyword, phrase, author, or category (e.g.,
,cs.AI
).cs.CL - You want to collect paper metadata (title, authors, publication date, abstract/summary, PDF link) for review or indexing.
- You need the latest submissions in a topic area (sorted by submission date or last updated date).
- You want to download one or more PDFs from search results for offline reading or batch processing.
- You have a known arXiv identifier and want to retrieve the corresponding paper directly.
2. Key Features
- arXiv query-based search (supports category filters, author filters, phrases, and ID lookups).
- Configurable result limits (
).--max-results - Sort control (
:--sort-by
,Relevance
,LastUpdatedDate
).SubmittedDate - Metadata output per result (title, authors, published date, abstract/summary, PDF URL; DOI when available via arXiv metadata).
- Optional PDF download for returned results (
) with configurable output directory (--download
).--dir
3. Dependencies
- Python 3.8+
(Python package) — version depends on your environment; install a recent release (e.g.,arxiv
)arxiv>=1.4.0
4. Example Usage
Install dependencies
pip install "arxiv>=1.4.0"
Run searches and downloads
Search for papers in
about reinforcement learning (top 5 results):cs.AI
python scripts/arxiv_search.py --query "cat:cs.AI AND reinforcement learning" --max-results 5
Search for “Large Language Models” in
:cs.CL
python scripts/arxiv_search.py --query "cat:cs.CL AND \"Large Language Models\""
Get the latest 5 papers on “quantum computing” (sorted by submission date):
python scripts/arxiv_search.py --query "quantum computing" --sort-by SubmittedDate --max-results 5
Download a specific paper by arXiv ID:
python scripts/arxiv_search.py --query "id:2101.12345" --download
Download results into a specific directory:
python scripts/arxiv_search.py --query "cat:cs.LG AND diffusion" --max-results 3 --download --dir ./papers
5. Implementation Details
- Entry point:
wraps thescripts/arxiv_search.py
Python API to execute queries against the arXiv search endpoint.arxiv - Query syntax: The
string is passed to arXiv search and can include:--query- Category filters (e.g.,
)cat:cs.AI - Author filters (e.g.,
)au:Smith - Exact phrases using quotes (e.g.,
)"Large Language Models" - ID lookup (e.g.,
)id:2101.12345 - Boolean operators such as
AND
- Category filters (e.g.,
- Result limiting:
controls how many entries are returned (default:--max-results
).10 - Sorting:
selects the ordering of results:--sort-by
(default)RelevanceLastUpdatedDateSubmittedDate
- Downloads: When
is set, the script downloads the PDF for each returned result using the provided PDF URL and saves it to--download
(default: current working directory).--dir - Metadata fields: Each result includes core arXiv metadata (title, authors, published date, summary/abstract, PDF URL). DOI is included when present in arXiv’s metadata for that record.