Medical-research-skills arxiv-database

Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.

install

source · Clone the upstream repo

git clone https://github.com/aipoch/medical-research-skills

Claude Code · Install into ~/.claude/skills/

T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Evidence Insight/arxiv-database" ~/.claude/skills/aipoch-medical-research-skills-arxiv-database && rm -rf "$T"

manifest: scientific-skills/Evidence Insight/arxiv-database/SKILL.md

source content

Source: https://github.com/aipoch/medical-research-skills

ArXiv Database Skill

When to Use

Use this skill when you need search and retrieve scientific preprints from arxiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, doi, pdf url), or download pdfs for offline reading in a reproducible workflow.
Use this skill when a evidence insight task needs a packaged method instead of ad-hoc freeform output.
Use this skill when the user expects a concrete deliverable, validation step, or file-based result.
Use this skill when
```
scripts/arxiv_search.py
```
is the most direct path to complete the request.
Use this skill when you need the
```
arxiv-database
```
package behavior rather than a generic answer.

Key Features

Scope-focused workflow aligned to: Search and retrieve scientific preprints from arXiv; use it when you need to find papers by keyword/author/category, fetch metadata (abstract, DOI, PDF URL), or download PDFs for offline reading.
Packaged executable path(s):
```
scripts/arxiv_search.py
```
.
Reference material available in
```
references/
```
for task-specific guidance.
Structured execution path designed to keep outputs consistent and reviewable.

Dependencies

```
Python
```
:
```
3.10+
```
. Repository baseline for current packaged skills.
```
Third-party packages
```
:
```
not explicitly version-pinned in this skill package
```
. Add pinned versions if this skill needs stricter environment control.

Example Usage

cd "20260316/scientific-skills/Evidence Insight/arxiv-database"
python -m py_compile scripts/arxiv_search.py
python scripts/arxiv_search.py --help

Example run plan:

Confirm the user input, output path, and any required config values.
Edit the in-file
```
CONFIG
```
block or documented parameters if the script uses fixed settings.
Run
```
python scripts/arxiv_search.py
```
with the validated inputs.
Review the generated output and return the final artifact with any assumptions called out.

Implementation Details

Execution model: validate the request, choose the packaged workflow, and produce a bounded deliverable.
Input controls: confirm the source files, scope limits, output format, and acceptance criteria before running any script.
Primary implementation surface:
```
scripts/arxiv_search.py
```
.
Reference guidance:
```
references/
```
contains supporting rules, prompts, or checklists.
Parameters to clarify first: input path, output path, scope filters, thresholds, and any domain-specific constraints.
Output discipline: keep results reproducible, identify assumptions explicitly, and avoid undocumented side effects.

1. When to Use

You need to quickly find arXiv preprints by keyword, phrase, author, or category (e.g.,
```
cs.AI
```
,
```
cs.CL
```
).
You want to collect paper metadata (title, authors, publication date, abstract/summary, PDF link) for review or indexing.
You need the latest submissions in a topic area (sorted by submission date or last updated date).
You want to download one or more PDFs from search results for offline reading or batch processing.
You have a known arXiv identifier and want to retrieve the corresponding paper directly.

2. Key Features

arXiv query-based search (supports category filters, author filters, phrases, and ID lookups).
Configurable result limits (
```
--max-results
```
).

Sort control (

--sort-by

Relevance

LastUpdatedDate

SubmittedDate

Metadata output per result (title, authors, published date, abstract/summary, PDF URL; DOI when available via arXiv metadata).
Optional PDF download for returned results (
```
--download
```
) with configurable output directory (
```
--dir
```
).

3. Dependencies

Python 3.8+
```
arxiv
```
(Python package) — version depends on your environment; install a recent release (e.g.,
```
arxiv>=1.4.0
```
)

4. Example Usage

Install dependencies

pip install "arxiv>=1.4.0"

Run searches and downloads

Search for papers in

cs.AI

about reinforcement learning (top 5 results):

python scripts/arxiv_search.py --query "cat:cs.AI AND reinforcement learning" --max-results 5

Search for “Large Language Models” in

cs.CL

python scripts/arxiv_search.py --query "cat:cs.CL AND \"Large Language Models\""

Get the latest 5 papers on “quantum computing” (sorted by submission date):

python scripts/arxiv_search.py --query "quantum computing" --sort-by SubmittedDate --max-results 5

Download a specific paper by arXiv ID:

python scripts/arxiv_search.py --query "id:2101.12345" --download

Download results into a specific directory:

python scripts/arxiv_search.py --query "cat:cs.LG AND diffusion" --max-results 3 --download --dir ./papers

5. Implementation Details

Entry point:
```
scripts/arxiv_search.py
```
wraps the
```
arxiv
```
Python API to execute queries against the arXiv search endpoint.
Query syntax: The
```
--query
```
string is passed to arXiv search and can include:
- Category filters (e.g.,
```
cat:cs.AI
```
  )
- Author filters (e.g.,
```
au:Smith
```
  )
- Exact phrases using quotes (e.g.,
```
"Large Language Models"
```
  )
- ID lookup (e.g.,
```
id:2101.12345
```
  )
- Boolean operators such as
```
AND
```
Result limiting:
```
--max-results
```
controls how many entries are returned (default:
```
10
```
).
Sorting:
```
--sort-by
```
selects the ordering of results:
- ```
Relevance
```
  (default)
- ```
LastUpdatedDate
```
- ```
SubmittedDate
```
Downloads: When
```
--download
```
is set, the script downloads the PDF for each returned result using the provided PDF URL and saves it to
```
--dir
```
(default: current working directory).
Metadata fields: Each result includes core arXiv metadata (title, authors, published date, summary/abstract, PDF URL). DOI is included when present in arXiv’s metadata for that record.