Awesome-Agent-Skills-for-Empirical-Research osf-api
Manage open science projects and preprints via the OSF REST API
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/literature/fulltext/osf-api" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-osf-api && rm -rf "$T"
manifest:
skills/43-wentorai-research-plugins/skills/literature/fulltext/osf-api/SKILL.mdsource content
OSF (Open Science Framework) API
Overview
The Open Science Framework by the Center for Open Science provides infrastructure for the entire research lifecycle — project management, file storage, preprint hosting, and registrations. The API enables search, project creation, file management, and preprint discovery across OSF Preprints, PsyArXiv, SocArXiv, and 25+ community preprint servers. Free, no auth for read access.
API Endpoints
Base URL
https://api.osf.io/v2
Search
# Search across all OSF content curl "https://api.osf.io/v2/search/?q=replication+crisis&page[size]=20" # Search preprints curl "https://api.osf.io/v2/preprints/?filter[q]=machine+learning&page[size]=20" # Filter by preprint provider curl "https://api.osf.io/v2/preprints/?filter[provider]=psyarxiv&filter[q]=cognitive+bias" # Search registrations (pre-registered studies) curl "https://api.osf.io/v2/registrations/?filter[q]=randomized+controlled+trial"
Projects
# Get public projects curl "https://api.osf.io/v2/nodes/?filter[public]=true&filter[q]=neuroimaging" # Get project details curl "https://api.osf.io/v2/nodes/{node_id}/" # Get project files curl "https://api.osf.io/v2/nodes/{node_id}/files/" # Get project contributors curl "https://api.osf.io/v2/nodes/{node_id}/contributors/"
Preprint Providers
| Provider | Filter | Disciplines |
|---|---|---|
| OSF Preprints | | Multidisciplinary |
| PsyArXiv | | Psychology |
| SocArXiv | | Social sciences |
| EarthArXiv | | Earth sciences |
| BioHackrXiv | | Bioinformatics |
| engrXiv | | Engineering |
| MedArXiv | | Medical sciences |
| NutriXiv | | Nutrition |
Query Parameters
| Parameter | Description | Example |
|---|---|---|
| Text search | |
| Preprint server | |
| Subject filter | Subject taxonomy ID |
| Date filter | |
| Results per page (max 100) | |
| Page number | |
Response Structure (Preprint)
{ "data": [ { "id": "abc12", "type": "preprints", "attributes": { "title": "Replication of the Ego Depletion Effect", "description": "We attempted to replicate...", "date_created": "2024-06-15T10:00:00Z", "date_published": "2024-06-16T08:00:00Z", "doi": "10.31234/osf.io/abc12", "is_published": true, "subjects": [["Social and Behavioral Sciences", "Psychology"]], "tags": ["replication", "ego depletion"] }, "relationships": { "contributors": {"links": {"related": {"href": "..."}}}, "primary_file": {"links": {"related": {"href": "..."}}} } } ] }
Python Usage
import requests BASE_URL = "https://api.osf.io/v2" def search_preprints(query: str, provider: str = None, page_size: int = 20) -> list: """Search OSF preprints across providers.""" params = { "filter[q]": query, "page[size]": page_size, } if provider: params["filter[provider]"] = provider resp = requests.get(f"{BASE_URL}/preprints/", params=params) resp.raise_for_status() data = resp.json() results = [] for item in data.get("data", []): attrs = item.get("attributes", {}) results.append({ "id": item.get("id"), "title": attrs.get("title"), "description": (attrs.get("description") or "")[:300], "doi": attrs.get("doi"), "date": attrs.get("date_published", "")[:10], "tags": attrs.get("tags", []), "url": f"https://osf.io/{item.get('id')}/", }) return results def search_registrations(query: str, page_size: int = 20) -> list: """Search pre-registered studies on OSF.""" params = { "filter[q]": query, "page[size]": page_size, } resp = requests.get(f"{BASE_URL}/registrations/", params=params) resp.raise_for_status() data = resp.json() results = [] for item in data.get("data", []): attrs = item.get("attributes", {}) results.append({ "id": item.get("id"), "title": attrs.get("title"), "description": (attrs.get("description") or "")[:300], "date_registered": attrs.get("date_registered", "")[:10], "registration_schema": attrs.get("registration_supplement"), }) return results def get_project_files(node_id: str) -> list: """List files in an OSF project.""" resp = requests.get(f"{BASE_URL}/nodes/{node_id}/files/") resp.raise_for_status() data = resp.json() providers = [] for item in data.get("data", []): attrs = item.get("attributes", {}) providers.append({ "provider": attrs.get("provider"), "name": attrs.get("name"), }) return providers # Example: search psychology preprints preprints = search_preprints("cognitive load", provider="psyarxiv") for p in preprints[:5]: print(f"[{p['date']}] {p['title']}") print(f" DOI: {p['doi']}") # Example: find pre-registered clinical trials regs = search_registrations("randomized placebo") for r in regs[:5]: print(f"[{r['date_registered']}] {r['title']}")
Use Cases
- Preprint discovery: Search across 25+ preprint servers
- Pre-registration search: Find registered study protocols
- Open data access: Download shared research datasets
- Reproducibility: Access materials, data, and code for published studies
- Project management: Programmatic project and file management