Awesome-Agent-Skills-for-Empirical-Research datacite-api

Resolve dataset DOIs and query research data metadata via DataCite

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/literature/metadata/datacite-api" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-datacite-api && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/literature/metadata/datacite-api/SKILL.md
source content

DataCite API Guide

Overview

DataCite is a leading global DOI registration agency focused on research data. While CrossRef primarily handles DOIs for publications, DataCite specializes in assigning persistent identifiers to datasets, software, samples, instruments, and other research outputs. DataCite has registered over 50 million DOIs from thousands of data repositories worldwide.

The DataCite REST API provides access to metadata for all DataCite DOIs. It is essential for researchers and developers working with research data discovery, data citation, FAIR (Findable, Accessible, Interoperable, Reusable) data practices, and repository integration. The metadata follows the DataCite Metadata Schema, which is designed specifically for describing research data and includes fields for resource types, funding references, geolocation, and related identifiers.

The API is free, open, and requires no authentication. It returns JSON responses following the JSON:API specification, with robust filtering, faceting, and pagination support.

Authentication

No authentication required. The DataCite API is fully open and free to use. No API key, registration, or email is needed. For write operations (DOI registration and metadata updates), authentication via DataCite member credentials is required, but read-only access is completely open.

Core Endpoints

DOIs: Search and Retrieve Dataset Metadata

  • URL:
    GET https://api.datacite.org/dois
  • Parameters:
    ParamTypeRequiredDescription
    querystringNoFull-text search query
    resource-type-idstringNoFilter by resource type (dataset, software, text, etc.)
    affiliation-idstringNoFilter by creator affiliation ROR ID
    registeredstringNoFilter by registration year (e.g., 2024)
    page[size]integerNoResults per page (default: 25, max: 1000)
    page[number]integerNoPage number for pagination
    sortstringNoSort field: relevance, created, -created, updated
  • Example:
    curl "https://api.datacite.org/dois?query=climate+change+dataset&resource-type-id=dataset&page[size]=10&sort=-created"
    
  • Response: JSON:API formatted response with
    data
    array containing DOI records. Each record has
    attributes
    with
    doi
    ,
    titles
    ,
    creators
    ,
    publisher
    ,
    publicationYear
    ,
    resourceType
    ,
    descriptions
    ,
    subjects
    ,
    dates
    ,
    relatedIdentifiers
    ,
    fundingReferences
    , and
    geoLocations
    .

Single DOI: Direct Lookup

  • URL:
    GET https://api.datacite.org/dois/{doi}
  • Parameters:
    ParamTypeRequiredDescription
    doistringYesThe full DOI (e.g., 10.5281/zenodo.1234567)
  • Example:
    curl "https://api.datacite.org/dois/10.5281/zenodo.3678171"
    
  • Response: JSON:API response with complete metadata for the specified DOI, including all DataCite Metadata Schema fields.

Providers: Data Repository Information

  • URL:
    GET https://api.datacite.org/providers
  • Parameters:
    ParamTypeRequiredDescription
    querystringNoSearch provider name or description
    regionstringNoFilter by region (e.g., EMEA, Americas, Asia Pacific)
    page[size]integerNoResults per page
  • Example:
    curl "https://api.datacite.org/providers?query=CERN&page[size]=5"
    
  • Response: JSON:API response with provider records including
    name
    ,
    displayName
    ,
    region
    ,
    memberType
    ,
    website
    , and associated repositories and DOI prefixes.

Clients: Repository Details

  • URL:
    GET https://api.datacite.org/clients
  • Parameters:
    ParamTypeRequiredDescription
    querystringNoSearch repository name
    provider-idstringNoFilter by provider
    softwarestringNoFilter by repository software (e.g., dspace, dataverse)
    page[size]integerNoResults per page
  • Example:
    curl "https://api.datacite.org/clients?query=zenodo&page[size]=5"
    
  • Response: JSON:API response with repository records including name, DOI count, and service details.

Rate Limits

No published rate limits. DataCite does not enforce strict API quotas for read access. However, the service is operated by a nonprofit organization, so users should implement reasonable request pacing. For large-scale data mining, use the DataCite OAI-PMH endpoint or the public data file available at https://datafiles.datacite.org. Sustained high-volume requests may be throttled without notice.

Common Patterns

Discover Datasets for a Research Topic

Find published datasets related to a research area:

curl -s "https://api.datacite.org/dois?query=CRISPR+genome+editing&resource-type-id=dataset&page[size]=5&sort=-created" | jq '.data[] | {doi: .attributes.doi, title: .attributes.titles[0].title, year: .attributes.publicationYear, publisher: .attributes.publisher}'

Find Software Citations

Search for research software registered with DataCite:

curl -s "https://api.datacite.org/dois?query=python+machine+learning&resource-type-id=software&page[size]=10" | jq '.data[] | {doi: .attributes.doi, title: .attributes.titles[0].title, year: .attributes.publicationYear}'

Link Datasets to Publications

Use related identifiers to find papers associated with a dataset:

curl -s "https://api.datacite.org/dois/10.5281/zenodo.3678171" | jq '.data.attributes.relatedIdentifiers[] | select(.relationType == "IsSupplementTo" or .relationType == "IsReferencedBy") | {type: .relationType, id: .relatedIdentifier}'

References