install
source · Clone the upstream repo
git clone https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/bio-entrez-search" ~/.claude/skills/freedomintelligence-openclaw-medical-skills-bio-entrez-search && rm -rf "$T"
OpenClaw · Install into ~/.openclaw/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/FreedomIntelligence/OpenClaw-Medical-Skills "$T" && mkdir -p ~/.openclaw/skills && cp -r "$T/skills/bio-entrez-search" ~/.openclaw/skills/freedomintelligence-openclaw-medical-skills-bio-entrez-search && rm -rf "$T"
manifest:
skills/bio-entrez-search/SKILL.mdsafety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
- references API keys
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-entrez-search description: Search NCBI databases using Biopython Bio.Entrez. Use when finding records by keyword, building complex search queries, discovering database structure, or getting global query counts across databases. tool_type: python primary_tool: Bio.Entrez measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Entrez Search
Search NCBI databases using Biopython's Entrez module (ESearch, EInfo, EGQuery utilities).
Required Setup
from Bio import Entrez Entrez.email = 'your.email@example.com' # Required by NCBI Entrez.api_key = 'your_api_key' # Optional, raises rate limit 3->10 req/sec
Core Functions
Entrez.esearch() - Search a Database
Search any NCBI database and get matching record IDs.
handle = Entrez.esearch(db='nucleotide', term='human[orgn] AND BRCA1[gene]') record = Entrez.read(handle) handle.close() print(f"Found {record['Count']} records") print(f"IDs: {record['IdList']}") # First 20 IDs by default
Key Parameters:
| Parameter | Description | Default |
|---|---|---|
| Database to search | Required |
| Search query | Required |
| Max IDs to return | 20 |
| Starting index (pagination) | 0 |
| Store results on server | 'n' |
| Sort order | database-specific |
| Date field to search | 'pdat' |
| Records from last N days | None |
| Start date (YYYY/MM/DD) | None |
| End date (YYYY/MM/DD) | None |
ESearch Result Fields:
record['Count'] # Total matching records (string) record['IdList'] # List of record IDs record['RetMax'] # Number of IDs returned record['RetStart'] # Starting index record['QueryKey'] # For history server (if usehistory='y') record['WebEnv'] # For history server (if usehistory='y') record['TranslationSet'] # Query translations applied record['QueryTranslation'] # Final translated query
Entrez.einfo() - Database Information
Get information about available databases or specific database fields.
# List all available databases handle = Entrez.einfo() record = Entrez.read(handle) handle.close() print(record['DbList']) # ['pubmed', 'protein', 'nucleotide', ...] # Get info about specific database handle = Entrez.einfo(db='nucleotide') record = Entrez.read(handle) handle.close() print(f"Description: {record['DbInfo']['Description']}") print(f"Record count: {record['DbInfo']['Count']}") # List searchable fields for field in record['DbInfo']['FieldList']: print(f"{field['Name']}: {field['Description']}")
Database Info Fields:
record['DbInfo']['DbName'] # Database name record['DbInfo']['Description'] # Database description record['DbInfo']['Count'] # Total records in database record['DbInfo']['LastUpdate'] # Last update date record['DbInfo']['FieldList'] # Searchable fields record['DbInfo']['LinkList'] # Available links to other databases
Entrez.egquery() - Global Query
Search across all NCBI databases simultaneously.
handle = Entrez.egquery(term='CRISPR') record = Entrez.read(handle) handle.close() for result in record['eGQueryResult']: if int(result['Count']) > 0: print(f"{result['DbName']}: {result['Count']} records")
Search Query Syntax
NCBI uses a specific query syntax:
Field Tags
# Search specific fields using [field_name] term = 'BRCA1[gene]' # Gene name field term = 'human[orgn]' # Organism field term = 'Homo sapiens[ORGN]' # Full organism name term = 'NM_007294[accn]' # Accession number term = 'Smith J[auth]' # Author (PubMed) term = 'Nature[jour]' # Journal (PubMed) term = '1000:5000[slen]' # Sequence length range term = 'mRNA[fkey]' # Feature key
Boolean Operators
term = 'BRCA1 AND human' # Both terms term = 'cancer OR tumor' # Either term term = 'human NOT mouse' # Exclude term term = '(BRCA1 OR BRCA2) AND human' # Grouping
Date Ranges
# Using date parameters handle = Entrez.esearch( db='pubmed', term='CRISPR', datetype='pdat', # Publication date mindate='2023/01/01', maxdate='2024/12/31' ) # Or in query string term = 'CRISPR AND 2024[pdat]' term = 'CRISPR AND 2023:2024[pdat]'
Wildcards and Phrases
term = 'immun*' # Wildcard term = '"breast cancer"[title]' # Exact phrase
Common Databases
| Database | value | Common Fields |
|---|---|---|
| PubMed | | , , , |
| Nucleotide | | , , , |
| Protein | | , , , |
| Gene | | , , |
| SRA | | , , |
| Taxonomy | | , , |
| Assembly | | , , |
Code Patterns
Basic Search with Pagination
from Bio import Entrez Entrez.email = 'your.email@example.com' def search_ncbi(db, term, max_results=100): handle = Entrez.esearch(db=db, term=term, retmax=max_results) record = Entrez.read(handle) handle.close() return record['IdList'], int(record['Count']) ids, total = search_ncbi('nucleotide', 'human[orgn] AND insulin[gene]') print(f'Retrieved {len(ids)} of {total} total records')
Paginated Search for Large Results
def search_all_ids(db, term, batch_size=10000): all_ids = [] handle = Entrez.esearch(db=db, term=term, retmax=0) record = Entrez.read(handle) handle.close() total = int(record['Count']) for start in range(0, total, batch_size): handle = Entrez.esearch(db=db, term=term, retstart=start, retmax=batch_size) record = Entrez.read(handle) handle.close() all_ids.extend(record['IdList']) return all_ids
Search with History Server (for Large Results)
# Store results on NCBI server for subsequent fetching handle = Entrez.esearch(db='nucleotide', term='human[orgn] AND mRNA[fkey]', usehistory='y') record = Entrez.read(handle) handle.close() webenv = record['WebEnv'] query_key = record['QueryKey'] total = int(record['Count']) # Use webenv and query_key with efetch for batch downloads # See batch-downloads skill for details
Recent Records Only
# Records from last 30 days handle = Entrez.esearch(db='pubmed', term='CRISPR', reldate=30, datetype='pdat') record = Entrez.read(handle) handle.close()
Get Available Fields for a Database
def get_search_fields(db): handle = Entrez.einfo(db=db) record = Entrez.read(handle) handle.close() return [(f['Name'], f['Description']) for f in record['DbInfo']['FieldList']] fields = get_search_fields('nucleotide') for name, desc in fields[:10]: print(f'{name}: {desc}')
Check Query Translation
handle = Entrez.esearch(db='nucleotide', term='human BRCA1') record = Entrez.read(handle) handle.close() # See how NCBI interpreted your query print(f"Your query was translated to: {record['QueryTranslation']}") # e.g., '"homo sapiens"[Organism] AND BRCA1[All Fields]'
Common Errors
| Error | Cause | Solution |
|---|---|---|
| Rate limit exceeded | Add delays or use API key |
| Invalid query syntax | Check field names and operators |
| Empty IdList | No matches or typo | Check QueryTranslation field |
| Missing email | Set |
Decision Tree
Need to search NCBI? ├── Finding records in one database? │ └── Use Entrez.esearch() ├── Search across all databases? │ └── Use Entrez.egquery() ├── Need database field names? │ └── Use Entrez.einfo(db='database') ├── List all available databases? │ └── Use Entrez.einfo() (no db argument) ├── Results > 10,000 records? │ └── Use usehistory='y', then batch fetch └── Need to fetch actual records? └── See entrez-fetch skill
Related Skills
- entrez-fetch - Retrieve full records after searching
- entrez-link - Find related records in other databases
- batch-downloads - Download large result sets efficiently
- geo-data - Search GEO expression datasets (specialized search)
- blast-searches - Search by sequence similarity instead of keywords