Medical-research-skills string-database
Access the STRING database to map identifiers, retrieve protein–protein interaction networks, and run functional/PPI enrichment when you need interaction context for a gene/protein set.
install
source · Clone the upstream repo
git clone https://github.com/aipoch/medical-research-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aipoch/medical-research-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/scientific-skills/Evidence Insight/string-database" ~/.claude/skills/aipoch-medical-research-skills-string-database && rm -rf "$T"
manifest:
scientific-skills/Evidence Insight/string-database/SKILL.mdsource content
When to Use
- You have gene symbols (e.g.,
) and need to resolve them to STRING protein identifiers for downstream analysis.TP53 - You want to retrieve a protein–protein interaction (PPI) network (functional/physical) with confidence scores for one or more proteins.
- You need to find interaction partners for a target protein to expand a candidate list (e.g., add top N neighbors).
- You want to perform functional enrichment (GO/KEGG/Reactome, etc.) for a protein set to interpret biological themes.
- You need a quick static visualization (PNG/SVG) of a STRING network for reports or notebooks.
Key Features
- ID Mapping: Convert gene/protein names to STRING identifiers for a given organism.
- Network Retrieval: Fetch interaction edges with confidence scores from STRING.
- Interaction Partners: Expand a protein list by retrieving interaction partners.
- Enrichment Analysis:
- Functional enrichment (e.g., GO, KEGG, Reactome)
- PPI enrichment statistics
- Functional annotations (e.g., PFAM/SMART where supported by STRING endpoints)
- Visualization: Download static network images (PNG/SVG).
Dependencies
- Python
>=3.8
(tested withrequests
)>=2.28
(tested withpandas
)>=1.5
Install:
pip install requests pandas
Example Usage
from scripts.string_api import StringClient def main(): # STRING does not require a secret API key, but providing a caller identity is recommended. client = StringClient(caller_identity="my_analysis_tool") # 1) Map an identifier (e.g., TP53 in Homo sapiens; NCBI taxonomy ID 9606) protein_id = client.map_id(identifier="TP53", species=9606) print("Mapped ID:", protein_id) # 2) Download a network image and expand by adding interaction partners client.get_network_image( identifiers=[protein_id], output_file="tp53_network.png", add_color_nodes=10, # add 10 partners ) print("Saved network image to tp53_network.png") # 3) Run PPI enrichment for the set ppi_stats = client.get_ppi_enrichment(identifiers=[protein_id]) print("PPI enrichment:", ppi_stats) if __name__ == "__main__": main()
Implementation Details
- Client entry point:
provides the main wrapper (e.g.,scripts/string_api.py
) around the STRING REST API.StringClient - Caller identity:
- STRING endpoints do not require an API key.
- A
string is strongly recommended (project name/email/URL) to support rate/load management.caller_identity - Pass it at initialization (e.g.,
) or inject via environment variables in your own wrapper.StringClient(caller_identity="my_email@example.com")
- Organism selection:
- Most operations require a species identifier (commonly NCBI taxonomy ID, e.g.,
for human).9606
- Most operations require a species identifier (commonly NCBI taxonomy ID, e.g.,
- Network retrieval and scoring:
- Network endpoints return interactions with confidence scores; downstream filtering is typically done by applying a score threshold in your analysis code (if exposed by the wrapper).
- Visualization:
- Static images are retrieved directly from STRING image endpoints and written to disk (PNG/SVG depending on the method/parameters).
- Reference documentation:
- See
for original API notes and endpoint details included with this skill.references/string_reference.md
- See