Encode-toolkit setup
Set up the ENCODE Toolkit server connection. Use when the user needs help installing, configuring, or troubleshooting the ENCODE connector.
git clone https://github.com/ammawla/encode-toolkit
T=$(mktemp -d) && git clone --depth=1 https://github.com/ammawla/encode-toolkit "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugin/skills/setup" ~/.claude/skills/ammawla-encode-toolkit-setup && rm -rf "$T"
plugin/skills/setup/SKILL.mdENCODE Toolkit Setup
When to Use
- User needs help installing or configuring the ENCODE Toolkit MCP server
- User is getting connection errors or server startup failures
- User asks "how do I set up ENCODE?" or "install ENCODE toolkit"
- User needs to configure ENCODE credentials for restricted data access
- User wants to verify their ENCODE server connection is working
- User is setting up a new environment and needs the ENCODE plugin
Help the user set up the ENCODE Toolkit server. The server connects Claude to the ENCODE Project genomics database — the largest public catalog of functional genomic elements with 8,000+ experiments across 50+ assay types.
Installation
The ENCODE Toolkit server is installed via
uvx (recommended) or pip:
For Claude Code (CLI)
claude mcp add encode -- uvx encode-toolkit
For Claude Desktop
Add to
claude_desktop_config.json:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
{ "mcpServers": { "encode": { "command": "uvx", "args": ["encode-toolkit"] } } }
Then restart Claude Desktop.
For VS Code (Claude Extension)
Add to your VS Code
settings.json (Ctrl/Cmd + Shift + P → "Preferences: Open Settings (JSON)"):
{ "claude.mcpServers": { "encode": { "command": "uvx", "args": ["encode-toolkit"] } } }
For Cursor
Add to
.cursor/mcp.json in your project root:
{ "mcpServers": { "encode": { "command": "uvx", "args": ["encode-toolkit"] } } }
For Windsurf
Add to
~/.codeium/windsurf/mcp_config.json:
{ "mcpServers": { "encode": { "command": "uvx", "args": ["encode-toolkit"] } } }
Alternative: pip install
pip install encode-toolkit encode-toolkit # Run the server
Verify Installation
After setup, test the connection with these verification queries (run them in order):
Step 1: Check metadata access
Ask: "List available ENCODE assay types"
- This calls
encode_get_metadata(metadata_type="assays") - Expected: Returns 50+ assay types including ChIP-seq, ATAC-seq, RNA-seq, WGBS, Hi-C
Step 2: Test search
Ask: "Search for ATAC-seq experiments on human brain"
- This calls
encode_search_experiments(assay_title="ATAC-seq", organ="brain", organism="Homo sapiens") - Expected: Returns experiment accessions (ENCSR...) with assay, biosample, and status info
Step 3: Test facets
Ask: "What organs have the most ENCODE data?"
- This calls
encode_get_facets(facet_field="organ") - Expected: Returns organ counts showing brain, liver, heart, etc. ranked by experiment count
If all three work, your setup is complete.
Authentication
Most ENCODE data is public and needs no authentication. For restricted/unreleased data:
- Get API credentials from https://www.encodeproject.org/profile/ (requires ENCODE account)
- Store them:
Ask: "Store my ENCODE credentials" → Calls encode_manage_credentials(action="store", access_key="...", secret_key="...") - Credentials are encrypted via the OS keyring (macOS Keychain, Windows Credential Manager, or Linux Secret Service)
- To verify:
encode_manage_credentials(action="status") - To remove:
encode_manage_credentials(action="remove")
20 Available Tools
After setup, these tools are available:
| Category | Tools | Purpose |
|---|---|---|
| Search | , , | Find experiments, explore data landscape, get valid filter values |
| Experiment Details | , | Get full experiment metadata, compare two experiments |
| Files | , , | Find files, list files for an experiment, get file details |
| Download | , | Download individual or batch files with MD5 verification |
| Tracking | , , | Local experiment tracking with SQLite |
| Provenance | , | Log analysis outputs with full lineage |
| Citations | , | Publication data, cross-reference to PubMed/GEO |
| Credentials | | Store/remove API credentials |
| Collection | | Summarize tracked experiment portfolio |
First-Run Walkthrough: Pancreatic Islet Epigenomics
This walkthrough demonstrates a complete workflow from installation to data exploration.
1. Explore what's available
"What ENCODE assay types are available for human pancreas?" → encode_get_facets(facet_field="assay_title", organ="pancreas", organism="Homo sapiens")
2. Find specific experiments
"Find all histone ChIP-seq experiments on human pancreas" → encode_search_experiments(assay_title="Histone ChIP-seq", organ="pancreas", organism="Homo sapiens")
3. Examine an experiment
"Get details for ENCSR123ABC" → encode_get_experiment(accession="ENCSR123ABC")
4. Find the right files
"List the preferred BED files for ENCSR123ABC" → encode_list_files(accession="ENCSR123ABC", file_format="bed", assembly="GRCh38")
5. Download data
"Download the IDR-thresholded peaks for ENCSR123ABC" → encode_download_files(accession="ENCSR123ABC", file_format="bed", output_type="IDR thresholded peaks")
6. Track your experiment
"Track ENCSR123ABC in my local database with note 'H3K27ac pancreatic islets'" → encode_track_experiment(accession="ENCSR123ABC", notes="H3K27ac pancreatic islets")
Cross-Database Integration
The ENCODE Toolkit works alongside other MCP servers and REST APIs:
| Database | Access Method | What It Adds |
|---|---|---|
| PubMed | MCP server () | Literature citations for ENCODE experiments |
| bioRxiv | MCP server () | Preprint discovery for latest research |
| ClinicalTrials.gov | MCP server () | Clinical trial cross-reference |
| Open Targets | MCP server () | Drug target identification |
| GTEx | REST API via skill | Tissue-specific expression context |
| ClinVar | REST API via skill | Clinical variant annotation |
| GWAS Catalog | REST API via skill | Trait-associated variant lookups |
| gnomAD | GraphQL via skill | Population allele frequencies |
| Ensembl | REST API via skill | VEP annotation, Regulatory Build |
| UCSC | REST API via skill | Genome browser tracks, cCRE data |
| GEO | E-utilities via skill | Complementary expression datasets |
| JASPAR | REST API via skill | TF binding motif databases |
| CellxGene | REST API via skill | Single-cell expression atlases |
47 Expert Skills
Beyond the 20 tools, the ENCODE Toolkit includes 47 skills providing domain expertise:
- Core (5): setup, search-encode, download-encode, track-experiments, cross-reference
- Analysis (9): quality-assessment, integrative-analysis, regulatory-elements, epigenome-profiling, compare-biosamples, visualization-workflow, motif-analysis, peak-annotation, batch-analysis
- Pipelines (7): pipeline-chipseq, pipeline-atacseq, pipeline-rnaseq, pipeline-wgbs, pipeline-hic, pipeline-dnaseseq, pipeline-cutandrun
- External DBs (9): gtex-expression, clinvar-annotation, cellxgene-context, gwas-catalog, jaspar-motifs, ensembl-annotation, geo-connector, gnomad-variants, ucsc-browser
- Workflows (10): data-provenance, cite-encode, variant-annotation, pipeline-guide, single-cell-encode, disease-research, publication-trust, bioinformatics-installer, scientific-writing, liftover-coordinates
- Data Aggregation (4): histone-aggregation, accessibility-aggregation, hic-aggregation, methylation-aggregation
- Meta-Analysis (2): scrna-meta-analysis, multi-omics-integration
- Functional Genomics (1): functional-screen-analysis
Pitfalls & Troubleshooting
| Problem | Cause | Fix |
|---|---|---|
| "Server not found" | Claude not restarted after config change | Restart Claude Desktop / reload Claude Code |
| "uvx not found" | uv not installed | |
| Timeout errors | Slow connection or ENCODE API load | Retry; rate limit (10 req/sec) is handled automatically |
| 403 on downloads | File requires authentication | |
| No results returned | Filters too narrow | Broaden filters; use to see available data |
| "Invalid accession" | Wrong format | Must be ENCSR/ENCFF/ENCBS format (e.g., ENCSR000AAA) |
| Empty facets | API connectivity issue | Check internet; try |
| Stale results | Cached data | Cache TTL is 1 hour; restart server to clear |
Code Examples
1. Verify server connection with metadata query
encode_get_metadata(metadata_type="assays")
Expected output:
{ "assays": ["ATAC-seq", "ChIP-seq", "CUT&RUN", "CUT&Tag", "DNase-seq", "Hi-C", "MPRA", "RNA-seq", "STARR-seq", "WGBS", "eCLIP", "scATAC-seq", "scRNA-seq"] }
2. Test search functionality
encode_search_experiments(assay_title="ATAC-seq", organ="brain", organism="Homo sapiens", limit=3)
Expected output:
{ "total": 32, "results": [ {"accession": "ENCSR000AAA", "assay_title": "ATAC-seq", "biosample_summary": "brain", "status": "released"} ] }
3. Test facet exploration
encode_get_facets(facet_field="organ", organism="Homo sapiens")
Expected output:
{ "facets": { "organ": {"brain": 450, "blood": 380, "liver": 220, "heart": 180, "lung": 150} } }
Related Skills
| Skill | When to Use |
|---|---|
| First skill to use after setup — find experiments by assay, tissue, target |
| Download ENCODE files (BED, bigWig, FASTQ, BAM) after finding experiments |
| Set up Nextflow pipelines for processing raw ENCODE data |
| Install all bioinformatics tools needed for ENCODE analysis |
| Link ENCODE experiments to PubMed, GEO, ClinicalTrials.gov |
| Evaluate data quality before analysis |
| Verify literature claims backing analytical decisions |
Presenting Results
When reporting setup results:
- Connection status: Confirm the ENCODE Toolkit server is connected and responding. Report the server version if available
- Available tools: List the 20 available ENCODE tools grouped by function (search, download, track, cross-reference, credentials)
- Test query result: Run a simple validation query (e.g.,
) and confirm it returns results successfullyencode_get_metadata(metadata_type="assays") - Authentication status: Note whether credentials are configured (for restricted data) or that public data access requires no authentication
- Troubleshooting: If any issues were encountered during setup, summarize the problem and resolution
- Next steps: Suggest
to find experiments, orsearch-encode
to explore what ENCODE data is available for their research areaencode_get_facets