Create-bindu-agent skills

id: biomni-v1

install
source · Clone the upstream repo
git clone https://github.com/GetBindu/create-bindu-agent
manifest: hooks/skills/biomni-skill.yaml
safety · automated scan (low risk)
This is a pattern-based risk scan, not a security review. Our crawler flagged:
  • pip install
Always read a skill's source content before installing. Patterns alone don't mean the skill is malicious — but they warrant attention.
source content

id: biomni-v1 name: biomni-agent version: 1.0.0 author: paras@getbindu.com

Description

description: | Biomedical AI agent that combines LLM reasoning with four specialized tools: BLAST sequence alignment, scRNA-seq clustering analysis, ChEMBL ADMET queries, and PDF report generation.Designed for computational research workflows, experimental planning, and biomedical hypothesis generation.

Tags and Modes

tags:

  • biomedical-ai
  • sequence-alignment
  • scrna-seq
  • admet-query
  • research-planning
  • bioinformatics

input_modes:

  • text/plain
  • application/json

output_modes:

  • text/plain
  • application/json
  • text/markdown

Example Queries

examples:

  • "Run BLAST alignment for this DNA sequence: ACGTACGTACGTACGTACGT..."
  • "Analyze scRNA-seq data from file /path/to/data.h5ad and identify clusters"
  • "Query ChEMBL for ADMET properties of SMILES: CC(=O)Nc1ccc(O)cc1"
  • "Design a computational workflow for identifying drug targets in TNBC"
  • "Generate a PDF report summarizing my analysis results"

Detailed Capabilities

capabilities_detail: sequence_alignment: supported: true method: ncbi_blast_api programs: - blastn - blastp - blastx - tblastn databases: - nr - nt - refseq_protein constraints: - minimum_sequence_length: 20 - timeout_seconds: 180 - max_hits_returned: 5

scrna_analysis: supported: true library: scanpy analysis_steps: - quality_filtering - normalization_log1p - pca - neighbor_graph - leiden_clustering outputs: - cluster_counts - cell_gene_statistics limitations: - requires_h5ad_or_scanpy_compatible_file - no_visualization_export - clustering_only

admet_query: supported: true data_source: chembl_webresource_client query_method: smiles_flexmatch returns: - molecule_name - chembl_id - bioactivity_count constraints: - timeout_seconds: 60 - max_activities: 20 - requires_valid_smiles

pdf_generation: supported: true library: reportlab format: basic_text_layout features: - title_page - timestamp - multi_page_support limitations: - no_figures_or_tables - no_complex_formatting - text_only

not_supported: crispr_design: "No CRISPR guide RNA design tools" differential_expression: "Clustering only, no DE analysis" admet_prediction: "ChEMBL lookup only, no de novo prediction" pdf_parsing: "Generation only, no extraction" visualization_export: "Descriptions only, no figure rendering"

Requirements

requirements: packages: - agno>=0.1.0 - biopython>=1.80 - scanpy>=1.9.0 - chembl-webresource-client>=0.10.0 - reportlab>=4.0.0 - bindu - python-dotenv>=1.0.1 system: [] min_memory_mb: 1024 external_services: - NCBI BLAST API (public, no key required) - ChEMBL Web Services (public, no key required) - OpenRouter API (required) OR OpenAI API (required) environment_variables: - OPENROUTER_API_KEY (required if using OpenRouter) - OPENAI_API_KEY (required if using OpenAI directly) - MEM0_API_KEY (optional, for memory features) - MODEL_NAME (optional, defaults to openai/gpt-4o)

Performance Metrics

performance: avg_processing_time_seconds: 60-180 blast_timeout: 180 chembl_timeout: 60 async_execution: true max_concurrent_tools: 4 memory_enabled: false notes: "BLAST queries can take 1-3 minutes depending on NCBI server load"

Agent Configuration

agents: llm_capabilities: - experimental_design - hypothesis_generation - workflow_planning - result_interpretation - literature_synthesis tool_capabilities: - sequence_homology_search - single_cell_clustering - compound_bioactivity_lookup - report_formatting

Rich Documentation

documentation: overview: | BioOmni is a biomedical AI agent that combines GPT-4 reasoning with four specialized bioinformatics tools. It accepts natural language requests for sequence analysis, scRNA-seq clustering, compound queries, and research planning. The agent uses async execution for tool calls and returns structured results with biological interpretations.

Tools are implemented using BioPython (BLAST), Scanpy (scRNA-seq), ChEMBL client (ADMET),
and ReportLab (PDF). The LLM handles parameter extraction, workflow design, and result
interpretation.

use_cases: when_to_use: - User needs BLAST sequence alignment for gene/protein identification - User has scRNA-seq data in h5ad format and needs clustering - User wants to query ChEMBL for known compound bioactivity - User needs computational research workflow design - User wants experimental planning or hypothesis generation - User needs a text-based PDF report

when_not_to_use:
  - User needs CRISPR guide RNA design (no specialized tool)
  - User needs differential expression analysis (clustering only)
  - User needs de novo ADMET prediction (ChEMBL lookup only)
  - User needs PDF parsing or data extraction (generation only)
  - User needs actual visualization files (descriptions only)
  - User needs laboratory execution or wet-lab protocols
  - User needs medical diagnosis or clinical advice

input_structure: | Accepts natural language text or JSON describing biomedical tasks.

Example inputs:
- "Run BLAST for this sequence: ACGTACGTACGTACGTACGT..."
- "Analyze scRNA-seq file at /data/pbmc.h5ad"
- "Query ChEMBL for aspirin: CC(=O)Oc1ccccc1C(=O)O"
- "Design a workflow to identify drug targets in triple-negative breast cancer"

Tool-specific constraints:
- BLAST: Sequences must be ≥20 characters, valid DNA/RNA/protein alphabet
- scRNA-seq: Requires h5ad or scanpy-compatible file path
- ADMET: Requires valid SMILES string
- PDF: Accepts title and text content

Max input length: No hard limit, but LLM context window applies

output_format: | Returns JSON with tool results and LLM interpretation:

{
  "success": true,
  "tool_results": {
    "blast": {"hits": [...], "num_hits": 5},
    "scrna": {"n_clusters": 8, "cluster_sizes": {...}},
    "admet": {"chembl_id": "CHEMBL25", "activities_found": 15},
    "pdf": {"size_bytes": 2048, "summary": "PDF generated"}
  },
  "interpretation": "LLM-generated biological context and next steps"
}

For planning tasks, returns structured markdown workflows.

error_handling: - "Agent not initialized: Returns error requesting API keys" - "BLAST timeout (>180s): Returns timeout error" - "ChEMBL timeout (>60s): Returns timeout error" - "Invalid sequence: Returns validation error with details" - "Invalid SMILES: Returns validation error" - "File not found (scRNA-seq): Returns file error" - "Tool execution failed: Returns error with exception details"

examples: - title: "BLAST Sequence Alignment" input: "Run BLAST for this DNA sequence: ACGTACGTACGTACGTACGTACGTACGT" output_sample: | { "success": true, "query_length": 28, "num_hits": 5, "hits": [ { "title": "Homo sapiens chromosome 1", "e_value": 0.001, "identity_percent": 96.4, "score": 52.0 } ], "summary": "Found 5 BLAST hits." }

- title: "scRNA-seq Clustering"
  input: "Analyze scRNA-seq data from /data/pbmc3k.h5ad"
  output_sample: |
    {
      "success": true,
      "n_cells": 2638,
      "n_genes": 1838,
      "n_clusters": 8,
      "cluster_sizes": {"0": 458, "1": 392, "2": 341, ...},
      "summary": "Identified 8 clusters."
    }

- title: "ChEMBL ADMET Query"
  input: "Query ChEMBL for SMILES: CC(=O)Oc1ccccc1C(=O)O"
  output_sample: |
    {
      "success": true,
      "molecule": "ASPIRIN",
      "chembl_id": "CHEMBL25",
      "activities_found": 18
    }

best_practices: for_users: - "Provide sequences ≥20 characters for BLAST" - "Use scanpy-compatible h5ad files for scRNA-seq" - "Validate SMILES strings before ADMET queries" - "Specify clear research objectives for planning tasks" - "Request PDF reports only for text-based summaries" - "Be patient with BLAST queries (1-3 minute timeout)"

for_orchestrators:
  - "Route sequence alignment requests to this agent"
  - "Route scRNA-seq clustering requests to this agent"
  - "Route ChEMBL compound queries to this agent"
  - "Route biomedical workflow planning to this agent"
  - "Do NOT route CRISPR design (no tool support)"
  - "Do NOT route differential expression (clustering only)"
  - "Do NOT route de novo ADMET prediction (lookup only)"
  - "Do NOT route PDF parsing (generation only)"
  - "Expect 1-3 minute processing for BLAST queries"
  - "Expect 15-60 seconds for other tools"

installation: | Required packages: pip install agno biopython scanpy chembl-webresource-client reportlab bindu python-dotenv

Environment variables (one required):
OPENROUTER_API_KEY=your_openrouter_api_key
OR
OPENAI_API_KEY=your_openai_api_key

Optional:
MEM0_API_KEY=your_mem0_api_key  # For memory features
MODEL_NAME=openai/gpt-4o  # Default model

versioning: - version: "1.0.0" date: "2026-03-02" changes: "Initial release with BLAST, scRNA-seq, ChEMBL, and PDF tools"

Assessment fields for skill negotiation

assessment: keywords: - blast - sequence - alignment - dna - rna - protein - scrna - single-cell - clustering - scanpy - admet - chembl - smiles - compound - molecule - bioactivity - research - workflow - experimental - hypothesis - biomedical - bioinformatics - genomics - transcriptomics - drug - target

specializations: - domain: sequence_alignment confidence_boost: 0.5 - domain: scrna_seq_analysis confidence_boost: 0.5 - domain: compound_queries confidence_boost: 0.4 - domain: research_planning confidence_boost: 0.3 - domain: biomedical_workflows confidence_boost: 0.3

anti_patterns: - "crispr guide design" - "differential expression" - "de novo admet" - "admet prediction from structure" - "pdf parsing" - "pdf extraction" - "visualization export" - "figure generation" - "wet lab" - "laboratory execution" - "medical diagnosis" - "clinical advice" - "patient treatment" - "drug prescription" - "genome assembly" - "variant calling" - "pathway enrichment" - "protein structure prediction"

complexity_indicators: simple: - "blast search" - "sequence alignment" - "chembl query" - "single tool use" medium: - "scrna-seq clustering" - "workflow design" - "hypothesis generation" - "multi-step planning" complex: - "multi-omics workflow" - "drug discovery pipeline" - "target identification strategy" not_supported: - "crispr design" - "differential expression" - "de novo prediction" - "actual bookings or execution"