install
source · Clone the upstream repo
git clone https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills-
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/mdbabumiamssm/LLMs-Universal-Life-Science-and-Clinical-Skills- "$T" && mkdir -p ~/.claude/skills && cp -r "$T/Skills/Sequence_Analysis/sequence-io/write-sequences" ~/.claude/skills/mdbabumiamssm-llms-universal-life-science-and-clinical-skills-write-sequences && rm -rf "$T"
manifest:
Skills/Sequence_Analysis/sequence-io/write-sequences/SKILL.mdsource content
<!--
# COPYRIGHT NOTICE
# This file is part of the "Universal Biomedical Skills" project.
# Copyright (c) 2026 MD BABU MIA, PhD <md.babu.mia@mssm.edu>
# All Rights Reserved.
#
# This code is proprietary and confidential.
# Unauthorized copying of this file, via any medium is strictly prohibited.
#
# Provenance: Authenticated by MD BABU MIA
-->
name: bio-write-sequences description: Write biological sequences to files (FASTA, FASTQ, GenBank, EMBL) using Biopython Bio.SeqIO. Use when saving sequences, creating new sequence files, or outputting modified records. tool_type: python primary_tool: Bio.SeqIO measurable_outcome: Execute skill workflow successfully with valid output within 15 minutes. allowed-tools:
- read_file
- run_shell_command
Write Sequences
Write SeqRecord objects to sequence files using Biopython's Bio.SeqIO module.
Required Import
from Bio import SeqIO from Bio.Seq import Seq from Bio.SeqRecord import SeqRecord
Core Functions
SeqIO.write() - Write Records to File
Write one or more SeqRecord objects to a file.
SeqIO.write(records, 'output.fasta', 'fasta')
Parameters:
- Single SeqRecord, list, or iterator of SeqRecordsrecords
- Filename (string) or file handlehandle
- Output format stringformat
Returns: Number of records written (integer)
record.format() - Get Formatted String
Get a string representation without writing to file.
formatted = record.format('fasta') print(formatted)
Creating SeqRecord Objects
Minimal SeqRecord
record = SeqRecord(Seq('ATGCGATCGATCG'), id='seq1')
Full SeqRecord
record = SeqRecord( Seq('ATGCGATCGATCG'), id='seq1', name='sequence_one', description='Example sequence for demonstration' )
With Annotations (for GenBank output)
from Bio.SeqFeature import SeqFeature, FeatureLocation record = SeqRecord( Seq('ATGCGATCGATCG'), id='seq1', annotations={'molecule_type': 'DNA'} ) record.features.append( SeqFeature(FeatureLocation(0, 9), type='gene', qualifiers={'gene': ['exampleGene']}) )
Common Formats
| Format | String | Notes |
|---|---|---|
| FASTA | | Most universal, sequence + header only |
| FASTQ | | Requires quality scores in letter_annotations |
| GenBank | | Requires annotations and molecule_type |
| EMBL | | Similar requirements to GenBank |
| Tab | | Simple ID + sequence tabular format |
Code Patterns
Write Single Record
record = SeqRecord(Seq('ATGC'), id='my_seq', description='test sequence') SeqIO.write(record, 'output.fasta', 'fasta')
Write Multiple Records
records = [ SeqRecord(Seq('ATGC'), id='seq1'), SeqRecord(Seq('GCTA'), id='seq2'), SeqRecord(Seq('TTAA'), id='seq3') ] count = SeqIO.write(records, 'output.fasta', 'fasta') print(f'Wrote {count} records')
Write to File Handle
with open('output.fasta', 'w') as handle: SeqIO.write(records, handle, 'fasta')
Write Modified Records
from Bio.Seq import Seq from Bio.SeqRecord import SeqRecord def uppercase_record(rec): return SeqRecord(rec.seq.upper(), id=rec.id, description=rec.description) records = SeqIO.parse('input.fasta', 'fasta') modified = (uppercase_record(rec) for rec in records) SeqIO.write(modified, 'output.fasta', 'fasta')
Append to Existing File
with open('output.fasta', 'a') as handle: SeqIO.write(new_records, handle, 'fasta')
Write FASTQ with Quality Scores
record = SeqRecord(Seq('ATGCGATCG'), id='read1') record.letter_annotations['phred_quality'] = [30, 30, 28, 25, 30, 30, 28, 25, 30] SeqIO.write(record, 'output.fastq', 'fastq')
Write GenBank Format
record = SeqRecord(Seq('ATGCGATCGATCG'), id='SEQ001', name='example') record.annotations['molecule_type'] = 'DNA' record.annotations['topology'] = 'linear' record.annotations['organism'] = 'Example organism' SeqIO.write(record, 'output.gb', 'genbank')
Common Errors
| Error | Cause | Solution |
|---|---|---|
| Passed raw string/Seq | Wrap in SeqRecord object |
| GenBank without annotations | Add |
| FASTQ without phred_quality | Add quality scores to letter_annotations |
| PHYLIP with unequal lengths | Pad or trim sequences first |
Format-Specific Requirements
FASTQ
Must have quality scores:
record.letter_annotations['phred_quality'] = [30] * len(record.seq)
GenBank/EMBL
Must have molecule_type:
record.annotations['molecule_type'] = 'DNA' # or 'RNA', 'protein'
PHYLIP
All sequences must be same length. IDs truncated to 10 characters.
Related Skills
- read-sequences - Read sequences before modifying and writing
- format-conversion - Direct format conversion without intermediate processing
- filter-sequences - Filter sequences before writing subset
- sequence-manipulation/seq-objects - Create SeqRecord objects to write
- alignment-files - For SAM/BAM output, use samtools/pysam