AutoResearchClaw chemistry-rdkit
Computational chemistry with RDKit for molecular analysis, descriptors, fingerprints, and substructure search. Use when working with SMILES, drug discovery, or cheminformatics tasks.
install
source · Clone the upstream repo
git clone https://github.com/aiming-lab/AutoResearchClaw
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/aiming-lab/AutoResearchClaw "$T" && mkdir -p ~/.claude/skills && cp -r "$T/researchclaw/skills/builtin/domain/chemistry-rdkit" ~/.claude/skills/aiming-lab-autoresearchclaw-chemistry-rdkit-7cae29 && rm -rf "$T"
manifest:
researchclaw/skills/builtin/domain/chemistry-rdkit/SKILL.mdsource content
RDKit Cheminformatics Best Practice
Molecular I/O
- Create molecules from SMILES:
mol = Chem.MolFromSmiles('CCO') - Always check for None:
returns None on invalid inputMolFromSmiles - Convert to canonical SMILES:
Chem.MolToSmiles(mol) - Read SDF files:
suppl = Chem.SDMolSupplier('file.sdf') - Read SMILES files:
suppl = Chem.SmilesMolSupplier('file.smi') - Write molecules:
writer = Chem.SDWriter('output.sdf')
Molecular Descriptors
- Molecular weight:
Descriptors.MolWt(mol) - LogP (lipophilicity):
Descriptors.MolLogP(mol) - TPSA (polar surface area):
Descriptors.TPSA(mol) - H-bond donors/acceptors:
,Descriptors.NumHDonors(mol)Descriptors.NumHAcceptors(mol) - Rotatable bonds:
Descriptors.NumRotatableBonds(mol) - Lipinski Rule of 5: MW <= 500, LogP <= 5, HBD <= 5, HBA <= 10
Fingerprints and Similarity
- Morgan (circular) fingerprints:
AllChem.GetMorganFingerprintAsBitVect(mol, radius=2, nBits=2048) - RDKit fingerprints:
Chem.RDKFingerprint(mol) - MACCS keys:
MACCSkeys.GenMACCSKeys(mol) - Tanimoto similarity:
DataStructs.TanimotoSimilarity(fp1, fp2) - Use radius=2 (ECFP4 equivalent) as default for most applications
- For virtual screening, Tanimoto > 0.7 suggests structural similarity
Substructure Search
- SMARTS patterns:
pattern = Chem.MolFromSmarts('[OH]') - Check match:
mol.HasSubstructMatch(pattern) - Get all matches:
mol.GetSubstructMatches(pattern) - Common SMARTS:
(carboxylic acid),[#6](=O)[OH]
(primary amine)[NH2] - Filter compound libraries by functional group presence
Property Calculation Patterns
- Batch processing: iterate over SDMolSupplier, skip None entries
- Use
for all available descriptorsChem.Descriptors.descList - For ADMET filtering, calculate Lipinski, Veber, and PAINS filters
- Generate 3D coordinates:
AllChem.EmbedMolecule(mol, AllChem.ETKDG()) - Minimize energy:
AllChem.MMFFOptimizeMolecule(mol)
Common Pitfalls
- Always sanitize molecules (default behavior) — disable only when needed
- Add hydrogens explicitly for 3D work:
Chem.AddHs(mol) - Handle stereochemistry: use
Chem.AssignStereochemistry(mol) - Large SDF files: use
for memory efficiencyForwardSDMolSupplier - Kekulization errors usually indicate invalid SMILES input