SciAgent-Skills fda-database
Query openFDA REST API for drug adverse event reports (FAERS), drug labeling, product information, recalls, and enforcement actions. Search by drug name, active ingredient, adverse event term (MedDRA), or NDC code. No API key needed for 1000 req/day; free key for 120,000 req/day. For clinical trial data use clinicaltrials-database-search; for drug structures use drugbank-database-access or chembl-database-bioactivity.
git clone https://github.com/jaechang-hits/SciAgent-Skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/jaechang-hits/SciAgent-Skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/structural-biology-drug-discovery/fda-database" ~/.claude/skills/jaechang-hits-sciagent-skills-fda-database && rm -rf "$T"
skills/structural-biology-drug-discovery/fda-database/SKILL.mdopenFDA Drug and Adverse Event Database
Overview
openFDA provides public access to FDA regulatory data through a simple REST API. Key datasets include the FDA Adverse Event Reporting System (FAERS) with 20M+ adverse event reports, drug product labeling (NDC, SPL), drug approvals (Drugs@FDA), medical device reports, and recall enforcement actions. The API supports full-text search and structured queries using Elasticsearch-style syntax.
When to Use
- Retrieving adverse event reports for a drug to assess safety signals and side effect profiles
- Querying FAERS for disproportionality analysis (comparing drug vs. drug adverse event profiles)
- Looking up official drug labeling (indications, contraindications, warnings, dosing) by drug name or NDC
- Searching for drug recalls and enforcement actions by drug name or company
- Identifying all marketed products containing a given active ingredient
- Building pharmacovigilance pipelines that monitor drug safety signals from public regulatory data
- For clinical trial efficacy data use
; for drug structures/targets useclinicaltrials-database-searchdrugbank-database-access
Prerequisites
- Python packages:
,requestspandas - Data requirements: drug names, active ingredients, MedDRA terms, NDC codes
- Environment: internet connection; no authentication required for basic use
- Rate limits: 1000 req/day without API key; 120,000 req/day with free API key from https://open.fda.gov/apis/authentication/
pip install requests pandas
Quick Start
import requests BASE = "https://api.fda.gov/drug" # Optional: add api_key parameter for higher rate limits # Find adverse events for aspirin r = requests.get( f"{BASE}/event.json", params={ "search": 'patient.drug.medicinalproduct:"aspirin"', "count": "patient.reaction.reactionmeddrapt.exact", "limit": 10 } ) r.raise_for_status() data = r.json() print("Top adverse reactions for aspirin:") for item in data["results"][:5]: print(f" {item['term']:40s} count={item['count']}")
Core API
Query 1: Adverse Event Report Search (FAERS)
Search the FDA Adverse Event Reporting System for drug-event associations.
import requests, pandas as pd BASE = "https://api.fda.gov/drug" def faers_search(drug_name, limit=100): """Search FAERS for adverse event reports mentioning a drug.""" r = requests.get(f"{BASE}/event.json", params={"search": f'patient.drug.medicinalproduct:"{drug_name}"', "limit": limit}) r.raise_for_status() return r.json() data = faers_search("warfarin", limit=5) total = data["meta"]["results"]["total"] print(f"Total FAERS reports for warfarin: {total:,}") # Show first report summary report = data["results"][0] print(f"\nReport {report['safetyreportid']}:") print(f" Date : {report.get('receivedate', 'n/a')}") print(f" Serious : {report.get('serious', 'n/a')}") drugs = [d.get("medicinalproduct", "n/a") for d in report.get("patient", {}).get("drug", [])] print(f" Drugs : {drugs[:5]}") reactions = [r.get("reactionmeddrapt", "n/a") for r in report.get("patient", {}).get("reaction", [])] print(f" Reactions: {reactions[:5]}")
Query 2: Count Top Adverse Events for a Drug
Use the
count parameter to aggregate adverse event terms.
import requests, pandas as pd BASE = "https://api.fda.gov/drug" def top_adverse_events(drug_name, limit=20): """Get the most frequently reported adverse events for a drug.""" r = requests.get(f"{BASE}/event.json", params={ "search": f'patient.drug.medicinalproduct:"{drug_name}"', "count": "patient.reaction.reactionmeddrapt.exact", "limit": limit }) r.raise_for_status() results = r.json()["results"] return pd.DataFrame(results).rename(columns={"term": "reaction", "count": "reports"}) df_atorvastatin = top_adverse_events("atorvastatin", limit=15) print("Top adverse events for atorvastatin:") print(df_atorvastatin.head(10).to_string(index=False)) df_atorvastatin.to_csv("atorvastatin_adverse_events.csv", index=False)
# Compare two drugs: adverse event profile overlap df_drug1 = top_adverse_events("simvastatin", limit=20) df_drug2 = top_adverse_events("atorvastatin", limit=20) common = set(df_drug1["reaction"]) & set(df_drug2["reaction"]) print(f"\nCommon adverse events (simvastatin ∩ atorvastatin): {len(common)}") print("Shared reactions:", list(common)[:10])
Query 3: Drug Labeling Search
Retrieve official drug labels (indications, warnings, dosing, contraindications).
import requests BASE = "https://api.fda.gov/drug" def get_label(drug_name): """Retrieve FDA drug label by brand or generic name.""" r = requests.get(f"{BASE}/label.json", params={"search": f'openfda.brand_name:"{drug_name}"', "limit": 1}) if r.status_code == 404: r = requests.get(f"{BASE}/label.json", params={"search": f'openfda.generic_name:"{drug_name}"', "limit": 1}) r.raise_for_status() results = r.json()["results"] return results[0] if results else None label = get_label("Lipitor") if label: print(f"Brand name : {label.get('openfda', {}).get('brand_name', ['n/a'])[0]}") print(f"Generic name: {label.get('openfda', {}).get('generic_name', ['n/a'])[0]}") print(f"Manufacturer: {label.get('openfda', {}).get('manufacturer_name', ['n/a'])[0]}") indications = label.get("indications_and_usage", ["n/a"])[0] print(f"\nIndications (first 300 chars):\n{indications[:300]}...")
Query 4: Drug Product Lookup by NDC
Retrieve marketed product information by National Drug Code.
import requests, pandas as pd BASE = "https://api.fda.gov/drug" def ndc_search(ndc_or_name, limit=10): """Search NDC directory for drug product information.""" # Search by product name or NDC r = requests.get(f"{BASE}/ndc.json", params={"search": f'generic_name:"{ndc_or_name}"', "limit": limit}) r.raise_for_status() return r.json() data = ndc_search("metformin", limit=10) total = data["meta"]["results"]["total"] print(f"Metformin products: {total}") rows = [] for prod in data["results"]: rows.append({ "product_ndc": prod.get("product_ndc"), "brand_name": prod.get("brand_name"), "generic_name": prod.get("generic_name"), "dosage_form": prod.get("dosage_form"), "route": ", ".join(prod.get("route", [])), "labeler": prod.get("labeler_name"), }) df = pd.DataFrame(rows) print(df.to_string(index=False))
Query 5: Drug Recall Search
Search FDA enforcement actions and drug recalls by drug name or company.
import requests, pandas as pd BASE = "https://api.fda.gov/drug" def drug_recalls(drug_name, limit=20): """Find FDA drug recalls for a given drug name.""" r = requests.get(f"{BASE}/enforcement.json", params={ "search": f'product_description:"{drug_name}"', "limit": limit }) if r.status_code == 404: return pd.DataFrame() r.raise_for_status() results = r.json()["results"] return pd.DataFrame([{ "recalling_firm": rec.get("recalling_firm"), "product": rec.get("product_description", "")[:80], "reason": rec.get("reason_for_recall", "")[:100], "classification": rec.get("classification"), "recall_date": rec.get("recall_initiation_date"), "status": rec.get("status"), } for rec in results]) recalls = drug_recalls("metformin", limit=5) print(f"Metformin recalls: {len(recalls)}") if not recalls.empty: print(recalls[["recalling_firm", "classification", "recall_date", "status"]].to_string(index=False))
Query 6: Active Ingredient Search Across Products
Find all drug products containing a specific active ingredient.
import requests, pandas as pd BASE = "https://api.fda.gov/drug" def products_by_ingredient(ingredient, limit=50): """Find all FDA-listed products with a given active ingredient.""" r = requests.get(f"{BASE}/ndc.json", params={ "search": f'active_ingredients.name:"{ingredient}"', "limit": limit }) r.raise_for_status() data = r.json() print(f"Total products with {ingredient}: {data['meta']['results']['total']}") rows = [] for prod in data["results"]: for ai in prod.get("active_ingredients", []): if ingredient.lower() in ai.get("name", "").lower(): rows.append({ "brand": prod.get("brand_name"), "generic": prod.get("generic_name"), "strength": ai.get("strength"), "dosage_form": prod.get("dosage_form"), "route": ", ".join(prod.get("route", [])), }) return pd.DataFrame(rows) df = products_by_ingredient("metformin hydrochloride") print(df.drop_duplicates(subset=["generic", "strength", "dosage_form"]).head(10).to_string(index=False))
Key Concepts
openFDA API Endpoints
| Endpoint | Dataset | Key Use |
|---|---|---|
| FAERS (adverse events) | Pharmacovigilance, safety signals |
| Structured Product Labeling | Indications, warnings, dosing |
| NDC Directory | Marketed products, strengths |
| Recalls & Enforcement | Drug recalls, market withdrawals |
| MAUDE (device events) | Medical device adverse events |
Query Syntax
openFDA uses Elasticsearch-style queries. Use
field:"exact phrase" for exact matching, field:term for fuzzy matching, and +field1:"A" +field2:"B" for AND logic. Use count parameter to aggregate (equivalent to GROUP BY). Use limit (1–1000) for pagination with skip for offset.
Common Workflows
Workflow 1: Drug Safety Signal Analysis
Goal: Compare adverse event frequency for multiple drugs in the same therapeutic class to identify differentiated safety profiles.
import requests, pandas as pd, time BASE = "https://api.fda.gov/drug" drugs = ["atorvastatin", "simvastatin", "rosuvastatin"] def count_reactions(drug, limit=20): r = requests.get(f"{BASE}/event.json", params={"search": f'patient.drug.medicinalproduct:"{drug}"', "count": "patient.reaction.reactionmeddrapt.exact", "limit": limit}) if r.status_code != 200: return pd.Series(dtype=float, name=drug) df = pd.DataFrame(r.json()["results"]) df.columns = ["reaction", drug] return df.set_index("reaction")[drug] series_list = [] for drug in drugs: s = count_reactions(drug, limit=20) series_list.append(s) time.sleep(0.5) comparison = pd.concat(series_list, axis=1).fillna(0) comparison = comparison.sort_values(drugs[0], ascending=False) print("Adverse event count comparison (statins):") print(comparison.head(10).to_string()) comparison.to_csv("statin_safety_comparison.csv")
Workflow 2: Drug Label Information Extractor
Goal: Extract indications, contraindications, and warnings for multiple drugs and save to CSV.
import requests, pandas as pd, time, re BASE = "https://api.fda.gov/drug" def get_label_sections(drug_name): r = requests.get(f"{BASE}/label.json", params={"search": f'openfda.generic_name:"{drug_name}"', "limit": 1}) if r.status_code != 200 or not r.json()["results"]: return None label = r.json()["results"][0] def clean(field): text = " ".join(label.get(field, [""])) return re.sub(r"\s+", " ", text).strip()[:500] return { "drug": drug_name, "indications": clean("indications_and_usage"), "contraindications": clean("contraindications"), "warnings": clean("warnings_and_cautions") or clean("warnings"), } drugs = ["metformin", "atorvastatin", "lisinopril", "omeprazole"] rows = [] for drug in drugs: info = get_label_sections(drug) if info: rows.append(info) time.sleep(0.4) df = pd.DataFrame(rows) df.to_csv("drug_labels.csv", index=False) print(df[["drug", "indications"]].to_string(index=False))
Key Parameters
| Parameter | Module | Default | Range / Options | Effect |
|---|---|---|---|---|
| All endpoints | — | Elasticsearch syntax | Filter query |
| All endpoints | — | field name + | Aggregate/count by field value |
| All endpoints | | – | Results per request |
| All endpoints | | integer | Offset for pagination |
| All endpoints | — | API key string | Increase rate limit to 120K/day |
suffix | count field | — | appended to field name | Exact string matching vs tokenized |
Best Practices
-
Get a free API key: Registration at https://open.fda.gov/apis/authentication/ is instant and raises your limit from 1,000 to 120,000 requests/day — essential for production use.
-
Use
for drug name searches:.exact
(notpatient.drug.medicinalproduct.exact
) gives exact phrase matching, preventing partial matches that inflate counts..medicinalproduct -
Normalize drug names: FAERS reporters use many spellings (e.g., "Lipitor", "atorvastatin calcium", "ATORVASTATIN"). Search multiple name variants or use generic ingredient field for consistent coverage.
-
Interpret FAERS counts carefully: Report counts do not equal incidence rates. FAERS is voluntary and subject to reporting bias; higher counts may reflect market size or media attention, not higher risk.
-
Paginate large result sets: Maximum
is 1000; uselimit
to paginate through large result sets (skip
intotal
).meta.results
Common Recipes
Recipe: Total FAERS Reports Count for a Drug
When to use: Quick check of total adverse event report volume for a drug.
import requests drug = "ibuprofen" r = requests.get("https://api.fda.gov/drug/event.json", params={"search": f'patient.drug.medicinalproduct.exact:"{drug}"', "limit": 1}) total = r.json()["meta"]["results"]["total"] print(f"Total FAERS reports for {drug}: {total:,}")
Recipe: Find Serious Adverse Events Only
When to use: Filter FAERS for reports classified as serious (death, hospitalization, disability).
import requests, pandas as pd r = requests.get("https://api.fda.gov/drug/event.json", params={ "search": 'patient.drug.medicinalproduct:"warfarin" AND serious:1', "count": "patient.reaction.reactionmeddrapt.exact", "limit": 10 }) df = pd.DataFrame(r.json()["results"]) df.columns = ["reaction", "serious_reports"] print(df.to_string(index=False))
Recipe: Check Drug Market Approval Status
When to use: Verify whether a drug has FDA NDA/ANDA approval and find the approval year.
import requests r = requests.get("https://api.fda.gov/drug/label.json", params={"search": 'openfda.generic_name:"metformin"', "limit": 1}) label = r.json()["results"][0] openfda = label.get("openfda", {}) print(f"Application numbers: {openfda.get('application_number', ['n/a'])}") print(f"Product type: {openfda.get('product_type', ['n/a'])}") print(f"NDA sponsor: {openfda.get('manufacturer_name', ['n/a'])}")
Troubleshooting
| Problem | Cause | Solution |
|---|---|---|
with | No results match query | Check drug name spelling; try alternative name formats |
| Rate limit exceeded | Register for API key; add between requests |
| Count results don't match expectations | Drug name tokenization | Use suffix: not |
| Label search returns wrong drug | Ambiguous name | Add to filter |
| Missing fields in FAERS report | Incomplete voluntary report | Check if field exists with |
error | Pagination limit | openFDA caps pagination at 26,000 records; use endpoint for aggregates beyond this |
Related Skills
— Clinical trial data for drugs identified via openFDAclinicaltrials-database-search
— Drug structures, targets, and interactions to contextualize FDA datadrugbank-database-access
— Preclinical bioactivity data for drugs in the FAERS databasechembl-database-bioactivity
— Protein interactions for drug targets found via adverse event analysisstring-database-ppi
References
- openFDA API documentation — Full API reference, endpoints, and query syntax
- openFDA API key registration — Free registration for increased rate limits
- FAERS overview — Understanding FAERS data and limitations
- openFDA GitHub — Source code and data download references