Awesome-omni-skills citation-management
Citation Management workflow skill. Use this skill when the user needs Manage citations systematically throughout the research and writing process and the operator should preserve the upstream workflow, copied support files, and provenance before merging or handing off.
git clone https://github.com/diegosouzapw/awesome-omni-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/citation-management" ~/.claude/skills/diegosouzapw-awesome-omni-skills-citation-management && rm -rf "$T"
skills/citation-management/SKILL.mdCitation Management
Overview
This public intake copy packages
plugins/antigravity-awesome-skills-claude/skills/citation-management from https://github.com/sickn33/antigravity-awesome-skills into the native Omni Skills editorial shape without hiding its origin.
Use it when the operator needs the upstream workflow, support files, and repository context to stay intact while the public validator and private enhancer continue their normal downstream flow.
This intake keeps the copied upstream files intact and uses
metadata.json plus ORIGIN.md as the provenance anchor for review.
Citation Management
Imported source sections that did not map cleanly to the public headings are still preserved below or in the support files. Notable imported sections: Visual Enhancement with Scientific Schematics, Search Strategies, Tools and Scripts, Common Pitfalls to Avoid, Integration with Other Skills, Dependencies.
When to Use This Skill
Use this section as the trigger filter. It should make the activation boundary explicit before the operator loads files, runs commands, or opens a pull request.
- Searching for specific papers on Google Scholar or PubMed
- Converting DOIs, PMIDs, or arXiv IDs to properly formatted BibTeX
- Extracting complete metadata for citations (authors, title, journal, year, etc.)
- Validating existing citations for accuracy
- Cleaning and formatting BibTeX files
- Finding highly cited papers in a specific field
Operating Table
| Situation | Start here | Why it matters |
|---|---|---|
| First-time use | | Confirms repository, branch, commit, and imported path before touching the copied workflow |
| Provenance review | | Gives reviewers a plain-language audit trail for the imported source |
| Workflow execution | | Starts with the smallest copied file that materially changes execution |
| Supporting context | | Adds the next most relevant copied source file without loading the entire package |
| Handoff decision | | Helps the operator switch to a stronger native skill when the task drifts |
Workflow
This workflow is intentionally editorial and operational at the same time. It keeps the imported source useful to the operator while still satisfying the public intake standards that feed the downstream enhancer flow.
- Use quotation marks for exact phrases: "deep learning"
- Search by author: author:LeCun
- Search in title: intitle:"neural networks"
- Exclude terms: machine learning -survey
- Find highly cited papers using sort options
- Filter by date ranges to get recent work
- Use specific, targeted search terms
Imported Workflow Notes
Imported: Core Workflow
Citation management follows a systematic process:
Phase 1: Paper Discovery and Search
Goal: Find relevant papers using academic search engines.
Google Scholar Search
Google Scholar provides the most comprehensive coverage across disciplines.
Basic Search:
# Search for papers on a topic python scripts/search_google_scholar.py "CRISPR gene editing" \ --limit 50 \ --output results.json # Search with year filter python scripts/search_google_scholar.py "machine learning protein folding" \ --year-start 2020 \ --year-end 2024 \ --limit 100 \ --output ml_proteins.json
Advanced Search Strategies (see
references/google_scholar_search.md):
- Use quotation marks for exact phrases:
"deep learning" - Search by author:
author:LeCun - Search in title:
intitle:"neural networks" - Exclude terms:
machine learning -survey - Find highly cited papers using sort options
- Filter by date ranges to get recent work
Best Practices:
- Use specific, targeted search terms
- Include key technical terms and acronyms
- Filter by recent years for fast-moving fields
- Check "Cited by" to find seminal papers
- Export top results for further analysis
PubMed Search
PubMed specializes in biomedical and life sciences literature (35+ million citations).
Basic Search:
# Search PubMed python scripts/search_pubmed.py "Alzheimer's disease treatment" \ --limit 100 \ --output alzheimers.json # Search with MeSH terms and filters python scripts/search_pubmed.py \ --query '"Alzheimer Disease"[MeSH] AND "Drug Therapy"[MeSH]' \ --date-start 2020 \ --date-end 2024 \ --publication-types "Clinical Trial,Review" \ --output alzheimers_trials.json
Advanced PubMed Queries (see
references/pubmed_search.md):
- Use MeSH terms:
"Diabetes Mellitus"[MeSH] - Field tags:
,"cancer"[Title]"Smith J"[Author] - Boolean operators:
,AND
,ORNOT - Date filters:
2020:2024[Publication Date] - Publication types:
"Review"[Publication Type] - Combine with E-utilities API for automation
Best Practices:
- Use MeSH Browser to find correct controlled vocabulary
- Construct complex queries in PubMed Advanced Search Builder first
- Include multiple synonyms with OR
- Retrieve PMIDs for easy metadata extraction
- Export to JSON or directly to BibTeX
Phase 2: Metadata Extraction
Goal: Convert paper identifiers (DOI, PMID, arXiv ID) to complete, accurate metadata.
Quick DOI to BibTeX Conversion
For single DOIs, use the quick conversion tool:
# Convert single DOI python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 # Convert multiple DOIs from a file python scripts/doi_to_bibtex.py --input dois.txt --output references.bib # Different output formats python scripts/doi_to_bibtex.py 10.1038/nature12345 --format json
Comprehensive Metadata Extraction
For DOIs, PMIDs, arXiv IDs, or URLs:
# Extract from DOI python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2 # Extract from PMID python scripts/extract_metadata.py --pmid 34265844 # Extract from arXiv ID python scripts/extract_metadata.py --arxiv 2103.14030 # Extract from URL python scripts/extract_metadata.py --url "https://www.nature.com/articles/s41586-021-03819-2" # Batch extraction from file (mixed identifiers) python scripts/extract_metadata.py --input identifiers.txt --output citations.bib
Metadata Sources (see
references/metadata_extraction.md):
-
CrossRef API: Primary source for DOIs
- Comprehensive metadata for journal articles
- Publisher-provided information
- Includes authors, title, journal, volume, pages, dates
- Free, no API key required
-
PubMed E-utilities: Biomedical literature
- Official NCBI metadata
- Includes MeSH terms, abstracts
- PMID and PMCID identifiers
- Free, API key recommended for high volume
-
arXiv API: Preprints in physics, math, CS, q-bio
- Complete metadata for preprints
- Version tracking
- Author affiliations
- Free, open access
-
DataCite API: Research datasets, software, other resources
- Metadata for non-traditional scholarly outputs
- DOIs for datasets and code
- Free access
What Gets Extracted:
- Required fields: author, title, year
- Journal articles: journal, volume, number, pages, DOI
- Books: publisher, ISBN, edition
- Conference papers: booktitle, conference location, pages
- Preprints: repository (arXiv, bioRxiv), preprint ID
- Additional: abstract, keywords, URL
Phase 3: BibTeX Formatting
Goal: Generate clean, properly formatted BibTeX entries.
Understanding BibTeX Entry Types
See
references/bibtex_formatting.md for complete guide.
Common Entry Types:
: Journal articles (most common)@article
: Books@book
: Conference papers@inproceedings
: Book chapters@incollection
: Dissertations@phdthesis
: Preprints, software, datasets@misc
Required Fields by Type:
@article{citationkey, author = {Last1, First1 and Last2, First2}, title = {Article Title}, journal = {Journal Name}, year = {2024}, volume = {10}, number = {3}, pages = {123--145}, doi = {10.1234/example} } @inproceedings{citationkey, author = {Last, First}, title = {Paper Title}, booktitle = {Conference Name}, year = {2024}, pages = {1--10} } @book{citationkey, author = {Last, First}, title = {Book Title}, publisher = {Publisher Name}, year = {2024} }
Formatting and Cleaning
Use the formatter to standardize BibTeX files:
# Format and clean BibTeX file python scripts/format_bibtex.py references.bib \ --output formatted_references.bib # Sort entries by citation key python scripts/format_bibtex.py references.bib \ --sort key \ --output sorted_references.bib # Sort by year (newest first) python scripts/format_bibtex.py references.bib \ --sort year \ --descending \ --output sorted_references.bib # Remove duplicates python scripts/format_bibtex.py references.bib \ --deduplicate \ --output clean_references.bib # Validate and report issues python scripts/format_bibtex.py references.bib \ --validate \ --report validation_report.txt
Formatting Operations:
- Standardize field order
- Consistent indentation and spacing
- Proper capitalization in titles (protected with {})
- Standardized author name format
- Consistent citation key format
- Remove unnecessary fields
- Fix common errors (missing commas, braces)
Phase 4: Citation Validation
Goal: Verify all citations are accurate and complete.
Comprehensive Validation
# Validate BibTeX file python scripts/validate_citations.py references.bib # Validate and fix common issues python scripts/validate_citations.py references.bib \ --auto-fix \ --output validated_references.bib # Generate detailed validation report python scripts/validate_citations.py references.bib \ --report validation_report.json \ --verbose
Validation Checks (see
references/citation_validation.md):
-
DOI Verification:
- DOI resolves correctly via doi.org
- Metadata matches between BibTeX and CrossRef
- No broken or invalid DOIs
-
Required Fields:
- All required fields present for entry type
- No empty or missing critical information
- Author names properly formatted
-
Data Consistency:
- Year is valid (4 digits, reasonable range)
- Volume/number are numeric
- Pages formatted correctly (e.g., 123--145)
- URLs are accessible
-
Duplicate Detection:
- Same DOI used multiple times
- Similar titles (possible duplicates)
- Same author/year/title combinations
-
Format Compliance:
- Valid BibTeX syntax
- Proper bracing and quoting
- Citation keys are unique
- Special characters handled correctly
Validation Output:
{ "total_entries": 150, "valid_entries": 145, "errors": [ { "citation_key": "Smith2023", "error_type": "missing_field", "field": "journal", "severity": "high" }, { "citation_key": "Jones2022", "error_type": "invalid_doi", "doi": "10.1234/broken", "severity": "high" } ], "warnings": [ { "citation_key": "Brown2021", "warning_type": "possible_duplicate", "duplicate_of": "Brown2021a", "severity": "medium" } ] }
Phase 5: Integration with Writing Workflow
Building References for Manuscripts
Complete workflow for creating a bibliography:
# 1. Search for papers on your topic python scripts/search_pubmed.py \ '"CRISPR-Cas Systems"[MeSH] AND "Gene Editing"[MeSH]' \ --date-start 2020 \ --limit 200 \ --output crispr_papers.json # 2. Extract DOIs from search results and convert to BibTeX python scripts/extract_metadata.py \ --input crispr_papers.json \ --output crispr_refs.bib # 3. Add specific papers by DOI python scripts/doi_to_bibtex.py 10.1038/nature12345 >> crispr_refs.bib python scripts/doi_to_bibtex.py 10.1126/science.abcd1234 >> crispr_refs.bib # 4. Format and clean the BibTeX file python scripts/format_bibtex.py crispr_refs.bib \ --deduplicate \ --sort year \ --descending \ --output references.bib # 5. Validate all citations python scripts/validate_citations.py references.bib \ --auto-fix \ --report validation.json \ --output final_references.bib # 6. Review validation report and fix any remaining issues cat validation.json # 7. Use in your LaTeX document # \bibliography{final_references}
Integration with Literature Review Skill
This skill complements the
literature-review skill:
Literature Review Skill → Systematic search and synthesis Citation Management Skill → Technical citation handling
Combined Workflow:
- Use
for comprehensive multi-database searchliterature-review - Use
to extract and validate all citationscitation-management - Use
to synthesize findings thematicallyliterature-review - Use
to verify final bibliography accuracycitation-management
# After completing literature review # Verify all citations in the review document python scripts/validate_citations.py my_review_references.bib --report review_validation.json # Format for specific citation style if needed python scripts/format_bibtex.py my_review_references.bib \ --style nature \ --output formatted_refs.bib
Imported: Example Workflows
Example 1: Building a Bibliography for a Paper
# Step 1: Find key papers on your topic python scripts/search_google_scholar.py "transformer neural networks" \ --year-start 2017 \ --limit 50 \ --output transformers_gs.json python scripts/search_pubmed.py "deep learning medical imaging" \ --date-start 2020 \ --limit 50 \ --output medical_dl_pm.json # Step 2: Extract metadata from search results python scripts/extract_metadata.py \ --input transformers_gs.json \ --output transformers.bib python scripts/extract_metadata.py \ --input medical_dl_pm.json \ --output medical.bib # Step 3: Add specific papers you already know python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 >> specific.bib python scripts/doi_to_bibtex.py 10.1126/science.aam9317 >> specific.bib # Step 4: Combine all BibTeX files cat transformers.bib medical.bib specific.bib > combined.bib # Step 5: Format and deduplicate python scripts/format_bibtex.py combined.bib \ --deduplicate \ --sort year \ --descending \ --output formatted.bib # Step 6: Validate python scripts/validate_citations.py formatted.bib \ --auto-fix \ --report validation.json \ --output final_references.bib # Step 7: Review any issues cat validation.json | grep -A 3 '"errors"' # Step 8: Use in LaTeX # \bibliography{final_references}
Example 2: Converting a List of DOIs
# You have a text file with DOIs (one per line) # dois.txt contains: # 10.1038/s41586-021-03819-2 # 10.1126/science.aam9317 # 10.1016/j.cell.2023.01.001 # Convert all to BibTeX python scripts/doi_to_bibtex.py --input dois.txt --output references.bib # Validate the result python scripts/validate_citations.py references.bib --verbose
Example 3: Cleaning an Existing BibTeX File
# You have a messy BibTeX file from various sources # Clean it up systematically # Step 1: Format and standardize python scripts/format_bibtex.py messy_references.bib \ --output step1_formatted.bib # Step 2: Remove duplicates python scripts/format_bibtex.py step1_formatted.bib \ --deduplicate \ --output step2_deduplicated.bib # Step 3: Validate and auto-fix python scripts/validate_citations.py step2_deduplicated.bib \ --auto-fix \ --output step3_validated.bib # Step 4: Sort by year python scripts/format_bibtex.py step3_validated.bib \ --sort year \ --descending \ --output clean_references.bib # Step 5: Final validation report python scripts/validate_citations.py clean_references.bib \ --report final_validation.json \ --verbose # Review report cat final_validation.json
Example 4: Finding and Citing Seminal Papers
# Find highly cited papers on a topic python scripts/search_google_scholar.py "AlphaFold protein structure" \ --year-start 2020 \ --year-end 2024 \ --sort-by citations \ --limit 20 \ --output alphafold_seminal.json # Extract the top 10 by citation count # (script will have included citation counts in JSON) # Convert to BibTeX python scripts/extract_metadata.py \ --input alphafold_seminal.json \ --output alphafold_refs.bib # The BibTeX file now contains the most influential papers
Imported: Overview
Manage citations systematically throughout the research and writing process. This skill provides tools and strategies for searching academic databases (Google Scholar, PubMed), extracting accurate metadata from multiple sources (CrossRef, PubMed, arXiv), validating citation information, and generating properly formatted BibTeX entries.
Critical for maintaining citation accuracy, avoiding reference errors, and ensuring reproducible research. Integrates seamlessly with the literature-review skill for comprehensive research workflows.
Imported: Summary
The citation-management skill provides:
- Comprehensive search capabilities for Google Scholar and PubMed
- Automated metadata extraction from DOI, PMID, arXiv ID, URLs
- Citation validation with DOI verification and completeness checking
- BibTeX formatting with standardization and cleaning tools
- Quality assurance through validation and reporting
- Integration with scientific writing workflow
- Reproducibility through documented search and extraction methods
Use this skill to maintain accurate, complete citations throughout your research and ensure publication-ready bibliographies.
Imported: Visual Enhancement with Scientific Schematics
When creating documents with this skill, always consider adding scientific diagrams and schematics to enhance visual communication.
If your document does not already contain schematics or diagrams:
- Use the scientific-schematics skill to generate AI-powered publication-quality diagrams
- Simply describe your desired diagram in natural language
- Nano Banana Pro will automatically generate, review, and refine the schematic
For new documents: Scientific schematics should be generated by default to visually represent key concepts, workflows, architectures, or relationships described in the text.
How to generate schematics:
python scripts/generate_schematic.py "your diagram description" -o figures/output.png
The AI will automatically:
- Create publication-quality images with proper formatting
- Review and refine through multiple iterations
- Ensure accessibility (colorblind-friendly, high contrast)
- Save outputs in the figures/ directory
When to add schematics:
- Citation workflow diagrams
- Literature search methodology flowcharts
- Reference management system architectures
- Citation style decision trees
- Database integration diagrams
- Any complex concept that benefits from visualization
For detailed guidance on creating schematics, refer to the scientific-schematics skill documentation.
Examples
Example 1: Ask for the upstream workflow directly
Use @citation-management to handle <task>. Start from the copied upstream workflow, load only the files that change the outcome, and keep provenance visible in the answer.
Explanation: This is the safest starting point when the operator needs the imported workflow, but not the entire repository.
Example 2: Ask for a provenance-grounded review
Review @citation-management against metadata.json and ORIGIN.md, then explain which copied upstream files you would load first and why.
Explanation: Use this before review or troubleshooting when you need a precise, auditable explanation of origin and file selection.
Example 3: Narrow the copied support files before execution
Use @citation-management for <task>. Load only the copied references, examples, or scripts that change the outcome, and name the files explicitly before proceeding.
Explanation: This keeps the skill aligned with progressive disclosure instead of loading the whole copied package by default.
Example 4: Build a reviewer packet
Review @citation-management using the copied upstream files plus provenance, then summarize any gaps before merge.
Explanation: This is useful when the PR is waiting for human review and you want a repeatable audit packet.
Best Practices
Treat the generated public skill as a reviewable packaging layer around the upstream repository. The goal is to keep provenance explicit and load only the copied source material that materially improves execution.
- Start broad, then narrow:
- Begin with general terms to understand the field
- Refine with specific keywords and filters
- Use synonyms and related terms
- Use multiple sources:
- Google Scholar for comprehensive coverage
- PubMed for biomedical focus
Imported Operating Notes
Imported: Best Practices
Search Strategy
-
Start broad, then narrow:
- Begin with general terms to understand the field
- Refine with specific keywords and filters
- Use synonyms and related terms
-
Use multiple sources:
- Google Scholar for comprehensive coverage
- PubMed for biomedical focus
- arXiv for preprints
- Combine results for completeness
-
Leverage citations:
- Check "Cited by" for seminal papers
- Review references from key papers
- Use citation networks to discover related work
-
Document your searches:
- Save search queries and dates
- Record number of results
- Note any filters or restrictions applied
Metadata Extraction
-
Always use DOIs when available:
- Most reliable identifier
- Permanent link to the publication
- Best metadata source via CrossRef
-
Verify extracted metadata:
- Check author names are correct
- Verify journal/conference names
- Confirm publication year
- Validate page numbers and volume
-
Handle edge cases:
- Preprints: Include repository and ID
- Preprints later published: Use published version
- Conference papers: Include conference name and location
- Book chapters: Include book title and editors
-
Maintain consistency:
- Use consistent author name format
- Standardize journal abbreviations
- Use same DOI format (URL preferred)
BibTeX Quality
-
Follow conventions:
- Use meaningful citation keys (FirstAuthor2024keyword)
- Protect capitalization in titles with {}
- Use -- for page ranges (not single dash)
- Include DOI field for all modern publications
-
Keep it clean:
- Remove unnecessary fields
- No redundant information
- Consistent formatting
- Validate syntax regularly
-
Organize systematically:
- Sort by year or topic
- Group related papers
- Use separate files for different projects
- Merge carefully to avoid duplicates
Validation
-
Validate early and often:
- Check citations when adding them
- Validate complete bibliography before submission
- Re-validate after any manual edits
-
Fix issues promptly:
- Broken DOIs: Find correct identifier
- Missing fields: Extract from original source
- Duplicates: Choose best version, remove others
- Format errors: Use auto-fix when safe
-
Manual review for critical citations:
- Verify key papers cited correctly
- Check author names match publication
- Confirm page numbers and volume
- Ensure URLs are current
Troubleshooting
Problem: The operator skipped the imported context and answered too generically
Symptoms: The result ignores the upstream workflow in
plugins/antigravity-awesome-skills-claude/skills/citation-management, fails to mention provenance, or does not use any copied source files at all.
Solution: Re-open metadata.json, ORIGIN.md, and the most relevant copied upstream files. Load only the files that materially change the answer, then restate the provenance before continuing.
Problem: The imported workflow feels incomplete during review
Symptoms: Reviewers can see the generated
SKILL.md, but they cannot quickly tell which references, examples, or scripts matter for the current task.
Solution: Point at the exact copied references, examples, scripts, or assets that justify the path you took. If the gap is still real, record it in the PR instead of hiding it.
Problem: The task drifted into a different specialization
Symptoms: The imported skill starts in the right place, but the work turns into debugging, architecture, design, security, or release orchestration that a native skill handles better. Solution: Use the related skills section to hand off deliberately. Keep the imported provenance visible so the next skill inherits the right context instead of starting blind.
Related Skills
- Use when the work is better handled by that native specialization after this imported skill establishes context.@burp-suite-testing
- Use when the work is better handled by that native specialization after this imported skill establishes context.@burpsuite-project-parser
- Use when the work is better handled by that native specialization after this imported skill establishes context.@business-analyst
- Use when the work is better handled by that native specialization after this imported skill establishes context.@busybox-on-windows
Additional Resources
Use this support matrix and the linked files below as the operator packet for this imported skill. They should reflect real copied source material, not generic scaffolding.
| Resource family | What it gives the reviewer | Example path |
|---|---|---|
| copied reference notes, guides, or background material from upstream | |
| worked examples or reusable prompts copied from upstream | |
| upstream helper scripts that change execution or validation | |
| routing or delegation notes that are genuinely part of the imported package | |
| supporting assets or schemas copied from the source package | |
Imported Reference Notes
Imported: Resources
Bundled Resources
References (in
references/):
: Complete Google Scholar search guidegoogle_scholar_search.md
: PubMed and E-utilities API documentationpubmed_search.md
: Metadata sources and field requirementsmetadata_extraction.md
: Validation criteria and quality checkscitation_validation.md
: BibTeX entry types and formatting rulesbibtex_formatting.md
Scripts (in
scripts/):
: Google Scholar search automationsearch_google_scholar.py
: PubMed E-utilities API clientsearch_pubmed.py
: Universal metadata extractorextract_metadata.py
: Citation validation and verificationvalidate_citations.py
: BibTeX formatter and cleanerformat_bibtex.py
: Quick DOI to BibTeX converterdoi_to_bibtex.py
Assets (in
assets/):
: Example BibTeX entries for all typesbibtex_template.bib
: Quality assurance checklistcitation_checklist.md
External Resources
Search Engines:
- Google Scholar: https://scholar.google.com/
- PubMed: https://pubmed.ncbi.nlm.nih.gov/
- PubMed Advanced Search: https://pubmed.ncbi.nlm.nih.gov/advanced/
Metadata APIs:
- CrossRef API: https://api.crossref.org/
- PubMed E-utilities: https://www.ncbi.nlm.nih.gov/books/NBK25501/
- arXiv API: https://arxiv.org/help/api/
- DataCite API: https://api.datacite.org/
Tools and Validators:
- MeSH Browser: https://meshb.nlm.nih.gov/search
- DOI Resolver: https://doi.org/
- BibTeX Format: http://www.bibtex.org/Format/
Citation Styles:
- BibTeX documentation: http://www.bibtex.org/
- LaTeX bibliography management: https://www.overleaf.com/learn/latex/Bibliography_management
Imported: Search Strategies
Google Scholar Best Practices
Finding Seminal and High-Impact Papers (CRITICAL):
Always prioritize papers based on citation count, venue quality, and author reputation:
Citation Count Thresholds:
| Paper Age | Citations | Classification |
|---|---|---|
| 0-3 years | 20+ | Noteworthy |
| 0-3 years | 100+ | Highly Influential |
| 3-7 years | 100+ | Significant |
| 3-7 years | 500+ | Landmark Paper |
| 7+ years | 500+ | Seminal Work |
| 7+ years | 1000+ | Foundational |
Venue Quality Tiers:
- Tier 1 (Prefer): Nature, Science, Cell, NEJM, Lancet, JAMA, PNAS
- Tier 2 (High Priority): Impact Factor >10, top conferences (NeurIPS, ICML, ICLR)
- Tier 3 (Good): Specialized journals (IF 5-10)
- Tier 4 (Sparingly): Lower-impact peer-reviewed venues
Author Reputation Indicators:
- Senior researchers with h-index >40
- Multiple publications in Tier-1 venues
- Leadership at recognized institutions
- Awards and editorial positions
Search Strategies for High-Impact Papers:
- Sort by citation count (most cited first)
- Look for review articles from Tier-1 journals for overview
- Check "Cited by" for impact assessment and recent follow-up work
- Use citation alerts for tracking new citations to key papers
- Filter by top venues using
orsource:Naturesource:Science - Search for papers by known field leaders using
author:LastName
Advanced Operators (full list in
references/google_scholar_search.md):
"exact phrase" # Exact phrase matching author:lastname # Search by author intitle:keyword # Search in title only source:journal # Search specific journal -exclude # Exclude terms OR # Alternative terms 2020..2024 # Year range
Example Searches:
# Find recent reviews on a topic "CRISPR" intitle:review 2023..2024 # Find papers by specific author on topic author:Church "synthetic biology" # Find highly cited foundational work "deep learning" 2012..2015 sort:citations # Exclude surveys and focus on methods "protein folding" -survey -review intitle:method
PubMed Best Practices
Using MeSH Terms: MeSH (Medical Subject Headings) provides controlled vocabulary for precise searching.
- Find MeSH terms at https://meshb.nlm.nih.gov/search
- Use in queries:
"Diabetes Mellitus, Type 2"[MeSH] - Combine with keywords for comprehensive coverage
Field Tags:
[Title] # Search in title only [Title/Abstract] # Search in title or abstract [Author] # Search by author name [Journal] # Search specific journal [Publication Date] # Date range [Publication Type] # Article type [MeSH] # MeSH term
Building Complex Queries:
# Clinical trials on diabetes treatment published recently "Diabetes Mellitus, Type 2"[MeSH] AND "Drug Therapy"[MeSH] AND "Clinical Trial"[Publication Type] AND 2020:2024[Publication Date] # Reviews on CRISPR in specific journal "CRISPR-Cas Systems"[MeSH] AND "Nature"[Journal] AND "Review"[Publication Type] # Specific author's recent work "Smith AB"[Author] AND cancer[Title/Abstract] AND 2022:2024[Publication Date]
E-utilities for Automation: The scripts use NCBI E-utilities API for programmatic access:
- ESearch: Search and retrieve PMIDs
- EFetch: Retrieve full metadata
- ESummary: Get summary information
- ELink: Find related articles
See
references/pubmed_search.md for complete API documentation.
Imported: Tools and Scripts
search_google_scholar.py
Search Google Scholar and export results.
Features:
- Automated searching with rate limiting
- Pagination support
- Year range filtering
- Export to JSON or BibTeX
- Citation count information
Usage:
# Basic search python scripts/search_google_scholar.py "quantum computing" # Advanced search with filters python scripts/search_google_scholar.py "quantum computing" \ --year-start 2020 \ --year-end 2024 \ --limit 100 \ --sort-by citations \ --output quantum_papers.json # Export directly to BibTeX python scripts/search_google_scholar.py "machine learning" \ --limit 50 \ --format bibtex \ --output ml_papers.bib
search_pubmed.py
Search PubMed using E-utilities API.
Features:
- Complex query support (MeSH, field tags, Boolean)
- Date range filtering
- Publication type filtering
- Batch retrieval with metadata
- Export to JSON or BibTeX
Usage:
# Simple keyword search python scripts/search_pubmed.py "CRISPR gene editing" # Complex query with filters python scripts/search_pubmed.py \ --query '"CRISPR-Cas Systems"[MeSH] AND "therapeutic"[Title/Abstract]' \ --date-start 2020-01-01 \ --date-end 2024-12-31 \ --publication-types "Clinical Trial,Review" \ --limit 200 \ --output crispr_therapeutic.json # Export to BibTeX python scripts/search_pubmed.py "Alzheimer's disease" \ --limit 100 \ --format bibtex \ --output alzheimers.bib
extract_metadata.py
Extract complete metadata from paper identifiers.
Features:
- Supports DOI, PMID, arXiv ID, URL
- Queries CrossRef, PubMed, arXiv APIs
- Handles multiple identifier types
- Batch processing
- Multiple output formats
Usage:
# Single DOI python scripts/extract_metadata.py --doi 10.1038/s41586-021-03819-2 # Single PMID python scripts/extract_metadata.py --pmid 34265844 # Single arXiv ID python scripts/extract_metadata.py --arxiv 2103.14030 # From URL python scripts/extract_metadata.py \ --url "https://www.nature.com/articles/s41586-021-03819-2" # Batch processing (file with one identifier per line) python scripts/extract_metadata.py \ --input paper_ids.txt \ --output references.bib # Different output formats python scripts/extract_metadata.py \ --doi 10.1038/nature12345 \ --format json # or bibtex, yaml
validate_citations.py
Validate BibTeX entries for accuracy and completeness.
Features:
- DOI verification via doi.org and CrossRef
- Required field checking
- Duplicate detection
- Format validation
- Auto-fix common issues
- Detailed reporting
Usage:
# Basic validation python scripts/validate_citations.py references.bib # With auto-fix python scripts/validate_citations.py references.bib \ --auto-fix \ --output fixed_references.bib # Detailed validation report python scripts/validate_citations.py references.bib \ --report validation_report.json \ --verbose # Only check DOIs python scripts/validate_citations.py references.bib \ --check-dois-only
format_bibtex.py
Format and clean BibTeX files.
Features:
- Standardize formatting
- Sort entries (by key, year, author)
- Remove duplicates
- Validate syntax
- Fix common errors
- Enforce citation key conventions
Usage:
# Basic formatting python scripts/format_bibtex.py references.bib # Sort by year (newest first) python scripts/format_bibtex.py references.bib \ --sort year \ --descending \ --output sorted_refs.bib # Remove duplicates python scripts/format_bibtex.py references.bib \ --deduplicate \ --output clean_refs.bib # Complete cleanup python scripts/format_bibtex.py references.bib \ --deduplicate \ --sort year \ --validate \ --auto-fix \ --output final_refs.bib
doi_to_bibtex.py
Quick DOI to BibTeX conversion.
Features:
- Fast single DOI conversion
- Batch processing
- Multiple output formats
- Clipboard support
Usage:
# Single DOI python scripts/doi_to_bibtex.py 10.1038/s41586-021-03819-2 # Multiple DOIs python scripts/doi_to_bibtex.py \ 10.1038/nature12345 \ 10.1126/science.abc1234 \ 10.1016/j.cell.2023.01.001 # From file (one DOI per line) python scripts/doi_to_bibtex.py --input dois.txt --output references.bib # Copy to clipboard python scripts/doi_to_bibtex.py 10.1038/nature12345 --clipboard
Imported: Common Pitfalls to Avoid
-
Single source bias: Only using Google Scholar or PubMed
- Solution: Search multiple databases for comprehensive coverage
-
Accepting metadata blindly: Not verifying extracted information
- Solution: Spot-check extracted metadata against original sources
-
Ignoring DOI errors: Broken or incorrect DOIs in bibliography
- Solution: Run validation before final submission
-
Inconsistent formatting: Mixed citation key styles, formatting
- Solution: Use format_bibtex.py to standardize
-
Duplicate entries: Same paper cited multiple times with different keys
- Solution: Use duplicate detection in validation
-
Missing required fields: Incomplete BibTeX entries
- Solution: Validate and ensure all required fields present
-
Outdated preprints: Citing preprint when published version exists
- Solution: Check if preprints have been published, update to journal version
-
Special character issues: Broken LaTeX compilation due to characters
- Solution: Use proper escaping or Unicode in BibTeX
-
No validation before submission: Submitting with citation errors
- Solution: Always run validation as final check
-
Manual BibTeX entry: Typing entries by hand
- Solution: Always extract from metadata sources using scripts
Imported: Integration with Other Skills
Literature Review Skill
Citation Management provides the technical infrastructure for Literature Review:
- Literature Review: Multi-database systematic search and synthesis
- Citation Management: Metadata extraction and validation
Combined workflow:
- Use literature-review for systematic search methodology
- Use citation-management to extract and validate citations
- Use literature-review to synthesize findings
- Use citation-management to ensure bibliography accuracy
Scientific Writing Skill
Citation Management ensures accurate references for Scientific Writing:
- Export validated BibTeX for use in LaTeX manuscripts
- Verify citations match publication standards
- Format references according to journal requirements
Venue Templates Skill
Citation Management works with Venue Templates for submission-ready manuscripts:
- Different venues require different citation styles
- Generate properly formatted references
- Validate citations meet venue requirements
Imported: Dependencies
Required Python Packages
# Core dependencies pip install requests # HTTP requests for APIs pip install bibtexparser # BibTeX parsing and formatting pip install biopython # PubMed E-utilities access # Optional (for Google Scholar) pip install scholarly # Google Scholar API wrapper # or pip install selenium # For more robust Scholar scraping
Optional Tools
# For advanced validation pip install crossref-commons # Enhanced CrossRef API access pip install pylatexenc # LaTeX special character handling
Imported: Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.