Awesome-Agent-Skills-for-Empirical-Research google-scholar-guide
Advanced Google Scholar search techniques for comprehensive literature discovery
install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/literature/search/google-scholar-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-google-scholar-gu && rm -rf "$T"
manifest:
skills/43-wentorai-research-plugins/skills/literature/search/google-scholar-guide/SKILL.mdsource content
Google Scholar Guide
A skill for leveraging Google Scholar's full capabilities for academic literature search. Covers advanced search operators, citation tracking, alert configuration, and strategies for systematic and comprehensive retrieval.
Advanced Search Operators
Core Operators
| Operator | Syntax | Example | Effect |
|---|---|---|---|
| Exact phrase | | | Matches exact phrase |
| OR | | | Matches either term |
| Exclude | | | Excludes term |
| Author | | | Filter by author |
| Source | | | Filter by journal |
| Title only | | | Search in title only |
| Date range | Custom range | Via Advanced Search UI | Limit publication years |
| File type | | | Specific file formats |
Constructing Effective Queries
def build_scholar_query(concepts: list[list[str]], exclude: list[str] = None, title_only: bool = False, author: str = None, source: str = None) -> str: """ Build a structured Google Scholar query from concept groups. Args: concepts: List of concept groups, each a list of synonyms Groups are ANDed together, synonyms are ORed exclude: Terms to exclude title_only: Search in title only author: Author name filter source: Journal/source filter Returns: Formatted Google Scholar query string """ # Build concept groups with OR groups = [] for concept_group in concepts: if len(concept_group) == 1: groups.append(f'"{concept_group[0]}"') else: terms = ' OR '.join(f'"{term}"' for term in concept_group) groups.append(f'({terms})') # AND the concept groups together query = ' '.join(groups) # Apply title restriction if title_only: query = f'intitle:{query}' # Add exclusions if exclude: for term in exclude: query += f' -{term}' # Add author filter if author: query += f' author:"{author}"' # Add source filter if source: query += f' source:"{source}"' return query # Example: find papers on transfer learning for medical imaging query = build_scholar_query( concepts=[ ["transfer learning", "domain adaptation", "fine-tuning"], ["medical imaging", "radiology", "pathology images"], ["deep learning", "convolutional neural network"] ], exclude=["survey", "review"], title_only=False ) print(query) # Output: ("transfer learning" OR "domain adaptation" OR "fine-tuning") # ("medical imaging" OR "radiology" OR "pathology images") # ("deep learning" OR "convolutional neural network") -survey -review
Citation Tracking Strategies
Forward and Backward Citation Chaining
Seed Paper (a highly relevant paper you already know) | +--> "Cited by" link -> Forward citation tracking | (who cited this paper? newer related work) | +--> Reference list -> Backward citation tracking (what did this paper cite? foundational work) Repeat for each highly relevant paper found. Stop when you reach saturation (no new relevant papers appearing).
Identifying Key Papers
Use citation metrics strategically:
def identify_key_papers(search_results: list[dict], min_citations: int = 10) -> list[dict]: """ Identify key papers from search results using citation analysis. Args: search_results: List of papers with 'title', 'year', 'citations' min_citations: Minimum citation threshold """ import datetime current_year = datetime.datetime.now().year for paper in search_results: age = max(1, current_year - paper['year']) paper['citations_per_year'] = paper['citations'] / age # Classify influence if paper['citations_per_year'] > 50: paper['influence'] = 'landmark' elif paper['citations_per_year'] > 20: paper['influence'] = 'highly_influential' elif paper['citations_per_year'] > 5: paper['influence'] = 'influential' else: paper['influence'] = 'standard' # Filter and sort filtered = [p for p in search_results if p['citations'] >= min_citations] return sorted(filtered, key=lambda x: x['citations_per_year'], reverse=True)
Google Scholar Alerts
Set up alerts to stay current:
- Go to Google Scholar and run your search query
- Click "Create alert" in the left sidebar
- Configure email frequency (as-it-happens or weekly digest)
- Use the same carefully constructed query from your search strategy
Best practices for alerts:
- Create separate alerts for each major concept group
- Use narrow, specific queries to reduce noise (10-20 results per alert is ideal)
- Review and refine alert queries quarterly
Google Scholar Profiles
Leveraging Author Profiles
- Follow prolific researchers in your field to get notifications of their new publications
- Use the "Related articles" feature on author profile pages
- Check co-author networks to discover related research groups
- The h-index and i10-index on profiles can help gauge researcher impact, but use with caution across different fields
Limitations and Complementary Databases
Google Scholar has known limitations:
- No controlled vocabulary or MeSH terms (unlike PubMed)
- Cannot filter by study design or methodology
- Includes non-peer-reviewed sources (preprints, theses, slides)
- Citation counts may include self-citations and non-scholarly citations
For systematic reviews, always supplement Google Scholar with structured databases: PubMed/MEDLINE, Web of Science, Scopus, and domain-specific databases (e.g., IEEE Xplore, PsycINFO, EconLit). Document the number of results from each database for your PRISMA flow diagram.