Awesome-Agent-Skills-for-Empirical-Research h-index-guide

Understanding and calculating research impact metrics

install
source · Clone the upstream repo
git clone https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/brycewang-stanford/Awesome-Agent-Skills-for-Empirical-Research "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/43-wentorai-research-plugins/skills/literature/metadata/h-index-guide" ~/.claude/skills/brycewang-stanford-awesome-agent-skills-for-empirical-research-h-index-guide && rm -rf "$T"
manifest: skills/43-wentorai-research-plugins/skills/literature/metadata/h-index-guide/SKILL.md
source content

H-Index and Research Impact Metrics Guide

Understand, calculate, and responsibly interpret bibliometric indicators including h-index, impact factor, and related metrics.

Core Bibliometric Indicators

H-Index

The h-index (Hirsch index) is defined as: a researcher has an h-index of h if h of their papers have each been cited at least h times.

Example: If a researcher has published 20 papers with citation counts [120, 80, 55, 40, 22, 18, 15, 12, 10, 8, 5, 3, 2, 2, 1, 1, 0, 0, 0, 0], their h-index is 10 (10 papers with at least 10 citations each).

def calculate_h_index(citation_counts):
    """Calculate h-index from a list of citation counts."""
    sorted_counts = sorted(citation_counts, reverse=True)
    h = 0
    for i, count in enumerate(sorted_counts):
        if count >= i + 1:
            h = i + 1
        else:
            break
    return h

# Example
citations = [120, 80, 55, 40, 22, 18, 15, 12, 10, 8, 5, 3, 2, 2, 1, 1, 0, 0, 0, 0]
print(f"h-index: {calculate_h_index(citations)}")  # Output: 10

Related Author-Level Metrics

MetricDefinitionAdvantage
h-indexh papers with >= h citationsSimple, robust to outliers
i10-indexNumber of papers with >= 10 citationsIntuitive threshold (Google Scholar uses this)
g-indexLargest g such that top g papers have >= g^2 total citationsRewards highly cited papers more
m-quotienth-index divided by years since first publicationNormalizes for career length
hI-normh-index divided by average number of co-authorsAdjusts for team size
def calculate_g_index(citation_counts):
    """Calculate g-index from citation counts."""
    sorted_counts = sorted(citation_counts, reverse=True)
    cumulative = 0
    g = 0
    for i, count in enumerate(sorted_counts):
        cumulative += count
        if cumulative >= (i + 1) ** 2:
            g = i + 1
    return g

def calculate_i10_index(citation_counts):
    """Calculate i10-index."""
    return sum(1 for c in citation_counts if c >= 10)

print(f"g-index: {calculate_g_index(citations)}")    # Output: 19
print(f"i10-index: {calculate_i10_index(citations)}") # Output: 10

Journal-Level Metrics

Journal Impact Factor (JIF)

Published annually by Clarivate in the Journal Citation Reports (JCR). The 2-year impact factor for year Y is:

JIF(Y) = (Citations in Y to articles published in Y-1 and Y-2)
         / (Number of citable items published in Y-1 and Y-2)
MetricProviderWindowNotable Features
Impact FactorClarivate (JCR)2-year or 5-yearGold standard, subscription only
CiteScoreScopus (Elsevier)4-yearFree, includes all document types
SJR (Scimago)Scopus data3-yearWeights citations by journal prestige (PageRank-like)
SNIPScopus data3-yearNormalizes for citation potential of each field
h5-indexGoogle Scholar5-yearFree, h-index applied to a journal

Looking Up Journal Metrics

import requests

# Using the OpenAlex API to get journal/source information
journal_name = "Nature"
response = requests.get(
    "https://api.openalex.org/sources",
    params={"filter": f"display_name.search:{journal_name}", "per_page": 5}
)
results = response.json()["results"]
for source in results:
    print(f"Name: {source['display_name']}")
    print(f"  ISSN: {source.get('issn_l', 'N/A')}")
    print(f"  Works count: {source.get('works_count', 'N/A')}")
    print(f"  Cited by count: {source.get('cited_by_count', 'N/A')}")
    print(f"  h-index: {source.get('summary_stats', {}).get('h_index', 'N/A')}")
    print(f"  2-year mean citedness: {source.get('summary_stats', {}).get('2yr_mean_citedness', 'N/A')}")

Calculating Your Own H-Index

From Google Scholar

Google Scholar profiles automatically display h-index and i10-index. No calculation needed, but coverage is the broadest (includes non-peer-reviewed sources).

From OpenAlex

# OpenAlex provides h-index directly in author profiles
author_name = "Geoffrey Hinton"
response = requests.get(
    "https://api.openalex.org/authors",
    params={"filter": f"display_name.search:{author_name}", "per_page": 1}
)
author = response.json()["results"][0]
print(f"h-index: {author['summary_stats']['h_index']}")
print(f"i10-index: {author['summary_stats']['i10_index']}")
print(f"2-year mean citedness: {author['summary_stats']['2yr_mean_citedness']}")

Responsible Use of Metrics

Known Limitations

  1. Field dependence: Average citation rates vary dramatically across disciplines. An h-index of 20 is excellent in mathematics but modest in biomedical sciences.
  2. Career stage bias: The h-index monotonically increases over time. Always compare within career stage (m-quotient helps).
  3. Self-citation: Some databases include self-citations in h-index calculation.
  4. Database coverage: Google Scholar, Scopus, and Web of Science yield different h-index values for the same author.
  5. Gaming: Metrics can be inflated through citation cartels, salami slicing, and excessive self-citation.

DORA Declaration

The San Francisco Declaration on Research Assessment (DORA) recommends:

  • Do not use journal-based metrics (such as impact factor) as a surrogate measure of individual research quality.
  • Assess research on its own merits rather than on the basis of the journal in which it is published.
  • Use article-level metrics alongside qualitative indicators for assessment.

Best Practices for Reporting

  • Always specify the source database and date when reporting h-index
  • Report multiple metrics rather than relying on a single number
  • Provide field-normalized indicators (FWCI, SNIP) when comparing across disciplines
  • Include qualitative achievements alongside quantitative metrics in CVs and promotion cases