Finance-skills stock-correlation

install
source · Clone the upstream repo
git clone https://github.com/himself65/finance-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/himself65/finance-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/market-analysis/skills/stock-correlation" ~/.claude/skills/himself65-finance-skills-stock-correlation && rm -rf "$T"
manifest: plugins/market-analysis/skills/stock-correlation/SKILL.md
source content

Stock Correlation Analysis Skill

Finds and analyzes correlated stocks using historical price data from Yahoo Finance via yfinance. Routes to specialized sub-skills based on user intent.

Important: This is for research and educational purposes only. Not financial advice. yfinance is not affiliated with Yahoo, Inc.


Step 1: Ensure Dependencies Are Available

Current environment status:

!`python3 -c "import yfinance, pandas, numpy; print(f'yfinance={yfinance.__version__} pandas={pandas.__version__} numpy={numpy.__version__}')" 2>/dev/null || echo "DEPS_MISSING"`

If

DEPS_MISSING
, install required packages before running any code:

import subprocess, sys
subprocess.check_call([sys.executable, "-m", "pip", "install", "-q", "yfinance", "pandas", "numpy"])

If all dependencies are already installed, skip the install step and proceed directly.


Step 2: Route to the Correct Sub-Skill

Classify the user's request and jump to the matching sub-skill section below.

User RequestRoute ToExamples
Single ticker, wants to find related stocksSub-Skill A: Co-movement Discovery"what correlates with NVDA", "find stocks related to AMD", "sympathy plays for TSLA"
Two or more specific tickers, wants relationship detailsSub-Skill B: Return Correlation"correlation between AMD and NVDA", "how do LITE and COHR move together", "compare AAPL vs MSFT"
Group of tickers, wants structure/groupingSub-Skill C: Sector Clustering"correlation matrix for FAANG", "cluster these semiconductor stocks", "sector peers for AMD"
Wants time-varying or conditional correlationSub-Skill D: Realized Correlation"rolling correlation AMD NVDA", "when NVDA drops what else drops", "how has correlation changed"

If ambiguous, default to Sub-Skill A (Co-movement Discovery) for single tickers, or Sub-Skill B (Return Correlation) for two tickers.

Defaults for all sub-skills

ParameterDefault
Lookback period
1y
(1 year)
Data interval
1d
(daily)
Correlation methodPearson
Minimum correlation threshold0.60
Number of resultsTop 10
Return typeDaily log returns
Rolling window60 trading days

Sub-Skill A: Co-movement Discovery

Goal: Given a single ticker, find stocks that move with it.

A1: Build the peer universe

You need 15-30 candidates. Do not use hardcoded ticker lists — build the universe dynamically at runtime. See

references/sector_universes.md
for the full implementation. The approach:

  1. Screen same-industry stocks using
    yf.screen()
    +
    yf.EquityQuery
    to find stocks in the same industry as the target
  2. Broaden to sector if the industry screen returns fewer than 10 peers
  3. Add thematic/adjacent industries — read the target's
    longBusinessSummary
    and screen 1-2 related industries (e.g., a semiconductor company → also screen semiconductor equipment)
  4. Combine, deduplicate, remove target ticker

A2: Compute correlations

import yfinance as yf
import pandas as pd
import numpy as np

def discover_comovement(target_ticker, peer_tickers, period="1y"):
    all_tickers = [target_ticker] + [t for t in peer_tickers if t != target_ticker]
    data = yf.download(all_tickers, period=period, auto_adjust=True, progress=False)

    # Extract close prices — yf.download returns MultiIndex (Price, Ticker) columns
    closes = data["Close"].dropna(axis=1, thresh=max(60, len(data) // 2))

    # Log returns
    returns = np.log(closes / closes.shift(1)).dropna()
    corr_series = returns.corr()[target_ticker].drop(target_ticker, errors="ignore")

    # Rank by absolute correlation
    ranked = corr_series.abs().sort_values(ascending=False)

    result = pd.DataFrame({
        "Ticker": ranked.index,
        "Correlation": [round(corr_series[t], 4) for t in ranked.index],
    })
    return result, returns

A3: Present results

Show a ranked table with company names and sectors (fetch via

yf.Ticker(t).info.get("shortName")
):

RankTickerCompanyCorrelationWhy linked
1AMDAdvanced Micro Devices0.82Same industry — GPU/CPU
2AVGOBroadcom0.78AI infrastructure peer

Include:

  • Top 10 positively correlated stocks
  • Any notable negatively correlated stocks (potential hedges)
  • Brief explanation of why each might be linked (sector, supply chain, customer overlap)

Sub-Skill B: Return Correlation

Goal: Deep-dive into the relationship between two (or a few) specific tickers.

B1: Download and compute

import yfinance as yf
import pandas as pd
import numpy as np

def return_correlation(ticker_a, ticker_b, period="1y"):
    data = yf.download([ticker_a, ticker_b], period=period, auto_adjust=True, progress=False)
    closes = data["Close"][[ticker_a, ticker_b]].dropna()

    returns = np.log(closes / closes.shift(1)).dropna()
    corr = returns[ticker_a].corr(returns[ticker_b])

    # Beta: how much does B move per unit move of A
    cov_matrix = returns.cov()
    beta = cov_matrix.loc[ticker_b, ticker_a] / cov_matrix.loc[ticker_a, ticker_a]

    # R-squared
    r_squared = corr ** 2

    # Rolling 60-day correlation for stability
    rolling_corr = returns[ticker_a].rolling(60).corr(returns[ticker_b])

    # Spread (log price ratio) for mean-reversion
    spread = np.log(closes[ticker_a] / closes[ticker_b])
    spread_z = (spread - spread.mean()) / spread.std()

    return {
        "correlation": round(corr, 4),
        "beta": round(beta, 4),
        "r_squared": round(r_squared, 4),
        "rolling_corr_mean": round(rolling_corr.mean(), 4),
        "rolling_corr_std": round(rolling_corr.std(), 4),
        "rolling_corr_min": round(rolling_corr.min(), 4),
        "rolling_corr_max": round(rolling_corr.max(), 4),
        "spread_z_current": round(spread_z.iloc[-1], 4),
        "observations": len(returns),
    }

B2: Present results

Show a summary card:

MetricValue
Pearson Correlation0.82
Beta (B vs A)1.15
R-squared0.67
Rolling Corr (60d avg)0.80
Rolling Corr Range[0.55, 0.94]
Rolling Corr Std Dev0.08
Spread Z-Score (current)+1.2
Observations250

Interpretation guide:

  • Correlation > 0.80: Strong co-movement — these stocks are tightly linked
  • Correlation 0.50–0.80: Moderate — shared sector drivers but independent factors too
  • Correlation < 0.50: Weak — limited co-movement despite possible sector overlap
  • High rolling std: Unstable relationship — correlation varies significantly over time
  • Spread Z > |2|: Unusual divergence from historical relationship

Sub-Skill C: Sector Clustering

Goal: Given a group of tickers, show the full correlation structure and identify clusters.

C1: Build the correlation matrix

import yfinance as yf
import pandas as pd
import numpy as np

def sector_clustering(tickers, period="1y"):
    data = yf.download(tickers, period=period, auto_adjust=True, progress=False)

    # yf.download returns MultiIndex (Price, Ticker) columns
    closes = data["Close"].dropna(axis=1, thresh=max(60, len(data) // 2))
    returns = np.log(closes / closes.shift(1)).dropna()
    corr_matrix = returns.corr()

    # Hierarchical clustering order
    from scipy.cluster.hierarchy import linkage, leaves_list
    from scipy.spatial.distance import squareform

    dist_matrix = 1 - corr_matrix.abs()
    np.fill_diagonal(dist_matrix.values, 0)
    condensed = squareform(dist_matrix)
    linkage_matrix = linkage(condensed, method="ward")
    order = leaves_list(linkage_matrix)
    ordered_tickers = [corr_matrix.columns[i] for i in order]

    # Reorder matrix
    clustered = corr_matrix.loc[ordered_tickers, ordered_tickers]

    return clustered, returns

Note: if

scipy
is not available, fall back to sorting by average correlation instead of hierarchical clustering.

C2: Present results

  1. Full correlation matrix — formatted as a table. For more than 8 tickers, show as a heatmap description or highlight only the strongest/weakest pairs.

  2. Identified clusters — group tickers that have high intra-group correlation:

    • Cluster 1: [NVDA, AMD, AVGO] — avg intra-correlation 0.82
    • Cluster 2: [AAPL, MSFT] — avg intra-correlation 0.75
  3. Outliers — tickers with low average correlation to the group (potential diversifiers).

  4. Strongest pairs — top 5 highest-correlation pairs in the matrix.

  5. Weakest pairs — top 5 lowest/negative-correlation pairs (hedging candidates).


Sub-Skill D: Realized Correlation

Goal: Show how correlation changes over time and under different market conditions.

D1: Rolling correlation

import yfinance as yf
import pandas as pd
import numpy as np

def realized_correlation(ticker_a, ticker_b, period="2y", windows=[20, 60, 120]):
    data = yf.download([ticker_a, ticker_b], period=period, auto_adjust=True, progress=False)
    closes = data["Close"][[ticker_a, ticker_b]].dropna()

    returns = np.log(closes / closes.shift(1)).dropna()

    rolling = {}
    for w in windows:
        rolling[f"{w}d"] = returns[ticker_a].rolling(w).corr(returns[ticker_b])

    return rolling, returns

D2: Regime-conditional correlation

def regime_correlation(returns, ticker_a, ticker_b, condition_ticker=None):
    """Compare correlation across up/down/volatile regimes."""
    if condition_ticker is None:
        condition_ticker = ticker_a

    ret = returns[condition_ticker]

    regimes = {
        "All Days": pd.Series(True, index=returns.index),
        "Up Days (target > 0)": ret > 0,
        "Down Days (target < 0)": ret < 0,
        "High Vol (top 25%)": ret.abs() > ret.abs().quantile(0.75),
        "Low Vol (bottom 25%)": ret.abs() < ret.abs().quantile(0.25),
        "Large Drawdown (< -2%)": ret < -0.02,
    }

    results = {}
    for name, mask in regimes.items():
        subset = returns[mask]
        if len(subset) >= 20:
            results[name] = {
                "correlation": round(subset[ticker_a].corr(subset[ticker_b]), 4),
                "days": int(mask.sum()),
            }

    return results

D3: Present results

  1. Rolling correlation summary table:
WindowCurrentMeanMinMaxStd
20-day0.880.760.320.950.12
60-day0.820.780.550.920.08
120-day0.800.790.680.880.05
  1. Regime correlation table:
RegimeCorrelationDays
All Days0.82250
Up Days0.75132
Down Days0.87118
High Vol (top 25%)0.9063
Large Drawdown (< -2%)0.9328
  1. Key insight: Highlight whether correlation increases during sell-offs (very common — "correlations go to 1 in a crisis"). This is critical for risk management.

  2. Trend: Is correlation trending higher or lower recently vs. its historical average?


Step 3: Respond to the User

After running the appropriate sub-skill, present results clearly:

Always include

  • The lookback period and data interval used
  • The number of observations (trading days)
  • Any tickers dropped due to insufficient data

Always caveat

  • Correlation is not causation — co-movement does not imply a causal link
  • Past correlation does not guarantee future correlation — regimes shift
  • Short lookback windows produce noisy estimates; longer windows smooth but may miss regime changes

Practical applications (mention when relevant)

  • Sympathy plays: Stocks likely to follow a peer's earnings/news move
  • Pair trading: High-correlation pairs where the spread has diverged from its mean
  • Portfolio diversification: Finding low-correlation assets to reduce risk
  • Hedging: Identifying inversely correlated instruments
  • Sector rotation: Understanding which sectors move together
  • Risk management: Correlation spikes during stress — diversification may fail when needed most

Important: Never recommend specific trades. Present data and let the user draw conclusions.


Reference Files

  • references/sector_universes.md
    — Dynamic peer universe construction using yfinance Screener API

Read the reference file when you need to build a peer universe for a given ticker.