Cc-skills sharpe-ratio-non-iid-corrections

install
source · Clone the upstream repo
git clone https://github.com/terrylica/cc-skills
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/terrylica/cc-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/plugins/quant-research/skills/sharpe-ratio-non-iid-corrections" ~/.claude/skills/terrylica-cc-skills-sharpe-ratio-non-iid-corrections && rm -rf "$T"
manifest: plugins/quant-research/skills/sharpe-ratio-non-iid-corrections/SKILL.md
source content

Sharpe Ratio Non-IID Corrections

Generalized Sharpe ratio inference framework for non-Normal, serially correlated returns. Reference implementation of López de Prado, Lipton & Zoonekynd (2026).

Self-Evolving Skill: This skill improves through use. If instructions are wrong, parameters drifted, or a workaround was needed — fix this file immediately, don't defer. Only update for real, reproducible issues.

Quick Start

# Run all 18 validation tests (paper Exhibit 1 + numerical examples)
uv run --python 3.13 --with numpy --with scipy --with numba \
  python plugins/quant-research/skills/sharpe-ratio-non-iid-corrections/references/sharpe_numba.py

Paper Metadata

FieldValue
TitleHow to Use the Sharpe Ratio
AuthorsMarcos López de Prado, Alexander Lipton, Vincent Zoonekynd
SeriesADIA Lab Research Paper Series, No. 19
DateMarch 7, 2026
SSRN5520741
Replicationhttps://github.com/zoonek/2025-sharpe-ratio
Source PDFlopez-de-prado-sharpe-2026.pdf

References

TopicReference File
Full Paper Transcription (82 equations)how-to-use-the-sharpe-ratio-2026.md
Equation → Implementation Trackersharpe-paper-tracker.md
Numba JIT Implementation (850+ lines)sharpe_numba.py
Source PDFlopez-de-prado-sharpe-2026.pdf
Paper Figuresmedia/ (8 figures from pp. 12-45)

Preflight: Experimentation Context Assessment (CRITICAL)

Before computing DSR or SFDR, Claude Code MUST assess the experimentation context to determine the correct parameters. These inputs require subjective judgment that cannot be derived from code alone.

Decision: DSR vs SFDR

SituationUseWhy
You have return series for all K strategies testedDSRCan compute K_eff from eigenvalues of correlation matrix
Strategy is the survivor of a long, unrecoverable experimentation historySFDRK is unknowable; SFDR sets a threshold without needing K
Evaluating a single strategy with no selection involvedPSRNo multiple-testing correction needed (K=1)

Default for this project: Use SFDR. Our strategies are survivors of hundreds of regime explorations across WFO configurations, feature sets, model architectures, and metric cutoffs. The full experimentation history is unrecoverable.

Parameters Requiring Judgment

Before calling

sfdr_threshold()
, assess each parameter by examining the project context:

ParameterWhat to assessHow to assess it
P[H₁] (prior: fraction of genuine strategies)"Out of all ideas ever tried in this project, what fraction were genuinely profitable?"Check git history, experiment logs, signal-archaeology skill, WFO result archives. Conservative default: 0.05 (1 in 20 ideas is real).
SR₁ (alternative: what a real strategy's Sharpe looks like)"When a strategy IS real, what Sharpe does it typically achieve?"Look at validated production strategies, published benchmarks for the asset class. Crypto range bars: 0.3–0.5 monthly.
q (false discovery tolerance)"What false positive rate can I tolerate?"Research: 0.05. Production capital allocation: 0.01. Exploratory screening: 0.10.
γ₃, γ₄, ρ (return distribution shape)Skewness, Pearson kurtosis, lag-1 autocorrelationCompute directly from the return series under evaluation. These are objective — no judgment needed.
T (sample length)Number of return observationsCount from data. Objective.

Preflight Checklist

When this skill is invoked, Claude Code should:

  1. Identify the evaluation context: Is this a single new strategy, a WFO fold comparison, or a survivor from extensive search?
  2. Determine DSR vs SFDR: If return series for all candidates exist → DSR with K_eff. Otherwise → SFDR.
  3. Elicit P[H₁]: Search for experimentation history (git log, experiment catalogues, the
    signal-archaeology
    skill). Count approximate ideas tried vs ideas that worked. If unknowable, use 0.05.
  4. Estimate SR₁: From validated production strategies or asset-class benchmarks.
  5. Set q: Based on the decision's consequence (research paper vs capital deployment).
  6. Compute objective parameters: γ₃, γ₄, ρ, T directly from the return series.
  7. Report the assessment: State all parameter choices and reasoning before computing, so the user can override.

Example Preflight Output

SFDR Preflight Assessment:
  Context:    Survivor of ~300 WFO configs × 5 feature sets × 3 architectures
  Method:     SFDR (experimentation history unrecoverable)
  P[H₁]:     0.05 (conservative — ~15 genuine signals from ~300 ideas)
  SR₁:       0.4 (typical monthly Sharpe for validated crypto range bar strategies)
  q:         0.05 (research-grade threshold)
  γ₃:        -2.448 (computed from return series)
  γ₄:        10.164 (Pearson kurtosis, computed from return series)
  ρ:         0.20 (lag-1 autocorrelation, computed from return series)
  T:         24 months
  → SFDR threshold: SR_c = 0.760
  → Observed SR: 0.456
  → VERDICT: FAIL (observed SR below SFDR threshold)

The user may override any parameter. If they disagree with P[H₁] or SR₁, recompute with their values.

Key Formulas

All equations use the paper's non-IID variance (Eq 2-3) with Pearson kurtosis convention (γ₄=3 for Gaussian).

EqNameFormulaFunction
2-3SR VarianceV[SR̂] = (1/T)·(a − b·γ₃·SR + c·(γ₄−1)/4·SR²)
sr_variance()
9PSRΦ((SR̂ − SR₀) / σ[SR₀])
psr()
11MinTRL(a − b·γ₃·SR₀ + c·(γ₄−1)/4·SR₀²) · (z_{1−α}/(SR̂−SR₀))²
min_trl()
13Critical SRSR₀ + σ[SR₀]·z_{1−α}
critical_sr()
15Power1 − Φ((SR_c − SR₁) / σ[SR₁])
power()
17β (Type II)Φ((z_{1−α}·√(a(ρ)) − SR₁·√T) / √(a − b·γ₃·SR₁ + c·(γ₄−1)/4·SR₁²))
power()
1 - power
21pFDR(1 + (1−β)·P[H₁]/(α·P[H₀]))⁻¹
pfdr()
24oFDRp·P[H₀] / (p·P[H₀] + (1−Φ[z*(SR₁)])·P[H₁])
ofdr()
28E[max SR]SR₀ + √V · ((1−γ)·Φ⁻¹[1−1/K] + γ·Φ⁻¹[1−1/(Ke)])
expected_max_sr()
29-31DSRPSR with SR₀ = E[max{SR̂_k}], σ = √V[max{SR̂_k}]
dsr()
32-33SFDRFind SR_c such that pFDR(SR_c) = q
sfdr_threshold()

Where a = (1+ρ)/(1−ρ), b = (1+ρ+ρ²)/(1−ρ²), c = (1+ρ²)/(1−ρ²) are AR(1) variance coefficients (

ar1_variance_coeffs()
).

Numerical Example (Paper Exhibit 1)

Hedge fund: T=24 months, γ₃=−2.448, γ₄=10.164 (Pearson), SR=0.036/0.079≈0.456, ρ=0.2.

QuantityEquationValueNotes
σ[SR̂] (non-Gaussian)Eq 30.379vs 0.214 Gaussian — 77% wider
PSR (SR₀=0)Eq 90.966Still significant despite wider CI
PSR (SR₀=0.1)Eq 9+50.900Harder benchmark reduces confidence
MinTRL (SR₀=0)Eq 1119.543 monthsT=24 > 19.5 → sufficient
MinTRL (SR₀=0.1)Eq 1139.369 monthsMore than doubles for SR₀ closer to SR̂
β (power, SR₁=0.5)Eq 170.411vs 0.224 IID Normal — 84% higher
pFDR (P[H₁]=0.1)Eq 210.43343.3% false discovery when true strategies rare
oFDR (SR₁=0.5)Eq 240.361Even 3.4% p-value → 36% observed FDR
Power (1−β)Eq 150.589Low for 24-month track record

Implementation Architecture

6-tier dependency hierarchy (Numba JIT, Rust-ready):

Tier 0: norm_cdf, norm_ppf, erfinv, Brent's method
Tier 1: sr_variance (Eqs 3/58)
Tier 2: psr (Eq 9), min_trl (Eq 11), critical_sr (Eq 13)
Tier 3: power (Eq 15), moments_mk (Eqs 62-65), expected_max_sr (Eq 28), var_max_sr
Tier 4: pfdr (Eq 18-21), ofdr (Eqs 22-24), fwer (Eq 25)
Tier 5: dsr (Eqs 29-31), sfdr_threshold (Eqs 32-33)  ← APEX

Related Skills

SkillRelationship
opendeviation-eval-metricsConsumes PSR, DSR, MinTRL for range bar evaluation; has quick-ref formulas (ρ=0 case)
adaptive-wfo-epochUses DSR for WFE validation across walk-forward folds
evolutionary-metric-rankingDSR as one of the metrics in multi-objective ranking

Post-Execution Reflection

After this skill completes, check before closing:

  1. Did the command succeed? — If not, fix the instruction or error table that caused the failure.
  2. Did parameters or output change? — If the underlying tool's interface drifted, update Usage examples and Parameters table to match.
  3. Was a workaround needed? — If you had to improvise (different flags, extra steps), update this SKILL.md so the next invocation doesn't need the same workaround.

Only update if the issue is real and reproducible — not speculative.