Awesome-omni-skill prediction-tracking

Track and evaluate AI predictions over time to assess accuracy. Use when reviewing past predictions to determine if they came true, failed, or remain uncertain.

install
source · Clone the upstream repo
git clone https://github.com/diegosouzapw/awesome-omni-skill
Claude Code · Install into ~/.claude/skills/
T=$(mktemp -d) && git clone --depth=1 https://github.com/diegosouzapw/awesome-omni-skill "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/development/prediction-tracking-rickoslyder" ~/.claude/skills/diegosouzapw-awesome-omni-skill-prediction-tracking-1d8fdd && rm -rf "$T"
manifest: skills/development/prediction-tracking-rickoslyder/SKILL.md
source content

Prediction Tracking Skill

Track predictions made by AI researchers and critics, evaluate their accuracy over time.

Prediction Recording

When recording a new prediction, capture:

Required Fields

  • text: The prediction as stated
  • author: Who made it
  • madeAt: When it was made
  • timeframe: When they expect it to happen
  • topic: What area of AI
  • confidence: How confident they seemed

Optional Fields

  • sourceUrl: Where the prediction was made
  • targetDate: Specific date if mentioned
  • conditions: Any caveats or conditions
  • metrics: How to measure success

Evaluation Status

When evaluating predictions, assign one of:

verified

Clearly came true as stated.

  • The predicted capability/event occurred
  • Within the stated timeframe
  • Substantially as described

falsified

Clearly did not come true.

  • Timeframe passed without occurrence
  • Contradictory evidence emerged
  • Author retracted or modified claim

partially-verified

Partially accurate.

  • Some aspects came true, others didn't
  • Capability exists but weaker than claimed
  • Timeframe was off but direction correct

too-early

Not enough time has passed.

  • Still within stated timeframe
  • No definitive evidence either way

unfalsifiable

Cannot be objectively assessed.

  • Too vague to measure
  • No clear success criteria
  • Moved goalposts

ambiguous

Prediction was too vague to evaluate.

  • Multiple interpretations possible
  • Success criteria unclear

Evaluation Process

For each prediction being evaluated:

1. Restate the prediction

What exactly was claimed?

2. Identify timeframe

Has enough time passed to evaluate?

3. Gather evidence

What has happened since?

  • Relevant releases or announcements
  • Benchmark results
  • Real-world deployments
  • Counter-evidence

4. Assess status

Which evaluation status applies?

5. Score accuracy

If verifiable, rate 0.0-1.0:

  • 1.0: Exactly as predicted
  • 0.7-0.9: Substantially correct
  • 0.4-0.6: Partially correct
  • 0.1-0.3: Mostly wrong
  • 0.0: Completely wrong

6. Note lessons

What does this tell us about:

  • The author's forecasting ability
  • The topic's predictability
  • Common prediction pitfalls

Output Format

For evaluation:

{
  "evaluations": [
    {
      "predictionId": "id",
      "status": "verified",
      "accuracyScore": 0.85,
      "evidence": "Description of evidence",
      "notes": "Additional context",
      "evaluatedAt": "timestamp"
    }
  ]
}

For accuracy statistics:

{
  "author": "Author name",
  "totalPredictions": 15,
  "verified": 5,
  "falsified": 3,
  "partiallyVerified": 2,
  "pending": 4,
  "unfalsifiable": 1,
  "averageAccuracy": 0.62,
  "topicBreakdown": {
    "reasoning": { "predictions": 5, "accuracy": 0.7 },
    "agents": { "predictions": 3, "accuracy": 0.4 }
  },
  "calibration": "Assessment of how well-calibrated they are"
}

Calibration Assessment

Evaluate whether predictors are well-calibrated:

Well-Calibrated

  • High-confidence predictions usually come true
  • Low-confidence predictions have mixed results
  • Acknowledges uncertainty appropriately

Overconfident

  • High-confidence predictions often fail
  • Rarely expresses uncertainty
  • Doesn't update on evidence

Underconfident

  • Low-confidence predictions often come true
  • Hedges even on likely outcomes
  • Too conservative

Inconsistent

  • Confidence doesn't correlate with accuracy
  • Random relationship between stated and actual accuracy

Tracking Notable Predictors

Keep running assessments of key voices:

PredictorTotalAccuracyCalibrationNotes
Sam Altman2055%OverconfidentTimeline optimism
Gary Marcus1570%Well-calibratedConservative
Dario Amodei1265%Slightly overSafety-focused

Red Flags

Watch for prediction patterns that suggest bias:

  • Always bullish regardless of topic
  • Never acknowledges failed predictions
  • Moves goalposts when wrong
  • Predictions align suspiciously with financial interests
  • Vague enough to claim credit for anything