Claude-skills autopredict
Wrap the howdymary/autopredict Polymarket trading-agent repo. Use when you need to scan live Polymarket markets, inspect structural event mispricing, evaluate a market with your own fair probability, run reproducible backtests against a JSON dataset, tune strategy parameters safely, or review the repo's paper/live trading scaffolds and failure modes.
git clone https://github.com/ckorhonen/claude-skills
T=$(mktemp -d) && git clone --depth=1 https://github.com/ckorhonen/claude-skills "$T" && mkdir -p ~/.claude/skills && cp -r "$T/skills/autopredict" ~/.claude/skills/ckorhonen-claude-skills-autopredict && rm -rf "$T"
skills/autopredict/SKILL.mdAutoPredict
Quick Start — Simple Examples
New to AutoPredict? Start here before reading the full docs.
1. Scan what's trending on Polymarket right now
python3 predict.py --top 10
Shows the 10 most active markets with spreads, depth, and overround signals.
2. Show me the 5 most liquid markets
python3 predict.py --top 5 --verbose
Lists markets sorted by liquidity with full execution details.
3. Browse multi-outcome events for structural mispricing
python3 predict.py --events --top 10
Checks whether event probabilities sum to more or less than 100%.
4. What does the order book look like for a specific market?
python3 predict.py --fair 0.55 <condition_id>
Replace
<condition_id> with the Polymarket ID. Provide your own fair probability estimate and AutoPredict evaluates the trade.
Run
for all flags. No credentials required for live reads.python3 predict.py --help
AutoPredict is an execution framework for prediction-market trading. It is not a forecasting model.
- You provide
.fair_prob - The repo evaluates execution quality: side, order type, size, spread, depth, slippage, and risk.
- Live market reads require internet but no credentials.
- Real trading is scaffolded, not production-ready.
This skill was audited against the upstream repository layout and command surface, not just the README.
What Is Real vs Scaffold
Reliable entry points
scans live Polymarket markets.python3 predict.py
inspects multi-outcome event overround / underround.python3 predict.py --events
evaluates one market using your explicit probability.python3 predict.py --fair 0.60 <condition_id>
runs an offline backtest.python3 -m autopredict.cli backtest --dataset ...
prints the most recent saved metrics JSON.python3 -m autopredict.cli score-latest
Partially implemented or scaffold-only
only works if you already have JSONL trade logs. Plain CLI backtests do not create those logs.python3 -m autopredict.cli learn analyze
andpython3 -m autopredict.cli learn tune
are placeholders that point to a nonexistentlearn improve
.scripts/learn_and_improve.py
is intentionally disabled by config.python3 -m autopredict.cli trade-live
andscripts/run_paper.py
are deployment scaffolds.scripts/run_live.py
uses arun_live.py
, so it is not a real exchange adapter.MockVenueAdapter
Repo Map
: live Polymarket scanner and one-off evaluation path.predict.py
: packaged CLI used for backtest, score-latest, and learning commands.autopredict/cli.py
: simple offline backtest harness used byrun_experiment.py
.autopredict.cli backtest
: strategy knobs for offline experiments.strategy_configs/*.json
: bundled sample dataset. Use this when you need a known-good backtest input.autopredict/_defaults/datasets/sample_markets.json
: reusable grid-search API. Better than the stub CLI.autopredict/learning/tuner.py
,scripts/run_paper.py
: paper/live monitoring templates.scripts/run_live.py
Decision Tree
1. Choose the workflow
If the user wants to:
- Find liquid live markets or structural event mispricing: use
viapredict.py
.scripts/scan_markets.sh - Evaluate one market with a known fair probability: use
.predict.py --fair ... - Compare strategy configs or produce reproducible metrics: use
viapython3 -m autopredict.cli backtest --dataset ...
.scripts/run_backtest.sh - Sweep parameters safely: use
. Do not usescripts/tune_params.sh
.python3 -m autopredict.cli learn tune - Inspect trade logs: use
only if JSONL logs already exist.python3 -m autopredict.cli learn analyze --log-dir ... - Discuss paper/live deployment scaffolds: read
,docs/DEPLOYMENT.md
, and the Python runners before claiming the repo can trade live.configs/*.yaml
2. Choose the command surface
- Use
for live reads and one-off agent evaluation.predict.py - Use
for reproducible offline backtests.python3 -m autopredict.cli ... - Avoid
; that submodule has brittle import behavior in the current repo state.python3 -m autopredict.backtest.cli ...
3. Choose the data source
- For a quick smoke test: use
.autopredict/_defaults/datasets/sample_markets.json - For real research: require a user-supplied dataset of historical snapshots.
- If the user has no dataset and wants strategy performance claims, stop and say the repo cannot produce a valid backtest without one.
Setup
Preferred helper:
bash skills/autopredict/scripts/setup.sh --dir /tmp/autopredict
Manual setup:
git clone https://github.com/howdymary/autopredict.git /tmp/autopredict cd /tmp/autopredict python3 -m pip install -e . python3 predict.py --help python3 -m autopredict.cli --help
After setup, keep work inside the cloned repo when invoking upstream commands.
Opinionated Workflows
Workflow A: Fast live market triage
Use this when the user wants ideas, not a PnL claim.
cd /tmp/autopredict python3 predict.py --top 10 --verbose python3 predict.py --events --top 10
Interpretation:
- Prefer markets with tight spreads and visible depth.
- Treat event underround as a structural clue, not automatic free money.
- Only move to trade evaluation once you can justify a
.fair_prob
Workflow B: Evaluate a single conviction
Use this when the user already has a thesis on one market.
cd /tmp/autopredict python3 predict.py --fair 0.60 <condition_id>
Important caveat:
constructspredict.py --fair
directly.AutoPredictAgent(AgentConfig())- That means it uses default agent parameters, not
or your edited JSON config.strategy_configs/baseline.json - Use it as a default-policy sanity check, not as proof that a tuned config behaves the same way.
Workflow C: Backtest a strategy config
Use this when the user wants reproducible metrics or config comparisons.
cd /tmp/autopredict python3 -m autopredict.cli backtest \ --config strategy_configs/baseline.json \ --dataset autopredict/_defaults/datasets/sample_markets.json python3 -m autopredict.cli score-latest
Opinionated rule:
- Always pass
.--dataset - The repo default
setsconfig.json
."default_dataset": null - Running
with no dataset currently throws apython3 -m autopredict.cli backtest
.TypeError
Workflow D: Tune parameters
Use the bundled helper instead of the stub CLI:
bash skills/autopredict/scripts/tune_params.sh \ --dir /tmp/autopredict \ --dataset autopredict/_defaults/datasets/sample_markets.json \ --param min_edge 0.03,0.05,0.08 \ --param aggressive_edge 0.10,0.12,0.15
Opinionated tuning rules:
- Start with 1-2 parameters, not 6.
- Prefer
orsharpe
only after sample size is reasonable.total_pnl - Reject “best” configs with too few trades.
- Save every run; do not trust memory or terminal output.
Workflow E: Review learning / deployment scaffolds
Use this when the user asks about self-improvement, paper trading, or live trading.
is real and reusable.autopredict.learning.tuner.GridSearchTuner
is just a message, not a tuning engine.python3 -m autopredict.cli learn tune
is a monitoring loop template; it does not fetch real markets or execute the full agent logic.scripts/run_paper.py
requires confirmation and safety flags, but still usesscripts/run_live.py
, so it cannot trade a real venue out of the box.MockVenueAdapter
Strategy Knobs That Matter
Main JSON parameters in
strategy_configs/*.json:
: minimum edge before any trade is considered.min_edge
: threshold for using market orders more aggressively.aggressive_edge
: position sizing as fraction of bankroll.max_risk_fraction
: hard dollar cap per order.max_position_notional
: minimum visible depth required.min_book_liquidity
: spread filter.max_spread_pct
: cap as fraction of visible depth.max_depth_fraction
: start slicing when order is too large relative to depth.split_threshold_fraction
Opinionated tuning guidance:
- Lower
only if trade count is too low.min_edge - Raise
if slippage is the dominant problem.aggressive_edge - Lower
before touching risk caps when market impact is the problem.max_depth_fraction - Do not loosen spread and liquidity filters at the same time; you will not know which one caused the regression.
Failure Modes and Edge Cases
Backtest failures
before the backtest starts: almost always because noTypeError
was passed and--dataset
isdefault_dataset
.null
:No metrics.json found under state directory
was run before a successful backtest.score-latest- Malformed JSON errors: invalid strategy config or dataset schema.
Learning workflow failures
reports no logs: expected unless you created JSONL logs withlearn analyze
or a scaffold that writes them.TradeLogger
/learn tune
prints advice only: expected. Those subcommands are placeholders.learn improve- Docs mention
: that script does not exist in the audited upstream repo.scripts/learn_and_improve.py
Live / paper trading confusion
- Paper trading is not the same as live market scanning:
is a loop scaffold, not an end-to-end paper execution engine over Polymarket.run_paper.py - Live trading docs sound complete, but adapter is mock:
cannot place real venue orders without extra implementation.run_live.py
CLI is disabled:trade-live
defaultsconfig.json
tolive_trading_enabled
.false
Command-path gotchas
- Do not use
in this repo state unless you are ready to debug import-path issues.python3 -m autopredict.backtest.cli - Do not assume root docs and packaged CLI are fully synchronized. The package path under
is the safer source of truth.autopredict/ - Do not claim config changes affect
unless you verified the code path. It currently ignorespredict.py --fair
.strategy_configs/*.json
Helper Scripts Bundled With This Skill
: clone, install, verify, and smoke-test the repo.scripts/setup.sh
: wrapper aroundscripts/scan_markets.sh
for live scan /predict.py
/--events
paths.--fair
: safe backtest wrapper that always provides a dataset or fails with a useful error.scripts/run_backtest.sh
: grid-search wrapper that bypasses the upstream stub tuning CLI.scripts/tune_params.sh
Recommended Agent Behavior
When using this skill:
- Lead with the limitation that AutoPredict optimizes execution, not prediction.
- Ask where
comes from before discussing edges as if they were alpha.fair_prob - Require a dataset for any serious backtest claim.
- Separate “works in the repo” from “documented in the repo”.
- Treat paper/live trading as architecture review unless the user is explicitly asking to extend the scaffold.
Autoresearch Pairing
Use this skill with
autoresearch when the user wants disciplined tuning.
Recommended setup:
- Define the target metric, usually
,sharpe
, ortotal_pnl
.avg_slippage_bps - Use
orscripts/run_backtest.sh
as the experiment workload.scripts/tune_params.sh - Keep one hypothesis per run.
- Store configs and metrics under a dated output directory.
Good autoresearch prompt framing:
- “Optimize
andaggressive_edge
for lower slippage without collapsing trade count.”max_depth_fraction - “Improve Sharpe on this dataset while keeping max drawdown below 35%.”
Bad framing:
- “Make it profitable” with no dataset.
- “Tune everything” with no metric priority.