Polymarket AI Agent
LIVE3-layer prediction market oracle. Dual-LLM ensemble (GPT-5.4 + Claude), 8 strategy engines, 15-min autonomous cycle scanning 80+ markets.
Key Numbers
At a Glance
80+
Markets Scanned
8
Strategy Engines
Dual-Model
LLM Ensemble
15 min
Cycle Time
Overview
About This Project
A fully autonomous prediction market trading agent that operates on a continuous 15-minute decision cycle. It scans 80+ active markets on Polymarket's central limit order book, applying a dual-LLM ensemble -- GPT-5.4 and Claude Sonnet 4.6 -- to generate independent probability estimates before passing them through 8 specialized strategy engines.
The system goes beyond naive LLM prompting. Each market is analyzed through news-driven momentum, calendar arbitrage, correlation plays, contrarian signals, liquidity analysis, sentiment parsing, statistical arbitrage, and market microstructure lenses. When the LLMs disagree, the system triggers deeper analysis rather than averaging -- treating disagreement as a signal of genuine uncertainty.
A continuous calibration pipeline tracks Brier scores per market category, automatically adjusting confidence scaling based on historical accuracy. Cross-market validation against Kalshi, Metaculus, and Manifold provides an independent truth source, ensuring the agent's probability estimates remain grounded.
Features
What It Does
Dual-LLM Ensemble Forecasting
Independent probability estimates from GPT-5.4 and Claude Sonnet 4.6. Disagreement triggers deeper investigation; consensus strengthens conviction for larger position sizing.
Cross-Market Validation
Real-time probability cross-referencing against Kalshi, Metaculus, and Manifold Markets provides an independent calibration anchor and arbitrage opportunity detection.
8 Strategy Engines
Specialized modules for news momentum, calendar arbitrage, correlation plays, contrarian signals, liquidity analysis, sentiment parsing, stat-arb, and microstructure edge detection.
Continuous Calibration
Brier score tracking per market category with automatic confidence scaling. The system learns its own biases and corrects for systematic over- or under-confidence.
Economic Profiling
Market-level economic analysis including implied probability surfaces, volume-weighted price trends, and order book depth profiling for optimal entry timing.
Autonomous Execution
End-to-end autonomous cycle from market discovery through analysis to CLOB order placement, with Kelly-criterion position sizing and portfolio-level risk limits.
Architecture
How It Works
Challenges
What Made This Hard
LLM probability estimates are notoriously poorly calibrated out of the box -- building a reliable calibration layer that corrects for systematic biases across political, crypto, sports, and economic market categories was the central technical challenge. Rate-limiting API calls to two frontier LLMs while maintaining full scan coverage of 80+ markets required careful batching, caching, and priority scheduling to stay within cost and latency budgets.
Stack