How a Sports Betting Model Actually Works
Every bookmaker has a model. Their traders crunch data, assign probabilities, and build in a margin. A sports betting model on our side does the same thing — the goal is simply to produce a more accurate number than theirs. When our probability is higher than what the odds imply, and the difference is large enough, there's a bet. That's the whole loop. The rest is execution and discipline.
What a model actually does
A model converts raw information into a probability estimate for each possible outcome. For a football match it might output: home win 42%, draw 27%, away win 31%. The bookmaker has their own set of numbers. If the bookmaker's odds imply a 38% chance of a home win but our model says 42%, that four-point gap is where value might live — provided the model is well-calibrated and the bookmaker is one of the softer ones.
The model is not trying to predict who wins a single match. It's trying to be *less wrong on average* across hundreds of matches. That's a very different task — and one where data and structure beat intuition reliably.
The ingredients: data, ratings, and form
Our football model is trained on roughly 105,000 matches from 2006 onwards across the Premier League, La Liga, Serie A, Bundesliga, Eredivisie, Ligue 1, EFL Championship, and more. The 110 features it uses fall into a few categories:
- Elo ratings — a continuously updated strength estimate for every team, adjusted after each result. Elo captures long-run quality; it's slower to react than form but much harder to game with one lucky result.
- Recent form features — goals scored/conceded, shots on target, xG proxy via shots on target, and results over rolling windows of 4–10 games.
- Head-to-head context — historical results between these two teams, adjusted for how long ago they were played.
- Lineup adjustments — when confirmed lineups are available (~90 min before kickoff), missing forwards reduce the expected goals parameter by 5.5%, midfielders 3.5%, defenders 2%, capped at 20% total. Injuries apply a similar but smaller fallback shift.
- Match context — competition type, whether it's a knockout round, neutral venue flag, and international tournament features for World Cup and Euros fixtures.
For baseball our model estimates runs scored by each team using Poisson regression with pitcher stats, park factors, temperature, wind, and umpire tendencies. For hockey and NBA, the architecture is similar — team efficiency ratings anchored to long-run performance, with recent form layered on top.
Calibration: why raw predictions aren't enough
A model that says '60% home win' on 100 matches should see the home team win roughly 60 of them. If it only wins 50, the model is overconfident — its probabilities are too high and any bets based on them will disappoint. Calibration is the process of correcting this systematic error.
We use isotonic regression to calibrate the raw XGBoost outputs. After calibration, our football model scores a Brier score of 0.177 — the Brier score measures calibration quality; lower is better, and 0.177 is within the range of the sharpest bookmakers. We benchmark against Pinnacle specifically because Pinnacle is the consensus sharp-market proxy: they accept professional bettors, move lines fast, and run the lowest margins in the industry. If you can't beat Pinnacle's implied probabilities on average, you haven't found an edge.
You can see the current model metrics and out-of-sample performance on our live model & track record page.
From probability to bet: expected value and the decision gate
Once the model outputs a probability, the comparison is simple. If our model says 45% and the bookmaker's odds imply 40%, the expected value is: (0.45 × odds) − 1. Positive EV means the bet is worth taking on average; negative means it isn't. You can test any scenario in our EV calculator.
Staking follows fractional Kelly at 25%, capped at 5% of bankroll per bet. Kelly sizes each bet proportional to your edge divided by the odds; the 25% fraction buys down variance so a run of bad luck doesn't end the exercise before the maths can play out. Try sizing a bet yourself with our Kelly calculator.
To understand why any of this matters, see what value betting is — the same logic underpins every bet the model recommends.
Why a model beats gut feeling
Human intuition is fast and pattern-hungry — both great properties in most situations, both liabilities in probability estimation. We overweight recent events (a team's last big win dominates our view of them), underweight regression to the mean, and are systematically biased toward popular outcomes. Bookmakers know this and price accordingly.
A model doesn't care about last weekend's highlight. It weights 105,000 historical outcomes and adjusts each feature by how much it actually predicted results in out-of-sample testing. The model is slower to update and less entertaining — but over 500 bets it will be less wrong than any individual analyst, which is the only number that matters.
The honest limits: what a model cannot do
A model is only as good as its inputs. Garbage in, garbage out — if we mislabel a team's lineup, miscalculate a pitcher's ERA, or train on data that has systematic recording errors, the model will confidently produce wrong probabilities. This is why we invest heavily in data quality and why every new feature goes through a controlled out-of-sample experiment before it ships.
- Variance is real. A 10% EV edge means a losing bet 40%+ of the time. Losing weeks and even losing months are part of the process.
- Market limits. Soft bookmakers limit or ban winning accounts. The edge only pays if you can actually place the bet.
- Model drift. Football tactics, roster construction, and officiating norms change. Models need periodic retraining; ours are re-evaluated at season end each year.
- Small samples lie. Statistically meaningful signal requires roughly 500 bets. Evaluate the model on CLV — did it beat closing prices? — not 50-bet win rate.
A realistic long-run yield from a well-calibrated model is 3–8%. That's a genuine edge in a negative-sum game. It compounds meaningfully with proper bankroll management. But it is not a guarantee, and it is not fast money.