How our predictions work
Methodology, in plain English
When we put a probability on a fixture — “73% BTTS”, “65% home win” — we mean it literally. We don't expect you to take that on faith. This page explains how the model works, what it uses, what it doesn't, and where it's weakest. The proof of the calibration claim lives on the track record page.
1. What we predict
We publish probabilities for the markets people follow: both teams to score (BTTS), over/under 1.5, 2.5 and 3.5 goals, home win / draw / away win, and double-chance (home or draw, away or draw).
Coverage runs to 233+ men's competitions globally - the top divisions of every major football country, most second tiers in Europe, the major continental club cups (Champions League, Europa, Libertadores and so on), and senior international fixtures. Cup ties, women's, and youth competitions are predicted but treated separately because their data dynamics are different (more on that below).
2. How the model works
The prediction stack has three components, each doing one thing well.
Per-team goal expectation. Two Poisson regressors estimate the expected goal count for the home and away team in a given fixture. They're trained on years of historical match data with features describing each team's recent form, Elo rating, days of rest, head-to-head history, recency-weighted shot and xG aggregates, and league standings context. Outputs are two numbers — “lambdas” in Poisson terminology — that represent the model's view of how many goals each side will score on average across infinite replays of this fixture.
Joint match distribution. From those two lambdas we compute the probability of every scoreline (0-0, 1-0, 0-1, 1-1, 2-0, …) using a joint Poisson convolution. We then apply a Dixon-Coles correction, which adjusts the low-scoring corner of the distribution (0-0, 1-0, 0-1, 1-1) where naive Poisson independence overstates how often those scorelines occur in real football. From the corrected scoreline distribution we sum up market probabilities — BTTS, over 2.5, and so on.
Result classifier. The three-way result (home win / draw / away win) is harder to derive cleanly from goal expectations alone, so we run a separate multinomial XGBoost model that consumes the Poisson lambdas as features alongside the same form, Elo, and context inputs. Probabilities sum to 1 across home / draw / away.
All probabilities then pass through one more stage — calibration — before they reach you.
3. Why “calibrated” matters
A model can be accurate in the wrong way. If it predicts BTTS at 70% confidence and BTTS lands 60% of the time, the predictions are systematically overconfident — useless for any decision that depends on the actual probability.
We fit an isotonic regression on a temporal holdout (the model trains on pre-2024 matches, the calibrator fits on 2024+ data the model has never seen). The isotonic step is a monotonic transformation: it pulls overconfident raw probabilities back toward their honest hit rate without disturbing their ranking. Every market we publish — BTTS, the three over markets, and the result classifier — gets its own calibrator.
The result: when we say 73%, fixtures at that confidence land around 73% of the time. The 30-day track record shows the gap between “model said” and “actual hit rate” per market — the smaller the gap, the more honest the calibration.
4. What the model uses
Features fall into a handful of categories:
- Recent form: venue-specific results over each team's last several matches, with recency weighting so last week matters more than last September.
- Elo ratings: running skill estimate updated after every match, with home advantage and a decay factor for inactive teams.
- Rest days: gap since each side's last fixture, including cup matches.
- Head-to-head: limited to the last three years to avoid stale data.
- Match-stats aggregates: recency-weighted shot volume and expected-goals data from each team's recent venue-matched fixtures, where coverage is available.
- Standings context: league position, points, goal difference, and motivation proxies (relegation pressure, title race, dead-rubber flag).
Bayesian smoothing pulls each team's priors toward the league average where samples are thin (early in a season, or after promotion), so the model doesn't overreact to small samples.
5. What we deliberately don't use
Some inputs sound useful but aren't worth their cost:
- Confirmed lineups: they're only published an hour before kickoff and last-minute changes happen all the time, so a lineup-aware model isn't actually usable for a prediction workflow that runs earlier in the day.
- Weather: historical coverage for our fixture set is unreliable, and the predicted effect on goals is small enough that we'd be adding noise rather than signal.
- Manager changes: the effect is real but small and the scraping cost is high.
6. Where the model is weakest
We try to be honest about this rather than paper over it.
Lopsided fixtures: The joint-Poisson independence assumption breaks down when a much stronger side comfortably shuts out a weaker one. In practice this hits women's and youth international fixtures hardest, both overpredict BTTS by a wide margin. We exclude these competitions from public BTTS predictions for that reason.
Draws: Predicting draws is systemically harder than home or away wins because no recent statistical pattern reliably points at one. Our draw probabilities are honest about their uncertainty (you'll rarely see a draw above 35%) but they're also less useful for high-confidence predictions.
Mismatched cup ties: Top-flight sides drawing third-tier opposition in early cup rounds often field rotated lineups; our public predictions down-weight these fixtures or exclude them entirely depending on the competition.
7. When we retrain
Match results, standings, and Elo ratings update daily as fixtures conclude. The full model — every Poisson regressor, the result classifier, and every calibrator — is retrained at the end of each season, and again whenever a structural data change (new feature, new league) warrants it. Calibrators are refreshed on their own cadence whenever drift opens up between predicted and actual hit rates.
8. See it in action
Track record every slip of the day from the last 30 days, with win rate, return, and per-market hit-rate vs predicted. BTTS predictions today's calibrated both-teams-to-score predictions. Predictions high-confidence slate across every published market.