Rating Competitors in Games with Strength-Dependent Tie Probabilities
Pub. online: 4 December 2025
Type: Statistical Data Science
Open Access
Received
13 June 2025
13 June 2025
Accepted
25 November 2025
25 November 2025
Published
4 December 2025
4 December 2025
Abstract
Competitor rating systems for head-to-head games are typically used to measure playing strength from game outcomes. Ratings computed from these systems are often used to select top competitors for elite events, for pairing players of similar strength in online gaming, and for players to track their own strength over time. Most implemented rating systems assume only win/loss outcomes, and treat occurrences of ties as the equivalent to half a win and half a loss. However, in games such as chess, the probability of a tie (draw) is demonstrably higher for stronger players than for weaker players, so that rating systems ignoring this aspect of game results may produce strength estimates that are unreliable. We develop a new rating system for head-to-head games based on a model that explicitly acknowledges that a tie may depend on the strengths of the competitors. The approach uses a Bayesian dynamic modeling framework. Within each time period, posterior updates are computed in closed form using a single Newton-Raphson iteration evaluated at the prior mean. The approach is demonstrated on a large dataset of chess games played in International Correspondence Chess Federation tournaments.
Supplementary material
Supplementary Material
•
Appendices.pdf : Appendices A and B.
•
Code_and_Data.zip : Zip file consisting of code and data to run the analyses in this manuscript.
References
Bhat CR (1995). A heteroscedastic extreme value model of intercity travel mode choice. Transportation Research. Part B: Methodological, 29(6): 471–483. https://doi.org/10.1016/0191-2615(95)00015-6
Boys R, Dunsmore I (1987). Diagnostic and sampling models in screening. Biometrika, 74(2): 365–374. https://doi.org/10.1093/biomet/74.2.365
Bradley RA, Terry ME (1952). Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika, 39(3/4): 324–345. https://doi.org/10.2307/2334029
Crouch EA, Spiegelman D (1990). The evaluation of integrals of the form ${\textstyle\int _{-\infty }^{\infty }}f(t)\exp (-{t^{2}})$: Application to logistic-normal models. Journal of the American Statistical Association, 85(410): 464–469. https://doi.org/10.1080/01621459.1990.10476222
Davidson RR (1970). On extending the Bradley-Terry model to accommodate ties in paired comparison experiments. Journal of the American Statistical Association, 65(329): 317–328. https://doi.org/10.1080/01621459.1970.10481082
Davidson RR, Beaver RJ (1977). On extending the Bradley-Terry model to incorporate within-pair order effects. Biometrics, 33(4): 693–702. https://doi.org/10.2307/2529467
Fahrmeir L, Tutz G (1994). Dynamic stochastic models for time-dependent ordered paired comparison systems. Journal of the American Statistical Association, 89(428): 1438–1449. https://doi.org/10.1080/01621459.1994.10476882
Florez M, Guindani M, Vannucci M (2025). Bayesian bivariate Conway–Maxwell–Poisson regression model for correlated count data in sports. Journal of Quantitative Analysis in Sports, 21(1): 51–71. https://doi.org/10.1515/jqas-2024-0072
Glickman ME (1999). Parameter estimation in large dynamic paired comparison experiments. Journal of the Royal Statistical Society. Series C. Applied Statistics, 48(3): 377–394. https://doi.org/10.1111/1467-9876.00159
Glickman ME (2001). Dynamic paired comparison models with stochastic variances. Journal of Applied Statistics, 28(6): 673–689. https://doi.org/10.1080/02664760120059219
Glickman ME (2025). Paired comparison models with strength-dependent ties and order effects. Statistical Modelling. In press. https://doi.org/10.1177/1471082X251400474.
Glickman ME, Jones AC (2024). Models and rating systems for head-to-head competition. Annual Review of Statistics and Its Application, 12: 259–282. https://doi.org/10.1146/annurev-statistics-040722-061813
Gorgi P, Koopman SJ, Lit R (2019). The analysis and forecasting of tennis matches by using a high dimensional dynamic model. Journal of the Royal Statistical Society. Series A. Statistics in Society, 182(4): 1393–1409. https://doi.org/10.1111/rssa.12464
Harding MC, Hausman J (2007). Using a Laplace approximation to estimate the random coefficients logit model by nonlinear least squares. International Economic Review, 48(4): 1311–1328. https://doi.org/10.1111/j.1468-2354.2007.00463.x
Hastie T, Tibshirani R (1986). Generalized additive models. Statistical Science, 1(3): 297–310. https://doi.org/10.1214/ss/1177013604
Ingram M (2021). How to extend Elo: A Bayesian perspective. Journal of Quantitative Analysis in Sports, 17(3): 203–219. https://doi.org/10.1515/jqas-2020-0066
Karlis D, Ntzoufras I (2005). Bivariate Poisson and diagonal inflated bivariate Poisson regression models in R. Journal of Statistical Software, 14: 1–36. https://doi.org/10.18637/jss.v014.i10
Nelder JA, Mead R (1965). A simplex method for function minimization. The Computer Journal, 7(4): 308–313. https://doi.org/10.1093/comjnl/7.4.308
Pirjol D (2013). The logistic-normal integral and its generalizations. Journal of Computational and Applied Mathematics, 237(1): 460–469. https://doi.org/10.1016/j.cam.2012.06.016
Pryanishnikov I, Zigova K (2016). Multinomial logit models for the Austrian labor market. Austrian Journal of Statistics, 32(4): 267–282. https://doi.org/10.17713/ajs.v32i4.461
Steen N, Byrne G, Gelbard E (1969). Gaussian quadratures for the integrals ${\textstyle\int _{0}^{\infty }}{e^{-{x^{2}}}}f(x)dx$ and ${\textstyle\int _{0}^{b}}{e^{-{x^{2}}}}f(x)dx$. Mathematics of Computation, 23(107): 661–671. https://doi.org/10.1090/S0025-5718-1969-0247744-3
Szczecinski L, Djebbi A (2020). Understanding draws in Elo rating algorithm. Journal of Quantitative Analysis in Sports, 16(3): 211–220. https://doi.org/10.1515/jqas-2019-0102
West M, Harrison PJ, Migon HS (1985). Dynamic generalized linear models and Bayesian forecasting. Journal of the American Statistical Association, 80(389): 73–83. https://doi.org/10.1080/01621459.1985.10477131
Wood SN (2011). Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. Journal of the Royal Statistical Society (B), 73(1): 3–36. https://doi.org/10.1111/j.1467-9868.2010.00749.x