Luck is Hard to Beat: The Difficulty of Sports Prediction

Preprint English OPEN
Aoki, Raquel YS ; Assuncao, Renato M ; de Melo, Pedro OS Vaz (2017)

Predicting the outcome of sports events is a hard task. We quantify this difficulty with a coefficient that measures the distance between the observed final results of sports leagues and idealized perfectly balanced competitions in terms of skill. This indicates the relative presence of luck and skill. We collected and analyzed all games from 198 sports leagues comprising 1503 seasons from 84 countries of 4 different sports: basketball, soccer, volleyball and handball. We measured the competitiveness by countries and sports. We also identify in each season which teams, if removed from its league, result in a completely random tournament. Surprisingly, not many of them are needed. As another contribution of this paper, we propose a probabilistic graphical model to learn about the teams' skills and to decompose the relative weights of luck and skill in each game. We break down the skill component into factors associated with the teams' characteristics. The model also allows to estimate as 0.36 the probability that an underdog team wins in the NBA league, with a home advantage adding 0.09 to this probability. As shown in the first part of the paper, luck is substantially present even in the most competitive championships, which partially explains why sophisticated and complex feature-based models hardly beat simple models in the task of forecasting sports' outcomes.
  • References (29)
    29 references, page 1 of 3

    [1] C. Anderson and D. Sally. 2013. The Numbers Game: Why Everything You Know about Football is Wrong. Penguin Books, Limited, UK.

    [2] E Ben-Naim, NW Hengartner, S Redner, and F Vazquez. 2013. Randomness in competitions. Journal of Statistical Physics 151, 3-4 (2013), 458-474.

    [3] E Ben-Naim, NW, F Vazquez, and S Redner. 2007. What is the most Competitive Sport? Journal of the Korean Physics Society 50 (2007), 124-126.

    [4] Eli Ben-Naim, Federico Vazquez, and Sidney Redner. 2006. Parity and predictability of competitions. Journal of Quantitative Analysis in Sports 2, 4 (2006), 1-12.

    [5] Joel Brooks, Matthew Kerr, and John Guttag. 2016. Developing a Data-Driven Player Ranking in Soccer Using Predictive Model Weights. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, USA, 49-55.

    [6] William Chan, Pascal Courty, and Li Hao. 2009. Suspense: Dynamic Incentives in Sports Contests. The Economic Journal 119, 534 (2009), 24-46.

    [7] Shuo Chen and Thorsten Joachims. 2016. Predicting matchups and preferences in context. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, ACM, USA, 775-784.

    [8] Wei-Sen Chen and Yin-Kuan Du. 2009. Using neural networks and data mining techniques for the nancial distress prediction model. Expert Systems with Applications 36, 2 (2009), 4075-4086.

    [9] Raphael Chetrite, Roland Diel, and Matthieu Lerasle. 2015. The number of potential winners in Bradley-Terry model in random environment. arXiv preprint arXiv:1509.07265 - (2015).

    [10] Rodney Fort and Joel Maxcy. 2003. “Competitive Balance in Sports Leagues: An Introduction”. Journal of Sports Economics 4, 2 (2003), 154-160.

  • Metrics
    No metrics available
Share - Bookmark