Biases in Probabilistic Reasoning

This research examines the winning probabilities across various scoring scenarios in tennis and reveals that the common-sense theoretical model makes accurate probability predictions at the macro level (set level) but inaccurate predictions at the micro level (game level). This suggests that our intuitive probabilistic estimates of tennis game outcomes are often incorrect but could make the game more exciting and engaging for spectators.

Project Structure

📄 `paper/`

Research paper.

Abstract

In sports games, the excitement and suspense felt by the spectators are essential to their entertainment experience. The level of excitement and suspense is linked to the spectators' reasoning about the probability of winning or losing. In tennis, as in many other sports, spectators' predictions of winning probabilities largely hinge on the scores. Given tennis's hierarchical scoring system, its probabilistic reasoning is multifaceted and complex. This research examines the winning probabilities across various scoring scenarios, using data from thousands of professional tennis matches and comparing them with theoretical models generally aligned with spectators' common beliefs. The analysis reveals that the theoretical model makes accurate probability predictions at the macro level but inaccurate predictions at the micro level, pointing to possible biases in micro-level probabilistic reasoning. A recent behavioral economic theory may help explain the causes of such biases. Biases are generally seen as undesirable errors, but this study offers a counterargument that biases in micro-level probabilistic reasoning actually enhance the enjoyment of tennis matches by creating expectations, anxiety, and surprises.

🐍 `scripts/python/`

Python scripts and notebooks for analysis and modeling.

Tennis_Analytics.ipynb: This program loads professional tennis match data from tennisabstract.com, cleans the data, and calculates the win probability for all possible game-score and point-score combinations. The goal is to provide empirical win rates for different score combinations and compare them with the win rates generated by a theoretical model.
tennis_probabilities.py: This program calculates a player's game and set winning probabilities for different game and point score combinations, based on the player's service and return point winning probability. This method is commonly used to predict the game and set winner in online sports betting. The win probability data generated from this model will be compared with the empirical win probability data calculated from the pro tennis match data. The goal is to check if the model makes accurate predictions.

📂 `data/`

Cleaned and processed match datasets.

Contains point-by-point and match-level data for both Men's (ATP) and Women's (WTA) tennis.
See data/README.md for detailed file descriptions.

📊 `results/`

Output files and verification benchmarks.

Contains CSV comparisons of win probabilities used to verify the Python model against theoretical baselines.
See results/README.md for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Biases in Probabilistic Reasoning

Project Structure

📄 `paper/`

🐍 `scripts/python/`

📂 `data/`

📊 `results/`

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
paper		paper
results		results
scripts/python		scripts/python
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Biases in Probabilistic Reasoning

Project Structure

📄 paper/

🐍 scripts/python/

📂 data/

📊 results/

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

📄 `paper/`

🐍 `scripts/python/`

📂 `data/`

📊 `results/`

Packages