A Data Scientist's Ping Pong: Exploring Win Probabilities and Time Series Elo Ranking in Python

My freshman year, I had the pleasure of having both a roommate who was an excellent ping pong player and a table walking distance from our room. We played a couple times per week, all year. Very college.

Three years later, a rather unlikely group of friends (a Business student going to Google, a Mechanical Engineer off to Gilette, and an Industrial Engineer doing banking in the DR) and I started to play every week, recording our scores each time. The goal? To prove, over time, who was the best player. In the past, I would have used R for this mini-project. The elo and PlayerRatings packages would have made for quick implementation and visualization. However, I self-imposed an R restriction as my internship this summer exclusively will be using python.

So, here is an implementation of the Elo ranking mechanism from scratch. An Elo score is for measuring relative skill, even when players have not directly competed. I chose Elo over other ranking algorithms for a couple reasons. One player, Pedro, hadn't played nearly as often as everyone else and the algorithm needed to be able to handle variability in number of games played, and I wanted to implement it without the use of libraries. Elo is the perfect middle ground between being robust and easy to understand. The other popular option would be Glicko, which you can find more about here.

Data Import

I decided to log the data using a google sheet on my personal account because I needed to be able to edit from anywhere and I wanted to let the other players view and edit the sheet as well. Google sheets provides the level of accountability and flexibility necessary for 4 competitive college students to work with.

In [11]:
#importing all (maybe most of) the libraries we'll need
import math
import pandas as pd
import gspread
from oauth2client.service_account import ServiceAccountCredentials
In [12]:
#bring in data from the google sheet
scope = ['https://spreadsheets.google.com/feeds',
         'https://www.googleapis.com/auth/drive']

#establish credentials
credentials = ServiceAccountCredentials.from_json_keyfile_name('service_keys.json', scope)
gc = gspread.authorize(credentials)

#grab worksheet
sh = gc.open('Ping Pong')
worksheet = sh.get_worksheet(0)
worksheet_values = worksheet.get_all_values()
header_values = worksheet_values.pop(0)
#set dataframe
Ping_pong_data = pd.DataFrame(worksheet_values, columns = header_values)

Elo Rating Implementation

The math behind the Elo algorithm can be found here.

In [13]:
# Function to calculate the Probability of player 2 beating player 1
def Probability(rating1, rating2):

    return 1.0 * 1.0 / (1 + 1.0 * math.pow(10, 1.0 * (rating1 - rating2) / 400))
In [14]:
def EloRating(Ra, Rb, K, d):


    # To calculate the Winning
    # Probability of Player B
    Pb = Probability(Ra, Rb)

    # To calculate the Winning
    # Probability of Player A
    Pa = Probability(Rb, Ra)

    # Case -1 When Player A wins
    # Updating the Elo Ratings
    if (d == "TRUE") :
        Ra = Ra + K * (1 - Pa)
        Rb = Rb + K * (0 - Pb)


    # Case -2 When Player B wins
    # Updating the Elo Ratings
    else :
        Ra = Ra + K * (0 - Pa)
        Rb = Rb + K * (1 - Pb)

    return([Ra,Rb])

Now that we have the mathematical implementation, we can record the elo score changes for each of the 65 matches played over the semester.
Because of the way the data was set up, and because there were only 4 people playing in total, I found it a bit easier to do a bit of copy and pasting to get the values I wanted
If I was going to do this again, I would try to find a more clever solution. To be fair, almost any solution would be a more clever solution.

In [15]:
#set up dataframe

#all elo scores start at 1000
Tyler_elo = 1000
Juan_elo = 1000
Kent_elo = 1000
Pedro_elo = 1000
#initialize a history dataframe to help visualization in the future
Elo_history = pd.DataFrame(columns = ("Tyler", "Juan", "Kent", "Pedro", "Date"))
for i in range(0, len(Ping_pong_data)):
    if Ping_pong_data.iloc[i]['Player 1'] == "Tyler":
        if Ping_pong_data.iloc[i]['Player 2'] == "Juan":
            new_ratings = EloRating(Tyler_elo, Juan_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Tyler_elo = new_ratings[0]
            Juan_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Kent":
            new_ratings = EloRating(Tyler_elo, Kent_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Tyler_elo = new_ratings[0]
            Kent_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Pedro":
            new_ratings = EloRating(Tyler_elo, Pedro_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Tyler_elo = new_ratings[0]
            Pedro_elo = new_ratings[1]
    if Ping_pong_data.iloc[i]['Player 1'] == "Juan":
        if Ping_pong_data.iloc[i]['Player 2'] == "Tyler":
            new_ratings = EloRating(Juan_elo, Tyler_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Juan_elo = new_ratings[0]
            Tyler_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Kent":
            new_ratings = EloRating(Juan_elo, Kent_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Juan_elo = new_ratings[0]
            Kent_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Pedro":
            new_ratings = EloRating(Juan_elo, Pedro_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Juan_elo = new_ratings[0]
            Pedro_elo = new_ratings[1]
    if Ping_pong_data.iloc[i]['Player 1'] == "Kent":
        if Ping_pong_data.iloc[i]['Player 2'] == "Juan":
            new_ratings = EloRating(Kent_elo, Juan_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Kent_elo = new_ratings[0]
            Juan_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Tyler":
            new_ratings = EloRating(Kent_elo, Tyler_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Kent_elo = new_ratings[0]
            Tyler_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Pedro":
            new_ratings = EloRating(Kent_elo, Pedro_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Kent_elo = new_ratings[0]
            Pedro_elo = new_ratings[1]
    if Ping_pong_data.iloc[i]['Player 1'] == "Pedro":
        if Ping_pong_data.iloc[i]['Player 2'] == "Tyler":
            new_ratings = EloRating(Pedro_elo, Tyler_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Pedro_elo = new_ratings[0]
            Tyler_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Kent":
            new_ratings = EloRating(Pedro_elo, Kent_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Pedro_elo = new_ratings[0]
            Kent_elo = new_ratings[1]
        if Ping_pong_data.iloc[i]['Player 2'] == "Juan":
            new_ratings = EloRating(Pedro_elo, Juan_elo, 30, Ping_pong_data.iloc[i]['Won_p1'])
            Pedro_elo = new_ratings[0]
            Juan_elo = new_ratings[1]

    #add match to history dataframe
    new_row = pd.DataFrame({'Tyler':[Tyler_elo], 'Juan':[Juan_elo], 'Kent':[Kent_elo], 'Pedro':[Pedro_elo], 'Date':[Ping_pong_data.iloc[i]['Date']]})
    Elo_history = Elo_history.append(new_row)
In [16]:
#This leaves us with the dataframe
Elo_history.head()
Out[16]:
Date Juan Kent Pedro Tyler
0 2/13/2018 985.000000 1000 1000 1015.000000
0 2/13/2018 1000.647200 984.353 1000 1015.000000
0 2/13/2018 1000.647200 970.673 1000 1028.680275
0 2/13/2018 1016.854866 970.673 1000 1012.472610
0 2/13/2018 1016.854866 957.469 1000 1025.676612

Visualization!!!

Now that we have the Elo history,

In [18]:
import matplotlib.pyplot as plt
%pylab inline

#remove the date column and put it in the index instead, this makes graphing easier (or so I found out after a half hour of frustration)
Elo_history.index = Elo_history['Date']
Elo_history = Elo_history.drop(['Date'], axis = 1);
Populating the interactive namespace from numpy and matplotlib
In [19]:
plt.style.use('fivethirtyeight')
plt.figure()
Elo_history.plot()
plt.legend(loc='best')
plt.xticks(rotation=90)
plt.legend(loc='upper center', bbox_to_anchor=(0.5, 1.2),ncol=4, fancybox=True, shadow=True); plt.ylabel('Elo Score')
Out[19]:
Text(0,0.5,'Elo Score')
<matplotlib.figure.Figure at 0x2751e4db4e0>

As we can see, Pedro didn't play often enough with us to get a good Elo estimation. Throughout most of the semester, Juan and I went back and forth on who had the highest rating. At the very end of the semester, in the last two weeks, I ended up winning quite a few games against Kent and pulled out the highest Elo score recorded.

Also, we can also calculate the probabilities of victory for each potential matchup for the final Elo scores.

In [9]:
player_probs = {"Tyler vs Kent":[Probability(Kent_elo, Tyler_elo)], "Tyler vs Juan":[Probability(Juan_elo, Tyler_elo)], "Tyler vs Pedro":[Probability(Pedro_elo, Tyler_elo)]\
                ,"Kent vs Juan":[Probability(Juan_elo, Kent_elo)], "Kent vs Pedro":[Probability(Pedro_elo, Kent_elo)], "Juan vs Pedro":[Probability(Pedro_elo, Juan_elo)]}
pd.DataFrame(player_probs, index = ["Winning Chance"]).transpose()
Out[9]:
Winning Chance
Juan vs Pedro 0.641594
Kent vs Juan 0.374683
Kent vs Pedro 0.517521
Tyler vs Juan 0.603075
Tyler vs Kent 0.717171
Tyler vs Pedro 0.731173

From this, it looks like a Kent vs Pedro matchup would be the closest to watch. The last time the two played, Kent ended up squeaking out a win 21-19 which (probably) validates the percent we get.

Issues with this approach

Whenever I finish a project like this, I always want to look back and glance at directions I didn't take, but wanted to. The main problem with this effort was that even though the score differential was recorded, the Elo algorithm only cares about who wins. In the future, I would love to write a mathematical alteration to Elo that is specific for Ping Pong's score distributions. But today is not that day.
Another issue is the parameterization of the Elo model, specifically the k parameter. In this case, I used the k value of 30, which means that the scores are highly sensitive to events. A k value of 30 is very common for new players or seasons where there are only 20 or so games (the English Premier League is a perfect example of this). However, there is no good reason for using 30, and I would like to explore alternate values.