Plotting Passes in Python

Before I get into today’s article, I want to give a shout out to Raven Beale (@sbourgenforcer). Raven is my coding and writing partner ,and co-creator of @GoalCharts and @Player_Elo. Raven and I learned Python together and without his help I wouldn’t have learned as much as I have. Having someone you can work on a project with as you are learning is invaluable. Your project partner will learn things you have not and vice-versa and you have someone who is always looking at your code, showing how it could be improved or learning from what you have written. It’s a great motivator to improve and I’d recommend it to anyone who is looking to work on their coding skills.

Last time, we wrapped our pitch code into a function and learned how to plot shots using xy data and expected goals. We’ll be using plt.scatter again today to show pass origin, so if you haven’t, read that post first.

Setting up the pitch

As we learned last time, we can use draw_pitch to draw a pitch on our plotting space. I’m feeling a little bored, so lets randomise our hex codes and see what kind of pitch style we get. We have a couple of options here. We can read in the data and append it to a list, or use pandas to import it as a data frame.

    import os
    import csv
    hexCodes = []
    with open('hexcolors.csv', 'r',encoding='utf-8') as csvfile:
        hexcodes = csv.reader(csvfile, delimiter='t')
        for row in hexcodes:hexCodes.append(row)
    csvfile.close()
    print(hexCodes)

OR

    import pandas as pd

    hex_df = pd.read_csv("hexcolors.csv", delimiter="t")
    print(hex_df)

As list

Here we have a list of lists with the csv columns in the first position. If we want to continue with this option, we can remove the first position by using 

 hexCodes = hexCodes[1:]

Now we can find a value by referencing it’s position in the list.

hexCodes[0][1] 

0 is the first vertical position in the list. This would return the following: [‘indian red’, ‘#B0171F’, ‘176’, ’23’, ’31’]

 1 referes to horizontal position in the list. This would return our hex code value: ‘#B0171F’

As dataframe

Here we have a pandas dataframe. If we want to get the first hexcode, all we need to do is find it in the dataframe. There are a few different ways we can do this, but for now lets just use its index number.

colour = hex_df.Hex.loc[0] # This returns the value '##B0171F'

Lists and pandas are really outside of the scope of this article, however, I am working on a python course that will cover these topics, so keep an eye out for that if you are looking to learn more. Anyway, back to randomising the hexcodes. I will stick with our dataframe, but I don’t want just any colour, I want a shade of grey:

    import random
    import matplotlib.pyplot as plt

    new_hex_df = hex_df[(hex_df['Colour Name'].str.contains("gray"))|
                 (hex_df['Colour Name'].str.contains("grey"))]

    p_colour = new_hex_df.Hex.iloc[random.randint(0,len(new_hex_df)-1)]
    l_colour = hex_df.Hex.iloc[random.randint(0,len(hex_df)-1)]
    draw_pitch(p_colour,l_colour,"h","full")

That looks pretty cool. With the dark pitch and blue lines it means we will need to be conscious of our palette choice for our passes and colorbars, though.

I’m going to look at two scenarios today, first we’ll take a look at a match, then we’ll look at a player for the 18/19 season. Again, I will use random to decide both.

    data = df[df['gsm_id'] == gsm_list[random.randint(0,len(gsm_list)-1)]]
    home = data.homeTeam.iloc[0]
    away = data.awayTeam.iloc[0]
    data = data.Date.iloc[0]
    hG = data.hG.iloc[-1]
    aG = data.aG.iloc[-1]
    print("Your match is {} vs {}, which was played on the {}. The match finished {}-{}".format(home,away,date,hG,aG)
    >>> Your match is Man Utd vs Arsenal, which was played on the 05-12-2018. The match finished 2-2.

We filter our dataframe to show only passes. For now we are only interested in passes made in open play. That being so, we filter out all corners, free kicks, goal kicks, and throw ins. I have created an extra column – isOpenPlay – that returns a 1 if this is the case, or 0 otherwise. Below is the pass locations for both teams (both shown left to right). Before looking in further detail we can already see that a lot play was concentrated on the left wing and the teams found difficulty in moving the ball centrally in the final third.

    passData = data[(data['Action'] == "Pass")&(data['isOpenPlay'] == 1)]

    draw_pitch(p_colour,l_colour,"h","full")

    x = passData.xM.values
    y = passData.yM.values

    zo = 12 #so we don't forget it later
    plt.scatter(x,y,color="red",edgecolors="black",zorder=zo,alpha=1)

Plotting passes

Using plt.plot, we can add our passes to our scatter plot above.

    draw_pitch(p_colour,l_colour,"h","full")
    x = passData.xM.values
    y = passData.yM.values
    xe = passData.endxM.values
    ye = passData.endyM.values
    plt.scatter(x,y,color="red",edgecolors="black",zorder=zo,alpha=1)
    plt.plot([x,xe],[y,ye],zorder=11,alpha=1,color="black")
    plt.show()

So this looks terrible. All we have done here is plot a line for the start and end location and colour them black. We have 0 context. Let’s limit our focus a little. I want to separate each team and look at their passing in the final third. again this is easy to do with pandas.

    mnu = passData[(passData['teamName'] == home)&(passData['xM'] >= 69.33)]
    ars = passData[(passData['teamName'] == away)&(passData['xM'] >= 69.33)]
 

Using Manchester United first, we can separate and plot passes that were successful and unsuccessful.

 

    mnu0 = mnu[mnu['successful'] == 0]
    mnu1 = mnu[mnu['successful'] == 1]

    draw_pitch(p_colour,l_colour,"h","full")

    pitch = p_colour
    xs = mnu1.xM.values
    ys = mnu1.yM.values
    xes = mnu1.endxM.values
    yes = mnu1.endyM.values
    xu = mnu0.xM.values
    yu = mnu0.yM.values
    xeu = mnu0.endxM.values
    yeu = mnu0.endyM.values

    plt.scatter(xs,ys,color="black",edgecolors="black",zorder=zo+1,alpha=1)
    plt.plot([xs,xes],[ys,yes],zorder=zo,alpha=1,color="green")

    plt.scatter(xu,yu,color=pitch,edgecolors="black",zorder=zo+1,alpha=1) 
    plt.plot([xu,xeu],[yu,yeu],zorder=zo,alpha=1,color="red")
    plt.title(str(home)+" final third open play passes",fontsize=18)
    plt.show()

Successful passes are filled black and the lines are coloured green. Unsuccessful passes are filled the same colour as the pitch to appear empty and coloured red. That looks pretty cool. Lets add a legend to highlight this. If we just put in our [xs,xes],[ys,yes]…. from our plt.plot, we would end up with a key for every pass. We don’t want that, so instead we can plot two dummy lines and hide them behind the pitch:

    plt.plot(52,0,34,0,color="green",label="successful",zorder=0)
    plt.plot(52,0,34,0,color="red",label="unsuccessful",zorder=0)
    leg = plt.legend(loc=8,ncol=2,frameon=False)
    plt.setp(leg.get_texts(), color='w')

Legends can be tricky. As you can see above, I added a label to each of our dummy lines. I then make the legend a variable, set the location to bottom-centre (8), set the number of columns to 2 so our legend is horizontal and not stacked, and turned off the frame. I could just do this using plt.legend, however, then I would not have control over the text displayed in the legend. in plt.setp() I get the text from the legend, and set my color to white (‘w’).

I’m actually pretty happy with this but we can take it a step further.

Colouring plot lines by value & adding colorbars

As I said at the start, because of the color of our pitch and lines we need to be mindful of our colour scheme. Green and red seem like they would work because we instinctively consider red to be wrong/bad and green to be correct/good. But the green on our background blends in too much and it’s difficult to distinguish individual passes. This also lumps passes into two categories but says nothing about the quality of the pass. There is also another issue, and this one is more important: audience. While not a large percentage of people are strongly affected by it, some people who are colour blind will struggle to distinguish between red and green.

To deal with this we will colour our passes on a scale and layer them to add some depth and give importance to higher quality passes. 

For this part lets jump back into our main dataset to pull out a random attacking player.

    player_df = df[(df['playerId'] == pIds[random.randint(0,len(pIds)-1)]&
                   (df['isPass'] == 1)&(df['isOpenPlay'] == 1)]

Our random pick gives us AC Milan’s Spanish right winger, Suso. Again, we are just looking at passes in the final third. As I haven’t posted my pass model yet, we will value the passes using the expected goals model we made. This will give an expected assist value to each pass as if a shot was taken from the point the ball was received. This is wrong, but it will do for example purposes.

First we need to filter the passes and sort them by lowest to highest value.

    plr = player_df[player_df['xM'] >= 69.33]
    plr.sort_values('xG',ascending=True,inplace=True)

We want to share the fig and ax from our draw_pitch method, so at the end of draw_pitch() which we made earlier, add 

return fig,ax

now we can draw the pitch and use the ax for our colourbar by using:

    ax = draw_pitch(p_line, l_line, "v", "half")

Once again we will declare our x, y, endx, and endy variables.

    x = plr.xM.values
    y = 68 - plr.yM.values # reversed for vertical plot
    xe = plr.endxM.values
    ye = 68 - plr.endyM.values #reversed for vertical plot
    z = plr.xA.values

Now we have to set up our color array for our colourbar.

    cmap = matplotlib.cm.get_cmap('plasma')
    norm_range = matplotlib.colors.Normalize(vmin=0, vmax=0.10)
    c_vals = [cmap(norm_range(value)) for value in z]

Be careful with vmin and vmax in the second line. vmin in the lowest colour value we will plot and vmax is the maximum. Values that are higher will take the highest colour value from our palette (plasma). If this was for shots/expected goals we should increase the maximum, but be sure your values are uniform across players and don’t just take the maximum value of the player you are looking at. The next step is to run through each pass and assign a colour to the line and marker. You’ll notice we use i + zo to layer the passes so those with the higher “xA" value are shown on top.

    i = 0
    for i in range(len(plr)):
        plt.plot([y[i],ye[i]],
                 [x[i],xe[i]],zorder=i+zo,color=c_vals[i])
        plt.scatter(y[i], x[i],zorder=i+zo+1,
                    color=c_vals[i],edgecolor='white',lw=0.25)
    

This is a short for loop which runs through our lists and gets each value in position i. plt.plot adds the lines to viz and plt.scatter adds our markers. The lw at the end controls our line with. The higher the value the thicker the line will be.

We can now make a dummy scatter to use as our input for the colourbar.

    plot = plt.scatter(y[i], x[I],zorder=0,color=c_vals[i])

zorder is set to 0 to hide it behind our pitch.

    cax, _ = matplotlib.colorbar.make_axes(ax)
    cbar = matplotlib.colorbar.ColorbarBase(cax, cmap=cmap,
                                        norm=norm_range)

    plt.show()

I will be revisiting these charts in my next post on the pass model I’m developing. Until then, I’m happy to take requests for tutorials or articles. You can ask me through my contact page or on twitter.

Liked it? Take a second to support petermckeever on Patreon!
No Comments

Post A Comment