Skip to main content

Getting started in R with StatsBomb Data

As always, I should caveat that I'm not an expert either in football or programming...I started learning R in December and have gradually reached a 'mildly competent' level.

This will go through installing R, loading the StatsBomb data, then plotting a pass map - something like this:



Anyway, away we go.

Thing number 1 - install R. There are two things to load...the R 'base' and Rstudio.

You can download Rstudio here:
https://rstudio.com/products/rstudio/download/

The first 3 minutes of the below shows the process:
https://www.youtube.com/watch?v=BuaTLZyg0xs&list=PL6cDc8Xxld162nSsZ14bQnFn1cYStsrtk&index=2&t=0s

That is now hopefully R loaded. Open Rstudio and you should be greeted with something like this:



Press the arrow areas to reveal:



Under the 'Packages' tab select 'install' and search 'devtools'..install package. Repeat the previous step however search 'tidyverse'.

Next steps are to load in the StatsBomb data and FCrSTATS pitch:

devtools::install_github("statsbomb/StatsBombR")
devtools::install_github("FCrSTATS/SBpitch")




If you check the packages, there should now be 'SBpitch' and 'StatsBombR'.

Now the fun stuff starts...

Load in the packages 'StatsBombR', 'tidyverse' and 'SBpitch'.


Once this is typed, hit 'run' to fully load. You do this after each section of code.

Now to start loading in the data...

Input:

Comp<-FreeCompetitions()

Again, after the 'Comp' code hit run and it should show in the window on the right.

This loads all the available StatsBomb competitions. By clicking 'Comp' you can view the competitions including the Messi data (competition_id = 11) and the FAWSL data (competition_id = 37, season_id = 42). We will use the FAWSL 2019/2020 data.


We now want to filter that competition data specifying the 'competition_id' and 'season_name'...

Comp<-FreeCompetitions()%>%filter(competition_id==37, season_name=="2019/2020")

Once again if you click 'Comp' on the right it should just display FAWSL 2019/20 season.

We now want to load all FAWSL matches -

Matches<-FreeMatches(Comp)

Click 'Matches' on the right and have a look around the match data:


Next, to load the free event data associated with the Matches:

StatsBombData<-StatsBombFreeEvents(MatchesDF = Matches, Parallel = T)

Click StatsBombData on the right and have a look. Lots of scary numbers but StatsBomb have created an 'allclean' function that makes it less scary:

StatsBombData = allclean(StatsBombData)

At this point, your screen should look like the following:


Now to filter the above for a single match:

d1<-StatsBombData%>%
  filter(match_id == 2275096, type.name == "Pass", team.name == "Arsenal WFC")

This essentially reads as:

 'get(<-) StatsBombData and filter (%>%filter) the match (match_id == 2275096) for passes (type.name == "Pass") associated with Arsenal (team.name == "Arsenal WFC") and assign to d1'.

Click d1 on the right and you should have all the Arsenal WFC event data associated with Arsenal WFC vs West Ham United LFC.

Yay.

Now to plot!

Lets get a pitch loaded. With the FCrSTATS 'SBpitch' package this is easy. Add:

create_Pitch()


Great - now to add the passes. The best way to think of this is simply as an elaborate scatter plot where you add layers. We have the base layer with the pitch, now to add where passes occurred:

geom_point(data = d1, aes(x = location.x, y = location.y))

This reads as:

 "create a point (geom_point) using the filtered match data (data=d1) and plot the x and y coordinate (aes(x = location.x, y = location.y))"

You should hopefully now have:



Woi oi. You have now plotted the start points of every Arsenal WFC pass vs West Ham United LFC. Go grab a biscuit and treat yourself.

Refreshed and fuelled lets build on this plot. Lets add some lines so we can establish where the pass started and ended:

geom_segment(data = d1, aes(x = location.x, y = location.y, xend = pass.end_location.x, yend = pass.end_location.y))

Again, this uses calls for a line (geom_segment) to be drawn using the match data (d1) from the x/y start point to the x/y end point.


Great! Final few steps...lets add some arrows so the pass direction is clearer:

arrow = arrow(length = unit(0.08,"inches"))


Looks decent however the arrows are pretty dominant and cover the pitch. You can use the 'alpha' command to adjust both the point and line transparency:

alpha = 0.5

You need to add this both the geom_point and geom_segment section - have a play with differing transparency from 0.1 - 0.9 and see what you like!

Final two steps! You can change the colour of the passes by adding the colour command:

colour = "red"

You can replace the 'red' with hex codes to customise further. Your full code should now look like this:




You should now have a lively plot, looking like this:



The y axis is incorrect on the create_pitch function...therefore if you plot the passes of a left back it will show up on the right.

To correct this you need to add:

scale_y_reverse()

You can add this after the geom_segment

Now, to finally add a title.

labs(title = "Arsenal WFC",
       subtitle = "vs West Ham United LFC")

I have removed the "red" command but your final plot should now look like this:


Bang! Final little tinker...you can filter the above further by specifying which player you which to see the passes of. If you click 'd1' and scroll across to the 'player.name' header and column. From here, you can select a player, I have chosen Leah Williamson. You can add this to you 'd1' function:

d1<-StatsBombData%>%
  filter(match_id == 2275096, type.name == "Pass", team.name == "Arsenal WFC", player.name == "Leah Williamson")

Run the plot once more and you should have:


Your final code should now be:




Hopefully this is helpful and has got you started on your R football coding journey! I'm a few months in but doing small amounts each day twinned with breaking things and trying to fix it seems to be how I progress fastest.

StatsBomb created a primer which is a must look to get a great overview: http://statsbomb.com/wp-content/uploads/2019/07/Using-StatsBomb-Data-In-R-English.pdf

If you need any help, let me know...I will do my best!






Comments

  1. Hi Mark!

    I follow you on Twitter and I am fairly new at R too. I like your heatmap you posted on March 31st. Do you mind sharing your code on that one?

    Best,

    Simon from Denmark.

    ReplyDelete
    Replies
    1. Getting Started In R With Statsbomb Data >>>>> Download Now

      >>>>> Download Full

      Getting Started In R With Statsbomb Data >>>>> Download LINK

      >>>>> Download Now

      Getting Started In R With Statsbomb Data >>>>> Download Full

      >>>>> Download LINK AV

      Delete
  2. Nice post. I have a news football site from my country. If you have a free time. You can visit here.
    คาสิโนออนไลน์

    ReplyDelete
  3. I really appreciate your support on this.
    Look forward to hearing from you soon.
    I’m happy to answer your questions, if you have any.


    แจกเครดิตฟรี ฝากถอนง่าย

    เล่นบาคาร่า

    คาสิโน

    ReplyDelete
  4. Many thanks for your kind invitation. I’ll join you.
    Would you like to play cards?
    Come to the party with me, please.
    Look forward to hearing from you soon.
    I’m happy to answer your questions, if you have any.

    เล่นบาคาร่า

    เครดิตฟรี

    แจกเครดิตฟรี ฝากถอนง่าย

    คาสิโน

    ReplyDelete
  5. Nice blog, very informative content.Thanks for sharing, waiting for the next update…
    Web-Based Applications of Java
    What is Java Programming?

    ReplyDelete


  6. Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, please check these depression, stress and anxiety related articles.
    How to Build a Portfolio with ETFs, My vision for India in 2047 postcard, Essay on Unsung Heroes of Freedom Struggle

    ReplyDelete
  7. I like your all post. You have done really good work.
    Thank you for the information you provide, it helped me a lot.

    KeepVid Pro Crack

    ScriptCase Crack

    R-Studio Crack

    ReplyDelete
  8. I like your all post. You have done really good work. Thank you for the information you provide, it helped me a lot. crackproz.org I hope to have many more entries or so from you.
    Very interesting blog.
    R-Studio Network Technician Crack

    ReplyDelete
  9. Getting Started In R With Statsbomb Data >>>>> Download Now

    >>>>> Download Full

    Getting Started In R With Statsbomb Data >>>>> Download LINK

    >>>>> Download Now

    Getting Started In R With Statsbomb Data >>>>> Download Full

    >>>>> Download LINK 5l

    ReplyDelete
  10. Very interesting, good job, and thanks for sharing such a good blog.
    เว็บบอล

    ReplyDelete
  11. Error in if (MatchesDF == "ALL") { : the condition has length > 1 having this issue when prompting StatsBombData <- StatsBombFreeEvents(MatchesDF = Matches, Parallel = T). please help

    ReplyDelete

Post a Comment

Popular posts from this blog

Using Wyscout in R

It's pretty clear that within a football setting, clubs are largely using the same data. Most clubs will be using Wyscout/Instat...others may have access to StatsBomb and Metrica. None the less, data quality discussion aside, Wyscout is used predominantly to quickly gain an overview of players (both from a video and data perspective). This dovetails with people up-skilling through the lockdown, taking various courses and becoming increasingly proficient in languages such as R and Python. This is a big asset within football! Those that have read previously know that I am self teaching R and sharing any learnings that may be of interest around football analytics to others. By no means am I an authority on this, I've just found something that works, that might help others...I'm always happy to be corrected! Anyway, the aim is to: - Download Wyscout data - Import into R - Clean the headers - Re-format the data from "wide" to "long" format - Some e

Shot Maps In R using StatsBomb Data

Im not sure if anyone is following these, but I will do one more and see what happens! I have covered some passing based stuff, I thought it might be useful to look into shots. Therefore, the rough plan for this piece: 1) Total player xG in the WSL for this season 2) Find the top 9 players based on xG 3) Plot all shots taken including xG 4) Add labels 5) Plot the shot map of the 9 players against one another As always, my coding is in the learning stage so this isn't a definitive way...just something that works for me and might help others! Anyway, load in this seasons WSL data as we have previously. We want to extract 3 things from the data - the number of shots, numbers of goals and total xG (initially including penalties) To start - tallying player shots: player_shots<-StatsBombData%>%   filter(type.name == "Shot")%>% ##filter all shots in StatsBombData   group_by(player.name)%>% ##group by player   tally(name = "total_shots"