Euro 2020 Stats Tracker
Euro 2020 reached its finale on Sunday, with Italy crowned winners after defeating England (my nation!) in a penalty shootout. It meant the Three Lions’ 55 barren years without a major international trophy continues.
Yesterday was a sombre day, as we mourned, reflected and regretted what might have been the night before. Now though, it’s time to look back on—what still was—a wonderful tournament.
Allegiances aside — perhaps you’re a football statistical anorak who is curious about the numbers behind the tournament, or a spreadsheet fanatic who is interested in seeing the inner workings of a stats-based system.
Or maybe you’re both of those things — like me.
In any case, I wanted to show you my interactive stats tracker I created in Google Sheets.
Not only does it allow you to view the match results in a convenient layout, but it also aggregates the statistics so you can see which countries performed well in different areas.
Almost all the match data has been extracted from Sky Sports match reports. Given the tournament is over, it doesn’t matter so much now, but it does update automatically. I must caveat that though by saying almost, and you’ll find out why in a minute.
I am also giving you unfettered access to the workbook, so you can faff about with it as much as you wish. You can even make a few customisations of your own if you desire. All I would ask is you use it for personal use only.
Sheets Workbook
Download a copy of the Sheets workbook from the following link:
Worksheets
The backstage worksheets are colour-coded red. These are responsible for housing data and carrying out the nitty-gritty work. On the other hand, the green ones are the stage sheets so are visibly easier on the eye.
Let’s go through each one:
🟩 All Matches
The group stage and knockout results are displayed, alongside the groups themselves.
🟩 Individual Matches
Choose a specific match from the tournament to focus on.
🟩 Tournament Statistics
View aggregated data visualisations to understand match and team trends.
🟥 Scraped Data
Houses the live data that’s been extracted from each Sky Sports match report.
🟥 Final Data
Exactly the same as the Scraped Data sheet, except the data is static.
🟥 Rankings
Contains the ranking data that determines how each team is sorted in the group phase, as well as who qualifies for the knockout phase.
🟥 Statistics
Features a flattened variant of the match data that’s found in the Final Data worksheet. This makes it easier to work with, as both teams from each game have separate rows containing data pertaining to them.
🟥 Countries/Stadiums
The countries and stadiums are listed in separate tables, along with their associated data.
Why Is the Match Data Stored in Two Places?
The magic behind the web scraping is IMPORTXML. It is a gem with enormous potential, as it’s essentially a web scraper packed away in a function. It’s worth learning, as it’s well-suited for any task that requires pulling in live data from a website.
However, going back to what I said earlier, unfortunately it can’t be reliably used as intended for a fully automated system. This is due to the erratic nature of how the function processes requests. Whilst most cells in the Scraped Data worksheet do successfully grab content, there are usually some that get stuck on ‘Loading…’.
It’s not clear why this happens. I’ve tried reloading the sheet several times and waiting a minute or two to see if everything loads, but the same thing occurs. Even with a small number of requests, it can struggle. Every now and then though, it will load perfectly.
Despite its shortcomings, it still saves a lot of time. What I’ve been doing after each game is copying the updated row(s) in Scraped Data and pasting it into Final Data.
The only other tinker needed was to divide all the percentage columns by 100, as when they were pasted as values, they got messed up.
Be aware as well that although Sky Sports is a reputable source, you will find some of the match statistics differ slightly to rival platforms. For instance, some companies calculate metrics like possession, shots and passing accuracy using their own independent methods, so the figures produced won’t necessarily be identical.
I also hasten to add that I am not responsible for the data, so you may see occasional inaccuracies, such as with Aerial Duels %, where the percentages wrongly add up to 100.1% for five matches.
Tiebreakers
For the group stage, where two or more teams were level on points, tie-breaker rules applied. These are found on the Euro 2020 Wikipedia page:
All rules except 8, 9 and 10 have been implemented in the Rankings worksheet, although it’s academic anyway as no group required the enforcement of anything beyond rule 2.
Final Words
Hopefully you’ve had fun and a play around whilst absorbing all the insightful statistics you’ve encountered.
I could have locked most of the workbook, so you could only operate it as a user, but I didn’t. By making it open source, it enables you to conduct your own analysis by going beyond what I’ve provided. Nevertheless, I’d like to think I’ve provided a good foundation for doing that anyway.
Shortly, I will be posting another article with some analysis looking at the tournament stats in-depth.
Stay tuned for that.