Twitter

Thursday, 25 September 2014

What are we looking at?

What are we trying to find by looking at on pitch statistics at an individual player level?

Shots vs. Goals

Goals

Their is a tendency for the coaches/media/fans to judge individuals on goals/assists for and against. The problem with using goals as a performance indicator is the fact that 'luck', or lack thereof, plays a massive part in scoring/conceding goals. Read these 2 links on PDO. Link1 Link2

It is my experience, from hockey analytics, that players and teams with a high PDO get a lot more 'praise' than players/teams with a low PDO. 

High or low on pitch shooting %/Save% are not sustainable and will regress toward the average.

Shots

The more repeatable stat for determining performance is using total shots. Here is a great article on shot quantity vs. quality. If we can find out which players help increase shot differential, we can then increase the teams chance of scoring/saving a goal by playing them more or subbing them at the right times, therefore, leading to more points in the league table.

What I am looking for, are players who help increase total shot differential, of their team, while on the pitch. The higher the TSR% the more we know what part of the pitch the game was played in while a player is on the pitch.  Raw on pitch TSR% is okay but does not provide context.  What we see is that good teams have players with high TSR% and bad teams have players with lower TSR%.

The following stat allows us to differentiate players on the same team.


Relative Total Shot Number per 90 (TSR Rel/90) = Total Shot Number of player - Total Number of Team when player not on pitch

TSR Rel/90 begins to tell us who 'drives play' on his particular team.  Helping create a shot for or stopping a shot against are equally important. Over the course of the season we will be able to see who has impacted shot ratios. Obviously, the higher the number the better.

What does a TSR Rel/90 of 1.25 mean? It is saying that for every 90 minutes that a player plays 11v11 versus every 90 minutes that he doesn't play, his team will be better off by a total shot differential of 1.25 per 90.  The higher the number the better.


In Closing

It is by no means a 'perfect' stat. As always, their are question marks.  With 11 players on the pitch their is a lot of noise. The quality of the competition faced by each player, while he is on the pitch, is not accounted for. Game states or score effects are not taken into account. I just do not have the data and time integrate these into the analysis.

The general idea is to find a new way of analyzing players by removing goals from the equation. If we find a player who is very high or low we can look into his performances by watching film and looking at the quality of competition, quality of teammates, game states etc.




Clarke Ruehlen






Tuesday, 23 September 2014

Welcome to Football in the Clouds: An Introduction




Thanks for visiting the blog.

The name is inspired by the late and great football manager, Brian Clough who said, "If God had wanted us to play football in the clouds. He'd have put grass up there".  He was referring to his dislike of 'long ball' or 'aerial' football.

This blog is an analytical site that will look primarily at individual 11v11 on pitch statistics using Total Shot Ratio (TSR) as a proxy for possession and using it to compare player performance over the course of the season.  As far as I can tell, no one else is doing this in the public domain for the Premier League (or any league).

This idea is taken from advanced stat tracking used in ice-hockey which tracks total on ice shots, both, for and against.  This is done at the team level and also at an individual player level. Total shot attempts include missed, blocked, on target and goals.

I am tracking players individual on pitch shooting %, save % and PDO.  PDO is just adding up shooting % and save %.  Simply put, it is the best way of determining good fortune/bad fortune of a player while on the pitch.  PDO is very erratic but does regress toward 100 over time.

I have 12 teams up to date.  The remaining 8 teams will slowly get updated as the season progresses.  I apologize if your team is not ready to go but this is a tedious task.  The 'big' teams will get priority as that is who most will want to see.

Their has been a ton of research on TSR and its predictability of future results in ice-hockey.  James Grayson (@jameswgrayson or  www.jameswgrayson.wordpress.com) brought the idea to Football and he has written a ton of awesome stuff on football analytics.  The stuff he has written about PDO is very, very good.

Ben Pugsley has an excellent site for team data. (@benjaminpugsley or www.objective-football.blogspot.com.es).  He has shots data, passing data, game states and a bunch more.  Ben is also the co-creator of www.statsbomb.com.

The links to the team tables are on the right hand side of the page.


Notes:

- All tables are sortable. (thanks, Ben)

- For ease of calculations and tracking purposes I have only used 90mins as the max a player is on the pitch. I do not count injury time. If a player comes on as a sub at 80 mins, I only mark him for 10 mins even if the game had 4mins of injury time.  Yes, it will skew the relative stats but as the sample size grows it will be less and less relevant.

- All stats are 11v11 only. If a player is sent off the pitch with 2 yellows or a red, data is not tracked from that point forward.  Why? Because 11v11 is by far the most common way the game is played. 11v10 skews the numbers against the team down a man.

-As this is done manually, their may be a few keystroke errors. I have done my best to check (and double check) but if you notice any errors, please bring them to my attention.

- Own goals will count as a shot on goal against.

- I am tracking each players individual shots, shot location (inside/outside box) and setup passes/assists, as well. I will post this data at some point, in the future but most of that information is already available on web.

- Lastly, lets remember that sample size is small, right now.  As the sample increase we can start to reduce the 'noise'.





Cheers,
Clarke