Saturday 26 May 2012

Deceptive statistics, Szczesny, apples and pears


Statistics is not a simple subject and understanding the validity of rating players based on simple footballing statistics is also a highly complex matter.  In fact I would go as far as saying that most footballing statistics are not really that useful in rating players, some are helpful but many are highly misleading and far more likely to result in fans confusing themselves.

A classic case has always been goalkeeping statistics, anyone who has looked into the subject would have realised that many of the so called best goalkeepers have not had the best statistics, the classic example of this is the '% shots saved' statistic.  How can this be, surely the best goalkeepers will have the best shots saved %? Unfortunately life and football is far far more complex that this, there are numerous confounding factors that make this statistic highly unreliable and not a valid measure of comparing goalkeeper performance.

The way a team defends makes a huge difference, for example teams that defend deep and with numbers will tend to concede more shots from distance, which will help a goalkeeper's stats.  Teams that have a lot of possession with tend to concede fewer shots, but if they defend with a high line and get hit on the counter many of the shots conceded will be extremely good clear cut chances, this will make their goalkeeper's statistics look poor.

Goalkeepers who save lots of shots that are going just wide will improve their statistics by doing this, even though this is technically not good goalkeeping.  Making errors that do not result in goals is often ignored by the statistics charts, so it may just come down to luck as to whether the opponent actually takes advantage of the error, part of this is also related to how good the defence is at helping out the goalkeeper.  I could go on and on and on.  The point is that many of these hard statistics are not a very valid way of measuring goalkeeper performance, it is the classic comparing 'apples and pears' argument.  This is very relevant for Szczesny's statistics, he has been excellent this season, of course he has made a few errors, but 

The same is true for statistics like shooting accuracy and goals/shots ratios, it all depends on where the player tends to shoot from.  One could have a great shooting accuracy and never score, one can also have a mediocre goals/shots ratio because a lot of the shots are long range.  Another classic case is the pass completion rate, for example we all know that the unambitious midfielder who never tried anything risky or attacking can have a great pass completion rate, doesn't make them a great passer though, it may just mean that they are a very cautious player who continually slows attacks down.  

There is a great danger in reading too much into these kind of statistics, Szczesny's statistics being a great example of this, there is very little correlation between certain numbers and performance, this little table demonstrates this perfectly.  I would take Szczesny over De Gea every single day of the week and I suspect many Manchester United fans would too.  The moral of the story is that statistics must be interpreted with great caution, confounding factors are the name of the game.

3 comments:

Al said...

I don't agree......in general for a significant sample size, shot save percentage is a very good metric to compare goalies. It's used in ice hockey all the time. Sure shots are different, but shots quality evens out when goalies are playing against the same teams and when a significant sample size is used. Isn't it familiar that arsenal would dominate and then concede on the first shot faced? Also you can't blame arsenals defense....they conceded the second lowest number of shots .....second to not man city who were thought to have the best defense. In my eyes the goalie situation has improved at arsenal but it's not solved. Go back and see what seamans or lehmanns save percentage wax and it was never 67 percent as Schzenys is. And I wouldn't take schnezybover de gea any time.

1979gooner said...

Disagree, it's a poor measure for the many reasons explained.

If you look at the gk stats each year then you would see there is little correlation of performance with shots saved.

matthew_wood said...

As you say, statistics can be used to prove everything or nothing.

In football, goalkeeper is almost unquestionably the position under the most scrutiny because they conform to a very simple mathematical model: goals/saves = %.

Unfortunately this doesn't take into account defending (although some might argue that the organisation of defence is a GK's prime responsibility), team style and the tempo at which they play.

Although the comparison to Ice hockey leads to comparisons like save percentage (which can be a handy tool but shouldn't be the only method of comparison/contrast), the fact is that in hockey, goaltenders don't stray any further than a couple of feet from their goal, occupy a much larger area of the goalmouth and have little/no use in lead-up play.

Statistics are a great time-waster and debate-starter. Only with large reference points and multiple angles do they truly form a basis for accurate comparison. One number rarely tells the whole story (but makes for easy story starting!).