Thursday, March 27, 2014

Just putting it out there - random thoughts on the Sports Analytics Innovation Summit in London

  • It's great to see so many people working on so many different sports related topics - youth development, from youth development over health related issues to on field and PR strategy
  • The 'driving on the left side' thing is going to kill me one day
  • I think I now know the 'Data Analytics pyramid' by heart. More or less Data -> Information -> Knowledge -> Wisdom. Was probably the most repeated phrase of the weekend. But is in my opinion not the most important general truth
  • In my opinion most important truth was brought to us by Bill Gerrard (imagine the Scottish version of Billy Beane ;) ): Big data can give you the correlation, but small data gives you conclusions
  • To put it in my own terms. I think that a lot of people think that big data gives us the answers. At least in sports, big data is much better suited to lead you to more specific questions

Tuesday, March 25, 2014

Who needs to shoot if you can drive - a case for Tyreke Evans

Over the last five years, Tyreke Evans went from Rookie of the year to being a disappointment. Alongside Josh Smith, he is a punchline for people that shouldn't shoot.
And his shot chart still is red as a baboon butt.
But by in-depth video analysis (I saw two quarters of a Pelicans game recently) and thanks to the data, I saw some weeks ago that there is some value to Tyreke. So, after he got #StatLineOfTheNight honors and praise by Zach Lowe, I decided to be an opportunist and post this figure:
(Click to enlarge)
Tyreke drives the most (per minute) off all players & his team scores quite well on those drives

Thursday, March 20, 2014

Three quick points on Kyle Korver and the Splash Brothers

As Korver could finish the season with the best Effective Field Goal Percentage ever and Steph Curry and Clay Thompson are destroying every 3 point duo statistic there is, some 3 point bullets on them:

Player Filter: >40 games, >24 minutes, 5 3 Point attempts per game
  • You know what, I think the graph speaks for itself (click on it for a bigger version)
Have a nice day everybody,

Tuesday, March 18, 2014

Data dumps - the problem with low hanging fruits in sports story telling

Disclaimer: The reason I picked recent articles by Kirk Goldsberry and John Schumann for this post is not because they are doing a bad job. The reason is that I like reading their work and thus their articles catch my attention easily. Even my last post ignores some of the things that I'm about to criticize. But, as every up and coming rapper would tell you - the best way to make it in the business is by writing a diss track about the big fishes. ;) (Note: I hope I'll not end up as the Benzino to Kirk's Eminem)
So, here we go:

A tale of two players
Imagine two players taking shots from the right corner of a basketball court. Both of them previously shot 8 of 20 (40%) from beyond the arc at that spot. Now, player A makes the next 5 shots, which raises his shooting percentage to 52%. Player B misses his next 5 shots, which drops his percentage to 33%.
Question 1: How sure are you that player A is a better three point shooter from the right corner than player B.
The scientific answer: There is only an 85% probability that A truly shoots better than B, or in vague terms 'it could be true, but you would not be able to publish it as a scientific result'.
Question 2: How sure are you, if I tell you that player A is Stephen Curry from last season and player B is Stephen Curry from this season? (Note: Up to now, Curry took 22 shots from that position)
So, it is more than a bit misleading, if Kirk uses the term 'Kryptonite' to describe his 33% shooting from that position. This leads to comments by readers like 'Any theories on why he's so much worse from the right corner 3?', followed by others that try to find a reason. The true reason is most likely random noise in making or missing a shot.1

Thursday, March 13, 2014

The 8th man starters - Part I

Nate Who!?
Nate Wolters (22.3 minutes per game). This is my personal answer to the fun game 'go to sort players by minutes per game and name the first one for whom you have no clue which position he plays' (Fun is a loose term here, but it could be a nice game between basketball nerds. Just like limbo it's about who gets the lowest). Well, being a Buck probably doesn't help with getting recognition.
But it is a nice start to my question: 'If you had to pick five 8th men to form a team - who would you pick?' or to put it differently 'Which five players could start for the 76ers?'. To find something like an answer, I will use data from and collected on 7th of March 2014. To be 8th man eligible, a player had to play 12 to 24 minutes and in at least 30 games. I will mostly use percentages or Per36 values and will give data of Starters (at least 30 minutes per game) as benchmarks.
To make my life easier and because positions are becoming more and more vague in any case, I will pick my team as one Point Guard, two Wings and two Bigs. There are two great stats on that - normalized by minutes - can be directly used as a filter to automatically divide my players into those three groups: One being time of ball possession and the other defended opponent field goal attempts at the rim. Plotting those two stats against each other we can easily see how we have to set the threshold for each group. By overlapping those thresholds a bit, we assure that we don't miss out on anybody.
(click to enlarge)