Interactive chart for Citi Bike data

I can't get enough of the Citi Bike data. Here is another look at the data via an interactive graph. You can change the x-axis on the scatter plot and see how it modifies the plot and correlation.

Screenshot of rCharts graph.

Screenshot of rCharts graph.

This chart was made using R and the R packages Shiny, ShinyApps, and rCharts. You can find the code on Github.

Many thanks to:

Which NBA franchise drafted best from 1990-2009?

Conventional wisdom says certain NBA front offices (San Antonio, Seattle/OKC, Houston) are better at picking young players in the draft. Can we test this hypothesis with data?

One approach is to look at WS (Win Shares). Basketball WS are a way to approximate a player’s value by aggregating points, assists, rebounds, efficiency, turnovers, etc in a single value. Here is a full breakdown.

WS are not a perfect proxy for value, but they do allow for nice comparisons. Check out the list of all time career leaders in WS.

See the Pen NBA top career WS by Joe Jansen (@joejansen) on CodePen

The data

We need to figure out:

  1. The expected value in WS for each pick. For example, the first pick turns out to be worth 77 WS on average (explained below).

  2. How much each pick varied from the expected value (Shaq was selected first in 1992 and had 182 WS over his career, so he was a better than average first pick ).

  3. The cumulative difference between expected and actual value for all the picks each franchise made.

Note that this approach doesn’t necessarily test for good outcomes. Charlotte drafted Kobe Bryant 13th in 1996 (173 WS so far) but didn’t get full value for him since they had agreed to trade him right away (for Vlade Divac who produced 54 WS for the rest of his career).

This analysis will have a narrow focus: With the pick each franchise had, what was the difference in expected and actual WS (regardless of what they did later with that player). 

Modified WS for younger players

The data set (1990-2009) includes a lot of players who are still playing and thus producing WS. Lets use a multiplier to estimate the career length of every player drafted after 2000. Career length is highly dependent on how good a player is and injuries. We can't predict injuries, but we can estimate longevity based on minutes per game.

  • Less than 12 minutes per game for a career = 2.01 seasons.
  • More than 12 minutes per game for a career but less than 20 = 5.01 seasons.
  • More than 20 minutes per game for a career but less than 25 = 7.59 seasons.
  • More than 25 minutes per game for a career but less than 30 = 9.21 seasons.
  • More than 30 minutes per game for a career = 10.88 seasons.

Now, Kevin Durant, who was drafted 2nd in 2007, has a Modified WS of 133 instead of his current 73 WS. Once again, this is not a perfect method, but it will help us make comparisons.

This analysis will underestimate WS for superstar players who will have long careers, but we can't see the future (if LeBron is a cyborg, however, a 30 year career seems like a safe bet). 


Expected value

Lets figure out the expected value in WS for each pick. A simple average of 20 years of data produces this table.

See the Pen Expected WS value by draft pick by Joe Jansen (@joejansen) on CodePen

Note that the 13th and 21st pick are in the top 10 for expected value while the 6th, 8th, and 12th picks have underperformed in this time period. There have been lots of duds at 6 and 8, while Kobe at 13 and Rondo and Michael Finley at 21 boost their respective picks.

Difference between expected and actual WS

Now that we have the expected value we can subtract it from the actual value for each pick. This creates a fun list of the best and worst picks.

See the Pen Top 10 picks by WS delta by Joe Jansen (@joejansen) on CodePen

See the Pen Worst 10 picks by WS delta by Joe Jansen (@joejansen) on CodePen

Which franchise was best?

Finally, we can add up the delta WS for each franchise’s picks over the past 20 years.


Cleveland is first?! Lets look at this more closely. Cleveland had a number of picks that dramatically outperformed their expected WS.

Notably, many of those players have spent large chunks of their career with other teams. 

Cleveland's draft picks that produced negative value weren't as damaging as other teams. 

See the Pen Negative CLE WS deltas by Joe Jansen (@joejansen) on CodePen

Success in the NBA isn't just about drafting well, but also developing talent, and making savvy signings and trades. I expected to see San Antonio and Seattle/OKC near the top of this list, but Cleveland and Phoenix are a surprise. 

Thanks to Basketball Reference for great data! You can take a look at my analysis here.

Update 12/23/13: I've added this data to a Shiny R web app.