Thursday, July 10, 2014

Syrian Refugee Density in Lebanon

I've done a few posts on Syria and have used data provided by the UNHCR for different analysis or visualization.  There are several links on their Syrian refugee data portal that communicate the breadth of this crisis numerically and visually.

One such link had the locations of settlements in Lebanon and the number of people in each settlement.  This information is undoubtedly helpful for coordinating the location of services within camps and in general tracking how they grow.  I was interested in seeing the growth of these camps, where and during what time periods the most growth is seen.

Below is a map of the country (or most of it) showing where all the tents are located that have been documented by the UNHCR.  Overlaid is a density plot communicating the concentration of structures (tents) with the number of people housed per structure.

The concentration of settlements has clearly been just outside of a town called Zahle.  If we look more closely at Zahle we can more clearly see the number of people per tent in this settlement.  On average across Lebanon there are over 6 people per tent based on this UNHCR dataset, some as high as 12.

If looked at year by year, we can see how during different years this area was settled more heavily.  2013 was a year where a significant amount of tents were constructed or setup.  

No doubt this data is being used to coordinate the location of different public facilities such as clinics, etc.  Data such as this provided by UNHCR serves burgeoning communities with much needed information in how to setup a town or "plan" for how this settlement could be organized or mitigated differently.  The code for this data and graphs, or at least most of it is available on my Github account.  

Tuesday, June 10, 2014

NBA Drafting

The draft for the NBA is quickly approaching.  Much effort on the part of teams goes into selecting the correct assets in a player to complement what a team needs.  Drafts also come in on much cheaper contracts than their more veteran counterparts and are therefore desirable from a value standpoint.  It becomes increasingly important then what pick a team gets and even more so how well they select their draft pick (No. 1 or No. 2 picks not always dictating a high level of performance).  In a recent interview with Bleacher Report, the head of analytics for the Denver Nuggets spoke a little about how they evaluate draft picks.  He made some interesting comments about numbers his franchise evaluates as they consider their draft picks.  Specifically, rebounds were an important metric that was actually translated as a "hustle stat".  I looked at the numbers to see if what he was saying was actually true over the last few years.  Turns out total rebounds per game is an important metric for increasing a draft's chance of being chosen as a top five pick.

I pulled draft year stats for the past 4 years for the top 30 picks from the good people at Basketball-Reference to see if their were any metrics that increased a players chances of being a top five pick each year.  Only drafts who had college stats were used for this analysis.  I wanted to see which per game metrics changed the likelihood of a top 5 selection in the draft class.  What I found is in the decision tree below.  I divided points per game, field goal attempts per game, minutes per game, and total rebounds per game into quartiles.  I lumped the bottom two quartiles together as the "Lower Quartile" and then the "Middle Quartile" and "Upper Quartile".

Decision Tree for Top Five Draft Pick for 2009-2013

As you can see from the tree above rebounds per game not in the "Upper Quartile" have a top five selection probability of .13.  The probability of being a top five pick is .33 if the draft's points per game are in the "Lower Quartile" or in the lower 50% (average or lower) of their draft class and their rebounds per game are in the "Upper Quartile".  Alternatively, if the draft has rebounds per game in the "Lower Quartile, or less than the 50th percentile, and their points per game is in the top 50th or 75th percentile their probability of being a top five pick is only .19.  Rebounds it seems are even more important than having someone who can score in the "Upper Quartile" of a draft class on a per game basis.

Alternatively, if a draft pick gets playing time per game in the "Upper Quartile" and has "Middle Quartile" or "Lower Quartile" points per game, this also yields a probability of .33 of a top five pick.  I interpret this as, if you don't have hustle but have had a lot of playing time in college, this increases the top five pick probability.  I don't interpret minutes per game in a college season to be as meaningful as rebounds simply because minutes per game is more of a college coaching decision that may or may not be relative to player performance.  Minutes are primarily a function of performance and not a stand alone performance metric.

Rebounds matter for draft picks.  Players not showing a strong "hustle stat" have a lower probability of being a top five pick within the respective draft class, unless they have happen to have played a lot of minutes, then this also had a higher top five pick probability.  The competition, shooting distances, and rules are obviously different in the NBA from college and some of the college stats may not translate into professional performance.  That being said this analysis does indicate that the "hustle stat" or rebounds are meaningful for teams other than just the Nuggets.  High performance specifically in this metric increases the probability of being selected as a top five pick in the NBA draft.

Those interested in the R code can find most of it here.

Wednesday, May 14, 2014

NBA Playoffs 1st Round Comparison

Most people I've talked to feel like they've watched an entire NBA Playoffs series after seeing the first round of this year's playoffs.  It's been amazing basketball.  I've said before, we are watching players whose numbers resemble that of other "golden eras" of the NBA.  

The first round for the most part over the past few years has meant a few games in overtime each year.  As far as this tournament goes, the first round isn't necessarily the most competitive because of the match-ups.  Teams are "seeded" based on regular season performance in each conference and in general teams that are "seeded" further apart will not have as competitive a match up in the first round, thus the advantage to perform well in the regular season to get the 1 seed.  The western conference has had some amazing games in the first round.  Several of the games have gone into overtime.  In fact, more than the last several years combined.  

Below is a simple network of playoff games that have gone into overtime 2010-2013.  Each line indicates a series that was played and each arrow is a game.  Blue=Western, Red=Eastern, and the width of the line is the margin of victory (gotta looks closely) and the arrow points to the visitor away from the home team.  

NBA Playoffs 1st Round 2010-2013

The margin of victory for instance in the Bulls-Nets game last year was 8 points (also 3OT).  For all these games, the average margin of victory was 6 points.  Most of these games you will notice are Eastern conference teams.

NBA Playoffs 1st Round 2014

2014 has been a different story.  In just the Memphis-Thunder match up we've seen 4 games go into overtime (it's been a grind).  These two teams were very competitive and did not demonstrate in general, the match up expectations we have based on "seeding".  The average margin of victory for all these games was 3 points.  The west has been very competitive in the first round.  There have been more first round match ups that have gone into overtime than the past 4 years combined and then the margin of victory has on average been half of what the margin was 2010-2013.

Stamina.  The players for the teams that have advanced are going to need it and most of the people I've talk to need it just to watch the games.  So if you feel like you've watched an entire series, those feelings are legitimate...and we're only out of the first round.