Tuesday, September 30, 2014

Syrian Refugee Settlement Clinic Locations

Previously I posted about the location of refugee settlements and how that had grown in density over time as well as in numbers.  As many NGOs and non-profits work in the area, they are providing much needed assistance to the people living around the Zahle area.  I wanted to look at the area again because of the breath of the crisis with Syria and the potential long-term locating of Syrians in Lebanon.  Services such as clinics have been established in these camps, which may or may not have taken into account the ability to service refugees (such planning considerations may not be possible in these circumstances) at optimal locations.  For long-term planning these are more important considerations by whomever the governing body for these settlements becomes.

Below is a map of settlement locations in the Zahle district provided by the UN Syria Data Portal.  Each point represents multiple tents in the settlement.

The overall consideration for clinic location will be on the basis for the level of service per person.  Based on a general criteria of having 1 clinic per 15,000-20,000 people, we can allocate about 4 clinics to the area.  The method(s) to determine these locations utilized both kmeans method of determining mean point in a cluster and a location analysis algorithm that considers the weights of points for determining a location (special thanks to the author(s) of orloca, kmeans, and the always helpful ggplot2 packages in R).

For these purposes latitude and longitude of tent settlement locations are the most helpful.  Here the settlements or points are colored according to the population of that settlement.

As you can see, some settlements hold many more people than others and the average settlement is about 187 people (again we're talking many tents per settlement).  Since the distribution of people in settlements is not equal we consider the "weight" (settlement population) for each point when determining a clinic location.  

The clinics are located most closely to those settlements with the highest number of people.  In the central Zahle area, these locations are about in the middle from a Latitude standpoint.  Other locations are perhaps less intuitive if the population of settlements were not considered.  Obviously with more clinics these points would change, but this is considered a minimum service level.

Using only this method to determine the location of a clinic would be problematic from the standpoint of what is actually on the ground with reference to street access or other local contingencies.   Planning for medical facilities is more of an exercise for long-term planning considerations than emergency or relief medicine which may have more short-term goals such as providing care at all.  Starting with taking into account the number of people being serviced and their location are important considerations as these camps become potentially longer-term obligations.

Those interested in the R code can find it here.

Thursday, July 10, 2014

Syrian Refugee Density in Lebanon

I've done a few posts on Syria and have used data provided by the UNHCR for different analysis or visualization.  There are several links on their Syrian refugee data portal that communicate the breadth of this crisis numerically and visually.

One such link had the locations of settlements in Lebanon and the number of people in each settlement.  This information is undoubtedly helpful for coordinating the location of services within camps and in general tracking how they grow.  I was interested in seeing the growth of these camps, where and during what time periods the most growth is seen.

Below is a map of the country (or most of it) showing where all the tents are located that have been documented by the UNHCR.  Overlaid is a density plot communicating the concentration of structures (tents) with the number of people housed per structure.

The concentration of settlements has clearly been just outside of a town called Zahle.  If we look more closely at Zahle we can more clearly see the number of people per tent in this settlement.  On average across Lebanon there are over 6 people per tent based on this UNHCR dataset, some as high as 12.

If looked at year by year, we can see how during different years this area was settled more heavily.  2013 was a year where a significant amount of tents were constructed or setup.  

No doubt this data is being used to coordinate the location of different public facilities such as clinics, etc.  Data such as this provided by UNHCR serves burgeoning communities with much needed information in how to setup a town or "plan" for how this settlement could be organized or mitigated differently.  The code for this data and graphs, or at least most of it is available on my Github account.  

Tuesday, June 10, 2014

NBA Drafting

The draft for the NBA is quickly approaching.  Much effort on the part of teams goes into selecting the correct assets in a player to complement what a team needs.  Drafts also come in on much cheaper contracts than their more veteran counterparts and are therefore desirable from a value standpoint.  It becomes increasingly important then what pick a team gets and even more so how well they select their draft pick (No. 1 or No. 2 picks not always dictating a high level of performance).  In a recent interview with Bleacher Report, the head of analytics for the Denver Nuggets spoke a little about how they evaluate draft picks.  He made some interesting comments about numbers his franchise evaluates as they consider their draft picks.  Specifically, rebounds were an important metric that was actually translated as a "hustle stat".  I looked at the numbers to see if what he was saying was actually true over the last few years.  Turns out total rebounds per game is an important metric for increasing a draft's chance of being chosen as a top five pick.

I pulled draft year stats for the past 4 years for the top 30 picks from the good people at Basketball-Reference to see if their were any metrics that increased a players chances of being a top five pick each year.  Only drafts who had college stats were used for this analysis.  I wanted to see which per game metrics changed the likelihood of a top 5 selection in the draft class.  What I found is in the decision tree below.  I divided points per game, field goal attempts per game, minutes per game, and total rebounds per game into quartiles.  I lumped the bottom two quartiles together as the "Lower Quartile" and then the "Middle Quartile" and "Upper Quartile".

Decision Tree for Top Five Draft Pick for 2009-2013

As you can see from the tree above rebounds per game not in the "Upper Quartile" have a top five selection probability of .13.  The probability of being a top five pick is .33 if the draft's points per game are in the "Lower Quartile" or in the lower 50% (average or lower) of their draft class and their rebounds per game are in the "Upper Quartile".  Alternatively, if the draft has rebounds per game in the "Lower Quartile, or less than the 50th percentile, and their points per game is in the top 50th or 75th percentile their probability of being a top five pick is only .19.  Rebounds it seems are even more important than having someone who can score in the "Upper Quartile" of a draft class on a per game basis.

Alternatively, if a draft pick gets playing time per game in the "Upper Quartile" and has "Middle Quartile" or "Lower Quartile" points per game, this also yields a probability of .33 of a top five pick.  I interpret this as, if you don't have hustle but have had a lot of playing time in college, this increases the top five pick probability.  I don't interpret minutes per game in a college season to be as meaningful as rebounds simply because minutes per game is more of a college coaching decision that may or may not be relative to player performance.  Minutes are primarily a function of performance and not a stand alone performance metric.

Rebounds matter for draft picks.  Players not showing a strong "hustle stat" have a lower probability of being a top five pick within the respective draft class, unless they have happen to have played a lot of minutes, then this also had a higher top five pick probability.  The competition, shooting distances, and rules are obviously different in the NBA from college and some of the college stats may not translate into professional performance.  That being said this analysis does indicate that the "hustle stat" or rebounds are meaningful for teams other than just the Nuggets.  High performance specifically in this metric increases the probability of being selected as a top five pick in the NBA draft.

Those interested in the R code can find most of it here.