Sunday, April 19, 2015

Boston Elite Field 2015

Last year I posted about how chances of a non-African country winning the Boston Marathon seemed to be good because of the widening interval of winning times (more recently there had been some historically "slower" races and some historically "faster" ones) and this actually happened.   Meb Kflezighi ran a remarkable race and was widely celebrated as he represented the US in a race more recently dominated by African countries.  His time for winning the race was obviously the fastest, but others in the field had faster PRs.  Because of the variation in winning times my conclusion has been that this provides opportunities for certain runners representing non-African countries to contest the race well.


The amount of participants from Africa in the elite field clearly increases the likelihood that the winner represents an African country.  The runners in the elite field mostly fall into or below the confidence interval shown in the graph above with the slight exception of Matt Tegenkamp whose PR for the marathon is 2:12 ish, just above where this statistical measurement would encompass.  It is clear that once again the elite field is dominated by African runners who are putting up some really impressive PRs.



And yet, with the difference in PRs, last year there was a similar dynamic.  Dennis Kimetto comes to the race with a 2:03 PR and Meb Kflezighi wins the Boston Marathon having run a 2:09 PR previously.  Thus we have another great story this year.  Incredible athletes, some of whom have in the past run much faster than others.  And yet, who can tell what will happen race day.

But why try?  Why did Meb think he could beat someone who in marathon terms could go somewhere he could not?  More broadly, why do we love these events?  Why should Matt Tegankamp attempt to rival someone who would be 2 miles ahead of him on each of their best days?  Variance.  Within these elite athletes there is the notion that on any given day, the guy next to you could be at his best or worst.  As spectators, we're drawn to variance...we love possibilities of things not turning out predictably, or that there is variation in what we assume to be true.  Athletes place their hopes in this, that they could run their absolute best and others may not.  Confidence intervals tell the story of variance, that statistically we can't know for certain.  I think this year yet again, we could see this same variance play out.  The athlete that doesn't have the fastest PR runs their best despite the odds.  This is what makes a great race and what we could see again tomorrow.

Monday, January 26, 2015

Presidential Approval and Applause

Some may have seen a twitter post about spurious correlations that myself and others mentioned on twitter.  Basically this was a joke about how correlation can be found in many things that certainly have no influence over each other.  I mention this because this post may or may not be in that category ;-)

About this time last year I looked at the two most recent State of the Union speeches and talked about the political priorities ostensibly shown in each.  For those that don't know, The State of the Union is the speech that the President of the United States delivers at the beginning of each year to a joint session of Congress (that is, both House and Senate).  For the most part or at least traditionally the aim of this speech is to outline the priorities for the next year for the President's office and to give a bit of an idea where the United States is at in general, or the "state of the union".

One of the more nuanced parts of the speech is that there are periods where the President is either interrupted with applause by members who feel what he is saying is good, or where he pauses to allow for applause (typically from his party).  The speeches are fairly lengthy.  The past several years these speeches have averaged about an hour.  Turns out applause is definitely a big part of the speech (it's polite afterall).  For President Obama's terms in his speeches, the word "applause" appears more than any other word (outside articles).  If applause lasts about 10 seconds on average, we're looking somewhere around 12-13 minutes of total applause during his speeches.  This doesn't take account the length of the applause times as in the text of the speeches it is only shown as "(applause)".

So what's the point other than that's a lot of clapping?  I wanted to look at if there was any similarities between the applause being given and the President's approval rating.  Appropriately, I'll be using a popular graph theme from the political analysis etc. site fivethirtyeight to display this brief analysis.  The theme was actually put together in R here, by Austin Clemens (thanks!).


So just by looking at the two lines, one indicating the number of times applause occurs during the speech, the other indicating the % approval (though the scale on the left not in % terms), we can see that it doesn't change a lot.  Except for two years, 2010 and 2014.  In 2010 he received 50% more applause than in the other years and about 30% more in 2014.

The question becomes, is the applause tactical to show support for the president by the party in a period of lessening approval?  The correlation coefficient was -.50, but as with spurious correlations, this very well could speak nothing of the influence of approval on the amount of applause.  In general just looking at the graph, the percentage change isn't the same for Applause and Approval however we can see that in general the change year over year between Applause and Approval certainly has an inverse relationship.   Meaning the line moves up for Applause between 2009 and 2010 then the line moves down for Approval rating for the same period.

Showing support and unity for a party leader by applauding is certainly reasonable, especially when support may be lacking from the general public.  Guessing as to whether this is considered before in response to the approval rating is more difficult.  Then again, it's a bit more fun to think that members of Congress would tactically use this:


  Code for this will appear on my Github page.

Monday, January 5, 2015

How Do You Create a Movement?

Assuming your idea has the potential that is.

The science behind social networks is not only intriguing from how we look at relationships but also how we consider the way ideas are spread...and conceivably how a "movement" that has its basis in an idea would be created.

Consider this simple social network you have and that you have an idea that you want to see turn into a dynamic movement that perpetuates itself across multiple social groups.

*Click image to open up interactive network

Simple enough to where YOU can relate to having this many friendships and that maybe some of your friends know each other.

If your friends are like mine then they have friends, some of them you don't know and although there's a greater chance you will meet them since you are friends with their friend, you don't know them now.  This dynamic of having a common relationship of some kind in common with someone else is known as a "degree of separation" between you and that person.  So in the network above we would say that Mike (because he is friends with you) has one degree of separation between John and Jane.

There is a theory in Social Network Science called "Small-World Theory".  The theory suggests that human society can be characterized by a network of people.  This may be more familiar to some with the theory of "Six degrees of separation", which basically says that human society can be six or few degrees of separation away by way of introduction.  Let's look at another graph and eventually show how this would play out in an idea being communicated to people we would have no way of knowing except by the randomness of relationship of someone we know who knows someone, who knows someone....

Let's consider the kind of idea that would create a movement spreading through relationships.  With the internet today I think many would argue that ideas with the potential of creating a movement could be spread electronically via social media, email, etc.  Let's just assume it takes personal interaction to spread an idea that's going to be cogent enough to create a movement (which could be argued just not here, not now).  Suppose I share this idea with Jane and John and they don't really care.  But I share it with Mike and he tells three of his friends.  Mary (one of Mike's friends) shares this idea with  many of her friends.  

*Click image to open up interactive network

So the idea that you were wanting to share has now been heard by a totally different group, all of whom you do not know.  Now consider Mary's friends.  This opens up an even larger network for this idea to communicated with their friends.  However, one of Mary's friend who is only an acquaintance to her has access to even more people than Mary.  Another question arises which is the strength of these connections.  Does the movement of an idea rely on the perceived strength of connections that Mary has to all her network or can the "weak" connections (those maybe that she doesn't value or people she doesn't think are going to give momentum to this idea) be beneficial to the spread of this idea?  Not all relationships are equal, but the communication of an idea is.  Consequently, Mary may open up more people who could take hold of this idea and create a movement from a weak connection...say Roselia.

*Click image to open up interactive network

Now Roselia also had a weak connection that she shared this idea with who happened to have just as many connections that he shared this idea.  You can see how a movement (at least what we will call one) can begin from YOU only having a few people who share your idea.  The spread of an idea into a movement however unlikely it is perceived, is countered by the potential relationships of those with whom it is shared.  So starting a movement starts with sharing ideas with a few people who are willing to share, who are willing to share, who are willing to share...