Wednesday, 7 November 2012

Gotcha!

I should start this with a disclaimer, ie that I'm not really claiming any "success" with this post. But I find it quite interesting that the estimations I produced with this very, very simple model turned out to be quite good.

The idea was to use the existing polls (that was a few days ago, even before the super-storm), which had been collated and presented in terms of an estimation of the proportion of voters for either party, together with some measure of uncertainty. Based on these, I constructed informative prior distributions, which I have then propagated to estimate the election results.

As it turns out, according to the projections of the final results, the prediction was accurate, as the following graph shows: the dots and lines indicate the average prediction and a 50% (darker) and 90% (lighter) credible intervals; the crosses are the observed proportions for Obama.

In all states, the prediction was "correct" (in the sense that the right "colour" was estimated). In some cases, the observed results were a bit more extreme than the observed ones, eg in Washington (WA) the actual proportion of votes for Obama is substantially larger than predicted $-$ but this has no real consequences on the final estimation of the election results as WA was already estimated to be a safe democratic state; and this is true for all other under/over estimated cases.

My final estimation was that, based on the model, I was expecting Obama to get 304 EVs. At the moment, the Guardian is reporting 303 $-$ so pretty good!

But, as I said, this is really not to brag, but rather to reflect on the point that while the race was certainly close, it probably wasn't as close as the media made it. Famously, Nate Silver gave Obama a probability of winning the election exceeding 80%, a prediction which has given rise to some controversy $-$ but he was spot on.

Also, I think it's interesting that, at least in this case, the polls were quite representative of the "true" population and what most people said they would do was in fact very similar to what most people actually did.

14 comments:

  1. Awesome, dear Gianluca! I'll give it a thorough look as soon as I have some free time! Nice job!

    /federico

    ReplyDelete
  2. Can you please post the R code you used to download/process the poll data and build the model(s) that predicted the results? Thanks!

    ReplyDelete
    Replies
    1. Inkhorn, the code is in this post; I think you will find everything you need, including links to the original files (which I have downloaded manually from the sources cited in there).

      Delete
  3. Really Great!!
    I successfully reproduced your code in "Bayes for President!", but I could not get coefplot2 to compile. Are there better sources than R-Forge?

    ReplyDelete
    Replies
    1. Thanks! Perhaps this can help you $-$ I think there is an issue with coefplot2 in non-updated versions of R.

      Delete
  4. This comment has been removed by the author.

    ReplyDelete
  5. Great follow up post. I appreciated both the original and follow up. One point... You say ``In all states, the prediction was "correct" (in the sense that the right "colour" was estimated)'', however, I believe that the model incorrectly predicted Florida as a red state. I may not be understanding the original depiction or your follow up statement. In any event you'll become really popular in 2 more and even more so 4 more years. Thank you for your posts.

    ReplyDelete
    Replies
    1. This comment has been removed by a blog administrator.

      Delete
  6. Hi Gianluca, How did you install the coefplot2 package? I am trying to install it on the newest R (2.15.2) on Ubuntu but failed.

    Thanks.

    ReplyDelete
    Replies
    1. Linguist: I don't really remember whether there was any particular trick that needed to be followed to install coefplot2... Do you also have the library "arm" installed? You probably know this already, but coefplot2 is just a modification of the function coefplot (which is part of arm); so perhaps you need both, to work?

      Delete
  7. Thanks Gianluca. I am trying to replicate your scripts and I got this error message:

    > a <- b <- numeric()
    > for (s in 1:Nstates) {
    + if (m[s] < .5) {
    + bp <- betaPar2(m[s],.499999,Confidence[s]/100)
    + a[s] <- bp$res1
    + b[i] <- bp$res2
    + }
    + if (m[s] >=.5) {
    + bp <- betaPar2(1-m[s],.499999,Confidence[s]/100)
    + a[s] <- bp$res2
    + b[s] <- bp$res1
    + }
    + }
    Error in b[i] <- bp$res2 : object 'i' not found

    ReplyDelete
  8. Linguist, there's a typo $-$ the code should be b[s] (instead of b[i]). That should do it!

    ReplyDelete