Today's Data Exercise: The @fivethirtyeight / Intrade Presidential Election Arbitrage #Analytics
(Nerd alert! You have been warned.)
Unoriginally, I'm a big fan of Nate Silver's fivethirtyeight blog. I've learned a ton from him (currently also reading his book The Signal and the Noise). For a little while now I've been puzzling over the relationship between his "Nowcast" on the presidential election and the price of Obama 2012 contracts at Intrade. Take a look at this chart I made based on the data from each of these sources:
If we look past Obama's disastrous first debate, and look at the difference between the seven-day moving averages of the 538 Obama win probability and the Intrade Obama 2012 contract price, it looks to fluctuate roughly around 10-15 points, call it 12. Also, looking at the volumes, it looks like the heaviest trading happens roughly around midweek, before Friday. So if you trust Nate's projections, and unless you've got inside scoop about any big negative surprises to come, the logical thing to do is to buy Obama 2012s tomorrow, with an average probability of clearing $1.20 on each contract (about a 20% gain).
Now for the nerdy part:
First, the easy job: Intrade lets you download historical prices on its contracts.
Next, the harder job: Nate doesn't provide a .csv of his data. But if you "view source" on his page, you'll see a file called:
right after a preceding description "Data URL".
If you take a look at this file, you'll notice it's Javascript-chart-friendly, but as far as for the kind of analysis above, not so much. The first order of business was to cut out the stuff I didn't want, like the Senate race data, and the forecast part of the presidential polls. Then, I further whacked out data before 10/1, because I thought examining trends in a more thinly-traded market would be less relevant.
For a little while I fiddled with the Stanford Visualization Group's Data Wrangler tool to reshape the remaining data into the .csv I needed. It's a powerful tool, but it turned out to be easier in this case to wrangle the file structure I wanted manually:
"date","obama_votes","romney_votes","obama_win_pct","romney_win_pct","obama_pop_vote","romney_pop_vote"
"2012-10-30",298.8,239.2,79.5,20.5,50.4,48.6
"2012-10-29",294.4,243.6,75.2,24.8,50.2,48.8
etc.
Combining the Intrade and 538 data and then plotting the Intrade close and the "Obama win pct" series results in the chart above.
Comments