In the previous post we discussed the precision of our model’s result, particularly compared to predictions made by other forecasters and pollsters. On a state-by-state level our BASON method was arguably the most precise, having missed only 3 states: Michigan, Wisconsin and New Hampshire. However, for Wisconsin we did not have enough survey respondents, so we had to use an average of polls, meaning that our method actually only missed Michigan (where we gave Hillary a 0.5 point advantage) and New Hampshire (where we gave Trump a 1 point advantage). Our biggest success however was the correct predictions of Pennsylvania, Florida and North Carolina – the three key swing states that carried this election to Trump.
In this post we will examine our results a bit further and try to find out whether the polls had a problem of underestimating Trump which led to the systematic bias in their predictions.
Let’s start by first taking a look at how close our method was for our predicted states compared to the ones we did not predict. Keep in mind that we had enough data to predict only 30 states (29+DC) while for 21 of them we had to use the polling average (I explain how we made the average in this post). Most of these 21 were traditional Red and Blue states so there was no problem with the polls there (except obviously for Wisconsin which was the big surprise of the election).
The first graph looks at the popular vote the winner got in each state (x-axis) compared with the difference between actual results and our prediction (y-axis), all in percentage points. We separated the 30 states that we predicted (orange dots) versus the 21 states where we used the polling average (green diamonds).
Two immediate conclusions arise: (1) whenever we used the polling average it is clear that the polls underestimated the winner’s performance. In fact there wasn’t a single case in which the polling average gave more to the winner than what he or she actually got in the 21 traditional Red and Blue states. (2) Our model overall was within a 6% margin of error, however the most important swing states (FL, PA, NC, VA) were all within a 1% margin of error (including some others like AZ, MO, NY, TX, KS). Ohio was the only important state where we underestimated the scope of Trump’s victory, even though we correctly called it in his favor. This was the only key race the model suggested would be closer than it actually was. Nevertheless, our success for predicting FL, PA and NC was key to us making the correct prediction overall, as no pollster or forecaster was giving all three to Trump.
The ‘Shy Trump’ voters?
Let’s examine the underestimation of the polling averages a bit further, and only look at the 30 states for which we did make our prediction. We want to see how the polls performed with respect to each candidate, to see which one was more underestimated.
The second graph therefore compares the success of our method (x-axis) with the success of the polling average (y-axis) for the difference between the predicted and actual vote share for Donald Trump. In other words, for the polling average any dots beyond the horizontal line overestimate Trump, while any dots under the horizontal line underestimate him. For our model the overestimation is to the right of the vertical line, and the underestimation is to the left of it.
You can see that our model under and overestimates Trump to a relatively equal extent for all states, being most precise in the most important swing states. On the other hand the polls consistently underestimate Trump in almost every state. The only outlier where they overestimated Trump by almost 6%, was – DC. This implies that the polls systematically and significantly underestimated Donald Trump.
Looking at the same numbers for Hillary Clinton we can see that the polls were relatively good in estimating her chances. For most states they fall within a 2% margin of error, where for about 10 states the polling average was spot on. Our method once again over and underestimated Clinton to an equal extent, being the most precise where it mattered the most.
Taking all this into account, the key to understating the underestimation of Trump by the pollsters was in the undecided voters. In other words the hypothesis of a ‘Shy Trump’ voter could be true – many Trump voters simply did not want to identify themselves as such in the polls, most likely due to their mistrust of the pollsters, or any other equally likely reason. Or they really were undecided until the very last minute, making the final decision in the polling both itself.
Finally, let’s examine this systematic bias a bit further by comparing the calibration of our model versus the polling average (calibration is the difference between prediction and actual results). The following graph shows the difference between predictions (y-axis) and the actual results (x-axis) for our method (blue dots) and the polling average (orange dots). A good prediction should be close to having a slope of 1, which is exactly what our method proved to be (a slope of 1.1). The polling averages on the other hand experienced a flatter slope of 0.77 which confirms a systematic underestimation of Trump even in states which Clinton easily won.
How large is this systematic bias? The following graph can answer this question. It compares the errors of the pollsters for Trump versus the errors of the pollsters for Clinton (errors being the difference between prediction and actual results). A good poll is expected to have a clear linear trend going through the origin, given that the overestimation of one candidate in a given state implies an underestimation of the other (adjusted for the noise from third party candidates). The polling average does exhibit this expected linear trend, however it has a big offset of around 4.3%. This is roughly the size of their systematic bias which led them to underestimate Trump’s chances.
Our method on the other hand has a much lower offset of less than 1%, which was yet another important reason of why our method did not make the same mistake of underestimating Trump.