Over the next month Oraclum I.S. will be engaged in forecasting the outcome of the 2016 US Presidential elections. It has been a hectic electoral year, culminating in the unexpected nomination of Donald Trump from the Republican Party and the unexpectedly difficult path to nomination for the Democratic candidate Hillary Clinton. So far, according to the polls, Hillary Clinton is in the lead, however the extent of her advantage over Trump has had its ups and downs during the summer.
However, the problem is that not everyone trusts the polls anymore. Which is suprising to a certain extent given that in the US, unlike the rest of Europe, pollsters and particularly polling aggregation sites (like FiveThirtyEight) have on aggregate been quite accurate in their predictions thus far, at least for the Presidential elections (although the polls themselves are not a prediction tool, they are simply representations of preferences in a given point in time). Still, one cannot escape the overall feeling that pollsters are losing their reputation, as they are often being accused of complacency, sampling errors, and even deliberate manipulations.
There are legitimate reasons for this however. With the rise of online polls, proper sampling can be extremely difficult. Online polls are based on self-selection of the respondents, making them non-random and hence biased towards a particular voter group (the young, the better educated, the urban population, etc.), despite the efforts of those behind these polls to adjust them for various socio-demographic biases. On the other hand, the potential sample for traditional telephone (live interview) polls is in sharp decline, making them less and less reliable. Telephone interviews are usually done during the day biasing the results towards stay-at-home moms, retirees, and the unemployed, while most people, for some reason, do not respond to mobile phone surveys as eagerly as they once did to landline surveys. With all this uncertainty it is hard to gauge which poll(ster) should we trust and to judge the quality of different prediction methods.
Is it possible to have a more accurate prediction by asking people how confident they are that their preferred choice will win?
However, what if the answer to ‘what is the best prediction method’ lies in asking people not only who they will vote for, but also who they think will win (as ‘citizen forecasters’) and more importantly, how they feel about who other people think will win? Sounds convoluted? It is actually quite simple.
There are a number of scientific methods out there that aim to uncover how people form opinions and make choices. Elections are just one of the many choices people make. When deciding who to vote for, people usually succumb to their standard ideological or otherwise embedded preferences. However, they also carry an internal signal which tells them how much chance their preferred choice has. In other words, they think about how other people will vote. This is why, as game theory teaches us, people tend to vote strategically and do not always pick their first choice, but opt for the second or third, only to prevent their least preferred option from winning.
When pollsters make surveys they are only interested in figuring out the present state of the people’s ideological preferences. They have no idea on why someone made the choice they made. And if the polling results are close, the standard saying is: “the undecided will decide the election”. What if we could figure out how the undecided will vote, even if we do not know their ideological preferences?
One such method, focused on uncovering how people think about elections, is the Bayesian Adjusted Social Network (BASON) Survey. The BASON method is first and foremost an Internet poll. It uses the social networks between friends on Facebook and followers and followees on Twitter to conduct a survey among them. The survey asks the participants to express: 1) their vote preference (e.g. Trump or Clinton); 2) how much do they think their preferred candidate will get (in percentages); and 3) how they think other people will estimate that Trump or Clinton will win.
Let’s clarify the logic behind this. Each individual holds some prior knowledge as to what he or she thinks the final outcome will be. This knowledge can be based on current polls, or drawn from the information held by their friends and people they find more informed about politics. Based on this it is possible to draw upon the wisdom of crowds where one searches for informed individuals thus bypassing the necessity of having to compile a representative sample.
However, what if the crowd is systematically biased? For example, many in the UK believed that the 2015 election would yield a hung parliament – even Murr’s (2016) citizen forecasters (although in relative terms the citizen forecaster model was the most precise). In other words, information from the polls is creating a distorted perception of reality which is returned back to the crowd biasing their internal perception. To overcome this, we need to see how much individuals within the crowd are diverging from the opinion polls, but also from their internal networks of friends.
Depending on how well they estimate the prediction possibilities of their preferred choices (compared to what the polls are saying), the BASON formulates their predictive power and gives a higher weight to the better predictors (e.g. if the polls are predicting a 52%-48% outcome in a given state, a person estimating that one candidate will get, say, 90% is given an insignificant weight). Group predictions can be completely wrong of course, as closed groups tend to suffer from confirmation bias. On the aggregate however, there is a way to get the most out of people’s individual opinions, no matter how internally biased they are. The Internet makes all of them easily accessible for these kinds of experiments, even if the sampling is non-random.
So, over the coming month Oraclum will be conducting the survey across the United States. The survey will start on October 10th and will run up until Election Day (November 8th) when we will provide our final forecast. Our forecasts will show the electoral votes, the total predicted percentages, and the probability distributions for two main candidates, all presented on the map of US states. They will also show the distribution of preferences for the friends of each user (so that the user could see how his or her social network is behaving and who they, as a group, are voting for), and the aggregate predictions the survey respondents will be giving.