The following blog has been published first at the Oxford University Politics Blog. Here is just the first part explaining the logic behind our BAFS method.
Is it possible to have a more accurate prediction by asking people how confident they are that their preferred choice will win?
As the Brexit referendum date approaches, the uncertainty regarding its outcome is increasing. And, so are concerns about the precision of the polls. The forecasts are, once again, suggesting a very close result. Ever since the general election of May 2015, criticism against pollsters has been rampant. They have been accused of complacency, herding, of making sampling errors, and even of deliberate manipulation of their results.
The UK is hardly the only country where pollsters are swiftly losing their reputation. With the rise of online polls, proper sampling can be extremely difficult. Online polls are based on self-selection of the respondents, making them non-random and hence biased towards a particular voter group (the young, the better educated, the urban population, etc.). On the other hand, the potential sample for traditional telephone (live interview) polls is in sharp decline, making them less and less reliable. Telephone interviews are usually done during the day biasing the results towards stay-at-home moms, retirees, and the unemployed, while most people, for some reason, do not respond to mobile phone surveys as eagerly as they once did to landline surveys. With all this uncertainty it is hard to gauge which poll(ster) should we trust and to judge the quality of different prediction methods.
However, what if the answer to ‘what is the best prediction method’ lies in asking people not only who they will vote for, but also who they think will win (as ‘citizen forecasters’), and more importantly, how they feel about who other people think will win? Sounds convoluted? It is actually quite simple.
There are a number of scientific methods out there that aim to uncover how people form opinions and make choices. Elections are just one of the many choices people make. When deciding who to vote for, people usually succumb to their standard ideological or otherwise embedded preferences. However, they also carry an internal signal which tells them how much chance their preferred choice has. In other words, they think about how other people will vote. This is why, as game theory teaches us, people tend to vote strategically and do not always pick their first choice, but opt for the second or third, only to prevent their least preferred option from winning.
When pollsters make surveys they are only interested in figuring out the present state of the people’s ideological preferences. They have no idea on why someone made the choice they made. And if the polling results are close, the standard saying is: “the undecided will decide the election”. What if we could figure out how the undecided will vote, even if we do not know their ideological preferences?
One such method, focused on uncovering how people think about elections, is the Bayesian Adjusted Facebook Survey, or BAFS for short. The BAFS method is first and foremost an Internet poll. It uses the social networks between friends on Facebook to conduct a survey among them. The survey asks the participants to express: 1) their vote preference (e.g. Leave or Remain); 2) how much do they think their preferred choice will get (in percentages); and 3) how likely they think other people will estimate that Leave or Remain will win the day.
Let’s clarify the logic behind this. Each individual holds some prior knowledge as to what he or she thinks the final outcome will be. This knowledge can be based on current polls, or drawn from the information held by their friends and people they find more informed about politics. Based on this it is possible to draw upon the wisdom of crowds where one searches for informed individuals thus bypassing the necessity of the representative sample.
However, what if the crowd is systematically biased? For example, many in the UK believed that the 2015 election would yield a hung parliament – even Murr’s (2016) citizen forecasters (although in relative terms the citizen forecaster model was the most precise). In other words, information from the polls is creating a distorted perception of reality which is returned back to the crowd biasing their internal perception. To overcome this, we need to see how much individuals within the crowd are diverging from the opinion polls, but also from their internal networks of friends.
Depending on how well they estimate the prediction possibilities of their preferred choices (compared to what the polls are saying), BAFS formulates their predictive power and gives a higher weight to the better predictors (e.g. if the polls are predicting a 52%-48% outcome, a person estimating that one choice will get, say, 80% is given an insignificant weight). Group predictions can be completely wrong of course, as closed groups tend to suffer from confirmation bias. On the aggregate however, there is a way to get the most out of people’s individual opinions, no matter how internally biased they are. The Internet makes all of them easily accessible for these kinds of experiments, even if the sampling is non-random.
 See Murr, A.E. (2016) “The wisdom of crowds: What do citizens forecast for the 2015 British General Election?” Electoral Studies 41 (2016) 283-288.
Oraclum Intelligence Systems provides custom designed 3D visualizations of election results suitable for showing during live TV broadcast. Images and video below are snapshots from an example of real-time interactive presentation of election and statistical results on a 3D map of Croatia. The visualization is created in Ventuz – a powerful content creation and playout control tool that is commonly used for high-end interactive presentations and broadcast graphics. More examples of a similar 3D graphical design that we can provide are shown at the Pixelibris website.
Oraclum Intelligence Systems Ltd is a non-partisan start-up interested in experimental testing of forecasting models on real-life electoral data. We aim to use a Facebook survey of UK voters, along with our unique set of Bayesian forecasting methods to try and pick out the best and most precise prediction method in lieu of the upcoming Brexit referendum. We wish to uncover a successful prediction method using the power of social networks. After the Brexit referendum, we will apply the same methods on the forthcoming US Presidential elections in November 2016.
Opinion pollsters in the UK came under a fierce line of attack from the public and the media following their joint failure to accurately predict the results of the 2015 UK general election. Months and weeks before the May election the polls were predicting a hung parliament and a virtual tie between Conservatives and Labour, where the outcome would have been another coalition government or even a minority government (a number of combinations were discussed, even the grand coalition between Labour and Conservatives).
The results showed that the pollsters, on average, missed the difference between the two parties by 6.8%, which translated into about 100 seats. What was supposed to be one of the closest elections in British history turned out to be a landslide victory for the Conservatives.
In fact we wish to vindicate some pollsters by offering, for the first time in the UK, an unbiased ranking of UK pollsters.
And here it is:
Our rankings are based on a somewhat technical but still easy to understand methodological approach summarized in detail in the text below. It has its drawbacks, which is why we welcome all comments, suggestions and criticism. We will periodically update our ranking, our method, and hopefully include even more data (local and national), all with the goal of producing a standardized, unbiased overview into the performance of opinion pollsters in the UK.
We hope that our rankings stir a positive discussion on the quality of opinion pollsters in the UK, and we welcome and encourage the usage of our rankings data to other scientists, journalists, forecasters, and forecasting enthusiasts.
Note also that in the ranking list we omit the British Election Study (BES), which uses a far better methodology than other pollsters – a face-to-face random sample survey (the gist of it is that they randomly select eligible voters to get a representative sample of the UK population, and then they repeatedly contact those people to do the survey; you can read more about it here). This has enabled them to give out one of the most precise predictions of the 2015 general election (they gave the Conservatives an 8% margin of victory). However there is a problem – the survey has been (and usually is) done after the elections, meaning that it cannot be used as a prediction tool. Because of this, instead of grouping it with the others we use the BES only as a post-election benchmark.
Methodology and motivation
Our main forecasting method to be applied during the course of the Brexit referendum campaign, the Bayesian Adjusted Facebook Survey (BAFS), will be additionally tested using a series of benchmarks. The most precise benchmark that we attempt to use is the Adjusted polling average (APA) method. In fact, the main motivation for our own ranking of pollsters in the UK is to complement this particular method. As emphasized in the previous post our APA benchmark adjusts all current Brexit referendum polls not only with respect to timing and sample size, but also with respect to its relative performance and past accuracy. We formulate a joint weight of timing (the more recent the poll, the greater the weight), sample size (the greater the sample size, the greater the weight), whether the poll was done online or via telephone, and the ranking for each poll, allowing us to calculate the final weighted average across all polls in a given time frame (which is in this case since the beginning of 2016).
The weighted average calculation gives us the percentages for Remain (currently around 43%), Leave (currently around 41%), and undecided (around 15%). To get the final numbers which we report in our APA benchmark, we factor in the undecided votes as well.
How do we produce our rankings?
The rankings are based on past performance of pollsters for three earlier elections, the 2015 general election, the 2014 Scottish referendum, and the 2010 general elections. In total we observed 480 polls from 15 pollsters (not all of which participated in all three elections). We realize the sample could have been bigger by including local and previous general elections, however given that many pollsters from 10 years ago don’t produce polls anymore (while the majority of those operating in 2015 still produce them now for the Brexit referendum), and given that local elections are quite specific, we focus only on these three national elections. We admit that the sample should be bigger and will think about including the local polling outcomes, adjusted for their type. There is also the issue of methodological standard of each pollster which we don’t take into account, as we are only interested in the relative performance each pollster had in the previous elections.
Given that almost all the pollsters failed to predict the outcome of the 2015 general election, we look at the performance between pollsters as well, in order to avoid penalizing them too much for this failure. If no one saw it coming, they are all equally excused, to a certain extent. If however a few did predict correctly, the penalization against all others is more significant. We therefore jointly adjust the within accuracy (the accuracy of an individual pollster with respect to the final outcome) and the between accuracy (the accuracy of an individual pollster with respect to the accuracy of the group).
To calculate the precision of pollsters in earlier elections we again have to assign weights for timing and sample size, in the same way as earlier described (older polls are less important, greater sample size is more important). Both of these factors are then summed up into the total weight for a given poll across all pollsters. We then take each individual pollster and calculate its weighted average (as before, this is the sum of the product of all its polls and their sample and timing weights, divided by the sum of all weights – see footnote ). By doing so we can calculate the average error each pollster made in a given election. This is done for all three elections in our sample allowing us to calculate their within accuracy for each election. We calculate the average error for an individual pollster as the simple deviation between the weighted average polling result and the actual result for the margin between the first two parties in the elections (e.g. Conservatives and Labour). Or in plain English, how well they guessed the difference between the winner and the runner-up.
After determining our within index, we estimate the accuracy between pollsters (by how much they beat each other) and sum them both into a single accuracy index. To do this we first calculate the average error for all pollsters during a single election. We then simply subtract the joint error from each individual error. This represents our between index: the greater the value, the better the pollster did against all others (note: the value can be negative).
Joint within-between ranking
To get our joint within-between index we simply sum up the two, thereby lowering the penalization across all pollsters if and when all of them missed. In this case those who missed less than others get a higher value improving their overall performance and ranking them higher on the scale.
We repeat the same procedure across all three elections and produce two final measures of accuracy. The first is the final weighting index (which we use for the ranking itself and whose values we use as an input in the Brexit polls), and the second is the precision index. The difference between the two is that the precision index does not factor in the number of elections, whereas the final index does. The precision index is thus the simple average of the within-between indices, while the final index is the sum of all three divided by the total number of elections we observed regardless of how many of them the pollster participated in. The two are the same if a pollster participated in all three elections, but they differ if the pollster participated in less than three elections.
For example, consider the fourth ranked SurveyMonkey. They have the highest precision grade because they were the only ones in the 2015 election to predict the result almost perfectly (a 6% margin Conservative victory). However since they only participated in a single election, they do not come up on top in the final weighting index. Pollsters that operated across all three elections give us a possibility to measure their consistency, a luxury we do not have for single-election players.
In other words perhaps SurveyMonkey was just lucky, particularly since they only conducted a single survey prior to that election. However, given that the survey was done in the week prior to election day (from April 30th to May 6th; election day was May 7th) and given that it had over 18,000 respondents, our guess is that it was not all down to luck. Either way given that their entry to the race was late and a one-off shot (similar to our BAFS effort actually), if or when they do produce their estimate for Brexit one day prior to the referendum, we will surely factor them in and give them a high weight. Not as high as their precision index suggests, but high enough. The same is with several other pollsters that were operational over the course of a single election, meaning that they got a lower weight overall, regardless of their single-election accuracy.
To conclude, the numbers reported under the final weighting index column represent the ranking weight that we talked about in the beginning of this text. Combined with the timing and sample size weights, it helps us calculate the final weighted average of all polls thereby helping us configure our strongest benchmark, the adjusted polling average.
 The rankings that we report here will not be a part of our BAFS method.  Calculated as ∑xiwi / ∑wi, where xi is an individual poll and wi the corresponding weight. wi is calculated as the sum of three weights, for timing (using an adjusted exponential decay formula, decreasing from 4 to 0, where half-life is defined by t1/2 = τ ln(2) ), for sample size (N/1000), and the ranking weight (described in the text).  Define xi as the difference between total predicted vote share of party A (vA) and party B (vB) for pollster i, and y as the difference between the actual vote share of the two parties. Assume A was the winner, and B was the runner-up. The within accuracy of pollster i (zi) is then defined simply as zi = |xi – y|. The closer the value of zi is to 0, the more accurate the pollster. From this we calculate the within index as Iw = (10 – zi).
Our prediction method rests primarily upon our Facebook survey, where we use a variety of Bayesian updating methodologies to filter out internal biases in order to offer the most precise prediction. In essence we ask our participants to express their preference of who they will vote for (e.g. Leave or Remain for the Brexit referendum), how much do they think their preferred choice will get (in percentages), and how much do they think other people will estimate that Leave or Remain could get. Depending on how well they estimate the prediction possibilities of their preferred choices we will formulate their predictive power and give higher weight to the better predictors. We call this method the Bayesian Adjusted Facebook Survey (BAFS).
In our first election prediction attempt, where we predicted the results of the 2015 general elections in Croatia, we found that our adjusted Facebook poll (AFP) beat all other methods (ours and of other pollsters) and by a significant margin. Not only did it correctly predict the rise of a complete outlier in those elections, it also gave the closest estimates of the number of seats each party got. Our standard method, combining bias-adjusted polls and socio-economic data, projected a 9 seat difference between the two main competitors (67 to 58; in reality the result was 59-56), and a rather modest result of the outlier party which was projected to be third with 8 seats – it got 19 instead. Had we used the AFP we would have given 16 seats to the third party, and a much closer relationship between the first two parties (60-57). The remarkable success of the method, particularly given that it operated in a multiparty environment with roughly 10 parties with realistic chances of entering parliament (6 of which competed for the third party status, all newly founded within a year of the elections, with no prior electoral data), encouraged us to improve it further which is why we tweaked it into the BAFS.
In addition to our central prediction method we will use a series of benchmarks for comparison with our BAFS method. We hypothesize (quite modestly) that the BAFS will beat them all.
We will use the following benchmarks:
Adjusted polling average – we examine all the relevant pollster companies in the UK given their past performance in predicting the outcomes of the past two general elections (2015 and 2010), and one recent referendum – the Scottish independence referendum in 2014. We could go longer back in time and take into consideration the polls of local elections as well. However, we believe the more recent elections adequately encapsulate the shift in trend regarding polling methods in addition to its contemporary downsides. As far as local elections are concerned, we fear that they tend to be too specific and that predicting local outcomes and national outcomes are two different things. More precision on a local level need not translate into more precision on a national level. Given that the election of our concern is national (the EU referendum), it makes sense to focus only on the performance of national-level polls in the past. We are however open to discussion regarding this assumption.
In total we covered 516 polls from more than 20 different pollsters in the UK across the selected elections. Each pollster has been ranked according to its precision. The precision ranking is determined on a time scale (predictions closer to the election carry a greater weight) and a simple brier score is calculated to determine the forecasting accuracy of each pollster. Based on this ranking weights are assigned to all the pollsters. To calculate the final adjusted polling average we take all available national polls, adjust them according to timing, their sample size, whether or not they conduct an online or telephone poll, and their pre-determined ranking weight, and take the average score from all those weights. We also calculate the probability distribution for our final adjusted polling average.
Regular polling average – this will be the same as above, except it won’t be adjusted for any prior bias of the given poll nor will it be adjusted based on sample size. It is only adjusted based on timing (the more recent get a greater weight). We look at all the polls done at least two months before the last poll.
What UK Thinks Poll of polls – this is a poll averaging only the six most recent polls, done by a non-partisan website What UK Thinks, run by the NatCen Social Research agency. The structure of what goes in changes each week as new pollsters share new polling data. The method is simple averaging (it shows moving averages) without weighting anything. Here’s the intuition.
Forecasting polls – these are polls based on asking the people to estimate how much one choice will get over another. They are different than regular polls as they don’t ask who you will vote for, but who you think the rest of the country will vote for. The information for this estimate is also gathered via the What UK Thinks website (see sample questions here, here, here, and here).
Prediction markets – we use a total of seven betting markets. We use the estimates from PredictIt, PredictWise, Betfair, Pivit, Hypermind, Ig, and iPredict. They are also distributed on a time scale where recent predictions are given a greater weight. Each market is also given a weight based on the volume of trading, so that we can calculate and compare one single prediction from all the markets (as we do with the polls). The prediction markets, unlike the regular polls, don’t produce estimates of the total percentage one option will have over another. They instead offer probabilities that a given outcome will occur, so the comparison with the BAFS will be done purely on the basis of the probability distributions of an outcome.
Prediction models – if any. The idea is to examine the results of prediction models such as the ones done by Nate Silver and FiveThrityEight. However, so far FiveThirtyEight hasn’t done any predictions on the UK Brexit referendum (I guess they are preoccupied with the US primary and are probably staying away from the UK for now, given their poor result at the 2015 general election). One example of such models based purely on socio-economic data (without taking into consideration any polling data, so quite different from Silver) is the one done by a UK political science professor Matt Qvortrup where he racks it all up into a simple equation: Support for EU= 54.4 + Word-Dummy*11.5 + Inflation*2. – Years in Office*1.2. Accordingly, his prediction is a 53.9% for the UK to Remain. We will try to find more such efforts to compare our method with.
Superforcaster WoC – we utilize the wisdom of the superforecaster crowd. Superforecasters are a colloquial term for participants in Phillip Tetlock’s Good Judgement Project (GJP) (there’s even a book about them). The GJP was a part of a wider forecasting tournament organized by the US government agency IARPA following the intelligence community fiasco regarding the WMDs in Iraq. The government wanted to find whether or not there exists a more formidable way of making predictions. The GJP crowd (all volunteers, regular people, seldom experts) significantly outperformed everyone else several years in a row. Hence the title – superforecasters (there’s a number of other interesting facts about them – read more here, or buy the book). However superforecatsers are only a subset of more than 5000 forecasters who participate in the GJP. Given that we cannot really calculate and average out the performance of the top predictors within that crowd, we have to take the collective consensus forecast. Finally, similar to the betting markets, the GJP project doesn’t ask its participants to predict the actual voting percentage, it only asks them to gauge the probability of an event occurring. We therefore only compare the probability numbers in this benchmark.
Finally, we will calculate the mean of all the given benchmarks. That will be the final, last robustness test of the BAFS method.
So far, one month before the elections here is the rundown of the benchmark methods: (these will be updated over time)
The following table expresses it in terms of probabilities:
 Note: this is not the same method as we use now, even though it was quite similar.  See his paper(s) for further clarification. * For the adjusted polling average, the regular polling average, and for the forecasting polls we have factored in the undecided voters as well.
A brief introduction into the political landscape so far. Back in January 2013, in response to mounting pressure from his own party and the upsurge in popularity of the eurosceptic UK Independence Party (UKIP), British Prime Minister David Cameron promised to his voters that if they re-elect his Conservative government he will give the citizens a chance to vote on the in-out EU referendum for the first time since 1975: “It is time for the British people to have their say. It is time to settle this European question in British politics.” The date was set to be in 2017 at the latest.
The campaign strategy worked. Conservatives were sworn in by a landslide electoral victory, much to the complete surprise of almost all UK pundits and almost every pollster. While everyone was predicting a very close election and a virtual tie between Labour and Conservatives, the Conservatives picked up 100 seats more than Labour which was enough for them to form a single-party government. Reinvigorated by this success the party quickly moved forward the EU Referendum Act. It was introduced to the House of Commons already in May 2015 (a few weeks after the elections), approved in December, with the official date (23rd June 2016) announced in February. The referendum question itself was designed to be quite clear, leaving no room for ambiguity:
Should the United Kingdom remain a member of the European Union or leave the European Union? 1. Remain a member of the European Union 2. Leave the European Union
It is hard to say whether or not the pledge of an EU in-out referendum helped the Conservative party (there were certainly other things that led them to such an impressive and unexpected result, primarily the dismal performance of Labour and Liberal Democrats across the country), but it was a gamble the PM was willing to take. He kept his promise, even allowing individual party members to form their own opinions on the Brexit, not necessarily along official party lines.
Cameron’s plan was to renegotiate Britain’s deal with the EU, primarily concerning immigration and welfare policies. In February he did just that, although many would disagree on the extent of his success, calling it a lukewarm deal at most, falling short of many of his promises. The deal is set to grant Britain a “special status” within the EU if they vote Remain. It ensures that Britain will not be a part of the path towards “an ever closer union”, that the financial sector is protected from further EU regulations, and that Britain is exempt from further bailouts of troubled eurozone nations (and is even to be reimbursed for funds used so far). Where it fell short of expectations was the migrant welfare payments and child benefits. Migrant workers will still be able to send child benefits back to their home countries, while new arrivals will gradually be able to get more benefits, the longer they stay. Some compromises were made by both sides of the bargaining table, however this hardly satisfied the eurosceptics back home.
Today, the Conservatives are divided. The party leadership as well as the majority of government ministers support the Remain campaign. However, roughly half of Conservative MPs, 5 government ministers, and the former Mayor of London and prominent Conservative figure, Boris Johnson, all support the Leave campaign. It would seem that the Conservative voters accurately depict their party’s split – YouGov reports that the distinction is 44%-56% towards Leave.
On the other hand the Labour party’s new leadership under Jeremy Corbyn has expressed its official position to support the Remain campaign, although political pundits have noted a slight reservation of Corbyn towards the EU (primarily based on his previous opinions of the EU). However, Labour voters are much more inclined towards the EU than their current party leader. 75% of them support Remain, while only 25% support Leave. LibDem voters are even more pro-EU (79-21), while on the other hand of the spectrum, UKIP voters are perfectly aligned with their party’s position (97% support Leave).
Usually, when political party leaderships in a country announce their position towards a referendum (particularly on EU membership), the outcome is very often predictable – voters listen to their parties and vote accordingly. The same can actually be said of the current division regarding Brexit – voters do listen to their parties. The Conservative party is split (its official position is to be – neutral) and their voters act accordingly. UKIP, LibDems, and Labour are all relatively united towards the referendum question, so their voters also vote accordingly. It is this interesting dynamic operating within the Conservative party and the electorate in general that makes this referendum a difficult one to predict. After all, the majority of the polls are predicting a very close result, within the margin of error.
The role of Oraclum I.S.
What is our stake at these elections? We are, above all, a non-partisan venture in its start-up R&D phase interested in experimental testing of our models on real-life electoral data. We aim to use a Facebook survey of UK voters (more on that in the next text), along with our unique set of Bayesian forecasting methods to try and pick out the best and most precise prediction method. Essentially our motivation in this initial stage is purely scientific. We wish to uncover a successful prediction method using the power of social networks. After the Brexit referendum, we will apply the same methods on the forthcoming US Presidential elections in November 2016.
In our Facebook survey we will not use any user data from Facebook directly or indirectly, only the data the users provide us in the survey itself. We will have no knowledge of voter preferences of any individual user, nor will we be able to find that out.
The Facebook survey will be kick-started 10 days prior to the referendum, on 13th June, and will run up until the very last day when we will provide our final forecast. Our forecasts will show both the total predicted percentages and the probability distributions for both options. They will also show the distribution of preferences for the friends of each user (so that the user could see how his or her social network is behaving and who they, as a group, are voting for), as well as the aggregate predictions the survey respondents will be giving.
Furthermore we will present our predictions in a map format, based on UK regions, where we will show the actual polling numbers, and our Bayesian adjusted version.
We welcome all suggestions, comments, and criticism.
In the next blog post, we will introduce you briefly to our method and the number of benchmarking methods we attempt to use for comparison.
The most surprising result of the 2015 general election in Croatia was the success of the recently local party MOST, that appeared to have broken the duopoly of the two establishment center-left and center-right parties, the SDP and HDZ. MOST was founded in a small coastal Croatian city to compete in its local elections in 2013. After winning those elections they spread out on a national basis attracting other local, independent, uncorrupt mayors, and in just 10 months since the beginning of 2015 they turned themselves from a local movement to a kingmaker party winning a total of 19 seats (out of 151), making it impossible for both establishment parties to form a government without them. Continue reading “Third party options”
By Dejan Vinković: Cartograms are maps with deformed geometries that substitute land area with some other thematic mapping variable, such as election results. Conveying information in this way often produces more informative visualization than ordinary maps, especially when used for online interactive data access. Continue reading “Cartograms of election results”
By Mile Šikić, Dejan Vinković, and Vuk Vuković: As a part of an election coverage project initiated by the biggest daily newspaper in Croatia, Jutarnji list, we were given the opportunity to introduce, for the first time in this part of Europe, a prediction model of general elections which simultaneously uses election polls, previous election results, and a range of socio-economic data for a given electoral district.Continue reading “Election forecasting for the Croatian 2015 general elections”