The Political Polls Got It Wrong

  • This is the appearance
  • Polls are imperfect estimates at a single point in time
    • Polls have known average error and error varies over the years
    • They are pretty good, but sometimes error is high
    • We forget this
  • The error in the estimate of Clinton’s percentage was actually very low (1%)
    • Difference between percent vote for all polls is in the margin of error
  • Obama outperformed polls
    • Created an assumption that Democrats should outperform polls
    • the polls were actually worse (off by 3%)
  • Idea is that lots of polls reduce uncertainty (ie don’t pay attention to individual uncertainty)
  • Clinton had a 67% probability according to 538
    • that means that there is a real possibility that Trump could win
  • Significant differences between models
    • Why? see error in model assumptions below
    • aggregators are taking polls and creating models without considering assumptions properly
  • Thought that 8 50/50 states all had to go to Trump
    • this is a bad assumption!
    • if these states are independent the probability is .5^8 = <1%
    • BUT error is correlated - tendency for Trump to overperform
      • thus if Wisconsin is underestimated (and goes for Trump) so will PA and FL etc
  • For blue states Clinton over and underperformed equally
  • correlation between poll error and % white non-college vote
    • states we talk about as battleground (eg WI, PA, FL, OH) have lots of white non-college
    • this population is increasingly leaning Republican
    • GOP has lost white college grads
    • BUT if this group was sampled correctly it wouldn’t create biased polls
    • this population also isn’t turning out to vote
      • assuming this means that if you’re wrong you introduce error in the poll
      • using a formula to decide if you will vote and whether you should be counted in poll
      • more likely to drop less educated voters from poll
      • small errors lead to just enough underestimation to create consistant bias
  • Take poll and weight on the basis of the census
    • weights do not include education, but if education matters than there’s bias
      • the influence of education has changed
  • Error in aggregator models
    • greater uncertainty in 2016 due to non-incumbant
    • lots of late deciders influenced by negative Clinton press
  • Were people afraid to admit to voting for Trump
    • BUT social pressure would be mostly in blue states, but those states were consistent w polls
    • BUT Republicans consistently outperformed polls
  • We will hopefully adjust for error in models but there may be other issues in future
    • reweight data to account for education
    • reconsider likely voter formula
  • Polls have been having issues elsewhere
    • eg Brexit passed
    • eg Colombia voted against FARC peace deal
  • Confirmation bias
    • media believed it couldn’t happen so they take the pieces that look like this and believe them
    • early voting suggested Clinton
  • If polls were wrong about elections what does that say about approval ratings?
    • BUT approval rating is all people; no worries about likely voter and associated formula
    • BUT popular vote estimate was very good
  • Other cases of correlated error
    • probability of massive subprime loan defaults in 2007
      • spreading out risk of any given default sounds great
      • HOWEVER if one fails then likely everything else will fail
      • SO we think risk is reduced but instead risk is high throughout

Factors that were underestimated

  • Clinton had lost before
  • Trump had been winning more than expected
  • People looking for something different
  • Setbacks immediately prior to voting
  • Comey investigation right before election