HomeLatest ThreadsGreatest ThreadsForums & GroupsMy SubscriptionsMy Posts
DU Home » Latest Threads » Forums & Groups » Main » General Discussion (Forum) » Were the polls wrong?

Wed Nov 4, 2020, 11:20 AM

Were the polls wrong?

I am the Community Editor for Data Science Central, so looking at statistics is both my passion and my job.

I've been pondering the comments that many people have said here at DU about never trusting the polls or pollsters again, because they were so off. My answer is simple - no, the polls were actually pretty accurate.

Most polling looks at distributions - the variations among the way that people will behave, with the idea that you can make models that test a lot of different variables and that in turn make it possible to combine or simulate the likelihood of a particular event occurring.

This year has been a real test for many pollsters and modelers, such as Five thirty eight. Between Covid-19 and the ensuing lockdown, the economic impact of that, racial protests and the like, there were a lot of factors that went into the models that were simply not knowable beforehand, because we've not had a widescale pandemic in modern times (the last one of note was the Spanish Flu in 1918-1920). Several modelers all made the same point - Biden will probably win but it could be by 5 electoral votes or 200. Florida was likely a toss-up, and outside of Miami-Dade country the numbers were in line with expectations, the same with Ohio (note that Ohio and Florida seem to be increasingly in sync, perhaps because Florida is where Ohioans retire to).

Similarly, Georgia and Arizona (and quite possibly North Carolina) were considered as being slightly leaning towards Biden, in part because all three of them are seeing growing college-educated populations in critical tech and media hubs offsetting retirees, and all three had similar probability distributions because of that (note that this becomes clearer looking at Omaha, which went for Biden, it too is becoming known as a technical hub. If Nebraska had a winner-take-all system, this detail would have been lost, as most of the rest of the state is agricultural and rural.

The same point can be made for most of the rest of the electoral map. The map that's emerging now is remarkably close to what most pollsters were projecting, Biden taking the Pacific West (including Colorado). Texas and the Interior West went to Trump (though it was closer than expected in Texas), and so forth. If you read beyond the headline numbers, however, what also emerges (and what's not always obvious beforehand) was that the likelihood of the Democrats capturing the Senate was above 50%, but not by much. They picked up two and lost one, and there are three more outstanding due to run-offs, so it's still possible.

What data analytics tells you are probabilities. and moreover probabilities after all the dust has settled. With network television, people became conditioned (perhaps too conditioned) to believing that elections would be over by the time midnight rolled around, but in reality, elections take time, though the use of mail-in ballots and a relatively long lead time, along with rules about secrecy, mean that we're now moving away from that model, for the better I believe. Government is a deliberative process, because it requires that we think about the choices that we make. In the era of quick gratification, the flashy TV graphics and horse-race pacing of coverage may garner more eyeballs, but it robs us of the opportunity to choose the best candidates or propositions.

So, before you condemn pollsters and data scientists as charlatans, wait a bit and get a clearer picture of the whole results, not the results in the moment. In 2020, the results followed the polls quite well, but only if you don't interpret them with additional expectations (a second blue wave, as an example).

12 replies, 1106 views

Reply to this thread

Back to top Alert abuse

Always highlight: 10 newest replies | Replies posted after I mark a forum
Replies to this discussion thread

Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:26 AM

1. I don't think we will ever know until we are sure that the actual voting has not been

messed with in some way. In particular as long as we vote on electronic machines where the votes can be manipulated without being detected we will never really know whether poll results are questionable.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:27 AM

2. There's also a general misunderstanding of margin of error.

Many don't realize that the margin or error applies to each side of the equation. For a simple example, a race listed as 52/48 with a 3% margin of error is not outside the MOE, because the 52 could be 49 and the 48 could be 51.

We tend to look at the polls as a firm number. They're actually a range, of course.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to TwilightZone (Reply #2)

Wed Nov 4, 2020, 11:35 AM

8. Right, but the the MOE always seems to swing in Trump's favor.

Which is unsettling to me.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to TwilightZone (Reply #2)

Wed Nov 4, 2020, 12:08 PM

10. Yup- most swing state polls were off by 2-3% +Trump, and had MOE's of 3-5% nt

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:27 AM

3. Thank you for your perspective.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:28 AM

4. Is any website reporting on total numbers of D & R votes for President?

Reply to this post

Back to top Alert abuse Link here Permalink



Response to TwilightZone (Reply #5)

Wed Nov 4, 2020, 11:47 AM

9. Thank you, where's the 8 or 10 polling average we were supposed to win by?

Russia & republicans hacking voting machines?

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:33 AM

6. I think it's also hard for pollsters to capture the effects of voter suppression-- including the

deliberate slow-down of the mail during a pandemic, specifically to prevent the counting of likely Democratic votes-- and all the chaotic, back and forth, last-minute "legal" challenges to a fair and accessible election, and the often spurious, egregiously partisan court decisions. It's not just about who registered or likely voters intend to vote for, but whether they are ultimately able to vote and whether that vote gets counted.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to deurbano (Reply #6)

Wed Nov 4, 2020, 06:16 PM

11. It is hard, primarily because there are so many different ways that you can

steal an election, and often multiple strategies may be used that make it hard to determine, a priori, what is malicious doctoring. For instance, how do you model ballots being held hostage by the post office? What about a system that switches every eighth D vote to R internally on a touch screen voting machine, but only when a given sequence is entered? What about both working in concert?

I suspected that when people began requesting VBM and Absentee, the carefully prepared touch screen "glitch" was suddenly rendered obsolete. This in turn turned into panic as the White House tried to figure out how to cheat when their primary fallback couldn't be used for 45% of voters. The post office route was actually pretty ingenious if Trump hadn't inadvertently revealed it. And that's the problem with these kinds of hacks. Trump is verbally incontinent - he can't keep from monologuing.

I suspect in the next three months, we'll hear all kinds of interesting tell-all stories, most of which will pretty much indict him and his cronies in all KINDS of crime.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 11:33 AM

7. Sorry, if pollsters can't do any better than the past few elections, maybe they need to

 

make some changes in polling AND how they report that stuff.

I know it is difficult in this environment with so many variables, but this is pitiful performance. I suppose it's statistically possible to get it wrong several times in a row, but bygolly I'd present it as a nothing more than rectal extraction or best guess.

Nothing personal.

Reply to this post

Back to top Alert abuse Link here Permalink


Response to kurtcagle (Original post)

Wed Nov 4, 2020, 06:40 PM

12. You're sort of conflating two separate things

It's true that a model predicting a 99% chance of winning a state does not in and of itself say much about the shape of the probability distribution for the margin of victory.

But the individual polls going into that model do give us rough confidence intervals, and a whole bunch of them very much seem to have have been outside those intervals (16 points ahead in WI as of last week?)

It seems pretty inescapable there was some kind of systemic error across much of the state polling.

Reply to this post

Back to top Alert abuse Link here Permalink

Reply to this thread