2016 Postmortem
Related: About this forumWant to know why the LA Times/Dornsife poll has been an outlier all election season?
He is sure he is going to vote for Donald J. Trump.
And he has been held up as proof by conservatives including outlets like Breitbart News and The New York Post that Mr. Trump is excelling among black voters. He has even played a modest role in shifting entire polling aggregates, like the Real Clear Politics average, toward Mr. Trump.
How? Hes a panelist on the U.S.C. Dornsife/Los Angeles Times Daybreak poll, which has emerged as the biggest polling outlier of the presidential campaign. Despite falling behind by double digits in some national surveys, Mr. Trump has generally led in the U.S.C./LAT poll. He held the lead for a full month until Wednesday, when Hillary Clinton took a nominal lead.
Our Trump-supporting friend in Illinois is a surprisingly big part of the reason. In some polls, hes weighted as much as 30 times more than the average respondent, and as much as 300 times more than the least-weighted respondent.
http://www.nytimes.com/2016/10/13/upshot/how-one-19-year-old-illinois-man-is-distorting-national-polling-averages.html
Johnny2X2X
(24,207 posts)They are weighting their respondents? What the heck kind of poll is this?
Ace Rothstein
(3,373 posts)Demographics such as gender, race and age.
Happyhippychick
(8,422 posts)"There's my African American"!
UCmeNdc
(9,655 posts)Why design a poll that cannot accurately represent the voting groups. It seems silly and stupid. Once they knew they had a panelist that falsely represented the black vote why keep polling? They had to know their numbers were not accurate.
I do not understand why they kept the whole poll running?
DanTex
(20,709 posts)All polls reweight the sample to fit demographics. The problem with the LAT poll is that, basically, they got unlucky, and one part of the sample that was heavily overweighted ended up being highly unrepresentative.
That doesn't mean the methodology was bad, it's part of sampling error and could happen with any poll.
The problem is made worse by the fact that this is a panel, so the same people are polled every time. Which means that the unrepresentative sample stays in there the whole time. If they did brand new polls every time with the same methodology, this sample would have been an outlier, but other results they got would be more in line with the poll averages.
The problem is, it is very hard to justifiably changing a methodology once you decide on it just because the numbers don't come out the way you think they should. This is a problem that comes up in some scientific and medical research: people gather data, do some analysis, it doesn't work out well, so then they do some different analysis on the same data and publish it. That's not kosher, you have to decide on what you are going to do with the data before you gather the data in order for the statistical tests to be valid.
So, all in all, I think that the LAT pollsters are doing the right thing. The designed a methodology honestly, and they aren't changing the methodology after the fact, which would result in bias. They decided to run a panel, the designed the panel study soundly, and they are running it like they said. Even if they stopped running the panel because they don't think it's accurate, even that would result in a form of "survivor bias" in poll averages, because it would mean that some soundly designed polls were excluded because of the results they gave.
I agree with Nate Silver's comments on the LAT poll a while ago. What poll readers should do is not just dismiss the poll, but instead adjust it for it's house effect.
muriel_volestrangler
(106,212 posts)The article goes on to say that on his own, he's made Trump's black support look like it's in double digits, and:
http://www.motherjones.com/kevin-drum/2016/10/lat-poll-finally-makes-it-unanimous-donald-trump-loser
I can see defending weighting in a poll that increases a group from, say, 15% of actual respondents to 25% of the poll results, if you can't get hold of enough people in that demographic. But a weighting of 30 times for an individual is taking the piss.
DanTex
(20,709 posts)the margin of error calculations, which means that, yes, it is defensible (even though it definitely appears strange).
That one person can apparently result in a 1% change in the outcome. Well, if you did a 100-person poll then every single person would account for 1% of the outcome. 100 people is a very small sample, so you'd have a big margin of error. In the LAT case, not many people (only one, it looks like), have that much influence, so the margin of error is smaller than it would be in a 100-person poll. But the margin of error calculations take into account the amount of reweighting.
As Nate Cohn points out in the article, if you cap the amount of overweighting, then you reduce the sampling error, but you end up introducing a bias, because your re-weighted sample will not approximate true demographics correctly.
muriel_volestrangler
(106,212 posts)It looks like an attempt to claim a level of detail for the results without paying for the number of respondents to justify that, to me. They don't give a margin of error exactly like most polls do; they draw a chart with an "area of uncertainty", currently 4% wide, but which has been 6% at times (when this one guy is in, that's 1% of it, I suppose). But that doesn't get reported in words, just the basic figure which is given to a ridiculous 3 significant figures.
The only times the figures have gone outside the area of uncertainty have been for leads for Trump, which, given the results of so many other polls with the more common methods, seems to show this poll has a fundamental flaw.
DanTex
(20,709 posts)The way it's typically done is to gather the sample before asking about demographics, though I don't know about the LAT poll. If that's the case, extending the sample by first asking people about their demo information would result in two different sampling techniques.
The other thing is, if you went that route, it would make the poll more expensive, because you'd end up contacting a lot more people, only to discard a lot of them because they weren't in the demographics you needed. If you did that, IMO you'd be better off just including everyone, and still having the reweighting, because that would result in a smaller margin of error.
In fact, discarding people because they don't fall into a demo that you need is itself a form of overweighting: in this case it's not that people's responses in other demographics would count for less, it's that if you fell into another demographic and were contacted, you'd have a very high chance of not even being able to respond at all. Which means that, relative to everyone who was contacted, there would still be huge disparities in effective influence. The problem to me isn't really that the overweighting is so severe, it's that one individual ended up with so much sway. If their sample were 10X bigger, then the most influential person would only have 0.1% sway, which would be fine.
I haven't actually seen the way they present the results. Whether they call it "margin of error" or "area of uncertainty" or whatever else their marketing department decided on, it's still basically the same thing: two standards of deviation.
It's interesting that their margin of error (or whatever they call it) is that large. The article says the sample size is about 3,000, and without reweighting that would mean a margin of error of about 1.8%. 5% is what you'd get with a sample of size 400, so that is an indication of the serious reweighting they are doing.
I agree that 3 significant digits is absurd. On the other hand, most polls only release 1 significant figure, and I think they should release 2. A typical poll with a margin of error of 4% means standard deviation of 2%. That means that, with one significant figure, rounding error could be a quarter of a standard deviation.
muriel_volestrangler
(106,212 posts)DanTex
(20,709 posts)I guess it means that if the two lines are in that region, then they are separated by less than the traditional margin of error.
Although if they are doing it right, that would be the MoE for the difference between the numbers, which is generally close to twice as large as the MoE for one individual number. So it's possible that they aren't, as a whole, as reweighted as I thought.
Also, in my last post, I miscounted the number of significant figures (forgot about the first digit). I actually agree with them presenting 3 digits, one after the decimal, for the I reasons said.
Adrahil
(13,340 posts)The basic design might be sound, but this particular sample shows how vulnerable the design is to a bad sample. That's a serious weakness.
DanTex
(20,709 posts)Other pollsters have outliers too. The problem is LAT is conducting a panel, so now they have the same sample for the entire election season, and they got a bad one. Bad luck, but it could happen to any panel.
Adrahil
(13,340 posts)But as this article makes clear, this poll design is DOUBLY vulnerable.
First, it weights demo groups to a fine level of detail, leading to weightings far in excess of what's normally seen, meaning that if you have a bad sample, its effect is magnified. A lot.
Second, if you DO get a bad sample, the design of this poll means you're stuck with it, meaning that sample bias is perpetuated, rather than averaged out.
That's a pretty big flaw, IMO.
DanTex
(20,709 posts)If they did the calculations correctly, which I assume they did, their vulnerability to a bad sample is accurately characterized by their margin of error. Yes, by less finely dicing up the demo groups, they could have reduced their vulnerability, which would have reduced their MoE. On the other hand, they had a relatively large sample, at 3000, which made up for that.
They would have been equally vulnerable if they went with a more typical sample size of 1000, with less fine dicing and reweighting. But in the end, they got unlucky with the sample, which could have happened to any poll.
And the fact that they are stuck with it is unfortunate, but that's not due to the weighting, it's because they are performing a panel and not a bunch of separate independently sampled polls.
Adrahil
(13,340 posts)That's the panel, as you said. And it's also true that the MoE can account for sampling errors, though in this case, it seems they are probably hanging onto the edge of that MoE. It is NOT clear if the poll accounts for weird "previous voting" weighting. I mean, if you look in the article, they show what the poll looks like with usual weighting and without the weird vote reporting weighting.
I don't advocate throwing out data just because one disagrees, but I think it's clear their design choices have led to their extreme outlier status.
DanTex
(20,709 posts)I think that was an error, and a preventable one: after an election people are more likely to retrospectively claim they voted for the winner, so weighting according to previous vote will predictably result in overweighting the GOP. So, yeah, that was a bad choice by them.
But the way they split up the demo in very fine ways, with their large sample size, I think is defensible even if it resulted in extreme overweighting for some respondents.
According to the article, it seems that about half of their systematic error is due to the "previous vote" weighting, and the other half is due to demo splitting. I think the first half can be attributed to poor methodology, but the second half I think is mainly bad luck. Even with standard demo weighting, you can end up with a skewed sample, and you can also end up with a sample that is more skewed with standard demo weighting than with their finer grained demo subgroups.
qdouble
(891 posts)so if it initially had a pro-Trump bias, it will keep it, even if he's looking much worse in every other poll.
Bernardo de La Paz
(60,320 posts)That's what you have to do to make a Tracking Poll: Poll the same people.
So ignore the absolute numbers on it and watch the swings. It will tend to give a more accurate view of swings and a less accurate view of actual numbers.
qdouble
(891 posts)It should be obvious from my first post that I understand the methodology is different than other polls.
Bernardo de La Paz
(60,320 posts)Adrahil
(13,340 posts)Nate Silver has said that.... that the poll is still useful for trends, BUT.... as this article shows, the very unusual weightings and the affect of those weightings means the poll can miss trends. Virtually every other poll showed a significant move to Clinton after the first debate, but this one missed it. It's odd weighting and bad sample explain that. It doesn't mean the poll is useless, but it does significantly lower it's value.
Bernardo de La Paz
(60,320 posts)Dark n Stormy Knight
(10,484 posts)A Times/Dornsife polls.
Adrahil
(13,340 posts)It states very well why that poll is so screwy.
bluestateguy
(44,173 posts)nt
oasis
(53,693 posts)workinclasszero
(28,270 posts)The deplorable freepers have been clinging to it for life, many weeks now. LOL
Yavin4
(37,182 posts)I'm also studying data science.
RAFisher
(466 posts)It's a experimental poll. That I'm fine with. But LA TImes endorsing it makes it seem like a normal poll, which it is not.
Foggyhill
(1,060 posts)Like you had a new motor design and decided to sell it without testing for reliability
