Understanding "margin of error"

Editor's note: Deep in the comments on Kari's post about the SurveyUSA poll, Chris Lowe - who boasts a Ph.D. in history from Yale and is currently working on an MPH at OHSU - explains the concept of "margin of error". He's a supporter of Steve Novick.

I'm sorry, but despite TJ and Miles also being Novick supporters, the MOE point is being spun a bit out of control here.

The first point (so to speak) is that though the data aren't perfect, they are data. If we took the arguments being made here literally, there would be no point in polling. The fact that Jeff is up in this one sample means that it is more likely that he really was up than if the numbers were equal or if Steve were up at the time of the poll.

TJ and Miles are writing as if the probabilities are equally distributed within the MOE / confidence interval. But they are not. That's why a normal curve is higher in the middle than at the ends. Within the interval some results are more likely than others.

TJ points out that what pollsters call MOE represents what other statisticians call confidence interval. Also that conventionally a standard of 95% confidence is used. Also that the points that bound 95% confidence are found on a normal curve are just under 2 standard deviations from the midpoint.

(A normal curve is a certain type of shape of curve produced by the distribution of probabilities in a random sample, the famous bell curve; a standard deviation is a measure of spread -- the wider the spread of probabilities, the wider and flatter the curve, and the bigger the standard deviation.)

What he does not say is that at the points marking out 1 standard deviation. i.e. half the distance from the midpoint of the 95% confidence points, the confidence level is 68%.

What this means for a poll with a MOE of 4% is that we can have confidence that 95% of a large number of samples would fall within 4% of the midpoint on either side (midpoint=the poll sample %, in this case 31% for Jeff, 27% for Steve), but also that 68% of such samples would fall within 2% of the midpoint.

So with Jeff 4 points up from Steve, if this were a random sample, it's roughly twice as likely that he was actually ahead of Steve in the population sampled, than that Steve was ahead of him, at the time of the poll.

The boundaries of the 68% confidence interval would be a tie, on one hand, and Jeff up 4 8 points, on the other.

(This is a bit crude about confidence intervals & wouldn't do as an answer on a test in a statistics class, but is o.k. for present purposes).

The actual data points in a sample do matter, and it does matter with a 4 pt MOE whether the actual data points are the same or 4 pts apart.

It is still a matter of probabilities, not certainties, and there is a substantial chance that Steve actually was ahead in the voting population a the time of the poll if the vote had been taken then. But given a choice betting whether the actual result would fall within the 68% confidence interval or in 32% confidence range outside it, I'd bet on inside.

Now this gets mucked up in uncertain ways by non-randomness of sample, as Miles points out, particularly if there's a systematic bias that's related to voting preference distributions. But we don't know if such a problem exists or whether it skews the sample toward Steve, or Jeff (or Candy).

  • Randle McMurphy (unverified)
    (Show?)

    This post reaffirms my opinion that Chris is the best regular commentator on Blue Oregon.

  • Randy (unverified)
    (Show?)

    "if this was a random sample" is the fulcrum on which everything balances. That is the problem with all of the polls. The math is easy, getting a truely random, or representative, sample is the tricky part.

  • (Show?)

    One correction: "The boundaries of the 68% confidence interval would be a tie, on one hand, and Jeff up 4 points, on the other," should be "Jeff up 8 points, on the other."

    Also, I failed to acknowledge that TJ had said "always better to be up instead of down," which is as good an 8 word summary of my point as could be wished.

    Thanks for the kind words Randle. Randy, I agree.

  • LT (unverified)
    (Show?)

    Great post! Reminds me of that statistics class I took several years ago!

    Once I worked with a campaign manager who had an even simpler approach to "likely voter" polls. If she couldn't read the first 3 questions, she didn't believe the poll. Apparently if the likely voter question is not asked early, it skews the result. Wise statement, esp. since the Oregonian published 4 polls on their front page the day before a polling place primary and all POLLS were wrong!

    Prior to that, in 1988, an Oregon pollster gave Kopetski a chance to beat Denny Smith, but the nationally famous DCCC pollster said he didn't have a chance so the DCCC didn't help him. Then sent an apology letter after the race ended in a recount (707 votes--Denny spent that last term talking about his "Boeing " victory margin.

    This is why I don't believe polls are Gospel! Esp. with the high undecided number!

  • (Show?)

    I second Randle's comment. Totally.

  • (Show?)

    It's always a good day when someone rationally explains the science of survey research. As long as you understand what the limitations are, they are valuable tools for point in time and trend analyses. Thanks Chris!

    One thing I wanted to carry over from that thread was a comment by Paul g that indicated I was wrong to say you couldn't determine a trend from two polls. At least that's what it sounds like he's saying.

    My point was that you cannot derive a trend when there is no measurable movement. From 28 to 31 for Merkley is a within MoE difference. Same for Novick's 27 and 30 in the last two polls---within the MoE.

    Since neither candidate has shown any significant movement between the two polls, it's not accurate to suggest a surge or a trend towards Merkley. From the April poll to the May poll--now that DID show movement, which I acknowledged in my piece on the second SUSA survey. Both candidates saw significant gains, with Merkley coming from further behind and showing a larger improvement.

    One other thing on undecideds--something you often can't see in the results is how hard "leaners" were pushed, if at all. I don't recall that SUSA does a lot of pushing, but clearly Hibbits didn't do any at all. If I were running the state survey center this state so desperately needs, I would ask political questions in "definitely will vote for/leaning for" format, in order to bring in leaners and show the depth of support.

  • (Show?)

    If I were running the state survey center this state so desperately needs, I would ask political questions in "definitely will vote for/leaning for" format, in order to bring in leaners and show the depth of support.

    I agree with Mark on this. I know that Pew does, or at least used to, set up their polls to not only give answer choices but also to give a confidence level in that answer. I've seen other web-based polls do the same thing with a 1 - 5 strength of opinion scale so that for each question the respondent could also indicate how strongly they feel about the answer they chose. If I remember correctly I think that's the usual format for the "presidential selecter" type quizzes that pop up every four years.

    As a sometimes poll taker I will say that the single thing that most frustrates me about most polls, major or minor, is that the available choices often don't allow me to accurated express my true opinion.

  • joel dan walls (unverified)
    (Show?)

    Thanks for posting this. Even those of us who have studied statistics need reminders and refreshers. My biggest concern, howeever, is not the sometimes-willful misinterpretation and spinning of "margin or error", but rather the veracity of the data. We're not dealing with a laboratory experiment here, after all, or a coin flip, or something of the sort, but all the statistical discussion here treats opinion polling as if it were like a laboratory experiment. (Maybe that's what Mr. Lowe is alluding to by "non-randonmess" of the polling sample.) But how in the world do you ever assess non-randomness? A voter may change his mind; a flipped coin can't.

    A suggestion: some humility on the part of pollsters, and some caution on the part of those who interpret polls, would go a long way.

  • (Show?)

    Thanks for posting the original information, Chris, and thanks to BlueO for calling it out as a notable comment.

    (Chris, when were you at Yale? I suspect I'm somewhat older than you -- I was there 1973-78 with a gap in the middle.)

  • (Show?)

    "But how in the world do you ever assess non-randomness? A voter may change his mind; a flipped coin can't."

    Well, randomness simply suggests that everyone in the polling universe (Democrats in Oregon, in this case) has an equal shot of being sampled. We know at a minimum there is some bias towards landlines, but it's not clear that non-landline people are substantially different from landline folks. My mentor in VA is now Asst. Director at Pew Research, and he's done some excellent work on non-phone bias--people who've not had phone service at all in the last year--and found they ARE different from phone household responders.

    When most homes had landlines (and in most places penetration was 97-98%) it was a lot easier.

  • (Show?)

    Great post Chris, thanks for clearing that up. I was not aware of this:

    MOE represents what other statisticians call confidence interval

    Randy, the confidence interval is intended to address that. Of course no technical measure can make up for human error, and lots of factors get overlooked in sampling. But the relation of a sample to the total population is not completely taken for granted in statistical analysis.

    Chris, can you comment more on the conflation of MOE and CI in political polling? Why is it like that?

  • (Show?)

    And to pile on the praise, let me point out that BlueOregon is either seriously advantaged or marginalized by its commenters. This is a community of dialogue, and I think it's safe to say we have the best commenters in Oregon.

    It takes time to compose well-argued, data-rich comments, and sometimes, you wonder if anyone's reading them. Props to Chris and ... we're reading!

  • JTT (unverified)
    (Show?)

    Chris, this has to be the most intelligent comment I have ever seen on BO. It took me an entire term of stats in college to really understand what confidence intervals were and how to calculate them. I'm glad you've laid out in a thousand words (gratis) what I paid a thousand dollars to understand. No seriously, I wish there were more intelligent, reasoned, educated posts like yours on BO. Kudos.

    "A voter may change his mind; a flipped coin can't."

    A poll is a snapshot of opinion in time. Pollsters are not forecasters, exactly for the reason that voters change their minds and move from decided for Candidate A to B from undecided to decided and visa-versa. An election is, in statistical terms, a census (a counting of all persons in a population) whereas a poll is a sampling (random and non-biased) that is used to extrapolate results to the entire population at a given point in time.

    However, I am really disappointed that neither Hibbitts nor SurveyUSA are asking the lean question. Those undecided numbers are really large and it would be REALLY interesting and helpful to know how many are leaning (typically combined leaners are 5-10%) which leaves approx 35% of people STILL completely undecided a week out from election day in most of those statewide races. I wonder what size of down-ballot falloff we're going to see for people turning out to vote in the presidential race and being overwhelmed by downballot contests. It bet it's larger than normal.

  • (Show?)

    "MOE represents what other statisticians call confidence interval"

    This should be clarified. MoE represents HALF of the confidence interval, since the margin extends both directions from the midpoint. The MoE for the last SUSA was 4, which means the confidence interval--the range of plausible results--runs four points to the left and another four to the right.

    So as shorthand, the CI is actually ~2x the MoE.

  • Pat Malach (unverified)
    (Show?)

    "This post reaffirms my opinion that Chris is the best regular commentator on Blue Oregon."

    Posts like this tend to back that up:

    The answer all of us may need to worry about is that the results of both candidates vs. Smith more reflect his relatively high negatives than anything either of our guys have done. The question isn't how to get to low 40s against Smith -- Bill Bradbury can tell us that. The question is how to get the next 10%.

    But my hands down favorite Oregon political blogosphere commenter, for pure entertainment value, is far and away THIS GUY.

  • Terry (unverified)
    (Show?)

    "This post reaffirms my opinion that Chris is the best regular commentator on Blue Oregon."

    Maybe. But he's so damned long-winded. A typical Chris Lowe comment could be cut in half without any loss of substance whatsoever.

  • (Show?)

    You know, an awful lot of this discussion about who's ahead of who and who's tied with who could have been avoided if Kari and other had just read the note header on the SUSA poll data Kari himself posted:

    In U.S. Senate Primary in Oregon, Merkley Continues to Build Support, Remains Tied With Novick: Eight days until votes are counted in the Democratic Primary for US Senate in Oregon, state House Speaker Jeff Merkley and attorney Steve Novick remain effectively tied, though today Merkley has the nominal advantage, 31% to 27%. ... Both the 05/01/08 results and the 05/12/08 results are within the survey's 4.0 percentage point margin of sampling error. Both sets of results should be characterized as effectively tied.

    It then goes on to mention that Merkley's got the mo.

    Why not just read the pollster's own analysis first?

    And Chris is a good (though lengthy) commenter, particularly when he agrees with me.

  • Miles (unverified)
    (Show?)

    I'm hesitant to stick my neck out again, but here goes.

    Pete writes: Randy, the confidence interval is intended to address [the non-randomness of a sample]. . . .The relation of a sample to the total population is not completely taken for granted in statistical analysis.

    I don't think this is quite right. The polling results, including the margin-of-error, are built on the assumption that the sample was in fact random. The MOE suggests that even if your sample is perfectly random, there is still some natural variability when selecting 600 people to call out of over 1 million Democratic voters. You may, randomly, call more Novick supporters in one poll or more Merkley supporters in another. The MOE suggests a range within which that natural variability should occur, and the more people you call the lower your MOE.

    If you don't have a random sample to start with, the poll results become much less useful and even the MOE isn't valid. That's what Chris is getting at when he notes the possibility of "systematic bias that's related to voting preference distributions." For instance, if Merkley is leading among women and women are disproportionately called (and the results aren't weighted to take this into account), the poll is going to show Merkley ahead when in fact he's not.

  • (Show?)

    Thanks again for the kind words. Terry's right, pretty much. Writing shorter takes longer, though. At least for me.

    I'd like to second Jeff about seeking a community of dialogue as a goal for us.

    Pete,

    I hope the following will get at what you're asking.

    The short version, IMO, is that "margin of error" functions as a kind of shorthand in reporting results that incorporates an idea of level of confidence, without reporting it, usually. A fuller expression would be "margin of error at a given level of confidence."

    The MoE shorthand is one that's "good enough" for journalistic purposes. Editors wants people to understand that the poll numbers don't claim to represent reality exactly, but don't want to overburden them with detail.

    To that, then add torridjoe's correction above. He's right and my statement of equivalence isn't.

    <hr/>

    "Margin of error" is calculated using formulae derived from reasoning about confidence levels. So the issue of confidence is intellectually prior to the formulae for resolving it.

    (This gets very complex to explain as applied to sample surveys. because it rests on a whole lot of reasoning first about distributions of phenomena, and the characteristics of certain kinds of distributions, and then about distributions of samples as a particular kind of phenomenon.)

    But for practical purposes confidence intervals are calculated based on margin of error formulae, as TJ says.

    A way to apply what TJ says is:

    CI= estimate ± margin of error.

    (Confidence interval = estimate, e.g. Merkley 31%, plus or minus margin of error. At 4% MoE, CI=27% to 35%.)

    Strictly speak a confidence interval always is an actually named range referring to a specific estimate (poll result in polling). So the CI for Jeff's 31% is 27%-35%, and the CI for Steve's 27% is 23%-31%. Same spread, but different specific CI's.

    <hr/>

    So, while CI's are result specific, the formula for a margin of error is generalizable across a given sample (poll).

    That generalizability of MoE to any given result (e.g. support for Merkely, Novick, Neville, Loera) makes it more correct for overall description of a poll, which is one reason journalistic reporting uses it.

    <hr/>

    However, journalistic "margin of error" reporting typically does not discuss confidence level. That's why it can be helpful to think about MoEs in relation to confidence intervals.

    One thing we can get out of that thinking is the point I was making in the comment reposted here: that probabilities and thus confidence aren't evenly distributed within margins of error.

    Also, the shorthand may erroneously convey an impression that "the" margin of error is absolute for any given poll result (=sample estimate). Actually the margin of error gets smaller if you accept less confidence, and bigger if you want more.

    <hr/>

    The shorthand is o.k. for practical purposes, provided everyone sticks to the convention of a 95% confidence level for journalistic polling, which they do, pretty much. So the shorthand works and margins of error usually are comparable among polls.

    <hr/>

    In addition to TJ's correction, I'd also commend for those interested a response to my original comment by Paul Gronke which adds a bit of precision in statistical terminology, in gentle way.

  • edison (unverified)
    (Show?)

    ARRGGHHH! Standard deviation! I haven't thought about σ for a long time. Excellent and more importantly, cogent stuff, Chris. And please allow me to pile on with praise for your comments here on BO (although I don't agree with the overly lengthy comments). Here's to randomness!

  • (Show?)

    Stephanie,

    Sounds like you're a little older, but not much -- my undergrad years were '76-'82 in Maine & Oregon, which were remarkably similar at the time, with a gap too. Yale was grad school, '83-'91 with 16 months research in Swaziland, South Africa & a few weeks in England.

notable comment

connect with blueoregon