I suspect most people who read this blog know all this already but I’ve met the same misunderstanding at work recently and also in the context of the opinion polls around the POTUS election. So here is a simplified explanation.
Imagine I have a great big jar of jelly beans, which are the favoured confectionary of probability explanations. There are exactly 500 red jelly beans and 500 blue jelly beans and nothing else – no Jill Stien jelly beans or exotic Even McMulberry flavours. A jelly bean pollster doesn’t know this, though. The pollster wants to estimate the proportion of red and blue jelly beans in the jar BUT is only allowed to look at some of the jelly beans.
The pollster grabs a handful of jelly beans from the jar and looks at the relative proportion of jelly beans. Naturally, I don’t want the pollster to do this very often because they’ll put their germ-ridden hands all over my beautiful jelly beans. So pollster only has this handful to look at. They have to make a key assumption – that the jelly beans were well mixed so that their handful is a random pick of jelly beans in the jar.
The pollster looks at the proportion of red to blue jelly beans. Let’s say they have 5 red and 8 blue jelly beans. The pollster says that the proportion of red to blue is 38% to 62% BUT they also report a margin of error that is quite large. They can’t be sure this figure is right because they know they may have been unlucky. With only 13 jelly beans in their handful, it isn’t wholly impossible that they could pick out nothing but blue jelly beans if the true proportion was 50-50. Now note if they did pick out nothing but blue, this could happen by chance.
Margins of error address only this aspect of errors in polling – that the proportion in the sample was to some extent an ‘unlucky’ pick. Both the reported figure and the margin of error BOTH assume that the picking was done correctly. In our jelly bean example the assumption that the beans were well mixed together.
Now it so happens that I didn’t mix the jelly beans well (although the pollster can’t tell)*. There are actually MORE red towards the top and fewer red towards the bottom of the jar. So the pollster’s assumption was wrong. A clever pollster might try to find ways to deal with this methodologically (e.g. by grabbing beans from both the top and the bottom) but the principle still applies: the reported estimate and the margin of error assume that the sampling methodology was valid. The margin of error doesn’t (and can’t) account for the probability of what in common parlance would be called an ‘error’ (i.e. a mistake).