Archive

For November, 2012

How to Interpret Polls–Do Not Forget the Margin of Error and the Sample Size

Comments Off

With the U.S. presidential election on November 6, we are presented with an ever increasing onslaught of political polls and their results. To make a proper interpretation of a poll’s results, three additional variables should be specified in addition to the proportion results: the poll’s margin of error, the desired level of confidence, and the sample size. In this brief essay, I will review the math behind the margin of error in polls to help you with interpretation of polls.

Background Information
The purpose of a poll is to estimate the opinion or behavior of a population from a sample. We work with a sample since contacting the entire population is too time consuming, often too expensive, and can be physically impossible. Several methods of sampling are used, and simple random sampling, systematic sampling, stratified sampling, and cluster sampling are the most widely used methods.

After the sample is selected from the population, a statistic computed from sample information estimates a population parameter.  The statistic computed from the sample that estimates the population parameter is called a point estimate. As an example, the sample mean, , is the point estimate of the population parameter, μ, the population mean. For polls, the sample proportion, ρ, is the point estimate of the population parameter, π, the population proportion.

How Close is the Point Estimate to the Population Parameter?
We now come to the essence of this essay–the confidence interval estimate (CI). A confidence interval estimate is a range of values constructed from sample data so that the population parameter is likely to occur within that range at a specified probability. The specified probability is called the level of confidence, and in most cases of poll results, the level of confidence is set at .95 (i.e., a pollster has a 95% confidence that the true measurement lies within the margin of error). Putting all of this together gives us the following equation:  CI = point estimate ± margin of error

Accordingly, the CI is determined from the margin of error. You’ve seen the margin of error in some poll results, e.g., “The poll has a margin of error plus-minus 3.1 percentage points for the sample.” If the poll determined that ρ = .5, then the CI would be 50% ± 3.1% = 46.9-53.1, i.e., a pollster has a 95% confidence that the true poll results are 46.9% to 53.1%

I will now show how the margin of error is used to determine a poll’s CI and sample size.

How Is the CI Determined in a Poll?
To determine the CI in a poll, we will use the following formula to compute the margin of error: z * standard error. Mathematically, this formula is expressed as:

In this formula, z defines the level of confidence. In polls, the 95% level of confidence gives us a z score of 1.96. Also in polls, we determine the standard error as the maximum standard error by setting the proportion at 50% (ρ = .5).

We plug in these numbers to determine the margin of error at the 95% level of confidence:


Polls that we see in the media use the 95% level of confidence in determining the margin of error. However, statisticians also determine the margin of error using the 90% and 99% levels of confidence, although the 95% l.98evel of confidence is the most common. The margin of error for the 90% confidence level is calculated using a z score of 1.65:

For the 99% confidence level, the margin of error is calculated using a z score of 2.58:

How Is the Sample Size of a Poll Determined?
I noted above that the purpose of a poll is to estimate the opinion of a population from a sample. As researchers, we are interested in the generality of the data in terms of the number of subjects in the population to which the results apply. If a poll has a margin of error of 3.1%, we can use the formula for the margin of error to estimate the size of the sample:

A recent poll from NBC News/Wall Street Journal reported the following poll results:

Obama is ahead of Romney by five points, 49 percent to 44 percent. The full poll was conducted Oct. 17-20 among 1,000 registered voters. The poll has a margin of error plus-minus 3.1 percentage points for the sample of registered voters.

According to the formula above, we can see how the margin of error was calculated from the sample size of n = 1,000 registered voters.

Putting it All Together
A new TIME Poll has Obama holding a 49% to 44% lead over Romney in Ohio. The poll’s margin of error is plus or minus three percentage points. How do we interpret the results of this poll?

First, we estimate the sample size: n = (.98/.03)² = 1,067.

Second, we estimate the CI around each point estimate at the 95% level of confidence. Obama: 46-52     Romney: 41-47.

Finally, we decide that according to the results of this particular poll, a sample of 1,067 people in Ohio are equally likely to vote for Obama or Romney 95 times out of 100 (because the CIs overlap).

Conclusion
I wrote this essay to provide some clarity and perspective on election polls by reviewing the statistics behind polls. I emphasized that the result of a poll must be interpreted along with the poll’s margin of error so that the sample size and CI can be determined.

For more information on the science of polls, check out Nate Silver’s book The Signal and the Noise: Why So Many Predictions Fail-But Some Don’t, and Nate Silver’s blog FiveThiryEight

Blue Taste Theme created by Jabox