Confidence Intervals for Proportions

Leslie Major; Amy Goldlist

Confidence Intervals

Confidence Intervals for Proportions

Learning Objectives

In this section, we will construct confidence intervals to estimate true population proportions as well as determine required sample sizes to reduce the margin of error below a certain limit.

Constructing Proportion Confidence Intervals

When dealing with trying to understand the true percentage or fraction of population, we will be estimating the true proportion of a population ( $p$ ). The calculations we will need to perform are the following:

Sample proportion: $\bar{p} = \frac{x}{n}$
Sample standard deviation: $σ_{\bar{p}} = \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}}$
Standard error: $E = z \cdot \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}} = z \cdot σ_{\bar{p}}$
$z$ -score: $z = NORM.S.INV (\frac{α}{2})$

We can now construct the confidence interval:

Confidence interval lower limit: $C L_{L o w e r} = \bar{p} - z \cdot \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}} = \bar{p} - E$
Confidence interval upper limit: $C L_{U p p e r} = \bar{p} + z \cdot \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}} = \bar{p} + E$

Note: We will be using $z$ -scores once again when dealing with proportions which is similar to the “sigma known” estimation problems.

Calculating Required Sample Sizes

If we are given or want a maximum margin of error, we can calculate the sample size required to achieve this, given a percent confidence level. The formula is slightly different if we do not have a sample proportion or any idea of what the population proportion is (we use 50% as estimate).

Sample size if $\bar{p}$ is known: $n = {(\frac{z}{E})}^{2} \bar{p} (1 - \bar{p})$
Sample size if $\bar{p}$ is unknown: $n = {(\frac{z}{E})}^{2} (0.5) (1 - 0.5) = 0.25 {(\frac{z}{E})}^{2}$ .

a) With 99% confidence, what can we say about the Maximum size of our error (i.e., the margin of error) in estimating the true percentage of people who prefer the new software program?

Note that we have underlined two important pieces of information in the problem to clearly identify the type of question we are presented with. First, the word“confidence” lets us know that this is an estimation problem and second “true percentage” identifies the fact that we are dealing with the third area of estimation: the true proportion of a population.

The next question you need to ask yourself is what is the question asking us to solve? In other words, what does “the maximum size of our error” or the “margin of error” refer to? The margin of error refers to a value of “E” that represents the maximum distance from the center of the distribution to one confidence limit at the given confidence level of 99%. Thus we calculate the margin of error:

$E = z \cdot \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}}$

Note that we must first calculate $\bar{p}$ (our point estimate of the true proportion of people who prefer the new software product). Information is provided in the question that out of 40 randomly selected people (i.e., our sample size n = 40), 28 preferred the new software product:

$\bar{p} = \frac{28}{40} = 0.70$

Thus, a point estimate of the true proportion of the population who prefer the new software is 70%. Also needed is a z-score that corresponds to the given confidence level of 99%. To determine the z-scores in the illustration, we use NORM.S.INV:

$z = abs (NORM.S.INV (0.005)) = 2.576$

We can calculate the margin of error:

$\begin{aligned} E & = z \cdot \sqrt{\frac{\bar{p} (1 - \bar{p})}{n}} \\ = 2.576 \cdot \sqrt{\frac{0.7 (1 - 0.7)}{40}} \\ = 0.18664 \end{aligned}$

Thus we can be 99% confident that the Maximum error of estimate of the. true proportion will be 0.1866 or 18.66% (rounded to 2 decimal places). We can now place this value on our sampling distribution. Recall, this value represents the distance from the center of our diagram to each confidence limit. This tells us that we are 99% sure that the true proportion (of people who prefer the new software) will fall within 18.7% of our sample proportion of 70%.

c) Using a 99% confidence level, how large a sample should be taken to obtain a margin of error for the estimation of the population proportion of 0.10?

Note that in the concluding statement to part b) there margin of error is quite large. We have found that the true proportion of users who may like the software may actually fall between 51.3% and 87% of the population. This question is asking us to find the required sample size that would enable us to reduce our margin of error to 10%. In other words, we would like to estimate the proportion of users who would like the software and be within 10% of the true proportion of the population.

For calculating sample size for a proportion type question we use the formula:

$n = {(\frac{z}{E})}^{2} \bar{p} (1 - \bar{p})$

This formula requires us to have knowledge of 3 variables:

E is the margin of error that is typically supplied in the question. In part c) it is provided as 10%.
z is a z-score that can be derived by knowing the confidence level (again provided within the question). Once again we have a 99% confidence level and therefore we know the corresponding z-score is 2.576.
$\bar{p}$ : the point estimate of the true proportion of the population.

There is a bit of a circular argument. It is asking us for $\bar{p}$ to calculate the sample size and yet we will be using the sample to calculate $\bar{p}$ ! The rule of thumb for calculating the sample size required for a proportion is to use any prior estimate of the proportion if possible. For example, there may be a prior estimate from a pilot study or previous survey that relates to the current investigation. Of course, if it does not relate to the current investigation it should not be used. As in this example, we have a prior point estimate of 70% of the proportion of users who like the software product over the older version. Thus we may use $\bar{p} = 0.7$ in our calculation:

$\begin{aligned} (1) & n & = {(\frac{z}{E})}^{2} \bar{p} (1 - \bar{p}) \\ (2) & = {(\frac{2.576}{0.10})}^{2} 0.7 (1 - 0.7) \\ (3) & = 139.33 \end{aligned}$

As we saw in the large sample case, when calculating the required sample size we always round up to the next higher integer value. Thus, in order to be 99% certain that the true proportion will be within 10% of our sample proportion, we need to have a sample size of 140 software users in our study.

Click here to download the Excel solutions shown below.

Example 54.1 Excel Solutions

part a)

Values

Excel Formula

conf level =

99%

alpha =

1%

=1-B3

a/2 =

0.005

=B4/2

z =

2.575829

=ABS(NORM.S.INV(B5))

x_bar =

28

n =

40

pbar =

0.7

=B7/B8

1-p_bar =

0.3

=1-B9

sigma =

0.072457

=SQRT(B9*B10/B8)

E =

0.186637

=B11*B6

part b)

CL_Lower =

0.513363

=B9-B12

CL_Upper =

0.886637

=B9+B12

part c)

E_new =

0.1

n_new =

139.3328

=(B6/B19)^2*B9*B10

answer =

140

=ROUNDUP(B20,0)

part d)

p_new =

0.5

q_new =

0.5

=1-B24

E_new =

0.1

n_new =

165.8724

=(B6/B26)^2*B24*B25

answer =

166

=ROUNDUP(B27,0)

License

Icon for the Creative Commons Attribution-NonCommercial 4.0 International License

An Introduction to Business Statistics for Analytics (1st Edition) Copyright © 2024 by Amy Goldlist; Charles Chan; Leslie Major; Michael Johnson is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License, except where otherwise noted.

Constructing Proportion Confidence Intervals

Calculating Required Sample Sizes

Distribution of Population proportions

Example for this Section (You Try)

Example 54.1

Click here to reveal the Written solution to part a)

Click here to reveal the Written solution to part b)

Click here to reveal the Written solutions to part c)

Click here to reveal the written solutions to part d)

Click here to reveal the Excel solutions for this Example

License

Share This Book