Counts

An introduction to basic statistics and an introduction to confidence intervals are available elsewhere in this guidance.


Calculation of confidence intervals for counts

The choice of method for calculating the confidence intervals for a count should be based on whether the count is based on a binomially distributed proportion or a Poisson distributed rate.

Binomial distribution

Where the observed count is based on a proportion, for example, a smoking prevalence indicator, its limits should be calculated from the binomial distribution. Here, the Wilson Score method (Wilson, 1927) should be used, and the \(100\left(1 - \frac{\alpha}{2}\right)\)% confidence limits for the proportion \(p\) are given by:

\[ p _{lower} = \frac{2O + z^2 - z\sqrt{z^2 + 4Oq}}{2(n + z^2)} \]

\[ p _{upper} = \frac{2O + z^2 + z\sqrt{z^2 + 4Oq}}{2(n + z^2)} \]

where:

  • \(O\) is the observed number of individuals in the sample or population who have the specified characteristic

  • \(n\) is the total number of individuals in the sample or population

  • \(đť‘ž\) is the proportion without the specified characteristic (\(1 – p\))

  • \(z\) is the \(100\left(1 - \frac{\alpha}{2}\right)\) th percentile value from the standard normal distribution

For example, for a 95% confidence interval, \(\alpha = 0.05\) and \(z \cong 1.96\) (the 97.5th percentile value from the standard normal distribution).

The limits for the proportion should then be multiplied by \(n\) to convert them into limits for the observed count.

Poisson distribution

Where the count follows a Poisson distribution, for example, a mortality or cancer incidence count, Poisson limits should be used. These can be calculated using either Byar’s method (Breslow and Day, 1987) or the \(\chi^2\) exact method (Armitage and Berry, 2002), depending on the size of the count.

Counts of 10 or greater

Where the count \(O\) is 10 or greater, Byar’s method should be used to calculate the lower and upper confidence limits of the count. These are given by:

\[ O _{lower} = O \left(1-\frac{1}{9O}-\frac{z}{3\sqrt{O}}\right)^3 \]

\[ O _{upper} = (O+1) \left(1-\frac{1}{9(O+1)}+\frac{z}{3\sqrt{O+1}}\right)^3 \]

where:

  • \(O\) is the total observed count of events in the local or subject population
  • \(z\) is the \(100\left(1 - \frac{\alpha}{2}\right)\)th percentile value from the standard normal distribution

Counts of less than 10

Where the count \(O\) is less than 10, the \(\chi^2\) exact method should be used to calculate the lower and upper confidence limits of the count. These are given by:

\[ O _{lower} = \frac{{\chi}^2_{lower}}{2} \] \[ O _{upper} = \frac{{\chi}^2_{upper}}{2} \] where:

  • \(O\) is the total observed count of events in the local or subject population

  • \({\chi}^2_{lower}\) is the \(100\left(1 - \frac{\alpha}{2}\right)\)th percentile value from the \({\chi}^2\) distribution with \(2{O}\) degrees of freedom

  • \({\chi}^2_{upper}\) is the \(100\left(1 - \frac{\alpha}{2}\right)\)th percentile value from the \({\chi}^2\) distribution with \(2{O}+2\) degrees of freedom


Page last updated: August 2024