Excess mortality in England: Methodology

Official Statistics in Development

Updated: 17 April 2025

1 Introduction

This methodology document describes the methods used by the Office for Health Improvement and Disparities (OHID) for the Excess mortality in England monthly reports. It details how monthly excess deaths have been estimated. The method is similar to that used by the Office for National Statistics for their weekly reporting of excess deaths, which is described in the ONS Methodology Document.

See Appendix 2 for list of updates to the model, data sources and documentation since its launch.

Excess deaths are estimated by comparing the number of (observed) registered deaths each month with the number of deaths that would have been expected, based on recent trends. The numbers of expected deaths are estimated using a statistical model, based on trends in mortality rates over the previous five years, with adjustments to take account of the extremely high death rates in peak months of the Covid-19 pandemic.

The report also includes monthly directly standardised mortality rates to complement the excess deaths data: where there is a clear rising or falling trend in the past rates it is helpful to know that as context for the excess deaths.

All the data are presented by calendar month of registration at regional and upper tier local authority level, for subgroups of the population (age groups, sex, deprivation groups) and by cause of death.

During the Covid-19 pandemic, Public Health England, and then OHID, published weekly data which compared the numbers of registered deaths with the number expected, had there been no pandemic. These reports aimed to show the overall impact of the pandemic on mortality. The current monthly reports are not related to any particular cause or event: they are intended to provide ongoing monitoring, to highlight variations in monthly mortality that may arise for any reason.

2 Methods

2.1 Overview

Three separate sets of analyses are carried out:

excess deaths from all causes, broken down by age group, sex, deprivation quintile, region (former government office region) and upper tier local authority (UTLA)
excess deaths from specific causes, identified by underlying cause of death
directly age-standardised mortality rates (DSRs), broken down by age group, sex, deprivation quintile, region, upper tier local authority and underlying cause of death

All of the analyses in the first set involve a single model, which means all the results are entirely consistent with each other.

For each cause of death, a separate model is run, because groups of causes are not necessarily mutually exclusive, and running the model at a more granular level, so that causes can be grouped in different ways, is too computationally intensive. Deaths are selected using International Classification of Diseases, 10th revision (ICD-10) [1] codes assigned by the Office for National Statistics (ONS). The causes of death included in the report are those which account for large numbers of deaths or are of specific policy interest. The cause of death analyses are based on underlying cause of death coded in the death record. Because cause of death coding can be missing from the first registration record received, and added or amended later, the cause of death analysis should be considered provisional for the most recent two months released.

The DSRs are calculated independently (they do not come from the excess deaths models), but are based on the same data.

2.2 Expected deaths – generation of the modelled estimates

2.2.1 Data sources

Models to develop baseline estimates of the expected number of death registrations in a given month of the year are constructed using a combination of deaths and population denominator data from the previous five years, with the baseline period moving on one month each month so that every month’s baseline is calculated on the same basis as any other month. For example, the baseline (comparator) for deaths registered in January 2024 is calculated using data from February 2018 to January 2023 inclusive, and that for December 2023 is calculated using data from January 2018 to December 2022 inclusive, etc. However, data for March to May 2020 and November 2020 to February 2021 are excluded from the baseline calculations as these months were dominated by exceptional numbers of deaths from Covid-19. Specifically, months excluded from the baseline had more than 18% of deaths with Covid-19 as the underlying cause. Those not excluded from the baseline had less than 10% of deaths with Covid-19 as the underlying cause. No months fell between those two cut-offs. March 2020 was also excluded because it is recognised that at the start of the pandemic there was no code for Covid-19 so many deaths went unrecorded.

Mortality data

Deaths for the baseline period are drawn from fully coded and cleaned annual extracts supplied to us by ONS, supplemented by daily deaths data supplied by ONS (for recent periods, until the annual extract replaces them). Deaths are aggregated by month of registration, age group, sex, deprivation quintile, underlying cause of death and UTLA.

Denominator data

Where available, population data (for baseline and target periods) are derived from 2021 census based (or rebased) mid-year estimates, published by ONS. Population data match the breakdowns set out for mortality data above. Mid-month populations are estimated by interpolating between the published mid-year estimates.

Where mid-year estimates are not available from ONS population projections are used and, where necessary, proportioned out to subgroups using the nearest available year’s mid-year estimates.

At the time of publication:

At England level, mid-year estimates are available from ONS up to 2023, and projections for 2024 and 2025
At regional and UTLA level, mid-year estimates are available from ONS up to 2023. For 2024 and 2025, the England projections are proportioned out to regions and UTLAs using the 2023 region and UTLA mid-year estimates.
At LSOA level (used for deprivation quintiles), mid-year estimates are available from ONS up to 2022. For 2023, the UTLA mid-year estimates are proportioned out to LSOAs using the 2022 LSOA mid-year estimates. For 2024 and 2025, the England projections are proportioned out to LSOAs using 2023 lower tier local authority mid-year estimates and 2022 LSOA mid-year estimates.

Cause of death analyses use whole population denominators.

Populations will be updated when they are published by ONS, and models re-run.

2.2.2 Baseline model

Model outcome

The primary model provides estimates of expected deaths by month of registration at national and subnational level, and for subgroups of the population (age group, sex, deprivation group, region and UTLA). Similar models are subsequently run to provide estimates by cause of death.

Data structure and covariates

In line with the ‘rising activity, multi-level mixed effects, indicator emphasis’ (RAMMIE) model [2], independent variables include month of year, allowing for seasonal effects. Covariates were included, allowing for the effect of age, sex, deprivation, and geographical area. Age is grouped in the model into broad age bands for younger age groups (0 to 24, 25 to 49 and 50 to 64) and 5-year age bands for older age groups (65 to 69 through to 85 to 89 and 90 and over). Younger age groups cannot reasonably be modelled in 5-year age bands as the numbers of deaths within the socio-demographic subgroups are small, so the models take an unreasonable amount of processing time to converge and give unreliable estimates. The model standardises (indirectly) for age using the age groups specified.

A linear trend was also included in the model to take into account systematic changes in the rate of death that are not reflected in the changing age structure of the population. The trend was constructed by assuming a constant daily rate of change throughout the baseline period.

Data are presented by:

age group (0 to 24, 25 to 49, 50 to 64, 65 to 74, 75 to 84 and 85 and over) derived from age at the time of death
sex (male or female) based on sex reported in the death record
region and UTLA based on April 2023 UTLA boundaries [3]
deprivation quintile based on lower layer super-output area (LSOA) of residence: LSOAs are grouped into national deprivation quintiles using the 2019 Index of Multiple Deprivation (IMD) [4]

Models are also run for separate underlying causes of death (listed in Appendix 1). The models are identical to the main model described above, with deaths filtered by underlying cause. The data are then presented by region.

The structure of the models used is hierarchical with denominators and counts of death each being fully disaggregated by age group, sex, geographic area, and deprivation.

Statistical modelling

Quasi-Poisson regression models are fitted on the logarithmic scale [5]. Quasi-Poisson models are used because when counts of deaths are independent of one another they theoretically follow a Poisson distribution. This has the characteristic property that as its mean (the expected number of deaths) increases, the variability of the observed count of deaths (its variance) rises in parallel such that the variance always equals the mean.

However, deaths are not completely independent. For example, an epidemic such as a high ’flu season results in outlying high rates of death for a period, which, if not accounted for, would carry an inappropriate amount of weight in the baseline. In these circumstances, the variance increases faster than the mean. This is referred to as overdispersion. Because Quasi-Poisson models allow the linear relationship between variance and mean to have a slope other than 1, they are more suitable for analysis or death rates when overdispersion exists.

The models contain the set of covariates outlined in the ‘Data structure and covariates’ section above. To allow for effects to vary between groups, interaction terms are included for trend and age, trend and deprivation, age and sex, age and deprivation and age and month of year. When modelling deaths it is fundamental to include a denominator representing the scale of exposure: the exposure includes the size of the population and the number of working days in the month (to take account of the fact that registrations only occur on working days). The exposure is specified in the model as an offset: person-working-days.

The model generates expected death rates for each population subgroup for the month specified, which are then applied to relevant population estimates to estimate the expected numbers of deaths for each month in each subgroup. The presence of trend in the model (and the interactions of trend with age and deprivation) means that the expected deaths assume that the trend through the baseline period would continue through to the current month.

The models are run using the generalised linear modelling function in the statistical package R [6].

2.3 Observed deaths

Results of the analyses are presented on a monthly basis. ONS provide a daily feed of registered deaths data, which are provisional and subject to change. For each monthly publication, the previous two months will also be updated using the latest version of the data, resulting in small changes reflecting improvements in cause of death coding or the addition of registrations not previously received. For example, in February, the models and outputs for November and December will be re-run, in addition to the first publication of data for January.

2.4 Calculation of excess deaths

Monthly excess mortality is calculated by taking the observed number of deaths registered in a month and subtracting the expected registered deaths for that month.

Cumulative excess mortality is estimated by summing the monthly excess deaths over the period selected. Excess death ratios, cumulative and monthly, are calculated by dividing observed deaths by expected deaths. This is the same way as a standardised mortality ratio is calculated: the reference rates are generated from the model.

2.5 Calculation of directly age-standardised rates

The DSRs, with 95% confidence intervals, are calculated using the standard OHID methodology approach. They are presented as rates per 100,000 person-years, so that the rates for all months are directly comparable with one another.

Rates are presented by age group (the same age groups as the excess deaths data), sex, deprivation quintile, region, upper tier local authority and underlying cause of death

Numerator data

The deaths data are as described in section 2.2.1 above, aggregated within each of the breakdowns listed above, by calendar month (based on the date of registration), additionally broken down by 5-year age group to facilitate the standardisation calculation.

Denominator data

Monthly population estimates are derived from ONS mid-year estimates of population, exactly as described in section 2.2.1 above.

Deaths are almost exclusively registered on working days, and the number of working days in each calendar month varies depending on how weekends fall and bank holidays. Hence the monthly denominators are adjusted to ensure monthly rates are comparable: the population estimate for each month is divided by the total number of working days in the calendar year, and multiplied by the number of working days in that month. Essentially the denominator is the number of person-working-days, with the results scaled for presentation as per 100,000 person-years.

3 Limitations

The reports published on excess mortality related to the pandemic included ethnicity in the model and reports. Permission to link deaths data to HES data and other sources was only granted for work relating to the management Covid-19 pandemic. Hence, we are unable to include ethnicity in the model or outputs for ongoing monitoring.

Deaths that tend to involve a coroner’s inquest, including a large proportion of those in people under 65, will be subjected to delays in reporting, as it may take months for an inquest to take place and for the death then to be registered. The times between death and registration have gradually increased over the last 20 years, and additional disruption to the coroner service during the pandemic means that it is important to interpret excess deaths in younger people particularly with caution. Furthermore, there are increasing numbers of deaths being registered, where the coding of cause of death is delayed: for this reason, the cause of death analyses will be provisional for the first two months of publication.

Deprivation is attributed ecologically based on the LSOA of residence at time of death. Any individual living in an area may not be representative of the area as a whole. In particular, for care home residents, the deprivation level of the location of the care home may not reflect the level of deprivation they experienced prior to entering the home.

The baseline is modelled using five years of historical data. These data include years of relatively high mortality and relatively low mortality. Although trends are detected over this period, and are used to inform the expected deaths, they are not necessarily stable trends – the prediction intervals reflect the uncertainty around prediction of any one year (where available). Because the trends from the baseline period are extrapolated to the current reporting month, in groups where the death rate has been increasing through the baseline period, the expected deaths will continue this trend. If death rates have risen since the baseline period, but less than expected, this will result in in negative excess deaths. Negative excess deaths do not necessarily indicate a downward trend in rates.

Excess deaths ratios, like standardised mortality ratios (SMRs), enable comparison between groups, but because they are indirectly standardised, the comparison is not precise. It is however very unlikely to be misleading.

Directly standardised rates for individual months are inevitably based on quite small numbers of deaths for some breakdowns. For the 0-24 age group within regions the number of deaths is frequently less than 10, and DSRs cannot reliably be calculated, so they are omitted. For some other breakdowns the DSRs are valid, but confidence intervals are wide and the figures should be interpreted with caution.

4 Comparison with other measures

Excess deaths measures are published by other public bodies, for varying purposes. For an overview of the different measures available see Measuring excess mortality: a guide to the main reports.

References

World Health Organization. ICD-10: international statistical classification of diseases and related health problems. Tenth revision, 2nd ed. 2004. World Health Organization. [Cited: 13 February 2024]
Morbey RA, Elliot AJ, Charlett A, Verlander NQ, Andrews N, Smith GE. The application of a novel ‘rising activity, multi-level mixed effects, indicator emphasis’ (RAMMIE) method for syndromic surveillance in England, Bioinformatics, Volume 31, Issue 22, 15 November 2015, pages 3660 to 3665.
Office for National Statistics. Local Authority District to Region Lookup (May 2023) in England, ONS geography open data. [Cited: 13 February 2024].
Ministry of Housing, Communities & Local Government. English indices of deprivation 2019 (IoD2019). Statistical Release. 26 September 2019. [Cited: 13 February 2024].
Gardner W, Mulvey EP, Shaw EC. Regression analyses of counts and rates: Poisson, overdispersed Poisson, and negative binomial models. Psychological Bulletin. 1995, 118(3), 392 to 404.
R Core Team. The R Project for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. 2024. [Cited: 13 February 2024].

Appendix 1: Cause of death ICD-10 reference codes

Cause description	ICD-10 definition
Cancer	C00 to C97
Dementia and Alzheimer’s	F01, F03 and G30
All circulatory diseases	I00 to I99
Ischaemic heart diseases	I20 to I25
Cerebrovascular diseases	I60 to I69
Influenza and pneumonia	J09 to J18
Chronic lower respiratory diseases	J40 to J47
Cirrhosis and other liver diseases	K70 to K76

Appendix 2: Change log

This section contains details of changes made to the excess mortality reports from their initial release in February 2024.

Release month	Change details
February 2024	First release of tool.
September 2024	Change in deaths data: Replaced provisional 2023 deaths data with final data. These data were used in the update of the tool in September and October but a whole time series revision to incorporate the final 2023 data was not done until November. Methodology document updated to reflect changes.
November 2024	Change in population and deaths data: Moved to using LSOA21 boundaries for assigning data to deprivation quintile.
November 2024	Change in population data: Moved to using ONS Census 21 mid-year population estimates at LSOA level for 2022 to obtain the distribution of population by deprivation. These distributions were applied to LTLA level data for 2013-2021 and 2023. The 2023 LTLA population distribution was applied to national population projections for 2024 and 2025, with the LTLA population projections then distributed into deprivation quintiles using the 2022 distributions.
November 2024	Change in population data: The whole excess mortality time series was revised to use the new populations.
November 2024	Inclusion of DSR time series data in tool. Methodology document updated to reflect all changes.
January 2025	Change in population data: Moved to using ONS Census 21 rebased mid-year population estimates for 2013 to 2022 at LSOA level. ONS revised populations for 2023 were incorporated which also resulted in minor changes to populations for 2023 onwards. The whole DSR and excess mortality time series was revised to use the new populations. Methodology document updated to reflect changes.
March 2025	Change log added to methodology document.
April 2025	Name change. Name changed on report and accompanying documents, to remove reference to pandemic.

Office for Health Improvement and Disparities

This publication is licensed under the terms of the Open Government Licence v3.0 except where otherwise stated. To view this licence, visit nationalarchives.gov.uk/doc/open-government-licence/version/3.

Where we have identified any third-party copyright information you will need to obtain permission from the copyright holders concerned.

OGL Logo