Using comparability ratios for mortality data

Introduction

Mortality statistics are essential tools for monitoring public health, developing public health policy and evaluating the effects of policies and the outcomes of health services. Overall mortality or all cause rates in particular population groups are sufficient for some of these purposes. However, for many purposes we need to know what are the main diseases and injuries leading to death? How do death rates from these causes compare with those of other countries? How do they vary between different areas of the country and between groups of the population? How are they affected by health-related behaviour, exposure to potential hazards and available treatment? Is the burden of disease changing? To understand these issues we need to know whether mortality from particular diseases or injuries is increasing or decreasing over time.

To answer these questions, causes of death have to be classified in a reliable way, which is comparable between places and over time. An agreed system of classification, The International List of Causes of Death, was first developed for this purpose in the late nineteenth century (World Health Organisation). Much of our understanding of the causes and prevention of disease has depended on comparisons between death rates coded and classified in the same way in countries around the world since then. Needless to say, a classification which was developed over a century ago would be unlikely to reflect current knowledge about the nature and causes of diseases, or the relationships between them. The need for regular updating of the classification has been recognised since the beginning of the twentieth century, and revisions produced almost every decade.

The International Classification of Diseases, Ninth Revision (ICD-9) (World Health Organisation, 1977) was in use between 1979 and 2000 for mortality statistics in England and Wales, so was widely recognised as being out of date when it was replaced by, the Tenth Revision (ICD-10) (World Health Organisation, 1992) in 2001. For example, AIDS and HIV were unknown when it was devised. This revision represented the largest change in the ICD in over 50 years. The first character of each code became alphabetic rather than numeric, enabling expansion of the number of codes to provide for recently recognised conditions and more detail about common diseases. Some diseases and groups of conditions moved from one chapter to another to reflect current ideas of aetiology and pathology at that time (Ashley, 1990).

However, there is an additional factor in mortality coding which can have a major impact on mortality statistics. This is the application of strict rules for selecting the underlying cause of death from all the conditions written on the death certificate. Changes in these rules usually have much larger effects on the number and proportion of deaths attributed to particular causes than any other change in classification (Rooney and Devis, 1996). Several of these rules changed substantially in ICD-10 (World Health Organisation, 1992).

As well as the need for an up-to-date classification, we need to be able to look at trends in cause specific mortality over time. In order to do this, we must be able to measure the effect of the change in classification on the proportion of deaths attributed to different causes. This is done using sample ‘bridge coding’ where a sample of deaths are coded to both the old and new classification, for example ICD-9 and ICD-10, and the results compared using statistics known as ‘comparability ratios’. A large bridge coding exercise was undertaken by the Office for National Statistics (ONS) for the change from ICD-9 to ICD-10 (Rooney et al., 2002).

Since 2001, ONS has implemented a number of updates to ICD-10 to keep it up to date ahead of eventually introducing ICD-11. Again bridge coding exercises, of samples rather than a full year of death records, were carried out to ensure that trends in mortality could be analysed consistently over time.

This document describes how comparability ratios are calculated, along with their confidence intervals, and explains how to apply them to analyse mortality data over specific time periods. Details on the specific changes at particular time points since 2001 are also provided. References to documentation about the change from ICD-9 to ICD-10 are also provided for those wishing to analyse longer-term trends.

Calculation of comparability ratios

Comparability ratios are calculated as the number of deaths coded to an underlying cause in the new version divided by the number of deaths coded to the same underlying cause in the old version. Using the bridge coded datasets, users can estimate the impact of the software and coding changes over time, by calculating comparability ratios.

Example: Deaths in males over 75 from cerebrovascular diseases (including stroke)

Cerebrovascular diseases defined using ICD-10 codes I60 to I69.

• MUSE 5.5: Number of deaths coded to cerebrovascular diseases = 748

• IRIS: Number of deaths coded to cerebrovascular diseases = 773

The comparability ratio is: \[ \frac {748}{773} = 0.968 \]

In this example, the number of deaths from cerebrovascular diseases has decreased by 3.2% in the new version when compared with the same underlying cause in the old version. If the ratio had been 1, the number of deaths coded to cerebrovascular diseases would have been the same in both versions. That may not necessarily mean that no changes took place, however: some causes of death in the new version may ‘gain’ some deaths from some causes and in turn ‘lose’ some deaths to other causes. If these movements in and out are balanced, the numbers of deaths in both versions will be the same.

The bridge coded datasets published by ONS cover data for all persons and all ages. These data have been used by analysts within the Department of Health and Social Care (and predecessor organisations) to calculate comparability ratios, split by sex and broad age group. This adjustment is particularly important, as for some causes, the bridge coded data indicate differences in the effects of the coding changes between sexes or age groups.

For the changes introduced in 2011, the age group split was under 65 and 65 and over. For the changes introduced in 2014, 2020 and 2022, the age group splits were under 75 and 75 and over.

The ratios are calculated with confidence intervals, calculated using the same method used by ONS (Rooney et al., 2002) and then the following policy is applied:

If an initially calculated comparability ratio is greater than 0.99 and less than 1.01, the comparability ratio is set to 1, as adjusting for the impact of the coding change would have little effect on mortality rates for this cause group.
If an initially calculated comparability ratio is less than or equal to 0.99 or greater than or equal to 1.01, the confidence interval is then checked. If the confidence interval includes 1, the comparability ratio is set to 1, as there is insufficient evidence of the impact of the coding change on mortality rates for this cause group.
Otherwise, the initially calculated comparability ratio is used, rounded to 3 decimal places.

The changes to selection rules impact on the underlying cause of death, but do not affect how causes of death are recorded on the death certificate. For example, the number of deaths where dementia was mentioned on death certificates in 2014 will be comparable to the number mentioned in 2013. No adjustment is therefore needed when using multiple cause of death data to look at mentions of causes, provided that equivalent cause code groupings are used.

Application of comparability ratios for specific time periods

It is normally recommended that comparability ratios are applied to historic data, to make them comparable to data coded using the most recent version of the ICD. It is only necessary to use comparability ratios if looking at trends in causes of death or grouping years of mortality data together.

In reporting cause-specific trends, ONS have generally taken the position that they do not report adjusted numbers of deaths. Instead, they have tended to report actual numbers of deaths and added warnings to users to note the likely impact of ICD revisions. Where comparability ratios are applied, it should be made clear that these are ‘adjusted’ numbers.

The comparability ratios can also be used to adjust mortality rates. The ratio can either be applied directly to the rate or the numbers of deaths can be adjusted before the adjusted rate is calculated. It is recommended that the latter option is used, so that the adjusted rate and its confidence interval are both based on adjusted numbers of deaths. Since the ratios are different for different age groups and for males and females, applying an overall single comparability ratio to the calculated rate would require this to be calculated taking into account the age/sex distribution of deaths by cause. If the numbers of deaths are adjusted, however, the appropriate ratio can be applied to each age/sex-specific count.

The published ratios for ICD-10 can be applied to mortality data in England and Wales back to 2001, the year that ICD-10 was introduced. If looking at trends across a range of years, Table 1 provides guidance on which version of the comparability ratios should be used for each time period. If analysis needs to go back before 2001, ONS published ratios can only be applied to ICD-9-coded data from 1993 to 2000 since there was a significant change between 1992 and 1993 brought about by the introduction of automated cause coding.

Table 1: Guidance on the version of comparability ratios to use for particular time periods

Year of death registrations	Rules for applying comparability ratios to counts of deaths
2022 onwards	Counts of deaths remain the same
2020 to 2021	Counts of deaths are multiplied by comparability ratios based on bridge coding data between MUSE 5.5 and MUSE 5.8
2014 to 2019	Counts of deaths are multiplied by two sets of comparability ratios, based on bridge coding data between MUSE 5.5 and MUSE 5.8, and IRIS and MUSE 5.5
2011 to 2013	Counts of deaths are multiplied by three sets of comparability ratios, based on the bridge coding data between MUSE 5.5 and MUSE 5.8, IRIS and MUSE 5.5, and ICD-10 v2010 and IRIS
2001 to 2010	Counts of deaths are multiplied by four sets of comparability ratios, based on the bridge coding data between MUSE 5.5 and MUSE 5.8, IRIS and MUSE 5.5, ICD-10 v2010 and IRIS, and ICD-10 v2001.2 and ICD-10 v2010
1993 to 2000	Counts of deaths are multiplied by five sets of comparability ratios, based on the bridge coding data between MUSE 5.5 and MUSE 5.8, IRIS and MUSE 5.5, ICD-10 v2010 and IRIS, ICD-10 v2001.2 and ICD-10 v2010, and ICD-10 v2001.2 and ICD-9

Example: Deaths in males over 75 from cerebrovascular diseases (including stroke) in 2009

In 2009, there were 11,987 deaths recorded in males over 75 from cerebrovascular diseases (including stroke). If producing trends over time in deaths from cerebrovascular disease, the 11,987 deaths would need to be multiplied by four sets of comparability ratios. For this specific cause, sex and age group, the comparability ratios are as follows:

• MUSE 5.8: 1

• MUSE 5.5: 0.968

• IRIS: 1.017

• v2010: 0.880

When comparing with the latest deaths, the count of deaths in 2009 would be:

\[ 11,987 \times 1 \times 0.968 \times 1.017 \times 0.880 = 10,385 \]

Alhough the comparability ratio for MUSE 5.8 is 1, the number of deaths in the bridge coded dataset was not equal, with 756 deaths from cerebrovascular diseases being recorded for males over 75 using MUSE 5.8 compared with 748 deaths using MUSE 5.5. The comparability ratio was set to 1 because the confidence intervals overlapped 1.

Calculation of confidence intervals

When adjusted counts of deaths are used to calculate rates or ratios, there are two potential ways of calculating confidence intervals for the statistics:

the adjusted counts can be used to determine the confidence intervals as if they were actual observed counts
the rate or ratio (\(R\)) and its confidence interval (\(R_L\), \(R_U\)) can be calculated using the original counts, and the confidence interval applied to the adjusted rate or ratio (\(R^*\)) as:

\[ R_L^*=\frac {R_L \times R^*}R , R_U^*=\frac {R_U \times R^*}R \]

While the second method is arguably more accurate, as it uses the actual observed counts to define the confidence interval, it will make little difference if the ratios are fairly close to 1, as they are in the majority of cases. Hence it is recommended, pragmatically, that the adjusted counts are treated as if they were observed counts, and the confidence intervals calculated using the standard methods.

Attributable fractions

Comparability ratios can be used to adjust numbers of deaths for the calculation of mortality indicators which involve the application of attributable fractions, for example to estimate numbers of smoking-related or alcohol-related deaths. In these cases, the comparability ratios should be used to adjust numbers of deaths separately for each cause, or group of causes, within the definition. Attributable fractions can then be applied to these adjusted numbers. Where the attributable fraction is 1 (for example for causes in which all deaths are accepted as alcohol-related) these can be adjusted as a single group of causes.

Details of software in use from 2001 onwards

From January 2001 until December 2010, ONS used the Mortality Medical Data System (MMDS) ICD-10 v2001.2 software provided by the US National Center for Health Statistics (NCHS) to code cause of death. Since then:

• ONS updated to ICD-10 v2010 in January 2011 and incorporated most of the World Health Organisation (WHO) amendments authorised up to 2009

• IRIS version 2013 was introduced in January 2014, which incorporated all updates to ICD-10 authorised by the WHO up to the end of 2013

• In January 2017, ONS updated to IRIS 4.2.3 software, which incorporated all changes to ICD-10 that had been authorised by the WHO up to the end of 2016, but did not produce a bridge coded dataset to compare differences between IRIS 2013 and IRIS 4.2.3

• In January 2020, ONS updated to a successor of IRIS, Multi-causal and Uni-causal Selection Engine (MUSE) 5.5 (IRIS version 5.5), which incorporated all changes to ICD-10 that had been authorised by the WHO up to the end of 2018

• In January 2022, ONS moved to MUSE 5.8, which incorporates ICD-10 changes authorised up to 2019.

Changes in 2001

Information on changes between ICD-9 and ICD-10 can be found in a number of ONS publications from Health Statistics Quarterly (Griffiths et al., 2004) (Brock et al., 2004) (Brock et al., 2006) (Griffiths and Rooney, 2003).

Changes in 2011

Changes introduced using ICD-10 v2010 are as follows:

• A small number of new codes and some expansions of existing codes, at the fourth digit level of the ICD code.

• Amendments to the modification tables and selection rules. Modification tables and selection rules are used to ascertain a causal sequence and consistently assign underlying cause of death from the conditions recorded on the death certificate. Overall, the impact of these changes is small although some cause groups are affected more than others.

Bridge coding of 55,280 deaths registered in 2009 in England and Wales showed that most deaths (around 95 per cent) remained in the same chapters. However, there were movements in and out of some chapters reflecting the changes in the selection of underlying cause from the combination of codes given on the death certificate. There is a report detailing these changes and a bridge coded data set is available.

Changes in 2014

Changes introduced using the IRIS (2013 and 4.2.3) software are as follows:

• Major updates to the ICD-10 approved by WHO. These include changes to the use of codes in the cancer Chapters C and D (Neoplasms).

• A small number of changes to the coding of other specific conditions to bring previous coding practice into line with international coding rules.

Bridge coding of 38,718 deaths registered in 2012 in England and Wales showed statistically significant percentage increases in the deaths allocated to an underlying cause in seven ICD-10 chapters, and significant decreases for five chapters when coded in ICD-10 v2013 (IRIS). However, 95 % of deaths remained in the same chapter. There is a report detailing these changes and a bridge coded dataset is available.

Changes in 2020

Changes introduced using the MUSE 5.5 software are as follows:

• Increased automation of coding compared with previous software. A coding test conducted on 3,844 records (in November 2018) found that IRIS 4.2.3 automatically coded 76.7% (2,947 records) whereas MUSE 5.5 automatically coded 80.4% (3,089 records).

• Deaths allocated an underlying cause of death in the ICD-10 chapter, Certain infectious and parasitic diseases, decreased by 19.8%; this was largely because of a change in the coding rules to count more deaths with an immediate cause of sepsis as resulting from a serious health condition that preceded the infection.

• Changes in the coding of deaths with an immediate respiratory cause but an underlying degenerative condition (such as Parkinson’s disease) contributed to an increase of 5.8% in deaths allocated to the ICD-10 chapter Diseases of the nervous system.

However, When comparing a sample of deaths bridge coded using the existing IRIS 4.2.3 software and the updated MUSE 5.5 version, over 98% of deaths remained in the same ICD-10 chapter in both versions. There is a report detailing these changes and a bridge coded dataset is available.

Changes in 2022

Changes introduced using the MUSE 5.8 software are as follows:

• There are two ICD-10 chapters with strong evidence of change in the frequency of underlying causes of death within them; Chapter I Certain infectious and parasitic diseases decreased by 4% between the two versions and Chapter VI Diseases of the nervous system increased by 1%.

• There are five leading causes of death with strong evidence of change in the frequency of underlying causes within them and a further 11 with evidence of change.

• The most significant leading causes of death change was Disorders of fluid electrolyte and acid-based balance (dehydration) which had a net loss of 44%, which mostly corresponded with a net gain of 24% to Mental and behavioural disorders due to psychoactive substance use.

• The second most significant change was Parkinson’s disease with a net gain of 3%, which mostly corresponded with a net loss of 4% from Acute respiratory infections other than influenza and pneumonia.

However, when comparing a sample of deaths bridge coded using the previous Multi-causal and Uni-causal Selection Engine (MUSE) 5.5 and the updated MUSE 5.8 version, 99.2% of underlying causes of death remained in the same International Classification of Diseases, Tenth Revision (ICD-10) chapter designation, and 98.6% remained in the same leading cause of death designation. There is a report detailing these changes. ONS have not published a bridge coded dataset, but have published indicative comparability ratios for underlying causes of death by ICD-10 chapter and leading causes of death instead.

Page last updated: August 2025