Death certificates serve as narratives explaining individual deaths. They are rooted in fact, but their significance can shift depending on which details are included, omitted, or misunderstood. This issue has been especially contentious in efforts to estimate COVID-related deaths. Beyond the often debated language of dying “from” COVID versus dying “with” it, there is also the common problem that certifiers might not fully know or understand the chain of events leading to a person’s death.
That uncertainty helps explain why researchers have looked for other ways to count the pandemic’s toll.
Researchers have turned to the tantalizingly “less biased” machine learning for a death count, as reported in Science Advances. Let’s break down the study and its findings, then offer a critique.
Why Counting COVID Deaths Was Never Simple
In the post-mortem on the pandemic, excess mortality—the number of deaths beyond the usual—serves as a metric for assessing the effectiveness of our public health interventions and our responses to them. No excess – a perfect 10 for prevention, while increasingly higher excess mortality signals a public health failure. A key issue within this metric is defining excess mortality and the “usual” number of deaths. The go-to value is all-cause mortality, which “include deaths from external causes (e.g., intentional or unintentional injuries)," leading to potential under-reporting of excess mortality from COVID.
The researchers trained an algorithm on hospital deaths attributed to COVID, since most patients were routinely tested for the virus, making those deaths our most accurate. They then used the algorithm on more chaotic data, out-of-hospital deaths, where COVID testing was unlikely, potentially providing a clearer picture of the pandemic’s true impact.
- The analysis indicates that many COVID-19 deaths went uncounted—likely in the hundreds of thousands—meaning the actual death toll from the pandemic was significantly higher than official reports.
- These unrecognized deaths mainly occurred outside hospitals, especially at home, and were significantly undercounted, happening at rates much higher than official records captured, reflecting gaps in testing and diagnosis in those settings.
- The likelihood of missing deaths varied by location and time, with greater undercounting in specific regions (notably parts of the South) and during earlier phases of the pandemic.
- Undercounting was unevenly distributed, impacting socially and economically disadvantaged groups and communities, obscuring the extent of health inequities during the pandemic.
- The undercounting was often from alternative causes of death in these populations, e.g., cardiovascular disease or diabetes. In short, the algorithm identified deaths with COVID (even if not attributed on the death certificate) as deaths from COVID.
The researchers conclude that COVID-19 deaths were significantly undercounted, unevenly distributed, and disproportionate. However, their results and conclusions depend on two assumptions.
Where the Model May Go Wrong
The first assumption is that hospital deaths attributed to COVID were accurate enough to serve as the model’s reference standard, or “ground truth.” Routine PCR testing made those hospital cases more reliable than deaths outside medical settings, but it did not resolve the longstanding question of dying “from” COVID versus dying “with” it. Some critics have also argued that reimbursement policies may have influenced how cases were classified, though that remains a matter of debate rather than settled evidence of systematic distortion.
Hospital death certificates remain the story we tell, and that story is frequently assigned to the lowest member of the clinical hierarchy, such as the intern or second-year resident, who has had limited formal training in this medico-legal task and may only have a superficial knowledge of the patient. In a study comparing autopsy findings with death certificates completed by more experienced practitioners, attending physicians or fellows, 30% of cases had a Grade III error, defined as “failure to list the underlying condition/disease that initiated the lethal cascade of events.”
If the model’s starting point was imperfect, its second challenge was whether those hospital patterns could be generalized beyond the hospital at all.
The second assumption is that patterns learned from hospital deaths can be applied to deaths outside hospitals. The researchers partly based that assumption on their earlier work showing a temporal link between reported COVID cases and deaths assigned to other causes. However, a shared timeline does not necessarily mean the two settings are the same in all respects. Out-of-hospital deaths often involve populations with different circumstances, such as lower income, less education, poorer access to care, and more crowded living conditions. These differences could influence both how people die and how their deaths are recorded.
To their credit, the researchers acknowledge both assumptions and explicitly state that the second is the greatest potential source of bias in their approach. That caveat matters and likely warrants more prominence in the main presentation of the findings rather than being mainly in the supplemental materials.
"Under-counting inequities in COVID-19 deaths can be viewed as both a manifestation of structural racism, ableism, and classism and as a mechanism preventing responsive policy action.”
- Coauthor Dielle Lundberg, research fellow, Department of Global Health at Boston University School of Public Health
The study’s broader social interpretation may still be plausible, but it depends on assumptions that are more difficult to test.
What might we conclude?
What the Study Can—and Cannot—Tell Us
Machine Learning Is Not a Panacea
The study makes a persuasive case that official COVID-19 death counts likely overlooked a significant number of deaths, especially outside hospitals. It also indicates that these missed deaths were not random but followed patterns based on location, time, and socioeconomic disadvantage. At the same time, the analysis cannot determine an exact revised death toll, nor can it fully explain why these deaths were missed.
There is a broader lesson here. Machine learning can uncover patterns in large datasets that would otherwise remain unseen, but it cannot escape the limitations of the data it is trained on. When those inputs are incomplete or when learned patterns do not transfer smoothly across different settings, the resulting estimates become highly dependent on assumptions rather than solid facts.
Source: Applying machine learning to identify unrecognized COVID-19 deaths recorded as other causes of death in the United States Science Advances DOI: 10.1126/sciadv.aef5697
