Unregulated Algorithms in Healthcare – EPIC and Sepsis

Sepsis is an overwhelming infection: bacterial, viral, or fungal. It requires immediate medical attention and intervention. EPIC, the company with the largest share of the electronic medical records market, developed an algorithm to help physicians timely identify at-risk patients. An independent study shows that it is not helpful. Is this healthcare’s 737Max moment?

Detecting sepsis early to initiate therapy and heightened surveillance saves lives. One of the core CMS quality measures is how quickly a patient is started on antibiotics, with hospitals penalized for acting too slowly. Algorithms that search hospital electronic health records (EHR) have been developed to speed the identification of these patients, and the one created by Epic Systems Corporation, one of the world’s largest EHR software vendors, has been widely deployed. The algorithm is based on 405,000 patient encounters across three hospital systems between 2013-2015. It was internally validated by EPIC but

“owing to the proprietary nature of the ESM [Epic Sepsis Model], only limited information is publicly available about the model’s performance, and no independent validations have been published to date.”

Much like the software upgrades to the 737Max MCAS system, which were not well tested and ended in the deaths of 300+ individuals.

A study reported in JAMA Internal Medicine in 2021 did an “external” validation at the University of Michigan’s health system. The results were not pretty. They looked at roughly 28,000 patients with prediction scores from the ESM algorithm made every 15 minutes from arrival in the ED through their hospitalization. 7%, 2552 patients developed sepsis.

The algorithm can be tuned to identify more cases but at the cost of more false positives. The researchers set the algorithm at a middle range, within the guidance offered by EPIC.

  • Of the 2552 patients with sepsis, the ESM algorithm identified 7% “missed” by the clinicians, and even those patients did not receive timely antibiotic administration. [1]
  • 67% of patients with sepsis were not identified by the algorithm but were identified by the clinical staff and appropriately treated
  • False positives by the algorithm meant that eight patients had to be evaluated to identify one patient that truly had sepsis. False positives create needless “busywork” and alarm fatigue - what happens when you cry wolf once too often.

It is hard to see the value of such an algorithm.

“The increase and growth in deployment of proprietary models has led to an underbelly of confidential, non–peer-reviewed model performance documents that may not accurately reflect real-world model performance. Owing to the ease of integration within the EHR and loose federal regulations, hundreds of US hospitals have begun using these algorithms.” [emphasis added]

Before looking at those federal regulations, we should note that the researchers pointed out that their findings were limited because they looked only at one institution. EPIC has subsequently altered their algorithm to consider the case mix and sepsis presentation at the hospital deploying the ESM – personalizing the algorithm may improve its performance, but who will be the judge?

The FDA responds – sort of, maybe.

This past week, the FDA provided guidance on evaluating these software algorithms. For years, computers have provided a range of “clinical decision support” (CDS) services. It includes reminders to physicians of possible drug-to-drug interactions for medications they prescribe or recommendations for standard screening or testing. Even those 5 or 6 pages of boilerplate you receive after visiting your doctor are considered CDS.

The FDA is making a distinction between devices like a glucometer and software like the ones identifying possible adverse drug interactions. In bureaucratic language, there are four criteria for what is not a device.

  • It is not a device if it is not intended to further analyze or process a medical image, pattern, or signal from an instrument. For example, a smart glucometer that measures your blood sugar and retains the information to identify a pattern is a device.
  • It is not a device if it simply displays or prints information – those discharge instructions and review of your doctor visit are not devices.
  • It is not a device if it supplies information “to enhance, inform and/or influence a health care decision but is not intended to replace or direct the HCP’s [health care provider] judgment.” – It is not a device if you cannot claim the algorithm made me do it. There is some further quibbling here in that a single recommendation might be serving as a direction of judgment, making it a device, so you have to have more than one recommendation. Additionally, the decision must not be time critical.
  • Finally, it is not a device if the physicians can independently review the basis for the recommendations presented by the software so that they do not rely primarily on such recommendations but rather on their judgment, to make clinical decisions for individual patients.” An opaque algorithm is a device.

The Epic ESM fails on multiple counts here; it is looking for a pattern, its alarm is designed to influence the physician to act in a time-sensitive manner, and its working is opaque. Should Epic be concerned? No. Here is what the FDA includes in their preamble to this guidance.

“The FDA does not intend at this time to enforce compliance with applicable device requirements of the FD&C Act, including, but not limited to, premarket clearance and approval requirements.” [emphasis added]

What is the value of guidance if compliance is not enforced? Who is to be held responsible should a septic patient go unidentified by ESM or physicians? Why are federal regulators knowingly allowing EPIC to continue to market a flawed medical device?

Fun Fact: The Apple Watch is an FDA-approved medical device for detecting atrial fibrillation, a heart rhythm disturbance. But the Apple Watch’s pulse oximeter, which measures your oxygen level, is not FDA-approved. It is just a “fun” add-on, as reliable as other pulse oximeters, which parenthetically are unreliable based on one’s skin tone. Pulse oximeters are allowed because, and here they are channeling the loophole exception for supplements, because they do not claim to “diagnose or treat” any disease.


[1] Time to receive antibiotics is a quality indicator, but it is impacted by any number of variables, including whether the antibiotic order is put directly into the computer or given to a nurse as a verbal order and then entered, how busy the pharmacy is in preparing medications, how long it takes to deliver the medicine from the pharmacy to the patient, and how long it takes for the antibiotic infusion to be started by the nurse.


Sources: External Validation of a Widely Implemented Proprietary Sepsis Prediction Model in Hospitalized Patients JAMA Internal Medicine DOI: 10.1001/jamainternmed.2021.2626