Two Sides of the AI/ML Coin in Health Care

Kathryn Marchesini; Jeff Smith and Jordan Everson | October 19, 2022

As we’ve previously discussed, algorithms—step by step instructions (rules) to perform a task or solve a problem, especially by a computer—have been widely used in health care for decades. One clear use of these algorithms is through evidence-based, clinical decision support interventions (DSIs). Today, we see a rapid growth in data-based, predictive DSIs, which use models created using machine learning (ML) algorithms or other statistical approaches that analyze large volumes of real-world data (called “training data”) to find patterns and make recommendations. While both evidence-based and predictive DSI types (models) could be used to address the same problem, they rely on different logic that’s “baked into” their software. 

But before we explore the two approaches, we should first revisit a key challenge posed in our previous post in this series: capitalizing on the potential of artificial intelligence (AI), particularly ML and related technologies, while avoiding risks (such as potential harm to a patient) from these technologies. In this blog post, we’ll dig a little deeper into what some of those risks are and where those risks can originate.

Evidenced-Based vs. Predictive Decision Support Interventions  

DSIs that use evidence-based guidelines or other expert consensus generate recommendations based on how the world should work. Generally, they represent the implementation of expert consensus emerging from high-quality clinical trials, observational studies, and other research. Evidence-based DSIs are usually “fixed rules,” essentially, a series of “if-then” statements that form an algorithm. For instance, “if a woman is between the age of 45-54 and if she is of average risk of breast cancer, then she should get a mammogram every year.” 

Predictive DSIs, by contrast, generate recommendations (outputs) to support decision-making based on recognized patterns in the way the world actually works, filling in knowledge gaps with real world data. It’s up to humans then to determine the recommendation’s relevance in a given context. This makes predictive DSIs powerful tools because they can, at least in theory, be used to predict anything about which the technology collects data—whether that image looks like a tumor, whether a patient is likely to develop a specific disease, or whether a patient is likely to make it to their next appointment, to name a few. In part, because expert clinical guidelines have not been established for many topics, predictive DSIs can provide important guidance on a wide range of topics that evidence-based DSIs currently do not touch. At their best, predictive DSIs can identify patterns in data earlier or more precisely than health care professionals, or even uncover patterns not previously known, and recommend decisions across many facets of health care. 

Magnifying Existing Risks Resulting from Emerging Technology 

While predictive DSIs have enormous potential to improve many aspects of health care, they also present several potential risks that could lead to adverse impacts or outcomes. These risks may be magnified because of their potential to “learn” rapidly and produce predictions across many hundreds or thousands of patients. In particular, predictive DSIs in health care can: 

Reproduce or amplify implicit and structural biases of society, health, and health care delivery, as it is captured in the underlying training data. This can lead to predictions or recommendations that are unfair or biased. It could also lead to technology performing differently among certain patients, populations, and communities without the user’s knowledge, potentially leading to patient harm, widening health disparities, discrimination, inefficient resource allocation decisions, or poorly-informed clinical decision-making.
Magnify existing concerns about the ethical, legal, and social implications of underlying data practices (collection, management, and use). Whenever health data are collected, managed, and used, there are information privacy, security, and stewardship concerns, including those pertaining to confidentiality, anonymity, and control over the use of information about an individual (potential misuse of information; unexcepted or adversarial use). The potential for predictive DSI to use health data in novel ways heightens these concerns.  
Reinforce common, non-evidence-based practices. While bias is a high-profile example of how predictive DSIs might learn and reinforce bad practices, more generally, predictive DSIs may reinforce the tendency to do something a certain way because that’s the way it is always done, even without supporting evidence of benefit. Because predictive DSIs learn from what is commonly done, not necessarily what is best, the use of predictive DSI could slow adoption of new innovations and updated best practices by recommending widespread practices, even after they become obsolete. Cognitive psychology shows that recommendations from predictive DSIs have the potential to reinforce widespread practices by making them the default option (default bias) or because of over-reliance on automation (automation bias).
Bake-in existing, inexplicable differences in health care and health outcomes. Given the widely noted extreme levels of variation in health care, even across small geographic areas, directly inferring what will happen here based on similar patterns in “messy” real-world data from over there is a risky proposition. That risk is even greater when the underlying data are of low quality or integrity. This can lead to DSIs making invalid or unreliable predictions, especially if the underlying model makes predictions based on patterns in training data that differ from patterns in data from the local context where the model is used, sometimes referred to as robustness.

Use “black box” or opaque algorithms so it is impossible to tell exactly how they arrive at a decision, including how input data are combined, counted, or weighted to produce the model’s prediction, classification, or recommendation. They are also based on predictive algorithms and models that are designed to predict a missing value rather than directly stating an action that should be taken. These facets can reduce the intelligibility of model outputs to end users, making it easy to mis-interpret what a model output means and lead to a greater risk that predictive DSIs are used in settings where they are not appropriate.  
Lead to recommendations that are ineffective or are unsafe, meaning that the risks described above outweigh any potential benefits.

Given that data plays a critical role in predictive DSIs, common data challenges in software development (e.g., quality and integrity) can also directly affect the successful development and use of the predictive DSI. Potential causes of harm may also be due to a lack of or inconsistent governance of data, or policies and controls for how data are acquired, managed, and used across the lifecycle of the predictive DSI.

At ONC, we’ve taken to calling high-quality predictive DSIs that have minimized risks as FAVES: Fair, Appropriate, Valid, Effective, and Safe. We introduced some of these terms in our first blog post, and in our next one, we will discuss what we see as a defining challenge inhibiting the optimization of predictive DSIs in health care, and we will discuss ways to know and show that a predictive DSI is FAVES.

This is part of the Artificial Intelligence & Machine Learning Blog Series.