Using algorithms to predict death: Lessons learned

Animals are extraordinary beings. Oscar the cat, for example. Adopted as a kitten by the staff at a nursing home in the United States, he grew up surrounded by people with end-stage dementia. At some point staff members noticed something unusual: Oscar did not interact much with people, but when he made himself comfortable next to one of the residents, that person invariably died within a few hours. After a while the nurses started informing family members that they saw Oscar curled up beside a patient. What sounds like an urban legend was in fact documented in an article in the respected New England Journal of Medicine in 2007. Oscar soon made headlines around the world. Even a book has been written about the unusual feline.

That makes Oscar very good at something that challenges doctors and nurses every day: predicting when a person will die. Moreover, he’s amazingly accurate. Animals also sense when humans have illnesses such as tumors. Dogs, for example, can be trained to detect lung cancer by sniffing a patient’s breath. And the results are better than those provided by conventional diagnostics.

Researchers are now trying to discover how they do it. Presumably the animals can smell certain molecules in the patient’s breath or on various parts of the body. Although the actual mechanism is not yet known, what is clear is that the animals’ brains absorb and process complex information and then come to the relevant conclusion. That describes how algorithms work as well.

It was only a matter of time before researchers developed algorithmic systems that are capable, like Oscar, of predicting death. Two types are now generally used:

  1. Algorithms that analyze a patient’s vital signs, such as heart rate and blood oxygen levels, to predict the probability of death within a few hours– in order to save the patient’s life.
  2. Algorithms that use a pool of patient data, such as medical diagnoses, treatments or prescribed medications, to predict the probability of death within 3 to 12 months– allowing terminally ill patients to benefit from at-home palliative care.

In both cases, it is generally acknowledged that algorithm-based systems will be able to produce, on average, better predictions than humans can. And even if hospital staff use special scoring systems to assess risk or the probability of death, the scores’ accuracy still leaves a lot to be desired, which is why doctors and nurses generally rely on their own professional experience.

Over the course of their careers, medical practitioners treat thousands of patients. Yet it’s hard to pinpoint the exact knowledge they use as the basis for choosing one therapy over another. That is not the case with software, however, which runs on a dataset that is always accessible and constantly growing.

Whether it’s a matter of predicting avoidable problems early on and thus saving lives or accurately foreseeing the unavoidable and helping the seriously ill die in a dignified manner, algorithms can help doctors make decisions based not only on their subjective experiences, but on objective parameters as well. In addition, using algorithms can help medical staff to systematize decision-making processes, for example in hospitals.

Yet as useful as such systems might be, many people feel uncomfortable when they discover that software is being used to foretell a death. This is particularly true in the second example given above: when death is more or less inevitable and there is little that can be done to save the patient’s life. The oncologist and author Siddhartha Mukherjee describes exactly this feeling of discomfort in one of his essays. In other words, there are many questions that must still be answered and many aspects that require public discussion before such algorithms become widely used.

Algorithmic decision-making processes must be transparent

Two such “death algorithms” are currently the subject of discussion in the United States which are being deployed to predict if seriously ill patients will die within the next 3 to 12 months (see our blog entry Optimizing palliative care: When algorithms predict a patient’s death). One is still being developed and has been published in a study by researchers at Stanford University; the other is the property of private US-based palliative-care provider Aspire Health.

In contrast to the Stanford researchers, Aspired Health is a for-profit enterprise that specializes in home-based end-of-life care. That means it earns money when people with serious illnesses no longer receive curative care. The greater the number of people who use the palliative services provided by Aspire, the greater the company’s profit. The conflict of interest is readily apparent. To make things more problematic, no one outside of the company knows how the algorithm used by Aspire works, since it’s a trade secret. It thus remains unclear how the system decides when the company’s palliative services are the better choice for a patient.

One could easily presume that Aspire Health has optimized the algorithm to identify as many patients as possible who would benefit from palliative care. Yet increasing the sensitivity generally results in lower specificity. That means the algorithm would misidentify too many of those patients who will actually live longer than one year, the upper limit for palliative care.

An algorithm must be transparent if outsiders are to understand how it has been optimized. And when it comes to systems that predict the probability of death, optimization parameters should not be the purview of commercial businesses alone. The system’s developers must instead publicly disclose which goals are being pursued with the algorithm and under what conditions it is being used. Both of these aspects must be subject to a public social, political and ethical debate. Moreover, it must be possible to verify the algorithm’s performance.

Algorithms must be thoroughly validated on an ongoing basis

It must be possible to validate algorithms if they are to be considered transparent. And an algorithm must be publicly accessible if it is to be publicly validated. Furthermore, a number of questions must be asked when it comes to algorithms of this type, such as:

  • How reliable are their predictions?
  • How often do the results include false positives or false negatives?
  • Are the algorithms truly helpful in achieving the desired goals? What those goals (e.g. improving access to at-home palliative care or reducing costs resulting from unnecessary treatments and interventions)?
  • Which framework are they embedded in, i.e. which patient groups were they developed for?

To answer these questions, publicly funded studies are required. Multiple studies have already shown, for example, that algorithms used in monitoring systems to predict the chance of a patient’s dying in the near term are highly accurate and can save lives. Such algorithms have already been approved as medical products by health regulators, such as the Food and Drug Administration (FDA) in the US and oversight bodies in Europe.

Approval procedures must be adapted (to new developments)

No standard definition yet exists of what constitutes a “clinical decision support system” (CDSS) as used in medical institutions. Algorithmic input for making decisions can vary considerably in complexity, ranging from a simple warning that a patient’s medical record contains duplicate entries to deep-learning processes which predict when an individual will die.

The criteria a medical product must meet for certification are very complex, yet they generally set no concrete benchmarks that a CDSS must adhere to. The US is somewhat further ahead than other countries here, since the FDA laid out regulatory guidelines for CDSSs at the end of 2017, including information on which systems will be considered medical products requiring approval and which will not. At the same time, as the systems become more complex, so will the criteria that must be met for approval. That, in turn, could make implementing such systems unnecessarily complicated or hinder the development of innovations.

A number of other regulatory issues must be clarified for the algorithmic systems that are used to predict death:

  • Which institutions should be permitted to use such algorithms? Which should use them as a matter of course? For example, should they only be used at full-fledged medical institutions? Or should organizations be able to use them that provide at-home palliative care? Can primary care doctors use them in their practices? Or only the health insurers and medical associations responsible for deciding whether at-home palliative care is warranted?
  • How can self-learning algorithms be regulated if they continue to develop on their own and it is not completely clear how they generate their output?

Doctors must be trained to communicate predicted deaths

Before the age of artificial intelligence, doctors often had to rely solely on their knowledge and wealth of experience. This was also true when the task at hand was informing a patient that they did not have long to live. Yet even if an algorithm existed which could say with 100-percent certainty that a patient is going to die – which will never be the case – that would not make it any easier for physicians to communicate the news to patients.

On the contrary, algorithms still lack empathy and morality. And recommendations about a situation so personal and emotional as an impending death should never be made by a computer program. Doctors can use artificial intelligence as an aid, but they will always have to consider the entire individual as they reach their decision on what the best way forward is. There are more than a few patients who have a 95-percent chance of dying, yet firmly believe they are one of the 5 percent who will recover.

Research shows that doctors tend to be overly optimistic when they predict whether or not their patients will live. That’s only human. Yet how do doctors deal with the fact that algorithms are better at foretelling medical outcomes? Do they include this more objective information in their deliberations? And how do they then tell the patient that their recommendation is based on a prediction made by a machine?

An open discussion is needed of the discomfort many people feel when they learn that computers are predicting when patients will die. What is also needed is thorough training in this area for medical personnel, including additional certification in statistics and programs that impart a basic understanding of how algorithmic systems work. Programs must also be introduced to help clinicians communicate better with patients, especially when the subject is palliative treatment.

An open discussion is needed on the importance of life’s final phase

Chemotherapy and surgery can be lucrative for hospitals. Yet how much should be spent on treating the terminally ill when it is clear that the patient does not have long to live? Or when it is clear that the patient will die sooner without treatment, but will not have to spend their final weeks or months in a hospital and will experience fewer side effects? There are no easy answers to this question.

After all, there are patients who indeed survive despite all the statistics suggesting they will not. Moreover, it’s almost impossible to ascertain if a therapy was unnecessary. Was it unnecessary if it gave the patient a few more weeks? A discussion is needed on this topic as well, among both policy makers and society at large.

In the UK, for example, it has been decided that quality of life can only be improved or a life extended if the price of doing so is reasonable – meaning, more concretely, if it costs between £20,000 and £30,000 per additional year of life. Cost-benefit analyses of this sort also make people uneasy. What might seem reasonable to a healthy person who is well off could be completely inacceptable to someone who is ill and lacks any income. Yet this is a discussion that must take place all the same, since if there is no transparent, broad discourse there can be no consensus, and cost-benefit decisions will merely be handed off to hospitals, health insurers and providers of palliative care.

Whether this concept – the quality-adjusted life year (QALY) – is really the right indicator for such calculations remains controversial in Germany. Discussing it is more important than ever, however, given the development of new medical tools such as death-prediction algorithms.

Doctor and patient must always make the decision

Regardless of how algorithms for predicting death develop in the future, guidelines must be put in place ensuring that it is ultimately a human being – a doctor – who makes the recommendation or decides together with the patient or family what the best way forward is.

Whether a cat or an algorithm makes a prediction, there is always a chance it will be wrong. That will always be the case. Moreover, such predictions “only represent the model of a person’s life, never the person themselves,” says Kevin Baum, a computer ethics specialist at Saarland University, speaking to German broadcaster ARD. “Machines are not capable of making decisions in individual cases.” What they also can’t do is something Oscar the cat is particularly good at: keeping people company and making them feel better.

 

This is the third post of a three-part series on the use of algorithms in the calculation of death risks.

Part 1 of the series is here: How algorithms can save people from an early death

Part 2 of the series here: Optimizing palliative care: When algorithms predict a patient’s death

 



Comment