AI predicts disease risks years in advance – a revolution for medicine?

A new model based on artificial intelligence (AI) can estimate the long-term individual risk for more than 1,000 diseases, according to researchers. It is a generative, pre-trained transformer and thus bears similarities to the large-scale language model behind ChatGPT.
The research team called it Delphi-2M. They trained the model using 400,000 patient records from a large British database (UK Biobank) and were able to apply it to nearly two million Danish patient records with only a slight loss of accuracy. The study , by the group led by Moritz Gerstung of the German Cancer Research Center (DKFZ) and Ewan Birney and Tom Fitzgerald of the European Molecular Biology Laboratory in Hinxton, UK, was published in the journal Nature.

The guide for health, well-being, and the whole family – every other Thursday.
By subscribing to the newsletter, I agree to the advertising agreement .
“Our AI model is a proof of concept that shows that it is possible to identify many long-term health patterns and use this information to generate meaningful predictions,” Birney is quoted as saying in a DKFZ statement.
The model has a resolution down to individual patients. It is therefore, in principle, possible to reconstruct individual medical histories and derive prognoses for the further development of disease risks and progression. At the same time, the model can predict the health development of larger population groups and thus provide clues as to how healthcare can be improved.
“Just as large language models can learn the grammar of our language from the sequence of words in texts, this AI model learns the logic of the temporal sequence of events in health data in order to model entire medical histories,” Gerstung explained, according to the DKFZ statement.
The learned patterns enable the AI model to calculate the probability of disease risks at the current time and for more than a decade into the future. In addition to disease diagnoses based on the International Classification of Diseases (ICD-10), other characteristics such as age, gender, body mass index, smoking habits, and alcohol consumption are included in the probability calculation.
After training Delphi-2M on the 400,000 records from the UK Biobank, it was tested on another 100,000 records from the same database. The researchers then applied the model to 1.93 million records from the Danish National Patient Registry between 1978 and 2018 without prior adjustments. The researchers were able to show that the probabilities calculated by the model indeed occurred with the expected frequency.

They comfort, listen, and are there – chatbots. Platforms like Character.ai allow you to create a digital counterpart. To talk, laugh, and even fall in love with. But what does this do to our human relationships? Will chatbots become our new confidants?
“The fact that Delphi-2M can be applied to Danish population data with slightly reduced accuracy suggests that many patterns learned by the model accurately reflect the actual development of multiple disease rates,” write the study authors.
For Fabian Theis, Director of the Institute for Computational Biology at the Helmholtz Zentrum München, the transfer to a cohort from another country is a breakthrough. This demonstrates the robustness of the model. "There have been some medical models with good results, but they usually only worked in one hospital and no longer worked in the next," Theis, who was not involved in the study, told the German Press Agency (dpa). Ewan Birney also said during a press conference that the positive result with the Danish data has greatly strengthened the scientists' confidence in their model.
In a graphic, the study authors demonstrate how several diseases affecting the pancreas, liver, and bile ducts, as well as diabetes mellitus and digestive disorders, increase the risk of pancreatic cancer by 19 times. According to the researchers, Delphi-2M is particularly suitable for diseases with clear progression patterns, such as certain types of cancer or heart attacks.
However, it is less reliable for infections or mental illnesses that depend on unforeseeable life events. "The key thing is that this is not a certainty, but an assessment of the potential risks," said Tom Fitzgerald.
Thanks to the large amount of training data, the AI model can detect signs of diseases that are not usually revealed during medical examinations. "By modeling how diseases develop over time, we can investigate when certain risks arise and how early interventions can best be planned," explained Birney. This is a major step toward personalized and more preventative approaches to healthcare. However, there is likely still a long way to go before Delphi-2M or a successor version can be used in everyday clinical practice – for patient and data protection reasons. Gerstung estimates that it will take five to ten years.
Should the AI model be used on individual patients, it should "only be a supplementary component and must always be accompanied by medical judgment," said Markus Herrmann of the Institute for Medical and Data Ethics at Heidelberg University. Patients must be informed about the use of the technology and its significance, and the results must be discussed in detail between doctor and patient.
In order not to restrict patients' freedom of choice, medical ethicist Robert Ranisch of the University of Potsdam advocates the possibility of a waiver: "Therefore, a right not to know remains crucial."
Ranisch also sees the potential of the AI model when applied to larger population groups: "It can be used in the spirit of equitable, proportional prevention to identify gaps in care for disadvantaged groups." For Carsten Marr of the Helmholtz Zentrum München, the most exciting aspect is finding connections between previously unknown diseases. "There's a study that demonstrated that an Epstein-Barr virus infection leads to a 30-fold increased risk of multiple sclerosis. These are the things we're looking for," said Marr.
An important aspect of further developing the AI model will be to consider potential biases. For example, only datasets from patients aged 40 to 70 were used for AI training; other age groups were not represented. Over- or underestimation could also affect groups that differ in origin and social status. "A model that predicts hundreds of diseases at once consolidates opportunities, but also increases the risk of bias," warned Ranisch.
But the study authors are optimistic: "This is the beginning of a new way of understanding human health and the progression of diseases," predicted Gerstung. Fabian Theis envisions that one day there will be a digital twin of a patient, fed by health and lifestyle data. "This would then allow us to see, for example, how the virtual patient reacts to a change in medication without having to test it on a real patient," explained Theis.
RND/dpa
rnd