Interview with Nikos Papanikolaou, principal investigator of the Computational Clinical Imaging Group.
According to official data, in 2020, prostate cancer was the second most frequent cancer in men worldwide and the fifth in terms of mortality. But prostate cancer diagnosis relies on unspecific measures such as PSA (prostate-specific antigen) levels and digital rectal examination, followed by a biopsy, where disease aggressiveness assessment is based on a qualitative and highly subjective score that can vary depending on the radiologist who performs the assessment (with PIRADS – Prostate Imaging–Reporting and Data System). Using PSA and digital rectal examination may lead to overdiagnosis and overtreatement. The latter may be responsible for missing those patients that deserve to be treated (aggressive disease) in a timely manner, while treating those who eventually will not die from prostate cancer, influencing their quality of life in a negative way.
With this in mind, Nikos Papanikolaou and his team, at the Champalimaud Foundation’s Computational Clinical Imaging Group, are developing Artificial Intelligence (AI) algorithms that they expect will be able to detect prostate cancer, assess its biological characteristics and predict outcomes in individual patients based on magnetic resonance imaging (MRI) exams.
The team, who is also a foremost partner in the EU-funded consortium ProCancer-I, which aims to develop this type algorithms for prostate cancer, published, in 2023, two papers about AI applied to prostate cancer, respectively in the journals Cancers and Scientific Reports (for more information, see https://www.mdpi.com/2072-6694/15/5/1467 and https://www.nature.com/articles/s41598-023-33339-0#Abs1) In this interview, Papanikolaou explains the research involved.
Can you tell us about your work?
I'm serving as the scientific director of a European project called ProCancer-I, which focuses on two distinct goals. The first one is to collect and produce the biggest clinical data and medical imaging repository of patients suffering from prostate cancer. The core of the data that we are collecting so far is based on MR imaging.
We are collecting multiparametric MRI examinations from patients with prostate cancer, and associated clinical variables such as biopsies, blood tests, PSA levels, and so on. So that's the first objective: to establish and provide a highly curated repository containing this kind of data for prostate cancer patients. The projected size of the repository is about 17,000 individual patients. So far, we have collected almost 80% of that data.
And the second goal?
The second goal, which is what I'm mostly involved in coordinating, is to use that data to try to solve unmet clinical problems in the entire prostate cancer spectrum. These problems are, among others, the detection of prostate cancer and its characterization, the risk for local and distant recurrence, the stratification of patients eligible for different types of treatments, the adverse effects after radiation therapy, and the prediction of quality of life.
At the moment, we are in the process of training the first wave of AI models to harness the data we collected in order to help clinicians, whether they are radiologists or urologists, to treat and diagnose prostate cancer in the best way possible. We are developing AI technology to empower humans. This falls under the so-called human-centric AI approach, rather than autonomous AI, where the idea would be to replace humans.
How are you going to proceed to achieve this second goal?
The methodological approach we have designed is based on three distinct phases. The first phase is to develop so-called master models. Those models are based on a kind of “dirty” data approach, meaning that we don't clean the data before we feed it to the model. We try to simulate real world situations.
What has happened so far is that most AI models are trained with highly curated lab data, and then they fail to meet the challenges with real world data. So we are following an alternative way, exposing the models to the vast heterogeneity of real-world data. That's why we are collecting data from 13 clinical partners, including the Champalimaud Clinical Centre: to expose the model to the necessary diversity and variability, and to simulate real-world situations as closely as possible.
We are currently finalizing the first phase; I recently presented the initial results on the master models at the European Congress of Radiology in Vienna. We still don't have any concrete results as far as model validation is concerned, but we have results related to goal number one, which is the development of the data repository, which at the moment includes about five million images. So the first objective has already been achieved, and now we are working towards achieving the second.
What will the two other phases be?
The second phase will be to use these master models as a backbone to train manufacturer-specific models. In the MRI world, we have three main manufacturers of MRI machines: Phillips, Siemens, and General Electric. So our intention is to collect vendor-specific data and then produce vendor-specific models.
One important problem that we are facing so far in healthcare AI is the lack of model generalizability. When a model is applied to somewhat different data, coming from a different institution, then it collapses performance-wise. It doesn't really produce what it has been expected to produce. One reason behind that is the fact that the model has been trained with rather limited amounts of data and limited diversity.
In particular, such a model is subject to a selection bias, as we say. But now, for the first time, we are exposing models to thousands of MRI examinations and millions of MR images. Therefore, we are anticipating – and we already have preliminary evidence suggesting it – that these models will be far more reproducible, generalizable, and therefore have clinical potential in real-world settings.
During the third phase, the aim will be to integrate all these models into vendor-neutral models. These final models should not be sensitive to the manufacturer of the MRI machine on which the examination was performed. The whole project design is not only very focused on the research part, it also goes all the way to practical implementation through the application of these models to real-world data. That's the only way to create value for patients, for physicians and for healthcare systems.
After you complete these three phases, what will be the next steps?
We are anticipating that, next year around this time, we will have finalized model development and start testing these models with prospective patients, rather than retrospective data. So far, we have collected about 9,000 MRI examinations of retrospective patients, and now we are just starting to recruit prospective patients in order to validate the models next year. That will take time, since we will have to wait for patient outcomes.
What is the state of the art concerning AI applied to prostate cancer?
One big problem with the detection of prostate cancer is that it is purely based on the visual perception of radiologists looking at MRI exams. But we know very well that there is variability across different radiologists. And sometimes, if you show the same exam or the same images to the same radiologist at two different times, you may also get different diagnoses.
So that lack of reproducibility is the main driving force and the rationale behind wanting AI to give some help to radiologists by looking at the data from a very different perspective. AI looks at the data at the pixel level.
An important thing that we are also tackling is to develop AI models that can improve performance while at the same time giving the chance to human experts to understand and to explain how this process is done.
Are you referring to so-called responsible AI?
Exactly. In order to solve the so-called “black box” problem of AI, we are developing “responsible” AI. Responsible AI is a methodological approach that takes into consideration not only the performance and accuracy of the model, but other aspects such as explainability, robustness, usability, trustworthiness.
All these are important qualities that need to be there as far as the AI model concerned. Otherwise, you may end up developing the best performing model in the world, but no one will really use it in clinical practice. Therefore, your contribution as far as clinical value will be zero. So what we are really trying to do here, even from the very early phase of AI model design, is to listen to end users of such models, including the patients themselves.
In order to develop meaningful models, we need to ask people whether what we are intending to develop makes sense. For example, one thing that we are planning now is to organize dissemination events with patient associations, including conferences or meetings, in order to inform them and to spread the results of the project.
I’m also organizing the next ProCAncer-I consortium meeting, inviting more than 50 researchers from the 20 institutions to the Champalimaud Foundation. The consortium meeting will take place during the last week of June 2023, immediately before the big international AI Congress that we are co-organizing with the International Cancer Imaging Society [Artificial Intelligence &
Machine Learning in Cancer Imaging 3.0 – Enhancing Healthcare through AI, see https://fchampalimaud.org/events/ai-and-machine-learning-cancer-imaging… for more information]
Your team at the Champalimaud Foundation recently published two research papers. Can you briefly explain them?
The first paper is about automatically detecting the location of the prostate within an image. For this, we need to develop AI to automatically “segment” the prostate, that is, to trace the exact borders of the prostate gland or of the lesion across multiple images or slices. This is a very time-consuming and tedious task for the radiologist to do in a purely manual way, so although the total prostate volume is a very important biomarker, it is almost never computed.
That is why one of the models we are developing is an automatic segmentation model. We let the AI do the hard weight lifting, so to speak, and then the final result is always validated by a radiologist.
The second paper was about using these automatic segmentation techniques to compute what we call radiomic features. Let's say we crack an image into several hundreds or thousands of individual features, and those are the ones that AI algorithms will look at in order to translate them into a model.
We have been computing these radiomic features to try to answer a very important question: when you have a patient with a detected prostate lesion, does this lesion fall into the aggressive disease category or the non-aggressive disease category? This makes a huge a difference because the treatment approach can be very, very different in aggressive disease.
When the cancer is aggressive, the patient needs immediate intervention. Based on the patient’s preferences, this can be radical prostatectomy – removing the entire gland – or radiation therapy. But when the lesion is non-aggressive, you just put the patient in a very close and strict follow-up, with multiple PSA measures and MRI examinations. And you intervene only if and when the cancer's biological behavior changes and it becomes aggressive. Because sexual or urinary incontinence are two very common side effects of treatments, we need to be very careful not to overtreat or over-diagnose patients.
And you want AI to make this distinction accurately.
This is one of the main clinical targets we are hoping to achieve with the help of AI: to be more accurate in distinguishing the patients who need aggressive treatment from those who do not.
So far, we have analysed a rather small dataset because, as I said, the data collection and, most importantly, the validation of the AI models, are still in progress. But even though the results we published are only preliminary, the capability of differentiating between aggressive disease and non-aggressive disease, based on the methodology that our lab has developed, is very encouraging.
Now we need to confirm these initial very positive results with bigger prospective patient cohorts. That will happen in the next two years. The last part of the project will be devoted to the prospective validation of these models, and only then will we be able to say to which extent these models can be translated to clinical practice.
The huge dataset that we are collecting – there is no other that compares to it – will give us the opportunity to train AI models that can translate into the clinic with high expectations. Although the Procancer-I project will end in 2025, there is now a new project we are participating in: the EUCAIM [EUropean Federation for CAncer IMages] project, launched at the beginning of 2023.
The EUCAIM is the cornerstone of the European Cancer Imaging Initiative, and it aims to develop a federated cancer imaging repository, including the Champalimaud Foundation and another 74 partners in Europe. It's a huge consortium, and it will develop by connecting individual repositories, and also hospitals, to a huge infrastructure. This will be the platform for future AI development at a European level. Our repository, the ProCancer-I repository, will be part of this bigger EUCAIM repository.
You are developing AI tools for prostate cancer. Will they be generalizable to other types of cancer?
At the moment we are focusing only on prostate cancer, but our group at the Champalimaud Foundation is also working in other areas. We have finalized a very big proposal on pancreatic cancer with a slightly different approach, that we are going to submit soon.
Interview by Ana Gerschenfeld, Health & Science Writer of the Champalimaud Foundation.