No one is perfect, including radiologists who read and interpret multiparametric MRI (mpMRI) prostate scans. However, some are better than others. Studies show that experienced readers are generally more accurate than less experienced readers. In fact, based on the number of MRIs reviewed over time as well as continuing education (conferences, training workshops, classes, etc.), it can take years for an individual radiologist to accumulate correct knowledge and judgment. Such is the human condition.
On the other hand, Artificial Intelligence (AI) is not subject to the human condition in the same way. Although AI is dependent on humans to design and train programs, the process takes months rather than years. And once the program has been properly tested and validated, it does not need to attend future conferences, training workshops, or classes. It can get straight to work. The question is, can human-designed state-of-the-art AI perform at least as well as experienced radiologists?
A new study shows AI outperforms humans
According to a new paper published by an international consortium of experts, a new AI model actually outperforms human readers.[i]
The AI program itself was trained on 9,207 MRI scans, and tested on 1,000 MRI scans from four different medical centers. Out of the total pool of scans, 400 cases were chosen for which there was also tissue based confirmation of clinically significant PCa defined as Gleason grade group 2 (GG ≥ 2). For comparison purposes in interpreting the 400 MRIs, the performance of the AI program was up against that of 62 radiologist readers from 45 centers in 20 countries. With an average of seven years’ experience reading prostate MRIs, these reviewers were hardly “newbies”. They based their detection of clinically significant PCa on PI-RADS 2.1.
A June, 2024 news report in Diagnostic Imaging summarized the findings of the comparison study:
- The AI program demonstrated 50.4% fewer false positive results compared to radiologists. (False positive means PCa was identified on the scan but it wasn’t actually there according to the case record of tissue samples). Reducing false positives means less anxiety for the patient as well as fewer follow-up tests and procedures.
- In terms of positive predictive value (PPV = odds that if MRI says PCa is there, it really is) and negative predictive value (NPV = odds that if MRI says it’s not there, it really isn’t), AI outperformed readers. AI was more accurate at determining if PCa was present or not.
AI | Radiologists | |
PPV | 68% | 53.2% |
NPV | 93.8% | 90.2% |
One final level of comparison was also included in the research. The 62 participating readers’ results were compared with those of the medical personnel who dealt with each of the 400 cases. On balance, the actual clinicians also outperformed the 62 reviewers. The study authors theorize that the patients’ clinicians had better accuracy simply because at the time of diagnosis and treatment, they also had access to “…patient history (including previous prostate-specific antigen levels and imaging and biopsy outcomes), peer consultation (or multidisciplinary team meetings), and protocol familiarity…”[ii]
Does this mean that AI will someday take the place of human readers? Not likely. As the authors conclude, “Such a system shows the potential to be a supportive tool within a primary diagnostic setting, with several associated benefits for patients and radiologists.”[iii] For the time being, they note that their comparison study is only a beginning, and more testing of their new model is needed to truly validate it.
NOTE: This content is solely for purposes of information and does not substitute for diagnostic or medical advice. Talk to your doctor if you are experiencing pelvic pain, or have any other health concerns or questions of a personal medical nature.
References
[i] Saha A, Bosma JS, Twilt JJ, van Ginneken B et al. Artificial intelligence and radiologists in prostate cancer detection on MRI (PI-CAI): an international, paired, non-inferiority, confirmatory study. Lancet Oncol. 2024 Jul;25(7):879-887.
[ii] Ibid.
[iii] Ibid.