Deconstructing the False Positive Narrative in Whole-Body MRI Screening

Summary

Debate surrounding whole-body MRI (WB-MRI) screening often centers on the claim that it produces excessive false positives. A central issue in this discussion, however, is the absence of a consistent definition of what constitutes a false positive in preventive imaging. In traditional diagnostic research, a false positive refers to a finding that meets a defined suspicion threshold and is later proven not to represent the targeted disease within a specified follow-up period. In WB-MRI screening, the term is frequently applied more broadly to indeterminate findings, benign lesions requiring characterization, opportunistic discoveries, or any abnormality that leads to additional imaging. These scenarios are not equivalent to diagnostic error.

Recent literature evaluating WB-MRI in asymptomatic populations, including a large systematic review and meta-analysis, demonstrates substantial heterogeneity in imaging protocols, follow-up practices, and outcome adjudication. Definitions of suspicious findings and false positives vary across studies, limiting the interpretability of pooled performance metrics. Consensus guidance has emphasized the need for harmonized acquisition standards and structured reporting systems to improve reproducibility and reduce variability.

Structured risk stratification frameworks such as ONCO-RADS, combined with the Prenuvo developed Clinically Significant Diagnosis (CSD) scale, provide a more transparent and reproducible method of categorizing findings. By separating cancer probability from overall clinical significance, these systems reduce ambiguity and improve communication.

Advancing WB-MRI screening requires clearer definitions, standardized protocols, consistent reporting practices, and prospective longitudinal research. A more rigorous and transparent methodological approach will allow WB-MRI performance to be evaluated more accurately within preventive healthcare.

Key talking points

  • False positives are often inconsistently defined. Indeterminate or benign findings are sometimes mislabeled as errors.
  • Study and interpretation variability affect results. Differences in protocols and expertise influence reported WB-MRI performance.
  • Structured reporting improves consistency. ONCO-RADS and CSD framework can help separate cancer risk from overall clinical importance.
  • More standardized research is needed. Prospective studies with consistent protocols and follow-up are needed for accurate evaluation

1. The Core Challenge: Defining “False Positives” in WB-MRI Screening

Much of the criticism surrounding whole-body MRI (WB-MRI) screening centers on the claim that it generates excessive ”false positives”. However, a foundational problem in this debate is the absence of a consistent definition of what constitutes a false positive in the context of preventive WB-MRI.

In traditional diagnostic testing, a false positive has a specific meaning: a finding that meets a predefined criteria for a particular disease and is later proven not to represent that disease within a set follow-up period. That definition depends on three factors: (1) the disease being targeted, (2) the threshold at which a finding is considered “positive,” and (3) the reference standard and duration of follow-up used to confirm or exclude disease. Without these elements, false-positive rates cannot be meaningfully interpreted.[1]

However, the term “false positive” in WB-MRI is often applied more broadly. It may encompass indeterminate findings requiring short-interval follow-up, benign lesions that may still benefit from additional imaging for characterization or longitudinal monitoring, opportunistic findings (e.g., vascular abnormalities, cysts, metabolic conditions), clinically relevant but non-urgent findings or even any abnormality that leads to contextually-indicated additional imaging. These situations are not the same as diagnostic errors.

Whole-body screening is designed to examine multiple organ systems at once. It is therefore expected to uncover a range of findings. Some findings cannot be definitively classified at first presentation and may be relevant for  short-interval surveillance or targeted imaging. If follow up imaging confirms that the finding is benign, characterizing the initial interpretation as a “false positive” is potentially methodologically inaccurate depending on the clinical context and specific motivation for more definitive workup. 

For example, a liver mass seen on screening WB-MRI may appear most consistent with a benign condition called focal nodular hyperplasia (FNH), but which is not definitively confirmed without contrast-enhanced imaging. A follow-up liver MRI with hepatobiliary-specific contrast may be recommended to confirm the diagnosis at baseline. If that study confirms FNH, calling the original finding a “false positive” oversimplifies what actually occurred. The lesion was real, the assessment appropriately reflected probabilities — even correctly anticipating the final diagnosis — and the recommendation for targeted contrast imaging represented careful, evidence-based diagnostic confirmation, not overdiagnosis or unnecessary testing.

When indeterminate findings, benign but clinically meaningful diagnoses, and true diagnostic errors are grouped under the single label of “false positive,” perceived error rates may be inflated and difficult to interpret. Clear operational definitions and structured interpretation frameworks can help separate appropriate caution from genuine diagnostic error. Without these distinctions, discussions about false positives in screening WB-MRI do not accurately reflect the modality’s true performance.

2. Reimbursement, Variability, and Misunderstanding Screening Performance

Insurance coverage policies for WB-MRI, especially in screening, are sometimes used as indirect evidence that the modality produces too many false positives or lacks diagnostic value. This is a misunderstanding. Coverage policies are driven by economic modeling, budget constraints, and establishment-based requirements, not by direct measurements of false-positive or false-negative rates.[2, 3] Using reimbursement status to judge diagnostic accuracy confuses financial policy with clinical performance. Coverage reflects whether a service meets payer criteria, not whether the test itself is accurate.

Another important factor is interpretive variability. Differences in image interpretation and reporting are well documented across radiology. Inter-reader variability is well documented across radiology, with discrepancy rates increasing when imaging is interpreted outside a radiologist’s subspecialty training or when standardized reporting criteria are absent.[4,5]. Screening whole-body MRI adds another layer of complexity. It surveys multiple organ systems in a single examination and requires familiarity with a broad range of pathologies, as well as clinical decision-making principles specific to screening rather than diagnostic contexts. Whole-body MRI requires specialized radiologic training and expertise to ensure accurate interpretation and clinically meaningful reporting, particularly in preventive or screening settings where findings span multiple organ systems and the clinical value of reporting is contingent on preventive actionability. Differences in expertise, workflow, and reporting style can influence how findings are clinically categorized and managed. Apparent false-positive or false-negative rates can therefore reflect variability in classification and expertise rather than inherent limitations of WB-MRI.

3. Evidence from WB-MRI Screening in Asymptomatic Population

Recent literature evaluating WB-MRI in asymptomatic populations reinforces these challenges rather than resolving them.

A meta-analysis reported a pooled cancer detection rate of approximately 1.6% across asymptomatic cohorts[6]. However, the authors identified substantial heterogeneity in imaging protocols, follow-up practices, and outcome adjudication. Criteria for what constituted a “positive” examination varied across studies, and consistent definitions of false positives were not uniformly applied. Follow-up intervals and confirmation standards were also inconsistent, making it difficult to determine how indeterminate findings were ultimately classified.

The review further noted the absence of standardized long-term outcome tracking and formal cost-effectiveness analyses. Without harmonized definitions and adjudication frameworks, pooled estimates of detection rates and incidental findings are inherently difficult to interpret.

These findings align with an earlier systematic review that emphasized that whole-body MRI requires standardized acquisition parameters and structured reporting systems to ensure reproducibility and reduce interpretive variability.[7] .

4. Structured Risk Stratification

4.1 ONCO-RADS: Clarifying Oncologic Probability

ONCO-RADS was developed to provide a standardized framework for the acquisition, interpretation, and reporting of WB-MRI, with the goal of improving reproducibility and reducing inter-reader variability in oncologic imaging-based screening.[8] In the context of preventive or screening WB-MRI, structured risk stratification is important given the low prevalence of malignancy and the wide spectrum of potential findings, including non-malignant pathology.

ONCO-RADS defines minimum standards for screening WB-MRI, including specified anatomic coverage, required pulse sequences (T2-weighted and Dixon-based T1-weighted imaging), and—critically—multi–b-value diffusion-weighted imaging (DWI), typically acquired without intravenous contrast. Standardized protocol design ensures consistent image quality and minimizes variability related to technique rather than interpretation.

ONCO‑RADS reporting employs a five‑tier risk‑stratification scale that categorizes findings as:

Category 1: No oncologic relevance

Category 2: Very unlikely to represent cancer

Category 3: Probably benign but not fully characterized

Category 4: Suspicious for probable malignancy

Category 5: Highly suspicious for malignancy

Each category is linked to a recommended management pathway, ranging from reassurance to individualized risk assessment, interval surveillance, targeted diagnostic imaging, or biopsy. This linkage between likelihood and action is central to the framework. Rather than labeling findings in isolation, ONCO-RADS integrates risk assessment with recommended clinical decision-making.

Importantly, recent data support the construct validity of this categorization approach. Hu et al. evaluated the application of ONCO-RADS scoring in an asymptomatic screening cohort and demonstrated that malignancy rates increased across higher ONCO-RADS categories, providing preliminary validation that the scoring system correlates with cancer risk in preventive populations.[9] These findings suggest that structured likelihood stratification can meaningfully differentiate low-risk from higher-risk findings in screening settings. Large prospective studies with standardized follow-up and long-term outcome tracking are needed to further establish the predictive performance and clinical utility of ONCO-RADS in general population screening.

4.2 ONCO‑RADS and the Clinically Significant Diagnosis (CSD) Scale: Dual‑Axis Clinical Interpretation 

While structured malignancy likelihood frameworks such as ONCO-RADS address the question of cancer probability, preventive WB-MRI frequently also identifies findings that are clinically important but not oncologic. In screening populations, distinguishing between these dimensions is essential.

To address this, Prenuvo implemented the Clinically Significant Diagnosis (CSD) scale, an internally developed system that stratifies overall clinical importance independent of malignancy risk. Rather than focusing solely on oncologic relevance, CSD categorizes findings by preventive health impact and actionability:

  1. The finding is not clinically significant. It represents normal anatomy or a

common normal variation. No additional evaluation or follow-up is needed.

  1. The finding is unlikely to be clinically significant and is documented for

completeness. No additional evaluation or follow-up is typically needed.

  1. The finding is likely low risk but may have clinical significance depending on

your medical history or risk factors. Healthcare clinician should review it

to determine whether any additional evaluation or non-urgent follow-up is

needed.

  1. The finding is likely clinically significant. Prompt medical evaluation and

The finding is clinically significant and may represent a serious medical

condition. Urgent medical evaluation is recommended.

testing is recommended.

  1. The finding is clinically significant and may represent a serious medical

condition. Urgent medical evaluation is recommended.

In preventive WB-MRI, many findings are non-oncologic yet still clinically relevant, such as aneurysms, hormonally active endocrine lesions, reproductive system disorders, or inflammatory conditions. Without separating cancer probability from overall actionability, these findings may be inaccurately grouped under “false positives” merely because they appropriately prompted follow-up tailored to the individual patient.  Because ONCO-RADS and CSD measure different constructs, their scores do not always match. A finding may be ONCO-RADS 1 yet CSD 5 (e.g., a large, high-risk brain aneurysm) or ONCO-RADS 1 and CSD 3 (e.g., mild hepatic steatosis warranting routine clinical/lab correlation).

This dual-axis framework is particularly valuable in low-prevalence preventive settings. By integrating ONCO-RADS and CSD, radiologists can distinguish oncologic from non-oncologic concern, prioritize actionable findings, standardize reporting, and reduce misclassification of clinically relevant non-oncologic findings as “false positives.”

Conclusion

Concerns about false positives in WB-MRI screening appear to stem less from intrinsic limitations of the technology than from inconsistent definitions, interpretive variability, and methodological heterogeneity across studies. Refinements in MRI protocol design, consistent technical execution, and risk-stratified reporting using structured frameworks such as ONCO-RADS and CSD can improve clarity, reduce variability, and ensure that sensitivity/specificity metrics more accurately reflect true test performance. Future evaluation of WB-MRI screening should prioritize prospective study designs incorporating standardized imaging protocols, structured reporting systems, uniform follow-up practices, long-term outcome tracking, and formal cost-effectiveness analyses. Greater methodological consistency and transparency will allow the clinical performance and value of WB-MRI screening to be assessed more rigorously within preventive healthcare.

References

  1. O’Sullivan JW, Muntinga T, Grigg S, Ioannidis JPA. Prevalence and outcomes of incidental imaging findings: umbrella review. BMJ. 2018;361:k2387. doi:10.1136/bmj.k2387

  2. Van Herck P, Annemans L, Sermeus W, Ramaekers D. Evidence-based health care policy in reimbursement decisions: lessons from a series of six equivocal case studies. PLoS One. 2013;8(10):e78662. doi:10.1371/journal.pone.0078662

  1. Kim DD, Basu A. How does cost-effectiveness analysis inform health care decisions? AMA J Ethics. 2021;23(8):E639-E647. doi:10.1001/amajethics.2021.639

  1. Rosenkrantz AB, Duszak R Jr, Babb JS, et al. Discrepancy rates and clinical impact of imaging secondary interpretations: a systematic review and meta-analysis. J Am Coll Radiol. 2018;15(9):1222-1231. doi:10.1016/j.jacr.2018.05.037

  1. Chong S, Hanna TN, Steenburg SD, et al. Interpretations of examinations outside of radiologists’ fellowship training: assessment of discrepancy rates among 5.9 million examinations from a national teleradiology databank. AJR Am J Roentgenol. 2022;218(4):738-745. doi:10.2214/AJR.21.26656

  1. Martins da Fonseca J, Trennepohl T, Pinheiro LG, et al. Whole-body MRI for opportunistic cancer detection in asymptomatic individuals: a systematic review and meta-analysis. Eur Radiol. Published online 2025. doi:10.1007/s00330-025-11976-5

  1. Petralia G, Zugni F, Summers PE, et al; Italian Working Group on Magnetic Resonance. Whole-body magnetic resonance imaging (WB-MRI) for cancer screening: recommendations for use. Radiol Med. 2021;126(11):1434-1450. doi:10.1007/s11547-021-01392-2

  1. Petralia G, Koh DM, Attariwala R, et al. Oncologically relevant findings reporting and data system (ONCO-RADS): guidelines for the acquisition, interpretation, and reporting of whole-body MRI for cancer screening. Radiology. 2021;299(3):494-507. doi:10.1148/radiol.2021201740

  1. Hu YS, et al. Applying ONCO-RADS to whole-body MRI cancer screening in a retrospective cohort of asymptomatic individuals. Cancer Imaging. 2024;24(1):22. doi:10.1186/s40644-024-00665-z

Share this
Open modal