18.01.2026 |
Lems E, Koch AH, Delvaux EJLG, Leemans JC, Bongers MY, Lok CAR, Ramaekers BL, Geomini PMAJ
Abstract
Objective: Accurate preoperative classification of ovarian tumors is essential for guiding treatment. There is an increasing body of data evaluating ultrasound-based models for this purpose in diverse clinical settings. The aim of this systematic review and meta-analysis was to generate up-to-date evidence on the diagnostic accuracy of the most relevant ultrasound-based models, including the Risk of Malignancy Index (RMI) versions 1, 2 and 3, Logistic Regression model 2 (LR2), Simple ultrasound-based Rules (SR), the Assessment of Different NEoplasias in the adneXa (ADNEX) model and subjective assessment (SA), for the differentiation between benign and malignant ovarian tumors.
Methods: Ovid/MEDLINE, EMBASE and the Cochrane Library were searched systematically from database inception until 19 June 2025. Eligible studies investigated the diagnostic accuracy of at least one of the preselected models, collected model parameters prospectively and provided sufficient data to construct 2 × 2 tables. The risk of bias of all included studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS)-2 and QUADAS-C extension tools. Pooled summary estimates of sensitivity and specificity for all included models were calculated and bivariate models were fitted into hierarchical summary receiver-operating-characteristics curves. Bivariate random-effects meta-regression analysis was conducted to determine significant differences in sensitivity and specificity between models. Subgroup analyses were conducted according to menopausal status and prevalence of ovarian malignancy.
Results: A total of 99 studies were included, describing 42 496 ovarian tumors, of which 31 371 (74%) were benign and 11 125 (26%) were malignant. SA had both high sensitivity (90.2% (95% CI, 87.8-92.2%)) and high specificity (91.4% (95% CI, 89.3-93.2%)). SR followed by SA of inconclusive cases (SR + SA) showed similar performance to SA (sensitivity, 88.6% (95% CI, 85.7-91.0%); P = 0.397 and specificity, 91.0% (95% CI, 89.0-92.7%); P = 0.811), as did the ADNEX model with a cut-off of 20% (sensitivity, 86.7% (95% CI, 80.6-91.0%); P = 0.095; specificity 87.9% (95% CI, 80.1-92.9%), P = 0.119). The ADNEX model with a cut-off of 10% had a similar sensitivity to SA (92.7% (95% CI, 90.8-94.2%); P = 0.130), but lower specificity (78.4% (95% CI, 71.7-83.8%); P < 0.001). Higher cut-offs of the ADNEX model led to a decrease in sensitivity, whereas lower cut-offs resulted in reduced specificity. The LR2 model with a 10% cut-off had a sensitivity of 89.5% (95% CI, 85.8-92.4%) and a specificity of 82.3% (95% CI, 75.0-87.8%). The RMI had the lowest diagnostic accuracy, with a sensitivity of 69.7% (95% CI, 67.0-72.2%) and a specificity of 90.5% (95% CI, 88.3-92.4%) for RMI version 1 with a cut-off of 200. Subgroup analyses showed that both menopausal status and prevalence of malignancy significantly affected sensitivity (P < 0.01) and specificity (P < 0.01). Postmenopausal status and higher disease prevalence were associated with lower specificity, while sensitivity was lower in premenopausal women.
Conclusions: All approaches, except for the RMI, performed well and could be used to differentiate between benign and malignant ovarian tumors. Although SA with or without SR had the highest diagnostic performance, it is dependent on operator expertise. If a strategy independent of operator expertise is preferred, the ADNEX model is recommended. Because of the high sensitivity of the ADNEX model, the likelihood of missing malignancies is low. In postmenopausal women, however, the reduced specificity may warrant a higher cut-off, depending on how the impact of a false-positive test result is evaluated. © 2025 The Author(s). Ultrasound in Obstetrics & Gynecology published by John Wiley & Sons Ltd on behalf of International Society of Ultrasound in Obstetrics and Gynecology.
Ultrasound Obstet Gynecol. 2025 Dec 4. doi: 10.1002/uog.70135