Background: Adnexal masses represent a diagnostic challenge because several benign lesions show complex morphology, while early ovarian malignancy may remain clinically silent. Standardized imaging systems attempt to convert morphology into reproducible risk categories and reduce variability in pre-operative decision-making.
Objectives: To compare O-RADS ultrasound (O-RADS US), Gynecologic Imaging Reporting and Data System (GI-RADS), and O-RADS MRI for characterization of adnexal masses, using histopathological examination as the reference standard.
Methods: This hospital-based diagnostic accuracy study included 110 female patients with adnexal masses who underwent ultrasound evaluation, pelvic MRI and subsequent surgical histopathological confirmation. Lesions were categorized on ultrasound using O-RADS US and GI-RADS, while MRI-based categorization was performed using O-RADS MRI. GI-RADS was applied solely for comparative research analysis and did not influence clinical management decisions. Diagnostic indices were calculated using category >=4 as positive for malignancy risk.
Results: Histopathology confirmed 82 benign lesions and 28 malignant lesions. O-RADS MRI demonstrated the highest overall accuracy (93.6%) and specificity (92.7%), with a sensitivity of 96.4%, PPV of 81.8%, and NPV of 98.7%. O-RADS US showed sensitivity of 96.4%, specificity of 90.2% and accuracy of 91.8%. GI-RADS showed sensitivity of 92.9%, specificity of 87.8% and accuracy of 89.1%. AUC was highest for O-RADS MRI (0.958), followed by O-RADS US (0.924) and GI-RADS (0.901). Interobserver agreement was also highest for O-RADS MRI (kappa=0.87).
Conclusion: O-RADS MRI demonstrated the best overall diagnostic performance, particularly through improved specificity and PPV. O-RADS US remained a robust first-line risk stratification system, while GI-RADS served as a useful supportive comparator. Histopathological examination remains essential for definitive diagnosis; however, incorporating MRI into ultrasound-based assessment improves diagnostic confidence in indeterminate and suspicious adnexal lesions.
Adnexal masses are frequently encountered in gynecologic and radiology practice; however, their interpretation is rarely straightforward. Functional cysts, endometriomas, dermoid cysts, borderline tumors and epithelial ovarian malignancies may all present as ovarian or adnexal lesions on initial imaging, and the consequences of misclassification are clinically significant. Overcalling a benign lesion may generate avoidable anxiety, additional imaging and aggressive surgery. Undercalling malignancy can delay referral to a gynecologic oncology unit, affecting staging, operative planning and prognosis. Globally, ovarian cancer continues to contribute substantially to cancer mortality among women, partly because many patients are diagnosed only after symptoms become persistent or disease becomes advanced.[1]
The Indian setting adds its own practical layer. Women may present late, imaging access is uneven across districts, and referral to gynecologic oncology is often concentrated in tertiary centres. National Cancer Registry Programme estimates show the increasing importance of cancer surveillance and site-specific planning in India, including malignancies of the female genital tract.[2] Here, a standardized, reproducible radiology report is not merely an academic instrument. It can influence whether a patient is observed locally, referred for specialist evaluation, or taken up for planned surgical staging.
Ultrasonography remains the first-line imaging modality for adnexal assessment because it is accessible, low-cost, dynamic and capable of combining morphology with Doppler vascularity. The International Ovarian Tumor Analysis(IOTA) group standardized terms and definitions for describing adnexal tumors, thereby establishing the foundation for more consistent sonographic interpretation across observers and institutions.[3] O-RADS US later formalized this approach into a risk stratification and management system, providing structured categories linked to malignancy risk and clinical action.[4,5]
GI-RADS was another attempt to make gynecologic ultrasound reporting more structured. It classifies adnexal lesions into progressive risk groups and has shown useful diagnostic performance in prior prospective work.[6] However, GI-RADS is not as deeply integrated into current multidisciplinary radiology workflows as O-RADS, nor does it provide a comparable framework linked to MRI-based risk stratification. In the present work, GI-RADS was therefore treated as a supportive comparator rather than as a management-defining system.
MRI has become particularly valuable when ultrasound findings are indeterminate or when lesion complexity raises a risk of false-positive classification. The O-RADS MRI score was developed and validated as a structured method for risk stratification of sonographically indeterminate adnexal masses.[7] The ACR O-RADS MRI Committee subsequently provided detailed guidance on assigning MRI risk categories using T1- and T2-weighted morphology, diffusion restriction and enhancement patterns.[8] In practical terms, MRI may reduce unnecessary oncologic referral for benign complex lesions while preserving sensitivity for clinically important malignancy.
The present study was designed to compare O-RADS US, GI-RADS and O-RADS MRI in women with adnexal masses, using histopathological examination as the reference standard. The main emphasis was the comparative diagnostic value of O-RADS US and O-RADS MRI, with GI-RADS included as a supportive ultrasound-based reference system.
MATERIALS AND METHODS
Study Design and Setting
A hospital-based diagnostic accuracy study was conducted at a tertiary-care teaching hospital over a two-year period (July 2023- July 2025) in the Department of Radiodiagnosis. Female patients presenting with adnexal masses and satisfying eligibility criteria were included.
Study Population
The study included 110 female patients with adnexal masses who were referred for imaging evaluation and subsequently had histopathological confirmation. The final cohort consisted only of patients in whom ultrasound assessment, MRI categorization and histopathological diagnosis were available.
Inclusion Criteria
Female patients with adnexal masses detected on ultrasonography were included if they underwent ultrasound evaluation for O-RADS US and GI-RADS categorization, pelvic MRI for O-RADS MRI classification, subsequent surgical intervention by laparoscopy or laparotomy, and histopathological diagnosis. Written informed consent was obtained.
Exclusion Criteria
Pregnant patients, patients without MRI assessment or with incomplete imaging information, patients not undergoing surgery or lacking histopathological diagnosis, patients with poor-quality ultrasound or MRI studies that prevented adequate visualization, and patients who did not provide consent were excluded.
Ultrasound Protocol
All patients underwent transabdominal and/or transvaginal ultrasonography using a high-frequency probe as appropriate for clinical circumstances. Each adnexal lesion was assessed for size, morphology, cystic/solid/mixed composition, septations, papillary projections and Doppler vascularity. Lesions were categorized according to O-RADS US and GI-RADS. GI-RADS categorization was applied for comparative research purposes only and was not used for clinical management decisions.
MRI Protocol
Pelvic MRI was performed using a standard pelvic protocol. Sequences included T1-weighted imaging, T2-weighted imaging, fat-suppressed sequences, diffusion-weighted imaging (DWI), and post-contrast sequences where applicable. Lesions were categorized using O-RADS MRI. MRI was used to enhance characterization of lesions requiring further risk stratification after ultrasound assessment, particularly indeterminate and suspicious lesions.
Reference Standard
All included patients subsequently underwent surgical intervention by laparoscopy or laparotomy. Final histopathological examination was considered the reference standard for determining benign or malignant lesions.
Diagnostic Threshold
For all three systems, category >=4 was considered positive for malignancy risk, while categories 1-3 were considered negative for binary diagnostic accuracy analysis.
Interobserver Assessment
Fifty randomly selected cases were reviewed independently by two radiologists. Interobserver agreement was assessed separately for O-RADS US, GI-RADS and O-RADS MRI using kappa statistics.
Statistical Analysis
Imaging categories were compared with histopathological diagnosis. Sensitivity, specificity, positive predictive value, negative predictive value and diagnostic accuracy were calculated for each system. Receiver operating characteristic (ROC)analysis was performed to estimate the area under the curve (AUC). AUC comparisons were reported with O-RADS MRI as the reference. A p-value below 0.05 was considered statistically significant.
RESULTS
The study included 110 female patients with adnexal masses. Age ranged from 26 to 78 years, with a mean age of 52.3+/-14.3 years. Histopathological examination confirmed 82 benign lesions (74.5%) and 28 malignant lesions (25.5%) (Table 1 and Figure 2). The diagnostic workflow is summarized in Figure 1.
|
Variable |
Value |
Interpretation |
|
Total patients |
110 |
All had ultrasound, MRI and histopathological confirmation |
|
Age, years |
52.3+/-14.3; range 26-78 |
Adult female cohort with broad age distribution |
|
Benign lesions |
82 (74.5%) |
Reference-standard benign diagnosis |
|
Malignant lesions |
28 (25.5%) |
Reference-standard malignant diagnosis |
|
Table 1. Baseline cohort profile and reference-standard diagnosis (n=110) |
||
Values are expressed as n (%) unless otherwise specified. O-RADS: Ovarian-Adnexal Reporting and Data System; GI-RADS: Gynecologic Imaging Reporting and Data System.
*All included patients had imaging categorization and histopathological confirmation available for final diagnostic comparison.*
*Donut chart shows the proportion of benign and malignant lesions in the final cohort*
Category-wise distribution showed increasing malignancy rates with higher risk categories across all three systems (Table 2). In O-RADS US, malignancy was absent in categories 1 and 2, low in category 3 (4.0%), and increased sharply in categories 4 and 5 (70.0% and 86.7%, respectively). GI-RADS demonstrated a similar ordinal trend, although category 3 included two malignant lesions and the malignancy rates in categories 4 and 5 were 60.0% and 87.5%, respectively. O-RADS MRI produced the steepest upper-category gradient, with no malignancy in categories 1 and 2, a 5.9% malignancy rate in category 3, and rates of 66.7% and 94.4% in categories 4 and 5. The overall category distribution across the three systems is shown in Figure 3.
|
System |
Category |
Benign |
Malignant |
Total |
Malignancy rate (%) |
|
O-RADS US |
1 |
10 |
0 |
10 |
0.0 |
|
O-RADS US |
2 |
40 |
0 |
40 |
0.0 |
|
O-RADS US |
3 |
24 |
1 |
25 |
4.0 |
|
O-RADS US |
4 |
6 |
14 |
20 |
70.0 |
|
O-RADS US |
5 |
2 |
13 |
15 |
86.7 |
|
GI-RADS |
1 |
8 |
0 |
8 |
0.0 |
|
GI-RADS |
2 |
42 |
0 |
42 |
0.0 |
|
GI-RADS |
3 |
22 |
2 |
24 |
8.3 |
|
GI-RADS |
4 |
8 |
12 |
20 |
60.0 |
|
GI-RADS |
5 |
2 |
14 |
16 |
87.5 |
|
O-RADS MRI |
1 |
12 |
0 |
12 |
0.0 |
|
O-RADS MRI |
2 |
48 |
0 |
48 |
0.0 |
|
O-RADS MRI |
3 |
16 |
1 |
17 |
5.9 |
|
O-RADS MRI |
4 |
5 |
10 |
15 |
66.7 |
|
O-RADS MRI |
5 |
1 |
17 |
18 |
94.4 |
|
Table 2. Consolidated category-wise comparison of O-RADS US, GI-RADS and O-RADS MRI with histopathology |
|||||
Category >=4 was considered positive for malignancy-risk analysis across all three systems. GI-RADS was used as a supportive comparative system and not for clinical management decisions.
*Grouped bars show the number of patients assigned to each category by each imaging system*
At the category >=4 cut-off, O-RADS MRI achieved the highest specificity, PPV and overall diagnostic accuracy. O-RADS US and O-RADS MRI showed identical sensitivity (96.4%) and NPV (98.7%), while O-RADS MRI reduced false-positive classification compared with both ultrasound-based systems. GI-RADS had slightly lower sensitivity, specificity and accuracy than O-RADS US and O-RADS MRI (Table 3 and Figure 4).
|
System |
Sensitivity % (95% CI) |
Specificity % (95% CI) |
PPV % (95% CI) |
NPV % (95% CI) |
Accuracy % |
|
O-RADS US |
96.4 (81.7-99.9) |
90.2 (81.7-95.7) |
77.1 (59.9-89.6) |
98.7 (93.0-99.9) |
91.8 |
|
GI-RADS |
92.9 (76.5-99.1) |
87.8 (78.7-94.0) |
72.2 (54.8-85.8) |
97.3 (90.6-99.7) |
89.1 |
|
O-RADS MRI |
96.4 (81.7-99.9) |
92.7 (84.8-97.3) |
81.8 (64.5-93.0) |
98.7 (93.1-100.0) |
93.6 |
|
Table 3. Diagnostic performance of O-RADS US, GI-RADS and O-RADS MRI at category >=4 threshold |
|||||
PPV: positive predictive value; NPV: negative predictive value. Category >=4 was treated as positive for malignancy risk.
*Grouped bars show sensitivity, specificity, PPV, NPV and accuracy percentages for each system*
ROC analysis showed the highest AUC for O-RADS MRI (0.958; 95% CI 0.918-0.998), followed by O-RADS US (0.924; 95% CI 0.872-0.976) and GI-RADS (0.901; 95% CI 0.841-0.961). Compared with O-RADS MRI, the AUC differences were statistically significant for O-RADS US (p=0.031) and GI-RADS (p=0.008). Interobserver agreement was strongest for O-RADS MRI, with kappa of 0.87 (95% CI 0.80-0.94), followed by O-RADS US with kappa of 0.82 (95% CI 0.74-0.90). GI-RADS showed substantial agreement with kappa of 0.71 (95% CI 0.62-0.80) (Table 4).
|
System |
AUC |
95% CI |
p-value vs O-RADS MRI |
Kappa value (95% CI) |
Agreement |
|
O-RADS US |
0.924 |
0.872-0.976 |
0.031 |
0.82 (0.74-0.90) |
Almost perfect |
|
GI-RADS |
0.901 |
0.841-0.961 |
0.008 |
0.71 (0.62-0.80) |
Substantial |
|
O-RADS MRI |
0.958 |
0.918-0.998 |
Reference |
0.87 (0.80-0.94) |
Almost perfect |
|
Table 4. ROC/AUC analysis and interobserver agreement |
|||||
AUC: area under receiver operating characteristic curve. Kappa values were calculated from 50 randomly selected cases independently reviewed by two radiologists.
DISCUSSION
The present study compared three structured imaging systems for adnexal mass characterization against histopathology. The main finding was clear: O-RADS MRI provided the best overall diagnostic performance, mainly by improving specificity and PPV while retaining the high sensitivity seen with O-RADS US. O-RADS US also performed strongly and remained suitable as the first-line risk stratification system. GI-RADS was diagnostically useful; however, its performance was modestly lower than that of O-RADS US and O-RADS MRI.
The high sensitivity of O-RADS US in this study is consistent with the broader literature. A systematic review comparing O-RADS US and O-RADS MRI reported that both systems have high sensitivity for detecting ovarian or adnexal malignancy, while MRI contributes higher specificity.[9] That pattern is almost mirrored here. O-RADS US identified 27 of 28 malignant lesions at the category >=4 threshold, but it also produced more false-positive classifications than O-RADS MRI. This is expected in sonographic triage because ultrasound prioritizes early suspicion and referral rather than final tissue characterization.
Category-wise performance also behaved in a clinically plausible way. In O-RADS US, no malignancy occurred in categories 1 and 2, malignancy was uncommon in category 3, and risk rose markedly in categories 4 and 5. This ordinal gradient supports the internal validity of the classification. It also aligns with meta-analytic evidence showing category-specific increases in malignancy rates across O-RADS US levels.[10] Similar trends were observed for GI-RADS and O-RADS MRI, although O-RADS MRI showed the highest malignancy concentration in category 5, where 94.4% of lesions were malignant.
The difference between O-RADS US and O-RADS MRI becomes most important in indeterminate lesions. Ultrasound is excellent for first-line triage, but complex benign lesions can mimic malignancy: endometriomas with mural irregularity, dermoid variants, cystadenofibromas, hemorrhagic lesions and inflammatory tubo-ovarian pathology may all generate suspicious morphology. External validation studies of O-RADS US have shown strong diagnostic value but have also emphasized the challenge of specificity in intermediate-risk categories.[11,12] MRI adds tissue characterization, diffusion-weighted assessment and contrast-enhancement information, which can help separate true solid enhancing malignant components from benign mimics.
This has immediate relevance in Indian clinical pathways. Many centres still rely heavily on ultrasound because it is widely available and cost effective. However, when ultrasound yields O-RADS 3 or 4 impressions, a second-stage MRI assessment may prevent excessive referral or overly radical surgery in women with benign disease. The goal is not to replace ultrasound. Rather, it is to use MRI selectively where ultrasound leaves uncertainty, where operative planning will change, or where fertility-preserving decisions are being considered.
GI-RADS performed reasonably well, with sensitivity of 92.9%, specificity of 87.8% and accuracy of 89.1%. Earlier work and meta-analysis have supported the ability of GI-RADS to distinguish benign from malignant adnexal lesions.[6,13] Still, compared with O-RADS US, GI-RADS showed slightly more false negatives and false positives in this cohort. Its role here should therefore be interpreted as complementary. It helped demonstrate that structured reporting improves clarity, but O-RADS US and O-RADS MRI appear better aligned with contemporary radiology practice and multidisciplinary communication.
Interobserver agreement is not a cosmetic endpoint in imaging research. A system that performs well only in the hands of one expert reader may fail in routine practice. The kappa values observed in this study were encouraging: O-RADS MRI showed almost perfect agreement, O-RADS US also showed almost perfect agreement, and GI-RADS showed substantial agreement. These findings fit with emerging evidence that O-RADS MRI can be reproducible when readers follow standardized criteria.[14] The slightly higher agreement for MRI may reflect the additional lesion characterization provided by multiple sequences, particularly diffusion and enhancement patterns. Landis and Koch's conventional interpretation of kappa also supports classification of these values as representing substantial to almost perfect agreement.[15]
The study has limitations. First, it was a single-centre diagnostic accuracy study with 110 patients and 28 malignant cases. Second, only patients with surgical histopathological confirmation were included, which creates verification bias and may over-represent lesions considered clinically important enough for surgery. Third, the analysis used a binary category >=4 threshold; while this is practical for diagnostic comparison, management decisions in real practice require patient age, symptoms, fertility preference, menopausal status, tumor markers, operative fitness and local oncology availability. Finally, the study did not evaluate cost-effectiveness, which is particularly important in resource-variable Indian settings. Even with these limitations, the findings support a pragmatic pathway: O-RADS US for first-line standardization, selective O-RADS MRI for indeterminate or suspicious lesions, and histopathology as the final diagnostic anchor.
CONCLUSION
O-RADS MRI demonstrated the highest diagnostic accuracy and specificity for characterization of adnexal masses, while preserving the high sensitivity seen with O-RADS US. O-RADS US remains a robust first-line system for standardized ultrasound risk stratification, and GI-RADS may serve as a supportive comparative tool rather than a management-defining system. In women with indeterminate or suspicious adnexal lesions, MRI-based O-RADS assessment can improve diagnostic confidence, reduce false-positive classifications, and support more appropriate surgical planning. Histopathological examination remains the definitive reference standard for diagnosis.
Source of Funding
Nil.
Conflict of Interest
None declared.
REFERENCES