There were a total of 530 observations from 119 women that were potentially eligible for inclusion in the model. Forty-three observations (8.1% of the complete dataset) were dropped during modeling because of missing covariate information. A total of 114 women contributing complete data on 487 lesions, of which 320 (65.7%) had histologically confirmed endometriosis. Approximately 2/3 of the observations (n = 334) were included in the training dataset and the remainder (n = 153) were included in the test dataset. Since each lesion was considered separately during the allocation process, the same woman could contribute lesions to both datasets. Thus, 104 women contributed to the training dataset, and 77 women to the validation dataset.
Of women in the complete dataset, 92 reported their race as white (80.7%). Forty (35%) had stage I disease, 46 (40%) had stage II disease, 17 (6%) had stage III disease, and 11 (9%) had stage IV disease. The mean age of the study participants was 31.4 ± 7.2 years (range 18–45) and the mean BMI was 25 ± 4.7 (range 17.2–44.5). Forty percent of lesions were located in the cul-de-sac or utero-sacral ligaments. The majority of the lesions were subtle lesions (red or white, 51.6%)(). The distributions of age, race, BMI, stage, lesion location, color, width and histologically confirmed endometriosis did not differ significantly between the training and validation datasets, suggesting that participant characteristics were reasonably well-balanced.
Table 1 Distribution of lesion characteristics, odds ratios for association of lesion characteristics with histologically-confirmed endometriosis, and calculated probability of histologically-confirmed endometriosis of each characteristic (derived from the model (more ...)
The Hosmer-Lemeshow goodness-of-fit test for the training dataset was used to validate the model. The p value was estimated at 0.78, indicating good calibration of the model to the dataset. The test dataset had a p value of 0.30, again showing good calibration. The area under the ROC curve for the training dataset was 0.70 indicating only fair discrimination of the model which was similar to the test dataset at 0.69.
After validating the model, it was applied to the complete dataset to determine characteristics predictive of endometriosis. Lesions located on the ovarian fossa, colon, or appendix were 25% more likely than those on the uterus, ovary, fallopian tubes, cul-de-sac, or utero-sacral ligaments to contain histologically-confirmed endometriosis (). The odds that a given lesion was confirmed to be endometriosis increased by 5% per millimeter of lesion width (OR=1.05, 95% CI 1.02, 1.09). Lesions from women classified as having Stage I disease were significantly less likely to contain endometriosis than women with Stage II-IV disease (OR=0.49, 95% CI 0.31, 0.79). The odds of lesions of mixed color truly containing endometriosis were 87% greater than those that were red or white (OR=1.87; 95% CI 1.05, 3.34), yet, red or white lesions were as likely as those that were blue, black, brown, or endometriomas to be confirmed as endometriosis. Age and BMI were not associated with changes in the odds of histologically-confirmed endometriosis.
The model was also used to explore the change in the probability of confirming endometriosis by adjusting each variable in relation to the referent (most common) value for each characteristic. The largest change in percentage occurred among different locations (from −22% to +3%), and the smallest change was observed with age (from −2% to +2%). Using the characteristics with the highest probability of confirming endometriosis, a nearly 3 cm wide, mixed color lesion in the ovarian fossa from a 42 year old Caucasian woman with a BMI of 15 and at least stage II disease had the highest probability of endometriosis (92.9%; 95% CI: 0.73, 0.98). By contrast, a small red lesion on the bladder peritoneum from a 22 year old non-Caucasian woman with stage I endometriosis had the lowest probability of confirmed endometriosis (22.0%; CI: 0.09, 0.45).
We also determined the sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) of the final model. The model predicted the presence of endometriosis with a sensitivity of 88.8% and predicted the absence of endometriosis with a specificity of 24.6%. The PPV was 69.3% and the NPV 53.3%. This equated to correct classification of a lesion of 66.5% ().
ROC curve. Model correctly predicts endometriosis in 66.7% of the time. Area under ROC curve=0.7026