The nomenclature of cervical cytology reports has been widely standardized in the USA with the Bethesda system.46
Slides are classified for specimen adequacy as satisfactory or unsatisfactory for evaluation. When there is no cytological evidence of neoplasia, the slides are designated as ‘negative for intraepithelial lesion or malignancy’. Epithelial cell abnormality is diagnosed when the degree of nuclear atypia is not sufficient to warrant a diagnosis of squamous intraepithelial lesion. Epithelial cell abnormalities fall into a broad category of squamous or glandular cell abnormality. Atypical squamous cells are further categorized as being ‘of undetermined significance’ (ASC-US) or ‘cannot exclude high grade squamous intraepithelial lesion’ (ASC-H). Squamous intraepithelial lesions are categorized as low-grade (LSIL), high-grade (HSIL) or squamous cell carcinoma (SCC). Atypical glandular cells if present are also reported as glandular epithelial cell abnormality (GECA). Absence of endocervical cells is reported as ‘inadequate transformation zone component’ (DNOEC).
We restricted the scope of the free-text rulebase to only the rules that were required to decide the values of parameters that are necessary for applying the guideline logic described in the previous step. There were four such parameters (). First, the screening variable indicated that the report was the result of a screening evaluation versus a specific diagnostic test. The referenced guidelines are applicable only for screening reports. Second, the cytology is classified as negative (normal reports), ASC-US, abnormal (other than ASC-US) and unsatisfactory for evaluation. Third, the HPV test could be negative, positive or not performed. Finally, the endocervical transformation zone was adequate or inadequate. The evaluation of the four parameters involved recognition of 11 concepts in the Pap report ().
To determine coverage and specificity of the identified word patterns on the development corpus, we applied the free-text rulebase to all Pap reports in the corpus. We used 49 293 reports for analysis. Of these, 6988 were either diagnostic or were non-cervical samples. A total of 41 910 (99.1%) reports was classified by the system into 21 categories corresponding to the combinations of parameter values extracted by the free-text rulebase. The distribution of reports is summarized in . Ninety-one per cent of the reports showed normal (negative for intraepithelial lesion) cytology, 2% of the reports indicated that the specimen was unsatisfactory for evaluation of the cytology, and the remaining 7% reports indicated an abnormal cytology. We manually verified 10 randomly selected reports from each category. A total of 210 (=10×21) reports was selected for the manual verification. The rulebase was found to determine correctly the parameter values for all the examined reports.
The free-text rulebase reported errors for 395 (0.9%) of the reports. Manual examination revealed that these were due to missing data or due to the fusion of multiple reports when data from the EMR were dumped into the research database. Invalid data are not expected in the production EMR, which was verified with spot-checking, and these errors were not considered further. However, in one case, the error was due to mutually exclusive diagnoses in the same report (diagnostic/reporting error). This was considered a rare event and was also not expected to have any significant effect on the CDSS performance.
These results suggest that the free-text rulebase covers all the word patterns in the development corpus and that the patterns are concept specific. These findings are the result of the use of an in-house text template for generating the reports. The identified word patterns closely correspond to the in-house template, barring some spelling variations. As the corpus used for the development represents a very large sample of different Pap reports, the free-text rulebase is expected to perform accurately on nearly all Pap reports in the EMR system. We hope that our study will guide decision-makers to make pathologists' annotations directly available in the EMR. This will facilitate the development of decision support tools by obviating the need for text processing to interpret the free-text reports.
Our choice of a rule-based approach for the text processor was due to: (1) the availability of the in-house template used for generating the reports; (2) the need for representing the physician's logic for report interpretation; (3) the need for providing decision explanations to the physicians that may not have been possible with other approaches; (4) the requirement of high accuracy for the decision support application; and (5) the provision of additional checks (rules) to ensure that the report had logically consistent findings and was not corrupted.
shows the flowchart abstraction of the guideline rulebase. The flowchart consisted of 22 nodes (11 leaf nodes) and 20 edges and spanned five different parts of the EMR (). A brief overview is as follows. The flow chart begins with a check in the registration section whether the patient is female. Next, the patient-provided information section is accessed to find whether the response to the question—‘Have your menstrual periods changed in anyway or become abnormal to you?’ matches the option ‘No, I have had a hysterectomy’. This is to ensure that female patients that have undergone a hysterectomy are not advised to have a routine screening Pap test. The patient-provided information section consists of the patient's response to annually administered questionnaires. Next, for patients with no history of hysterectomy, the list of patient documents is then searched to identify ‘Pap reports’ and the latest report is analyzed to determine the cytology type. In case the cytology is of type ASC-US or normal, the ‘HPV test’ and ‘endocervical transformation zone component’ parameters deduced from the report are used.
Distribution of data points in the guideline rulebase/flowchart across EMR sections
If the HPV test is positive, the previous Pap report is analyzed to check for the HPV test. The majority (76%) of patients have a normal cytology, negative or no report of HPV and adequate transformation zone. For these patients, the recommendation depends on the age (accessed from the registration section) and if they are in the 30–65 years age group, whether any of the high-risk conditions (see supplementary appendix, available online only) for cervical cancer appears in their active problem list.
The distribution of cases across possible decision paths is shown in . The recommendations made by the CDSS matched those made by the physician for 66 of the 74 cases. The physician was then unblinded to the CDSS recommendations and requested to review the eight ‘mismatch’ cases. The physician verified that in seven of the eight cases the CDSS recommendation was optimal () and the CDSS had erred in one case.
Distribution of cases across the decision table
Reasons for suboptimal recommendations by the physician
In the first case, the physician missed that the age of the patient was above 65 years and recommended a follow-up Pap test. In the second and third cases, the physician failed to check the previous report and advised a repeat HPV/Pap co-test after 1 year, while the CDSS correctly noted that both the current and past reports were positive for HPV and referred the patient to the gynecology clinic. In both of these cases the CDSS did well to pick up the need for more evaluation, which is the goal of this computer-aided intervention. The fourth case previously had a hysterectomy, as indicated in a response to the patient questionnaire, which the physician missed. The physician had inappropriately recommended a Pap. In the fifth and sixth cases, the physician had looked up the Pap test result for a different date than was required for the evaluation, and made a suboptimal recommendation. The physician had reflexively looked up the latest report in the EMR, instead of considering the reports dated before the particular decision date as required for the evaluation.
In the seventh case, the physician had recommended follow-up Pap at 3 years, while the CDSS recommended Pap in 1 year, as it had identified that the patient had history of cervical dysplasia and was at high risk of cervical cancer. The physician had closely examined the dates of recent Pap tests and noticed that subsequent to the recording of the risk factor information, the patient was evaluated by a gynecologist and was prescribed screenings at intervals that suggested that the patient was returned to routine screening. In such cases, when the gynecologist returns patients with risk factors to routine screening, it is desirable that the CDSS alerts the care providers to the presence of risk factors and provides an opportunity to reconsider the decision of returning the patient to routine screening. Therefore, for this case the CDSS recommendation was considered to be optimal by the physician.
Finally, in the eighth case the CDSS incorrectly referred the patient to the gynecology clinic when the optimal recommendation that was made by the physician was to repeat Pap/HPV co-test in 1 year. The reason for the failure was that HPV testing is sometimes performed separate from a Pap test and these results are reported in the laboratory system, which was not queried by the CDSS. Although the scenario for the failed case is expected to occur in only a small percentage of patients who have abnormal HPV results, it could lead to overreferral of patients to the gynecology clinic. To address this issue, the CDSS was improved to include the HPV results missed earlier. On including the HPV results from the laboratory, the CDSS generated the optimal recommendation for all test patients.
The physician reviewing the cases for evaluation had led the development of the guideline flowchart, as she was experienced in women's healthcare issues and was very familiar with the guidelines. Despite this, the physician missed the optimal recommendations for six out of 74 cases. Other healthcare providers are expected to be generally less familiar with the required guidelines and would find the CDSS a valuable resource.
For the construction of the evaluation set, we had restricted the number of cases for particular paths in the flowchart. Thererore, the distribution of the test cases differed from the distribution that would be encountered by the CDSS on deployment. However, our method allowed us to evaluate nearly all possible case scenarios that would be encountered by the system and ensured the validity of possible paths in the flowchart. For instance, as seen from , 76% of Pap reports have a normal cytology, adequate endocervical zone component and are not positive for HPV. The follow-up recommendations for abnormal Pap results are especially critical to ensure that the patients receive appropriate work-up and referral to prevent cancer.9
Therefore, the restriction on the distribution of test cases facilitated a judicious use of the physician's review effort, a focus for the higher impact recommendations and a near complete coverage of possible case scenarios.
Results suggest that the guideline rulebase contains the logic required to generate the optimal recommendation for all patients. This was because the guidelines were comprehensive and detailed, and that allowed us to make an explicit flowchart representation required to construct the guideline rulebase.
A limitation of the proposed approach is that the developed CDSS depends on the Pap report and may not be readily portable to other institutions that have different word patterns in the Pap reports. Also, the CDSS depends on the availability of other data elements in the EMR like a well-defined problem list and patient-provided information, which may not be present at other institutions. Nevertheless, we expect that the proposed approach may be applied to construct a system tailored to the individual hospital.
Another limitation of our study is that only one physician with an expertise in the cervical cancer screening/management guidelines participated in this study. To validate the system further, it is necessary to include others physicians with an expertise in this domain, to review the guideline flowchart and to evaluate the CDSS.
Nonetheless, the results indicate that the developed free-text processor for the Pap smear report was accurate, as a result of the standardization of reporting the Pap test. Evaluation revealed that the CDSS made the optimal screening recommendations for 73 of 74 patients and it identified two cases for gynecology referral that were missed by the physician. The CDSS aided the physician to modify recommendations in six cases. The failure case was because HPV testing was sometimes performed separate from the Pap test and these results were reported in the laboratory system that was not queried by the CDSS. Subsequently, the CDSS system was corrected to include the HPV results missed earlier, and it generated the optimal recommendation for all patients. Given the high accuracy of the system, the authors consider it a suitable candidate for deployment in clinical practice.