A total of 23 programs were identified during our initial search. After the application of the exclusion criteria, 11 programs were excluded because of specialty-specific focus (see online Appendix 3
for all excluded programs). Another eight programs were excluded after an initial review for reasons that included: inability to compare diagnoses, inability to enter two symptoms or characteristics, a static tree structure with cross linking of internal reference points, and no ranking of the diagnoses. Four programs were reviewed fully with the evaluation criteria listed in Table . The general information for each of the programs is listed in Table . Information regarding data elements available for input and input methods are listed in Table , and information regarding DDX content sources are listed in Table .
Knowledge regarding the mechanism of generating the DDX results is limited to the information shared by the vendors. For DiagnosisPro® the underlying logic was not specified. The diagnoses are presented in disease categories. The results are not rank ordered in terms of disease prevalence or other criteria and the program offers no advice on how to further refine the suggestions. These factors limited the program’s usefulness. One differentiating feature is that DiagnosisPro® progressively truncates the list of suggestions as additional findings are entered. Conversely, with the other generators, the lists are re-prioritized, but remain large.
DXPlain® rank ordered results from most to least likely within two categories: common vs. rare diseases, based on disease prevalence. The mechanism is presumed to be a propriety algorithm from the description that follows. An importance rank is given based on criticality of potential diagnosis. Findings are assigned two attributes: one relating to the frequency of the finding in the disorder, and one expressing how strongly it suggests that disease. Ranking is related to findings that are both important and suggestive of a disorder. Rank of a given disease will be lowered if findings commonly seen in the disease are stated to be absent. The attributes are used to generate an ordered list of diagnoses associated with some or all of a given set of findings. Of note, DXPlain® allows occupation as a finding, the input of negative findings such as “no fever,” and has a side-by-side disease comparison feature. The program displays supportive findings and guides the user to other findings which, if present, support or refute the disease.
Isabel© was the only program to accept natural language queries and the only product allowing the user to input all of the key findings at once. The program uses a “natural language processing” search engine to match entered clinical features with similar terms in the diagnostic data set. Each diagnosis has a complete description of the clinical features with the differential ranked by the strength of the match to the entered clinical features. With each clinical feature addition, the differential diagnostic output reconfigures the list, taking into account all the clinical features entered. Isabel has links to databases, knowledge sources and validation studies.
PEPID™ lists diagnoses based on a proprietary scoring system related to the number of selected signs/symptoms consistent with each potential diagnosis. Additionally, each sign/symptom is assigned a unique score/weight relative to its importance in differentiating among specific diagnoses. Classic or cardinal disorders in which selections strongly suggest a disease or are pathognomonic are indicated. Critical diagnoses with immediate life or limb threat are indicated. Worthy of note is that the overall PEPID™ product, of which the DDX generator is only one piece, incorporates a laboratory testing manual, a drug interactions generator, a drug database covering 7,500 drugs, approximately 400 interactive clinical calculators, an IV compatibility tool, an acute care / life support reference section, and 700 evidence based topics (primary care module).
None of the vendors allowed for unfettered access to institutional library resources or PubMed Linkout for full text from subscribed content, although both Isabel and DxPlain® do provide for Pubmed searching. DiagnosisPro® and Isabel report that they integrate with major EHR vendor products to some degree, but we did not test the ability to integrate any of the products into an EHR. It is noteworthy that DiagnosisPro® has English, French, Spanish, and Chinese interfaces.
Aggregated results and mean scores (with 95% confidence intervals) from entering published cases into each of the differential diagnosis generators are shown in Table . ISABEL© and DxPlain® performed well with means of 3.45 for both. Post-hoc analysis with correction for multiple comparisons revealed that only the difference between DxPlain® and PEPID™ reached statistical significance (P
0.04, mean score difference 1.75, 95% C.I. 0.05 to 3.45) None of the generators included the correct diagnosis for two of the MKSAP cases (acquired von Willebrand’s disease related to aortic stenosis, and metformin-induced peripheral neuropathy). Certain scores for returned suggestions such as “pancreatitis” for autoimmune pancreatitis and “cardiomyopathy” for methamphetamine-induced cardiomyopathy were scored only “3” (or “might have been helpful”) because the broad category of diagnosis was clear from the presentation and the DDX generator did not help elucidate the root cause. Compared to the three other generators which appeared to have large vocabularies, PEPID™ was unable to recognize many of the key findings. The number of exact matches was DiagnosisPro®
9, and PEPID™