Accurate diagnosis and early detection of complex disease has the potential to be of enormous benefit to clinical trialists, patients, and researchers alike. We sought to create a non-invasive, low-cost, and accurate classification model for diagnosing Parkinson’s disease risk to serve as a basis for future disease prediction studies in prospective longitudinal cohorts.
We developed a simple disease classifying model within 367 patients with Parkinson’s disease and phenotypically typical imaging data and 165 controls without neurological disease of the Parkinson’s Progression Marker Initiative (PPMI) study. Olfactory function, genetic risk, family history of PD, age and gender were algorithmically selected as significant contributors to our classifying model. This model was developed using the PPMI study then tested in 825 patients with Parkinson’s disease and 261 controls from five independent studies with varying recruitment strategies and designs including the Parkinson’s Disease Biomarkers Program (PDBP), Parkinson’s Associated Risk Study (PARS), 23andMe, Longitudinal and Biomarker Study in PD (LABS-PD), and Morris K. Udall Parkinson’s Disease Research Center of Excellence (Penn-Udall).
Our initial model correctly distinguished patients with Parkinson’s disease from controls at an area under the curve (AUC) of 0.923 (95% CI = 0.900 – 0.946) with high sensitivity (0.834, 95% CI = 0.711 – 0.883) and specificity (0.903, 95% CI = 0.824 – 0.946) in PPMI at its optimal AUC threshold (0.655). The model is also well-calibrated with all Hosmer-Lemeshow simulations suggesting that when parsed into random subgroups, the actual data mirrors that of the larger expected data, demonstrating that our model is robust and fits well. Likewise external validation shows excellent classification of PD with AUCs of 0.894 in PDBP, 0.998 in PARS, 0.955 in 23andMe, 0.929 in LABS-PD, and 0.939 in Penn-Udall. Additionally, when our model classifies SWEDD as PD, they convert within one year to typical PD more than would be expected by chance, with 4 out of 17 classified as PD converting to PD during brief follow-up; while SWEDD not classified as PD showed one conversion to PD out of 38 participants (test of proportions, p-value = 0.003).
This model may serve as a basis for future investigations into the classification, prediction and treatment of Parkinson’s disease, particularly those planning on attempting to identify prodromal or preclinical etiologically typical PD cases in prospective cohorts for efficient interventional and biomarker studies.
Please see the acknowledgements and funding section at the end of the manuscript.