A major challenge in the management of RA is the prediction of long-term response to therapy. At present, only limited information is available to determine which factors, if any, will predict a good long-term response. The ability to predict achievement of LDA at 1 year would enable physicians to tailor treatment early during the course of therapy, thereby improving outcomes and potentially minimizing patient exposure to ineffective therapies. Furthermore, achieving LDA is a treatment goal supported by the recent EULAR guidelines (24
). Using clinically applicable models, we show that, 12 weeks after start of CZP therapy, we could accurately classify the vast majority (~88%) of patients from RAPID 1, a study population with mainly high baseline disease activity, as likely to achieve or not achieve LDA at 1 year. Across several prediction models, we also found that patients with an early response to treatment at Weeks 4, 6 and 8 had an even greater likelihood of achieving LDA at 1 year. Approximately 12–25% of patients could not be classified accurately at Week 12 and needed treatment longer to determine with a high degree of certainty the likelihood of achieving LDA at Week 52. At least for the types of patients enrolled in RAPID 1, our results identify which patients can likely be switched at 12 weeks if they are predicted to be nonresponders (i.e., patients having a very low predicted likelihood of achieving LDA) with a relatively high degree of accuracy. Indeed, these data highlight the possibility to use such models as a negative predictability tool – identifying those unlikely to achieve LDA – and this is perhaps the patient population a treating physician would most like to identify early so that treatment can be altered. In our best performing model, Model 2, this prediction was made with 90% accuracy. Whether a 90% level of certainty, or similar amounts of certainty, is sufficient to give physicians enough confidence to make treatment changes at 12 weeks is a matter of individual judgment. Other factors, including patient and physician preferences and access to alternative therapies, are also likely to play important roles in the decision to switch RA treatments (27
Results using an alternate model in which DAS was replaced by CDAI (CART Model 3) had somewhat lower discrimination and accuracy than Model 1. This lower performance was likely to be a consequence of only moderate correlation between the CDAI (used for the predictor variables) and the DAS28 (used for the outcome). Performance of this model would likely have been better if we had used CDAI to define LDA, rather than the DAS28. Finally, CART Model 4 (similar to Model 1, but with a composite outcome [LDA and/or ACR50] at 1 year) had the numerically best discrimination of all prediction models.
When comparing results across the models, we see a clear compromise between the accuracy of prediction and the proportion of patients who could be classified at Week 12. In the a-priori model (), for example, the misclassification rate for patients predicted to be nonresponders was very low, 6%. However, only 23% of eligible RAPID 1 patients could be classified with that high level of accuracy. The a-priori model also had suboptimal discrimination and calibration for the entire study population. Using the CART-based, data-driven approach, the misclassification rate for patients predicted to be nonresponders shown in Model 1 () was slightly higher at 14%, but twice as many patients (54%) could be classified. Results from Model 2 () were even better and had a misclassification rate of only 10% for a group of patients who comprised 46% of the eligible RAPID 1 population. These data illustrate the concession between the certainty of classification and the proportion of all patients able to be classified.
Prior RA data evaluating single predictors at a fixed time point indicate that the level of disease activity at baseline and after the first 3 months of treatment is significantly related to the level of disease activity at 1 year ()(31
). While probability plots offer important insights, they have limitations—they provide probabilities for only 1 specific time point and 1 variable. To improve upon this, using data collected within the first 12 weeks of CZP therapy, we constructed several models using CART to predict which patients would achieve LDA (DAS28 ≤3.2) at 1 year. The benefit of this approach is that it allows for inclusion of multiple predictors and allows patients to be further classified based upon whether they had a very early response (4–6 weeks). Furthermore, these early time points reflect visit intervals at which a patient could be reasonably assessed in clinical practice; measuring predictor variables of response at shorter time points (e.g., 2 weeks after initiating anti-TNF therapy) may not be realistically feasible outside of controlled trials.
This study has several limitations. Despite the use of a random split sample methodology possible with the large RAPID 1 dataset, with separate testing and training datasets, the potential for overfitting (i.e., model fails to provide accurate predictions when applied to new subjects/datasets) remains. Furthermore, model accuracy reported herein may be specific to the biologic or biologic class examined (CZP, or perhaps only anti-TNF therapies), or applicable only to the types of patients recruited to RAPID 1 (i.e., those with high disease activity and established RA). To address this generalizability issue, we replicated a recently published decision tree prediction model that used clinical assessments at Week 12 and earlier to predict LDA at 1 year derived from RA patients treated with etanercept in the TEMPO trial (33
). The prediction model built from the TEMPO data appeared to perform similarly well using RAPID1 data. Based upon this empiric example, we would suggest that the prediction models represented in our results (which had much more data than in the previous analysis (33
)) will perform well for established RA patients with high disease activity treated with other anti-TNF agents. Additional replication studies, in RA patient populations receiving differing drug treatments, will be useful to confirm whether these models can be applied more broadly. However, we suspect that different prediction models will be required for patients with early RA and for those who start with lower disease activity than RAPID1 patients.
Despite these caveats, our results suggest that it may be possible to develop predictive tools that are user friendly and could be easily and consistently applied in clinical practice. Using an analytic framework like CART, biomarker and pharmacogenetic information could be added to prediction tools and likely would complement clinical assessments to prospectively guide the management of individual RA patients. Designing a trial around the concept of predicting response at 12 weeks or earlier, and altering therapy for patients predicted to be nonresponders, would be an optimal approach and should be tested. Prediction models similar to the type we have developed also have the potential to improve treatment to target approaches (24
) by identifying groups of patients who should be switched to an alternate strategy more quickly, which ultimately could improve patient outcomes.
SIGNIFICANCE AND INNOVATIONS
- Classification and regression trees (CART) have been successfully utilized in a number of therapeutic areas to categorize disease state and to determine the likelihood of future events.
- The ability to quickly predict achievement of outcomes in patients with rheumatoid arthritis (RA) at 1 year would enable physicians to tailor treatment early during the course of therapy.
- Using CART, it was possible to predict one-year response within the first 12 weeks of starting therapy with CZP.