Most genetic association studies conclude with a statement about how positive findings could be used to improve clinical outcomes, though cost-effectiveness of genetic tests has received remarkably little attention in psychiatry(Perlis et al., 2005
). With our base-case assumptions, utilizing a large-scale effectiveness study intended to mimic clinical practice, the incremental cost-effectiveness of a putative pharmacogenetic test is $93,520/QALY relative to the next best strategy of using an SSRI as 1st
level treatment for all subjects. While there is no accepted threshold below which interventions should be funded, one widely-cited number, based on the cost-effectiveness of dialysis in chronic renal failure patients covered by Medicare, is $50,000/QALY(Winkelmayer et al., 2002
). It has been noted that few interventions with cost-effectiveness ratios exceeding $100,000/QALY receive funding(Laupacis et al., 1992
). Within psychiatry, a recent cost-effectiveness analysis suggested that a simple depression care program for employees led to an incremental cost-effectiveness ratio of $20,000 per QALY(Simon et al., 2006
), consistent with other primary care quality-improvement programs yielding ratios less than $50,000(Simon et al., 2001
). Relative to these numbers, the ICER for the genetic test, with our base-case assumptions, would not be considered cost-effective. Of course, as genotyping rapidly becomes a commodity, the cost of testing would likely fall substantially. In the extreme, where testing is free, the cost/QALY is ~$1,000, well within the range considered to be cost-effective. Notably, the magnitude of difference between QALYs resulting from the strategies examined is modest, and below the threshold suggested by some authors to represent clinically meaningful differences(Kaplan et al., 1993
). On the other hand, given the prevalence and costliness of MDD, even modest differences in outcomes bear consideration by policymakers. In the subset of patients whose treatment is changed by testing, the initial response rate is increased by 5%. With a 0.25 QALY difference between a year of depression and a year of remission, this is arguably a clinically meaningful improvement.
With base-case assumptions, we found that a pharmacogenetic test for antidepressant response could only be considered cost-effective for tests with odds ratios ≥1.5. Multiple potential strategies could be applied by clinical researchers to identify such cost-effective tests. First, incorporating multiple informative loci will likely be necessary to achieve this threshold. Recent genomewide association studies of antidepressant response indicate that individual loci are likely to exert only modest effects(Hamilton, 2007
), so any pharmacogenetic test would likely need to incorporate multiple informative loci to achieve an adequate OR. Second, more effective tests could incorporate other putative clinical predictors such as those identified in the STAR*D study (Trivedi et al., 2006
). Addition of clinical predictors would simply be reflected in better test performance (i.e., greater effect sizes).
An alternate strategy would rely on tests informative about multiple treatment strategies: rather than focusing solely on SSRIs, a test which was also informative about common alternative strategies could be more cost-effective. To date, few antidepressant pharmacogenetic studies include such non-SSRI comparators and describe specific predictors for the alternate strategy. Similarly, the incorporation of predictors of adverse effects could offer another strategy for designing cost-effective tests. While modern antidepressants are quite safe and generally well-tolerated, many patients do discontinue treatment prematurely. A number of recent reports suggest that it may be possible to predict specific adverse effects(Laje et al., 2007
; Perlis et al., 2003
; Perlis et al., 2007
Our results underscore the importance of understanding pharmacogenetic test performance in the population in which it is being applied. While this is true in general for any test, it becomes particularly important given the known wide variation in allele frequencies between racial groups(2003). Apart from individual studies in Southeast Asian or Latino populations(Kim et al., 2006
; Wong et al., 2006
), the vast majority of association studies for antidepressant responsiveness focus on Caucasians. Notably, the ‘beneficial’ allele frequency for the test considered here is less prevalent among African-Americans(McMahon et al., 2006
). Our results demonstrate that the cost-effectiveness of such tests is critically dependent upon the effect size, and test probabilities, in the target population, suggesting that more representative cohorts will be required to determine the true utility of pharmacogenetic tests.
We note several caveats in interpreting our base-case results. Most importantly, our estimates rely on numerous assumptions about model parameters which are imprecise and likely to vary across clinical settings. However, a strength of this study is that it closely follows results of one of the largest antidepressant-effectiveness studies completed to date. Not only was that study designed to mirror clinical practice, but it took place in both primary care and specialty psychiatric clinics, suggesting our results can be informative about 'real-world' treatment of MDD(Rush et al., 2004
). Many of the parameters not drawn from STAR*D were previously utilized in a cost-effectiveness model of an employer-based depression intervention(Wang et al., 2006
) which was later validated in a prospective study(Wang et al., 2007
). This model can thus be understood in terms of 'how might STAR*D outcomes have differed if initial treatment assignment was determined by a genetic test', assuming a standard set of next-step interventions. Of note, our results likely underestimate the ‘true’ cost-effectiveness of the intervention because, as with most such analyses, we do not include the costs to caregivers or other family members(Weinstein et al., 1996
We emphasize that, although we utilized an existing genetic finding as our base case, the general model can be applied to any pharmacogenetic test of antidepressant response. The code for this model is available at [website]; simply substituting the appropriate test parameters allows cost-effectiveness of that test to be estimated. The first marketed pharmacogenetic test to be advocated for antidepressant prescribing is actually one which examines cytochrome p450 variation(Somanath et al., 2002
), though it has previously been suggested that such a test is likely to have little impact on general antidepressant prescribing(Perlis, 2007
). Similarly, a serotonin transporter promoter insertion/deletion polymorphism is the genetic variation most often associated with antidepressant responsiveness, albeit in small cohorts(Serretti et al., 2006
). However, the specificity of its effect is not well characterized, and the largest cohort to date did not detect an association with treatment response, though incorporating an additional polymorphism did identify some association with overall citalopram tolerability(Hu et al., 2007
). Therefore, we focused on the HTR2A variation because it was replicated in a split-sample design and exerts a well-defined impact on response. We note that, even if it represents a true association, its effect size is almost certainly less than that estimated here, based upon the phenomenon of the ‘winner’s curse’, or regression to the mean. Future pharmacogenetic tests will almost certainly incorporated multiple markers drawn from genomewide association studies, but the basic principles of our model can be applied regardless of the type or scale of the genetic test.
We also made the simplifying assumption that the HTR2A-based test is not informative about response to other (non-SSRI) treatments. While this appears to be the case in STAR*D (Laje and McMahon, unpublished data) and other cohorts (Perlis, unpublished data), these effects have not been fully characterized. In general, this assumption would yield an optimistic estimate in the base case, and underscores the need to understand not simply predictors of nonresponse to a given treatment, but the specificity of such predictors, prior to their clinical application. That is, it will be important to characterize not only predictors of differential response to a single treatment, but also the effects of such predictors on response to alternative treatment strategies.
Third, we included two primary types of states in our state-transition model, 'depressed' and 'well', either on- or off-antidepressant. The gradation of treatment responses in depression is well-known - based upon STAR*D, roughly one-third of patients improve with treatment but do not reach symptomatic remission(Trivedi et al., 2006
). The effects of including individuals who improve but do not remit in treatment among the 'depressed' state would be to dilute or decrease the disutility of depression, rendering our model overly conservative. On the other hand, the significant impact of continued depressive symptoms even among those who improve with treatment is well-documented(Wells et al., 2007
A further simplification was the requirement that nonremitters have their treatment switched, rather than augmented (Fava, 2001
). In routine outpatient settings, augmentation is generally a much less common strategy, particularly in primary care settings. A survey of clinicians suggested that substantial variation exists in their preference for treatment sequence(Petersen et al., 2002
), perhaps because other prior to STAR*D there was little controlled data bearing on the efficacy of augmentation in general(Fava, 2001
Notwithstanding these caveats, our results suggest a means for evaluating future pharmacogenetic tests in psychiatry. To our knowledge, only one previous report addressed the value of such psychiatric tests in terms of cost-effectiveness; in that study, we found that a test for clozapine responsiveness could be cost-effective under certain conditions(Perlis et al., 2005
). As with pharmacotherapy, determining the true cost-effectiveness of a diagnostic intervention will require either large randomized prospective studies or retrospective assessment of large clinical population. With the substantial public interest in personalized medicine, pressure will be great to quickly translate tests to clinical practice. Our results suggest that the cost-effectiveness of these tests can be modeled in a straightforward fashion, allowing necessary test parameters and integration with treatment algorithms to be carefully considered prior to implementation.