|Home | About | Journals | Submit | Contact Us | Français|
For many disorders, patient heterogeneity requires physicians to customize their treatment to each patient’s needs. We test for the existence of customization in physicians’ prescribing for bipolar disorder, using data from a naturalistic clinical effectiveness trial of bipolar disorder treatment (STEP-BD), which did not constrain physician prescribing. Multinomial logit is used to model the physician’s choice among five combinations of drug classes. We find that our observed measure of the patient’s clinical status played only a limited role in the choice among drug class combinations, even for conditions such as mania that are expected to affect class choice. However, treatment of a patient with given characteristics differed widely depending on which physician was seen. The explanatory power of the model was low. There was variation within each physician’s prescribing, but the results do not suggest a high degree of customization in physicians’ prescribing, based on our measure of clinical status.
In the past two decades, there has been an acceleration in new drug approvals for many illnesses including mental disorders. As a result, many new medications and entire new classes have emerged to treat depression and other mood disorders. This rapid innovation has the potential to benefit patients, if the new drugs are adopted and used appropriately by physicians.
However, the very pace of new drug introductions has challenged physicians, who face considerable cognitive burdens as they struggle to sort out the relative merits of a growing number of medication choices. This complexity is heightened when, as in various drug classes, there is no one medication that is best for all, but rather different medications work better for different patients. In these situations one might expect the physician to ‘customize’ prescribing according to each patient’s needs. In the case of prostate cancer, one recent study estimated that identifying cost-effective treatments at the individual level would be worth $70 million per year in the US, much more than the value of identifying treatments that are cost-effective on average (Basu and Meltzer, 2007).
Successful customization places substantial informational requirements on the physician, who must both learn the attributes of the various drugs (e.g. benefits and risks) and learn to recognize patient characteristics relevant to drug choice (e.g. clinical moderators that support the use of one drug over another; Kraemer et al 2002). Various authors have questioned how likely full customization is to occur, and speculated that physicians are more likely to apply rules of thumb, heuristics and other less informationally demanding approaches to choosing among medications (Smoller and Nierenberg 1999, Frank and Zeckhauser 2007).
The present paper seeks to shed light on these issues by examining prescribing patterns for bipolar disorder, in a naturalistic clinical trial where physicians’ prescribing was not constrained. We examine the physician’s choice of which drug classes to use in treating a new patient. The basic questions are as follows. First, whether the drug class chosen varies strongly with our observed measure of patient’s clinical status, as should happen with customization. Second, whether patients with the same age, sex and clinical status receive different drug class combinations depending on which physician they see. The aim is to see whether the patterns observed are consistent with customization, or with physician-specific norms applied across patients.
For many diseases (including bipolar disorder), conditions exist which might seem to favor customization in drug prescribing. Often there is heterogeneity of response across patients, making it difficult to predict ex ante which drug will work for which patient. For chronic diseases, the physician has the opportunity to discover over time which drug works best for a given patient, say, by use of a Bayesian learning process. Various observers have described the existence of trial-and-error in drug selection within a given class, including for schizophrenia (Leslie and Rosenheck, 2002) and depression (Berndt et al, 2002).
On the other hand, various authors have noted considerations that work against a fully Bayesian learning process. Denig and Haaijer-Ruskamp (1992) suggest that physicians typically select medications from an ‘evoked set’ of more familiar ones, rather than the full universe, and the selection is often based on habit or rules of thumb, not case-by-case considerations. Smoller and Nierenberg (1999) argue that physicians may seek to avoid regret by comparing the worst possible outcome of each choice and choosing the treatment with the least bad worst-case (this is not the same as maximizing expected utility). Also, physicians may seek to avoid the ambiguity of a large complex choice set by ‘thinking categorically about the risk of adverse events’ (Nierenberg et al, 2008). Smoller and Nierenberg cite a study that found that physicians may become less confident in their choice, the more drugs there are available to choose from. Finally, they argue that physicians may be over-influenced by dramatic types of information, privileging small-sample case reports over larger clinical trials that come later. This could result from ‘status quo bias’, whereby once a physician has adopted a drug, inertia sets in and less attention is paid to subsequent evidence that should induce switching. These attitudes could lead physicians to rely on too few drugs, and to be unwilling to switch despite poor response.
Frank and Zeckhauser (2007) propose a continuum along which to measure the extent of customization. At one extreme is ‘hyper-rationality’, where the physician optimizes care for each patient given the available information. At the other extreme is ‘My Way’ behavior, where a physician will regularly prescribe a therapy that is quite different from what other physicians would choose. This may occur because the physician places excessive weight on her own experience as opposed to research literature. Between those extremes are other possibilities, for example selection of therapies based on what works for an ‘average patient’ rather than tailored to a specific patient. However, those norms may themselves be chosen rationally, and indeed may perform better than full customization, if the latter imposes high costs in terms of cognition, coordination etc. Frank and Zeckhauser label this behavior as ‘Sensible Use of Norms’.
There is little direct evidence in either direction as to how far physicians are customizing their prescriptions to patients’ responses. When surveyed about their choice of antidepressant, psychiatrists reported paying attention to patient-specific factors such as side effects, and prior treatment history (Zimmerman et al., 2004). However, Frank (2007) notes findings that even when an insurer created large copayment differentials among drugs which their physicians viewed as interchangeable, patients did not switch. He interprets this as inertia in prescribing behavior, and suggests status quo bias on the part of patients as a possible explanation. Henke et al (2009) found relatively low rates of treatment adjustment for depression patients who were not responding to the initially selected treatment. They describe this pattern as ‘clinical inertia’.
Bipolar disorder is a severe, chronic and costly disorder that causes critical disruptions in mood, and impairs functioning in multiple life domains, in particular psychosocial and occupational (Judd et al., 2008). It commonly co-occurs with serious psychiatric (Merikangas et al., 2007) and/or medical conditions (McIntyre et al., 2007; Kilbourne et al., 2004), and is associated with increased risk of comorbid substance abuse (Vornik and Brown, 2006; Ostacher and Sachs, 2006) and suicide (McElroy et al., 2006). The Global Burden of Disease Study found bipolar disorder to be the sixth leading cause of years lived with disability worldwide (Murray and Lopez, 1996; Fleishman, 2003). One analysis estimated that the lifetime discounted cost for persons with onset of bipolar disorder in 1998 in the US was $24 billion (Begley et al., 2001). Another study calculated the one-year costs for prevalent cases in the US in 1991 (approximately 2 million individuals). These were estimated at $45 billion, of which $7 billion were for direct costs and the remainder were indirect (Wyatt and Henter, 1995).
Approximately 5.7 million adults in the US have bipolar disorder, which represents about 2.6% of the population age 18 and older (Kessler et al., 2005). The figure is likely higher in light of reports that many individuals remain undiagnosed (Mantere et al., 2004, 2008). In addition, a substantial minority of individuals diagnosed with recurrent depression is found to have disorders falling within the bipolar spectrum, which is a broader definition of bipolarity with reported prevalence rates of at least 5% (Akiskal et al., 2000).
Bipolar disorder is an episodic illness characterized by recurrent periods of mania, depression or mixed states (co-occurring mania and depression), with symptoms typically remitting and relapsing continuously over time. Mania is a condition characterized by periods of persistently and abnormally elevated mood or irritability that include at least three, or four in the case of irritable mood, of the following symptoms most of the day, nearly every day, for one week or longer: decreased need for sleep; increased talkativeness; distractibility; racing thoughts; overly inflated self-esteem; increased goal-directed activity or physical agitation; and engaging in risky behavior (NIMH, 2003).
Patients often enter treatment during an acute episode for which the primary goal is to achieve remission. Once this acute phase is successfully resolved, patients enter a maintenance phase during which the main pharmacotherapy goal is prophylaxis, or recurrence prevention. Acute and maintenance interventions exist for each phase of bipolar disorder. While episodes of depression and mania commonly recur lifelong with bipolar disorder, the majority of bipolar patients experience stabilization of their symptoms between episodes (Sachs and Thase, 2000), though residual symptoms persist in as many as one-third, and a small percentage of treated patients suffer from chronic unremitting symptoms (Hyman and Rudorfer, 2000).
Bipolar I, often referred to as the disorder’s classic form, is a distinct subcategory that includes recurrent episodes of depression and mania. The literature documents differences in clinical features and course, comorbidities, and recovery rates between bipolar I and II, a key difference being that a bipolar II diagnosis depends on the absence of psychosis and the presence or history of at least one depressive and at least one hypomanic episode but no manic or mixed episodes (MacQueen and Young, 2001). Subthreshold mania (hypomania) resembles mania, but is considered markedly less impairing.
Psychotropic prescribing patterns for treating bipolar disorder vary widely (Sachs, 1996) but typically involve medications or combinations of medications from three drug classes: mood stabilizers, antipsychotics, and antidepressants. Mood-stabilizing (MS) drugs are medications with both antimanic and antidepressive actions (APA’s Practice Guideline for Treatment of Patients with Bipolar Disorder, 1994) and generally considered the cornerstone of all bipolar disorder treatment. Lithium and valproate are among the most commonly prescribed mood stabilizers, and may be administered as the core of a treatment regimen that includes other medications or as monotherapy. Antipsychotic (AP) medications, sometimes referred to as “neuroleptics,” are broadly divided into two types: atypical and conventional. Atypical antipsychotics, such as risperidone and olanzapine, are second-generation or newer drugs that tend to be preferred by expert guidelines (Sachs et al, 2000), compared with the first-generation or typical antipsychotics such as haloperidol and chlorpromazine. However, for schizophrenia, recent studies have found relatively strong evidence of variability in efficacy among second-generation antipsychotics, challenging the belief that they were uniformly more efficacious than first-generation drugs for key outcomes (see e.g., Leucht et al. 2009; Tyrer and Kendall 2009; Lieberman et al 2005). Several distinct antidepressant (AD) medication types are used to treat the depressive episodes of bipolar disorder, including the newer selective serotonin reuptake inhibitors (SSRIs) and serotonin and norepinephrine reuptake inhibitors (SNRIs) along with bupropion, and the older tricyclics and monoamine oxidase inhibitors (MAOIs). Bipolar patients are also often prescribed benzodiazepines, but these are to address comorbid anxiety and insomnia, and are not included in this study.
In an effort to promote evidence-based care for bipolar disorder, a number of bipolar practice guidelines and algorithms have emerged and become widely available to provide a framework for physicians’ decision-making at each illness phase, that is, acute or maintenance (Suppes et al., 2001; Sachs et al., 2000; Bauer et al., 1999). These guidelines have substantial concordance and uniformly recommend that treatment be tailored to the patient’s clinical presentation, and vary depending upon on the episode’s severity and prominent features. They are focused on selection among classes more than on among medications within a class, as there is less research evidence to guide within-class choices. Mood stabilizer medication is considered the core treatment for patients with bipolar disorder in all phases of treatment. Clinical characteristics, such as the presence of “mixed” features, mania, hypomania, or depression, with or without psychosis, are considered the most salient criteria in choosing amongst possible medications from the three drug classes (mood stabilizers, antipsychotics and antidepressants). Typically, in evaluating medication choices for key clinical situations, bipolar guidelines present initial strategies, both preferred and alternative, and suggest possible next steps after inadequate response, with recommended regimens presented at the drug class level first, and then within a class.
An additional consideration is whether or not the patient is already on long-term treatment for bipolar disorder at the time of a manic/mixed episode. For example, the British Association for Psychopharmacology (Goodwin et al., 2009) recommends different medication classes for patients with acute mania or mixed episodes depending whether or not they are already on long-term treatment. Furthermore, BAP guidelines recommend consideration of patient preferences established in previous illness episodes. In general, guidelines for this disease emphasize that the physician consider the patient’s prior experience.
Bipolar disorder is a likely candidate for customized prescribing for several reasons. First, most physicians prescribing for bipolar patients are psychiatrists, for whom such patients may be a substantial portion of their caseload. Psychiatrists may find it more worthwhile than general physicians to learn about the medication options available. Second, bipolar disorder is a chronic disorder, so there will be many repeat prescriptions for a given patient, raising both the value of finding the right drug and the likelihood of doing so. On the other hand, the very complexity of the choice set facing physicians may lead them to select a ‘standard’ approach, such as starting most patients on the same mood stabilizer and switching only those who fare poorly on it.
Previous analyses of bipolar disorder prescribing have considered the potential influence of the presence of: psychiatric comorbidities (Blanco et al., 2002 Busch et al., 2007, 2009), medical comorbidities and/or comorbid substance abuse (Busch et al., 2007, 2009, Goldberg et al., 2009), a history of attempted suicide (Goldberg et al., 2009), and a history of ECT use (Busch et al., 2009).
Based on the foregoing review, our analysis focuses on customization in the selection among medication classes, rather than among medications within a class. One reason is that the choice of classes has important clinical implications which are the subject of ongoing debate (e.g. whether or not to accompany a mood stabilizer with an antidepressant). Second, most existing guidelines focus on the choice of classes (e.g. discouraging use of unopposed antidepressants), thereby drawing physicians’ attention to this specific type of choice.
This study uses data from the multisite Systematic Treatment Enhancement Program for Bipolar Disorder (STEP-BD) that was collected between 1998 and 2005 at 20 sites nationwide, which were chosen from a larger number of site applications. Participating sites were required to have a bipolar disorder specialty program treating at least 100 active patients, and preference was given to those also experienced in clinical research with bipolar patients. Sites offered STEP-BD enrollment to all diagnostically eligible patients, both existing and new, who sought outpatient treatment. The only exclusion criteria were subjects’ unwillingness or inability to comply with study assessments or the inability to give informed consent (Sachs et al., 2003). It is likely that many study subjects had established relationships with their study physicians, but the data do not include this information. The full STEP dataset includes 4,361 participants, all aged 15 or older.
STEP-BD was a long-term, naturalistic clinical trial undertaken to examine the effectiveness of treatment options for bipolar disorder (Sachs et al., 2003), and their influence on the disorder’s course. Several dimensions of bipolar disorder have been studied with this dataset. Although the study included a few small randomized trials, most patients were seen in the observational arm, in which clinicians were encouraged but not required to choose certain treatment options based on expert consensus guidelines and other published treatment guidelines (Sachs et al., 2003). However, treatment decisions were based entirely on the preferences of the treating physicians in collaboration with their patients, and were not constrained in any way. This paper uses data from the observational arm only. Details on subject recruitment and diagnostic methods are available in other papers from the study (Sachs et al., 2000; Ghaemi et al., 2006).
Over the course of the STEP-BD, during regularly scheduled visits psychiatrists recorded data on patients’ medication treatment; adverse side effects; self-reported adherence to prescribed medications; clinical mood state (i.e. depressed, manic, mixed or hypomanic); social and role functioning; and self-reported comorbidities (e.g. substance use, eating disorders, medical conditions). These data were recorded on a Clinical Monitoring Form (CMF), a one-page assessment tool that consists of nine parts. Its subscales were constructed as a standardized assessment substitute for the narrative process note routinely used in clinical practice (Sachs et al., 2003). The CMF also records information on the percent time and severity of depressive, anxious and/or elevated mood symptoms experienced in the prior 10 days; laboratory data, stressors and selected mental status items.
For this paper, we selected patients from the observational arm who had Bipolar I and full demographic information. For each patient, we selected the first visit after intake. 25 patients were dropped because their first visit did not yield any prescriptions in our three classes of interest (AD, AP, MS), and another 163 because of missing demographic data. The excluded patients did not differ significantly from our final sample in any observable characteristics except insurance, with uninsured patients more likely to have been excluded. Our final sample included 1,759 individuals, whose care was managed by 183 treating physicians.
The key dependent variable is the combination of medication classes prescribed at the first clinical visit following the study intake visit, based on prescribing data recorded on the Clinical Monitoring Form (CMF). For regression purposes, we simplified the possible class combinations into five: 1) antidepressant + antipsychotic + mood stabilizer (AD AP MS), 2) antidepressant + mood stabilizer (AD MS), 3) antipsychotic + mood stabilizer (AP MS), 4) mood stabilizer only (MS), and 5) no mood stabilizer (No MS).
The principal independent variable is the patient’s clinical status. We created 12 clinical status dummy variables from the standardized assessment information collected on each patient in the Affective Disorder Evaluation (ADE) and CMF. On the form, the clinician may record only one current clinical status, each of which corresponds with a treatment phase. Accordingly, our constructed variables indicate whether a patient’s current mood state represents: 1) a current episode of one of four diagnostic mood states (depression, mania, mixed, or hypomania), according to DSM-IV criteria; 2) continuing symptomatic, 3) recovering from an episode, and if so, from which DSM-IV clinical diagnosis (i.e. depression, mania, mixed or hypomania), 4) roughening, or 5) recovered. Continuation refers to a state in which the patient no longer meets DSM-IV diagnostic criteria but is continuing to exhibit symptoms (greater than or equal to 3 moderate symptoms, or one moderate symptom and one more than moderate symptom). A patient’s status is recorded as “recovering” when DSM-IV criteria are no longer met but up to two moderate symptoms persist. The term roughening is a clinical course indicator for bipolar disorder that refers to a return of symptoms at a subthreshold level that may signal an impending episode (Hirschfeld et al., 2007). On the CMF, the “roughening” classification is used when the patient is experiencing greater than two moderate symptoms after having met criteria for recovery, which would include the occurrence of two new symptoms and/or marked worsening of residual symptoms (Sachs et al, 2003). Recovery is defined as eight consecutive weeks with no more than two moderate symptoms.
For each patient, we consulted the CMF record that reports on the first clinical visit after the ADE, and extracted the assessed clinical status and response to treatment information recorded on the form. If a patient was classified as “recovering” on the CMF, then we consulted the ADE, which comprehensively documents diagnostic information at the time of study intake. Thus, each of our constructed “recovering” variables includes an indicator of the type of episode from which the patient is recovering. If a patient’s clinical status at intake (on the ADE) had a value other than a diagnosis, e.g. “roughening,” we classified the patient as “recovering – other.” Finally, we created an indicator variable for psychosis, equal to 1 if the clinician had noted on the CMF any of these symptoms: paranoid ideation, ideas of reference, hallucinations, or delusions.
Other independent variables include the patient’s age, sex, and several other demographic characteristics collected at the intake visit. Patients were asked about their marital status, employment status, personal income from earnings, and source of health insurance (if any). For various of these variables we collapsed categories, in order to create simpler classifications for regression analysis.
Our multivariate model aims to predict choice among five unordered alternatives, namely the various combinations of drug classes. We estimate this model using multinomial logit, which is a commonly used approach for similar contexts (Greene 1993, Allison, 1999). The model assumes that the decision-maker chooses the option that yields most utility, with utility depending on the patient’s characteristics (Xi) and on how each characteristic affects the utility of a given choice (βj). Under standard assumptions, the probability of patient i choosing option j is given by:
One known limitation of this model is that it requires assuming that the error terms are independent across choices. We tested this assumption by running Hausman-McFadden tests for exclusion of each alternative, which yielded a negative test statistic in three cases and a positive nonsignificant one in the other case. Following Hausman and McFadden (1984, footnote 4), we interpret these results as failing to reject the IIA property, and conclude that a non-nested model is appropriate.
We estimate two models. Model 1 only includes patient explanatory variables: demographics and a set of clinical status dummies. The model includes robust standard errors to correct for clustering at the physician level. Model 2 adds dummy variables to control for physician effects. Although 183 physicians are represented in the data, many see only a few patients. We selected 20 patients as a cutoff, and included a dummy variable for each physician exceeding that cutoff, with the reference category being all other physicians combined. 30 physicians exceeded the cutoff. The statistical consistency of our estimator cannot be proven for Model 2, due to the ‘incidental parameters’ problem created when fixed effects are used in nonlinear models (Neyman and Scott, 1948). As a result, Model 2 is not our preferred specification, but is presented nonetheless in order to see whether results differ much when we control for unobserved physician characteristics. Goodness of fit is evaluated using the pseudo r-squared, which measures a given model’s improvement in log-likelihood relative to a model with constants only.
For ease of interpretation, results are presented in terms of average partial effects (APEs) rather than coefficients. For an explanatory variable that is binary, the partial effect for a single patient measures the change in probability for a given outcome if that explanatory variable changes from zero to 1, while holding all other explanatory variables at their actual values for the patient in question. For age (the only continuous explanatory variable), the partial effect measures the change in probability associated with increasing patient age by one year. For both types of variable, the APE is then computed as the mean of the individual partial effects across sample members. The model was estimated using the Mlogit command in Stata 11, and the APEs were generated with Stata’s Margin command. Standard errors of the APEs were computed using the delta method (Oehlert, 1992).
In addition, we use the logit coefficients from the regression to simulate how choice of regimen is affected by which physician the patient sees. Predicted values for each choice are formed by applying the model coefficients to an individual with specified characteristics (e.g. female, age 40, and depressed), while varying which physician the patient sees. For each drug class combination, we then examine the distribution of the resulting predicted probabilities across the 30 higher-volume physicians. This is done in order to convey more clearly the variation in prescribing across physicians.
The multinomial logit regression involves estimating 96 coefficients (24 variables * 4 non-reference alternatives). As a result, one could be concerned about a multiple-comparison problem. To address this, we applied a Bonferroni correction and divided significance levels by 4. In the regression results table we use notation to indicate which results achieved a significance level of .01 and .05, with and without correction.
Demographic and clinical characteristics of this study sample of bipolar I patients are presented in Table 1. The mean age was 40.5 years, with 45% of the sample male. Half the patients had annual income below $10,000 and only 30% were employed full time, probably reflecting substantial disability in this population. At the first clinical visit following the intake assessment visit, 18% met criteria for current depression, while rates of current mania, mixed (mania and depression) and hypomania were substantially lower and about equal (less than 2% each). The prevalence of patients with continuing symptoms was 13%. The proportion of patients roughening was 4%, and the proportion recovering from an episode of one of the four main clinical categories ranged from 0.7% (hypomania) to 8% (depression).
Table 2 presents the distribution across the different drug class combinations recorded at the first post-study entry visit, with the first three rows representing regimens that did not include mood stabilizers, which we then collapsed into a No Mood Stabilizer (No MS) category, representing 16% of our sample. The majority of patients (84%) did receive mood stabilizers, and nearly a quarter of the sample received mood stabilizers only. Just over half of the sample (56%) received any antidepressants, and nearly half (44%) was on any antipsychotic medications. The most commonly received combination of agents by class (27%) consisted of medications from both the antidepressant and mood stabilizer classes. The three-drug class combination from the antidepressant, antipsychotic and mood stabilizer was received by 19% of patients, while 15% received medications from the antipsychotic and mood stabilizer classes.
Table 3 shows the relationship of medication class combinations (collapsed) to patient characteristics. Nearly one quarter (23.5%) of hypomanic patients received mood stabilizer only, compared with a much lower proportion of depressed patients (13.9%), while only 8.8% of hypomanic patients did not receive any mood stabilizing medication at all. Rates of receiving the three-class combination were higher (over 20%) for patients diagnosed as depressed or hypomanic, and lower for manic and mixed patients.
In addition to the prevalence of each class combination, it is of interest to know the utilization of unopposed antidepressants in this population. Among patients with current depression, 67% received any antidepressant medication, and 16% received antidepressant medication without a mood stabilizer (Data not shown).
Table 4 presents the average partial effects for two multinomial logit models. The first model includes patient characteristics only. Only some of the clinical status variables affected class choice. Specifically, patients are less likely to receive the MS-only combination if they are depressed, having continuing symptoms, or in a state of recovery from depression or from ‘other’ conditions (‘recovering-other’). MS-only is also less likely for patients with psychosis. Clinical status variables do not affect the choice of other drug class combinations, after applying the Bonferroni correction. At the same time, two demographic variables also affect class choice, even after controlling for clinical status. Insured patients are more likely to receive the AP/MS combination, while nonwhites are less likely to receive AD/MS or AP/MS.
Model 2 adds dummy variables for the 30 physicians who saw at least 20 patients. The dummies are jointly significant, and the pseudo r-squared indicates that compared to a constant-only model, Model 2 improves the log likelihood by 8.2%. By contrast, Model 1 improved the log likelihood by 4.5%. This suggests that the physician effects substantially improve the model’s explanatory power, albeit not by as much as the patient characteristics (which contribute 55% of the total improvement: 4.5/8.2).
In terms of the actual results, all variables that were statistically significant in Model 1 remain so in Model 2, and two additional variables reach clinical significance (continuing symptoms, for AD/AP/MS, and ‘recovering-other’, for AD/MS). Overall, strikingly few of the average partial effects are statistically significant.
To illustrate the extent of prescribing variation, Table 5 presents information on the distribution of predicted probabilities for each drug class combination across the 30 higher-volume prescribers. The probabilities are derived for hypothetical female 40-year old patients with differing clinical status. For a patient with mania, the probability of receiving an AD/MS combination could be as high as 55% or as low as 5%, depending on which physician she sees. Across these 30 physicians, the standard deviation of the AD/MS share is 0.13. For this patient, 25% of the physicians are predicted to use AD/MS less than 18% of the time, and 25% will use it more than 37% of the time. Variation is less marked for the other four drug class combinations but still present. For a patient with depression (lower panel), similar patterns are observed for predicted use of AD/MS, and in addition there is higher variation in the use of No-MS compared to the other combinations.
Our findings provide mixed support for our hypotheses regarding the extent of customization. On the one hand, most physicians were using a variety of class combinations, not starting every patient on the same combination. This is compatible with some degree of customization, although not proof of it, since use of a variety of regimens could occur for other reasons. In particular, patient clinical status did not appear to matter as much of the time as one would expect, if physicians were customizing their prescribing to patients’ clinical presentations. In fact, only four (of 11) clinical status variables (e.g. depression, roughening) ever affected class choice in either model. Other conditions that one would have expected to matter, such as mania, had no apparent effect on the choice among medication class combinations. This accords with an earlier study examining medication use at study intake for this population. Busch et al (2009) found that neither the use of antimanics nor the choice between AP only and any mood stabilizer was affected by various medical conditions (e.g. thyroid, hepatic, renal) that had been expected to affect prescribing.
The results of our simulation of the effect of assigning hypothetical patients to each psychiatrist highlight the pronounced variation across physicians in which medications are prescribed. It is clearly not the case that all physicians are following the same algorithms, given this finding of substantial differences in class choice between physicians treating observationally similar patients. This is compatible with what Frank and Zeckhauser (2007) call ‘My Way’ prescribing: behavior where physicians use very different algorithms based on their own experience and training. To the extent they underweight broader evidence on what works, the resulting variation would not be customization in the desirable sense. One could argue that the patient characteristics we used are not sensitive enough, and that the variation in prescribing we observe could be a response to unobserved heterogeneity, implying that customization is occurring after all. For example, our data did not include information on patient preferences or patient history of treatment and treatment-related side effects. However, given the importance of the clinical status variable, it is striking how little key clinical conditions appear to affect choice of drug class. Of course, the same conditions could have a stronger effect on choice of medications within a class, which we did not study.
Our finding that nonwhites were less likely than whites to receive the MS/AD combination warrants further analysis, particularly since the preferred treatment for severe depressive episodes without psychosis is the combination of mood stabilizer and antidepressant (Sachs et al., 2000). Recent work has shown that black patients with bipolar disorder are less likely to be prescribed the newer SSRI-type antidepressants (Kilbourne and Pincus, 2006), a prescribing choice which may play a role in the decreased likelihood of the nonwhites in our study receiving the MS/AD combination. African-Americans and Latinos with bipolar disorder have been found to be less likely than non-Latino whites to receive mood stabilizers or antipsychotics (Depp et al., 2008), which raises questions about care quality.
Our finding that 16% of patients were prescribed no mood stabilizer is lower than the 41% reported by Blanco et al (2002) or the 52% in the NCS-R study (Merikangas et al., 2007), but similar to the rate in Unutzer et al (2000). All these findings are at variance with the recommendation that this class of medication be a vital part of pharmacotherapy with bipolar patients throughout all phases of treatment.
Interestingly, when we stratified patients by clinical status, 10.3% of manic patients were not prescribed any mood stabilizer. Also, although the first-line preference is for mood stabilizing medication alone, or (if psychosis is present), mood stabilizer and antipsychotic (Sachs et al., 2000), nearly half (48.2%) of patients with mania were receiving something other than these two combinations.
Several limitations of our study should be kept in mind. First, the data used come from a study administered chiefly in academic medical settings, where physicians may have been more exposed to recent research than physicians elsewhere. Similarly, the physicians in this study received more guidance about prescribing than is probably typical elsewhere. These features limit generalizability of our findings to the wider universe of physicians prescribing for bipolar disorder. A second important limitation is that, despite the richness of clinical detail available, we lack information about some other potentially important influences on medication choice that may have been observable to the physician, such as prior medication and disease history, patient preferences, history of side effects, and other clinical issues such as pregnancy. The available data do not record which patients either had previously experienced an inadequate response to any preferred or first-line medications initially prescribed, or had rejected such treatment recommendations. Ideally, psychiatrists’ decision-making is guided not only by the patient’s current clinical status but also their knowledge of the frequency, severity and consequences of the patient’s past episodes. Also, guidelines recommend medications but also emphasize the necessity of monitoring patients’ response to medications, and augmenting or switching as needed, so some proportion of prescribing that at first blush seems unaligned with guidelines may not be. Inability to control for such influences could help explain the nonsignificance of the clinical variables we did include. On the other hand, these variables may be less relevant to the choice of medication classes, which is what we study. For example, knowing that a patient once reacted badly to lamotrigine would discourage future prescribing of that drug, but would not discourage future prescribing of the entire mood stabilizer class.
Another omission is what kind of insurance coverage patients had for psychotropic medications (e.g. what prior authorization requirements or cost-sharing tiers applied). These limitations are less serious for a study of choice among drug classes (as opposed to among medications). The reason is that while insurers often use copayments or administrative controls to limit access to specific medications within a class, they rarely use those controls to bar access to an entire class. Thus, insurance features should be less important for the present study than for studies seeking to explain the specific drug chosen. Third, while we used physician fixed effects to control for differing practice styles, we were not able to use these for all physicians, as many were low volume. Fourth, the sample size (1,759 patients) may limit the ability to detect statistically significant effects, given the number of parameters being estimated. Finally, it is unlikely that most of these patients were truly new to treatment, given the chronic nature of the disorder, and physicians may have been influenced by patients’ medication experiences prior to the study, which we do not observe.
The results show substantial variation among physicians in their prescribing for bipolar disorder, even after controlling for our observed measure of clinical status. In addition, our measure of clinical status did not appear to strongly affect prescribing, although our study had limited power to detect such effects. The study does not find evidence for a high degree of customization in physicians’ selection among medication classes, although customization could still have been occurring between medications within each class. Further research is needed, using larger datasets with additional information on variables such as medication history and patient preferences.
The authors thank Lee Panas for programming assistance, and Alisa Busch, Ellen Dennehy, Randy Ellis, Barry Friedman, Tom McGuire, Sharon-Lise Normand, Grant Ritter and David Salkever for helpful discussions. The paper was presented at the NIMH meeting on Economics of Mental Health (2008) and the International Health Economics Association meetings (2009). This research was supported by NIMH grant R01 MH 77727.
Funding: This research was supported by the National Institute of Mental Health, grant R01 MH 77727.
The authors do not have conflicts of interest.