Search tips
Search criteria 


Logo of hsresearchLink to Publisher's site
Health Serv Res. 2010 October; 45(5 Pt 1): 1205–1226.
PMCID: PMC2939263

Examining the Relationship between Clinical Monitoring and Suicide Risk among Patients with Depression: Matched Case–Control Study and Instrumental Variable Approaches



To assess the relationship between closer monitoring of depressed patients during high-risk treatment periods and death from suicide, using two analytic approaches.

Data Source

VA patients receiving depression treatment between 1999 and 2004.

Study Design

First, a case–control design was used, adjusting for age, gender, and high-risk days (1,032 cases and 2,058 controls). Second, an instrumental variable (IV) approach (N=714,106) was used, with IVs of (1) average monitoring rates in the VA facility of most use and (2) monitoring rates of VA facilities weighted inversely by distance from patients' residences.

Principal Findings

The case–control approach indicated a modest increase in suicide risk with each additional visit (odds ratio=1.02; 95 percent confidence interval=1.002, 1.04). The “facility used” IV estimate indicated near zero change in risk (0.0008 percent increase; p=.97) with each additional visit, while the distance-weighted IV estimate indicated a 0.032 percent decrease in risk (p=.29). An alternative analysis assuming a threshold effect of ≥4 visits during high-risk periods also showed a decrease (0.15 percent; p=.08) using the distance IV.


The IV approach appeared to address the selection bias more appropriately than the case–control analysis. Neither analysis clearly indicated that closer monitoring during high-risk periods was significantly associated with reduced suicide risks, but the distance-weighted IV estimate suggested a potentially protective effect.

Keywords: Suicide, HEDIS visit, depression treatment, case–control, instrumental variable

Suicide is the 11th leading cause of death in the United States (National Center for Injury Prevention and Control, Center for Disease Control [CDC]). To reduce suicide-related mortality, the Institute of Medicine (2004) has called for improvements in suicide surveillance, monitoring, and prevention. Close clinical monitoring of individuals during high-risk periods is often suggested as one way to reduce suicide deaths, and the U.S. Food and Drug Agency (FDA) has recommended closer monitoring for children, adolescents, and young adults starting antidepressants or changing doses to reduce suicide risks at these times. The most stringent FDA recommendation for monitoring suggested that seven clinical visits be completed in the 12 weeks following these antidepressant treatment events (U.S. Food and Drug Administration). Close clinical monitoring has also been suggested for patients (of any age) who are recently discharged from psychiatric inpatient stays, another high-risk period for suicide (American Psychiatric Association 2003).

However, intensive monitoring following antidepressant changes does not appear to be the norm. Only 23 percent of patients enrolled in a large managed care organization and 6 percent of patients in the Veterans Health Administration (VHA) system who started new antidepressant treatment received seven or more visits in the first 12 weeks following these antidepressant starts (Stettin et al. 2006; Valenstein et al. 2009;). Providing more intensive monitoring during these treatment periods would require substantial reorganization of mental health services. Health systems likely will require solid evidence for the effectiveness of close monitoring in reducing suicide before making these substantial service changes.

To date, no published studies have systematically evaluated the impact of intensive clinical monitoring on suicide deaths. Demonstrating such a link is difficult due to the low base rate of completed suicide and the resulting need for extremely large sample sizes (Gunnell, Saperia, and Ashby 2005; Simon et al. 2006;). Sample sizes required to examine meaningful differences in suicide rates with alternative treatment interventions are estimated at 300,000 or more—numbers that preclude using randomized-controlled trials for this purpose (Gunnell, Saperia, and Ashby 2005; Simon et al. 2006;). Administrative data from large health care organizations can provide the necessary sample size, but they raise an important methodological issue of confounding by treatment indication. In clinical settings, treatment receipt is influenced by a mixture of patient, provider, and organizational factors. If more severely ill patients are channeled to a particular treatment (e.g., increased monitoring) and are more likely to have adverse outcomes, this may result in spurious associations between increased monitoring and suicide (Prentice et al. 2005). A bias in the opposite direction may also arise if providers and organizations with more intensive monitoring provide higher quality care in other unmeasured dimensions.

In this study, we used data from a large national cohort of VHA patients in depression treatment to investigate the relationship between more intensive monitoring during high-risk treatment periods and suicide. Recognizing the potential for bias, we used two different analytic approaches to address confounding by treatment indication. We first used a case–control design, adjusting for all known characteristics available in the administrative databases that might be associated with both increased monitoring and suicide. The traditional case–control design allows matching of cases with controls in order to assess the relationship between the primary exposure variable and outcome of interest while controlling for the matching variables as well as the undefined variables associated with the matching variables. We subsequently used an instrumental variable (IV) approach to address potential biases due to variables not available in the administrative databases (Newhouse and McClellan 1998; Fortney et al. 2001; Salkever et al. 2004;).

These two approaches each have potential advantages and disadvantages. The case–control approach is well suited for studying uncommon events and is intuitively straightforward. However, it is limited by the patient and facility characteristics available in the administrative or study databases, and it can produce biased results when factors unavailable in these databases influence both the level of monitoring and the likelihood of suicide. In theory, the IV approach can account for unmeasured confounders, but its validity depends on the appropriateness of the chosen IV. This approach typically involves a loss of statistical power, because it is based on variation in treatment due to specific factors (the IVs) rather than the full variation in treatment. The loss in statistical power may particularly limit the application of the IV approach where the event of interest has a low base rate. We compared these two methods, with the ultimate goal of assessing whether increased monitoring during high-risk treatment periods reduces suicide deaths.


Study Sample

Study data were drawn from a cohort of 887,859 patients receiving depression treatment between April 1, 1999 and September 30, 2004 in the VHA health system, but the size and composition of the samples differed slightly for the two approaches. Patients were included in the overall depression treatment cohort if they received both a diagnosis of a depressive disorder and an antidepressant medication fill from VHA providers, or if they had two encounters with diagnoses of depressive disorders. Patients were excluded if they received any diagnosis of bipolar I, schizophrenia, or schizoaffective disorder during the study period. Several prior studies and accrediting agencies have used similar definitions to define cohorts with depressive disorders when examining the quality of care for depression (Kerr et al. 2000; Charbonneau et al. 2003; Spettell et al. 2003; National Committee on Quality Assurance 2006;). Because patients who died soon after entering the cohort had limited observation time, making monitoring rates difficult to calculate and important patient characteristics difficult to assess, patients with <84 observation days were excluded from the primary study analyses (although they were included in sensitivity analyses), resulting in a cohort of 835,944 patients.

Study Measures


National Death Index (NDI) Plus includes national data regarding dates and causes of death for all U.S. residents, derived from death certificates filed in state vital statistics offices. As outlined in a prior report based on the same treatment cohort (Zivin et al. 2007), NDI queries were submitted for cohort patients with a date of death in the VA Beneficiary Identification and Records Locator System Death File during the study period and for patients who did not use VHA services in the year following the study period, resulting in a comprehensive assessment of all suicide deaths in the study population.

High-Risk Periods and Intensity of Clinical Monitoring

High-risk periods for suicide were defined as the 12-week (84 day) periods immediately following psychiatric hospitalizations, new antidepressant (AD) starts, other antidepressant starts, or antidepressant dose changes (≥50 percent change in dose between two consecutive fills of a specified AD occurring within 6 months) (Valenstein et al. 2007). Monitoring visits were defined using Health and Employment Data and Information Set (HEDIS) criteria modified for the VHA setting (National Committee on Quality Assurance 2006). A HEDIS monitoring visit is an outpatient visit that has a psychiatric current procedural terminology (CPT) code or a nonpsychiatric CPT code accompanied by a mental health diagnosis. On any given high-risk day, only one monitoring visit was counted, even if more than one qualifying visit occurred. The intensity of monitoring was calculated as the HEDIS visit rate during the high-risk periods. Inpatient days were excluded from these high-risk periods, as we were interested in the relationship between outpatient monitoring and suicide deaths.


Patient covariates included age categories, race, Hispanic ethnicity, sex, diagnoses of substance use disorder, posttraumatic stress disorder (PTSD), major depression, personality disorder, anxiety disorder and bipolar II, service connection (indicating disability from conditions newly occurring or exacerbated by military service), Charlson medical comorbidity index, use of services with Medicare claims, E-code indicating a suicide attempt, VA psychiatric hospitalization, psychiatric inpatient days, number of psychotropic medications, and number of years since cohort entry. The choice of these covariates was based on the prior literature regarding suicide risk factors and patient factors associated with increased monitoring levels, including our teams' prior paper on risk factors for suicide within this VA population (Simon 1992; Zivin et al. 2007;). Facility-level variables included geographic region of the facility of most use in the year and an indicator of whether the facility was in an urban area. Psychiatric comorbidities were defined based on data from 6 months before cohort entry through the end of the study period. Covariates other than psychiatric comorbidities were defined based on data from the 12 months before cohort entry.

Study Analyses

Case–Control Design and Analyses


For each suicide case, we matched one to two control patients based on demographics and risk factors for suicide. The matching variables were chosen so that cases and controls were at risk in the same time period with a similar duration of depression treatment and potentially similar illness severity. Control patients were matched individually to a case, randomly chosen with replacement from a pool of control patients who were alive on the date when the case died of suicide (index date), with similar numbers of total observation days (±60 days) as the case before this index date, within ±5 years of the case birth year, of the same sex, and with similar numbers of high-risk days within 12 months before the index date (±30 high-risk days). When more than two possible controls satisfied the matching criteria, we randomly selected two controls. A total of 1,537 suicide cases were identified; 5 cases without a matching control were excluded. We then limited the case–control analyses to the 3,090 patients (1,032 cases and 2,058 controls) with at least one high-risk day in the year before their index date.

Statistical Analyses

Conditional logistic regression models were used to evaluate the relationship between suicide and intensity of monitoring during high-risk days. The primary predictor was the HEDIS visit rate calculated for the high-risk days in the year before the index date. Logistic models included the covariates listed above. The analysis was also conducted using a generalized linear model with case as the response variable with logit link, matched sets as the clusters, and HEDIS visit rate as the primary predictor. A Cox proportional hazard model was also used to assess time from cohort entry until suicide or the index date, with late entry at 1 year before index date for patients having >1 year of observation. In these analyses, we categorized the HEDIS visit rate into five dummy variables to check for a nonlinear relationship, and we stratified by index year to assess whether the effect of monitoring changed across study years.

Sensitivity Analyses

All analyses were repeated using the subset of cases (and their matching controls) who survived at least a year after their depression diagnosis. We also completed analyses using a newly pulled case–control dataset without the restriction of patients having at least 84 observation days after cohort entry.

IV Design and Analysis

Analytic Dataset

All patients in the depression cohort from fiscal year (FY) 2000 to 2004 with at least 84 observation days and one high-risk day were included (N=706,280). FY 1999 data were excluded because the data were available only for part of the year. The analytic dataset included 1,295,321 yearly patient-level observations, including HEDIS monitoring rates defined for each year. For the primary IV analysis, we excluded yearly data with <14 high-risk days because HEDIS monitoring rates may not be reliable when patients have small numbers of high-risk days. As the research question pertains to policies regarding minimal levels of monitoring, the primary analytic dataset also excluded yearly data from those with HEDIS visit rates >7 visits per 84 high-risk days, although this was subjected to sensitivity analysis. This gave 1,168,589 observation years from 682,001 unique patients for the primary IV analysis.

IV Model

We used two-stage IV regression model. The first stage was

equation image

where Rit is the monitoring level of patient i at year t (endogenous variable), Vit is the IV, and epsilonit is the effect of the unmeasured factors that affect monitoring rate. Xit includes variables known to affect suicide risks and potentially associated with increased monitoring (Zivin et al. 2007) (listed earlier under “Covariates”) and dummy variables for each FY. In the second stage, we estimated the yearly completed suicide linear regression model using the predicted monitoring rate from the first-stage model for each patient, An external file that holds a picture, illustration, etc.
Object name is hesr0045-1205-mu2.jpg, and the Xit.

An IV approach requires identifying IVs that predict levels of clinical monitoring but are not correlated with suicide, conditional on the level of monitoring (Newhouse and McClellan 1998). We constructed two alternative IVs, both based on the idea that practice patterns of VA facilities closest to one's residence may affect the type of treatment that one receives but may not be related to unmeasured factors that affect health outcomes. The first IV represented the average intensity of depression monitoring during high-risk periods for all patients at the facility used most often by the individual. For each individual, this IV was calculated as the average number of outpatient depression monitoring visits provided to all depressed VA patients (excluding the individual himself) during high-risk periods in the facility where the individual received the majority of his/her outpatient care during each FY.

The second IV was a distance inverse-weighted average of the monitoring rates of all VA facilities based on the facility's distance from each individual. This IV was calculated for each individual as

equation image

where Mk was the average monitoring level of kth facility, the weights (wk) were the inverse of the square root of distance to each facility from the centroid of each individual's zip code, and n was the number of facilities in the VA. IV approaches based on distance to facilities have been used in the health services research literature starting with McClellan, McNeil, and Newhouse (1994).

We used these two different IVs because we hypothesized that each would offer distinct advantages. We expected that the first IV would be more strongly predictive of the monitoring intensity that a patient receives (the first condition of a valid IV), because this IV is a function of practice patterns only at the facility that the patient attends most. On the other hand, we expected that the second IV would be less likely to be correlated with unmeasured factors that affect health outcomes (the second condition of a valid IV), because it is a distance-weighted practice patterns of surrounding facilities, determined solely by where one lives rather than the facility one attends the most (which may be affected by unmeasured severity level).

Statistical Analyses

To produce unbiased and consistent estimates, the IV should be (1) associated with the patient-level monitoring and (2) not associated with unmeasured patient or facility characteristics that are likely to affect suicide. To assess the first property for each of the two IVs, we checked the first-stage relationship between patient-level monitoring rate, Rit, and each IV using linear regression, controlling for Xit. To assess the second property, we compared the distribution of individual and facility-level characteristics between individuals with lower versus higher values than the median IV value, and across the individuals of quartiles based on IV values. Finding groups to be similar for most measured characteristics would strengthen the case that the IV was not related to unmeasured differences. Our distance-weighted IV estimate might also be confounded if people move closer to facilities that practice more intensive (or less intensive) monitoring because of unmeasured severity of their condition. To explore this further, we looked at a subset of people who moved their residence (on a yearly basis) to see if those who move closer to facilities that practice more versus less intensive monitoring are different in their severity. Finally, we used a two-stage regression based on a linear probability model to examine the relationship between suicide and the IV, adjusting for the covariates Xit. The model was estimated using a two-stage model with robust standard errors to account for potential heterogeneity of the error terms within multiple years of data from a person. We used the Hausman test to assess whether there was a significant difference between the parameter estimates of the IV and the naïve analysis estimate using monitoring rate as the predictor (Hausman 1978).

Sensitivity Analyses

To confirm that results did not vary by year, we fit the model separately by each FY and assessed whether the direction of the relationship remained consistent across years. IV estimates were also obtained using two-stage least random-effect estimators to account for multiple years of data from each person. Because excluding data with HEDIS visit rates >7 per 84 high-risk days may bias the sample toward less severe cases for suicide risk, sensitivity analyses was done including data with HEDIS visit rates >7. Because treatment from non-VA providers is less common for those with the service connection, we also did subgroup analysis of the persons with a service connection to see whether the results are more reliable in this subgroup. Lastly, recognizing that an increase in monitoring may affect suicide only if it reaches a certain intensity, we conducted an analysis where a threshold monitoring effect was assumed at ≥4 visits, as the average increase to bring patients to 7 visits during high-risk periods was 3.5 visits.


Table 1 shows the mean HEDIS monitoring rate for the entire depression cohort with ≥84 observation days in the first 12 months following cohort entry (N=835,944) and for the subsets of patients who had ≥1 or ≥14 high-risk days in the first 12 months following entry. Monitoring rates were higher among patients with indications of more severe psychiatric illness, such as having psychiatric hospitalizations or diagnoses of substance use disorder or posttraumatic stress disorder (PTSD). These characteristics are also associated with increased rates of suicide in this population, (Zivin et al. 2007) suggesting that confounding of the relationship between monitoring and completed suicide by treatment indication is plausible and may be an important source of bias.

Table 1
Monitoring Levels by Measures of Patient Severity: Rate of HEDIS Visits on High-Risk Days in the First 12 Months Following Cohort Entry*

Case–Control Analysis

For matched cases and controls, the monitoring rates during high-risk periods were similar. Cases had a mean HEDIS visit rate of 2.9 visits (SD=5.3) per 84 high-risk days, while controls had 2.7 visits (SD=5.0). Table 2 shows demographics and other patient characteristics for cases and controls. Despite matching on index date, sex, age, gender, period of observation, and numbers of high-risk days, cases were still more likely than controls to be white, non-Hispanic, to receive larger numbers of psychotropic medications, and less likely to have comorbid PTSD or a service connection. Unadjusted odds ratio (OR) of suicide associated with an increase of one HEDIS visit per 84 high-risk day period was 1.00 (p=.27), and covariate adjusted OR was 1.02 (p=.03, Table 3).

Table 3
Case-Control Analysis—From Conditional Logistic Regression Model for the Relationship between Suicide and HEDIS Visit Rate
Table 2
Case–Control Analysis—Demographic Characteristics and Other Variables by Suicide for Patients and Controls Who Had At Least One High-Risk Day in Prior Year

In sensitivity analyses, the positive relationship between monitoring and suicide persisted (OR=1.03, p=.05) even when the analysis was limited to 1,547 patients (525 cases and 1,022 controls) who lived more than a year following their cohort entry. Similar results were obtained when the HEDIS visit rate was entered as five dummy categorical variables to check for nonlinear relationships. The GENMOD model with logit link also resulted in an adjusted OR of 1.02 (p=.01) when all cases were included and an adjusted OR of 1.03 (p=.03) when the analysis was limited to those who lived more than a year following cohort entry. A Cox proportional hazard model with late entry did not give different results. The OR was also very similar when cases (and their matching controls) with HEDIS visit rate >7 were excluded (OR=1.07, confidence interval [CI]=1.01, 1.13) and when using newly pulled case–control dataset that did not exclude those with <84 observation days (OR=1.02, CI=1.01, 1.03).

IV Analysis

Table 4 shows patient-level monitoring is significantly higher for patients in high-intensity monitoring facilities (higher than median) than in low- intensity monitoring facilities, supporting the first criterion for a valid IV. Similarly, when patients were grouped into smaller increments of facility-level monitoring, the mean patient-level HEDIS monitoring rates increased from 1.3 in patients with the lowest facility-level monitoring (between 1.1 and 1.3) to 2.4 in those with the highest facility monitoring level (between 5.7 and 5.9). In addition, the Kleibergen–Paap test of underidentification (Kleibergen and Paap 2006) testing for the correlation between the IVs and the patient-level monitoring was significant (p<.001) for both our IVs.

Table 4
Instrumental Variable Analysis—Patient Characteristics by High Versus Low Facility-Level Monitoring (IV; Dichotomized at the Median Rate of 2.22 Visits per 84 High-Risk Days) and by High Versus Low Distance Inverse-Weighted Facility-Level Monitoring ...

The second property for a valid IV cannot be tested directly, but examining how measured covariates correlate with the IV can give a sense of whether unmeasured differences are likely to be associated with suicide and thus consequential. Patient and facility characteristics in Table 4 for low- versus high-monitoring facilities based on each IV show similarity of several patient-level characteristics between these categories, particularly the nearly identical distributions of gender, age, anxiety, and comorbidity status. However, substantial differences between low- versus high-monitoring facilities are seen in the numbers of psychiatric inpatient days, psychiatric and substance use diagnoses, and geographic region, which may reflect idiosyncratic differences in diagnosis and practice patterns across geographic areas. We note that although these differences are substantial in some cases, they are much smaller than the differences between high versus low patient-level monitoring groups (not shown), suggesting that the IV estimate may be less prone to treatment indication bias than a standard single equation approach. We also did not find any patterns in severity such as more inpatient psychiatric days or higher percent of psychiatric illnesses in patients who moved closer to facilities practicing more intensive monitoring than those who moved closer to facilities practicing less intensive monitoring. Lastly, as an indirect test of the second property, we tested for the relationship between suicide and the IVs, conditional on the patient level of monitoring and other covariates and found that it was not significant (p=.91 for the IV and .24 for distance-weighted IV).

The first-stage model (Appendix SA2), with the endogenous patient-level monitoring as the dependent variable, indicated that 13.0 percent of variation in monitoring rate is explained by the average facility-level monitoring and other exogenous variables. Similarly, 12.5 percent of the variation in patient-level monitoring is explained by the distance inverse-weighted average facility-level monitoring and other exogenous variables. In particular, each additional facility-level average visit and distance inverse-weighted facility-level visit was associated with an unadjusted estimate of 0.28 (p<.001) and 0.70 (p<.001) more patient visits during high-risk periods, respectively. After adjusting for other patient and facility characteristics variables, the estimates were 0.20 (partial F-test, p<.001) for facility-level average monitoring and 0.99 (partial F-test, p<.001) for distance inverse-weighted facility-level monitoring. Both IVs were strong predictors of patient-level monitoring; however, the IV based on average monitoring at the facility of most use had a standard deviation 3.5 times larger than that of the IV based on distance-weighted monitoring practices, indicating that it provides more variation and therefore more statistical power (though at the possible expense of higher bias, as noted earlier). Other patient characteristics, such as psychiatric hospitalization and comorbid psychiatric conditions, showed the expected positive relationships with monitoring.

An unadjusted “naïve” linear probability model estimate showed that suicide increased by 0.003 percent (p=.03) with each additional monitoring visit. After adjusting for the full set of patient and facility covariates, suicide still increased by 0.003 percent (p=.05) with each additional monitoring visit. The IV estimate (Table 5) using facility-level monitoring indicated a positive and insignificant association (0.0008 percent, p=.97) in suicide with each additional monitoring visit. On the other hand, the distance-weighted IV estimate indicated a decrease in suicide of 0.032 percent with each additional visit, although still not statistically significant (p=.29). Hausman tests failed to reject that there is no endogeneity, indicating that the difference in coefficients between the naïve and IV analysis was not significant using either IV.

Table 5
Instrumental Variable (IV) Analyses—The Estimates Are of HEDIS Monitoring Rate from IV Analysis with Facility-Level Monitoring as an IV and with Distance Inverse-Weighted Facility Monitoring as an IV, Based on Linear Probability Model of Suicide ...

When a random-effects model was used instead of robust standard errors, we found the estimates to differ in size, but there was no change in these basic patterns. IV estimates from models fit separately by each year varied by year with both IVs, but with no consistent direction of association—that is, the estimates neither increased nor decreased in years. IV estimates based on data that included patients with HEDIS rates >7 and estimates based on data that included only patients with a service connection both estimated a decrease in suicide risk with increased monitoring using either IV. Thus, all estimates, although not statistically significant, suggested a reduction in suicide (Table 5). Lastly, a model with a threshold effect of ≥4 visits during high-risk periods showed a decrease of 0.011 percent (p=.88) in suicide risks using facility monitoring IV and a decrease of 0.151 percent (p=.08) using the distance-based IV.


We used data from a large national cohort of VHA patients receiving depression treatment to complete the first comprehensive assessment of monitoring visits and suicide deaths. We show that in this observational dataset, patient-level characteristics associated with higher risks of suicide (Spettell et al. 2003) were also associated with higher levels of monitoring, suggesting there are likely substantial treatment selection biases when assessing the relationship between suicide and clinical monitoring.

We used two analytic approaches to examine the link between monitoring and suicide while addressing these treatment selection biases, the traditional case–control approach and an IV approach—and we found that these analytic approaches resulted in different conclusions.

Although widely used, traditional case–control analyses with the covariate data available in large administrative databases appeared unsuccessful in addressing treatment selection biases. Case–control analyses indicated that visits with clinicians were significantly associated with a slight increase in suicide deaths among patients during high-risk periods. This would seem an unlikely scenario unless one believes that clinical contacts increase patient distress or impulsivity. Instead, this finding is likely due to residual treatment indication biases because of insufficient patient information in administrative data to adjust for factors associated with both monitoring and suicide.

Although some IV analyses based on facility of most use suggested no impact of monitoring on suicide risk, several IV analyses, particularly those using the distance-weighted IV, produced results more in line with expert opinion, suggesting that increased monitoring may lead to reduced suicide risks. However, none of the IV results were statistically significant.

IV estimates may be less subject to treatment indication biases than either the naïve single equation estimate or the case–control study estimates, and they may more closely address policy questions of whether to increase or decrease monitoring for defined populations during high-risk periods. IV analyses assess the marginal impact of increases in monitoring visits—the impact of increased monitoring for individuals who receive this closer monitoring only because of differences in facility practices (Newhouse and McClellan 1998). Some patients, because of high-risk behaviors, will receive high levels of monitoring regardless of usual facility practices, and IV estimates do not reflect the benefit these individuals receive from close monitoring. However, because IV approaches focus only on treatment variation that can be explained by the IV, these analyses tend to produce larger standard errors than more direct methods. Likely due to this imprecision and low base rate of completed suicide, IV analyses did not provide a definitive answer regarding the relationship between closer monitoring and suicide despite the use of a large administrative database (Sturm 1998).

We note that while insignificant findings in IV analyses might be due to residual bias or imprecision, it is also possible that clinical visits as currently practiced are not effective in reducing suicide (though they may be effective in addressing other concerns). Routine clinical visits may fail to include a careful assessment of suicidal ideation, plans, or access to lethal means. Alternatively, suicidal ideation or other clinical indicators of risk may not be present during routinely scheduled visits. Suicide attempts are often impulsive and are frequently planned for <30 minutes before being enacted (Simon et al. 2001). Therefore, even large increases in monitoring (e.g., more than four contacts) over the course of 84 days may not be sufficient to detect these short periods of acute risk. Finally to date, even if acute suicide risk is detected during a visit, few clinical interventions have been shown to be effective in reducing these risks (Mann et al. 2005). These possibilities, in combination with our lack of a robust finding for a protective effect, highlight a need to further refine and evaluate additional clinical approaches for reducing suicide risks.

We have reported previously that increasing monitoring from current levels to FDA suggested levels would mean a substantial reorganization of health services along with substantial incremental costs (Valenstein et al. 2009). As noted previously, RCTs of sufficient size to demonstrate a clear link between monitoring and suicide deaths are impractical, and we now show that using observational data (in which clinicians are following individuals they deem at greater risk more closely) to demonstrate a clear link between higher levels of monitoring and reduced suicide mortality is also difficult—even when using case–control and IV analyses to address treatment biases. Given the difficulty in demonstrating this link, we believe health care organizations and clinicians will remain unwilling to change current behaviors and press forward with implementing broad policy recommendations regarding monitoring—unless a public consensus develops that these activities should proceed without firm evidence. Indeed, because of the lack of evidence for effectiveness and the large investment that would be required to implement a blanket policy of close monitoring during high-risk periods for all patients, an argument could be made that treatment resources might be better used for mental health interventions with stronger evidence for effectiveness. Future research using large datasets with more detailed information on potential confounders and the development of new methodologies to address treatment selection biases in observational data are clearly needed.

Study Limitations

Diagnoses, demographics, and cause of death may not be completely accurate in administrative databases. However, VHA administrative data quality is generally considered good, with high levels of concordance between VHA administrative data and medical record data (Kashner 1998; Cowper et al. 1999;). The NDI is also considered the “gold standard” of U.S. mortality databases (Cowper et al. 2002). Findings for VHA patients also may not generalize to non-VHA patient populations, and findings within the VHA may change as greater numbers of younger veterans enter the health care system following their return from Iraq or Afghanistan.

In summary, although expert and governmental recommendations have urged closer monitoring for depressed patients during high-risk periods to prevent suicide, strong evidence for this recommendation may be difficult to generate. When using observational datasets to address this issue, IV analyses appear less biased than case–control approaches; however, even with a very large database, the application of the IV method appears limited due to low rate of suicide and the resulting increased variability of its estimate, making it difficult to arrive at definite answers regarding this relationship.


Joint Acknowledgment/Disclosure Statement: This research was supported by grants from the Department of Veterans Affairs, Health Services Research and Development Service, IIR 04-211-1 and MRP 03-320, and by the National Institute of Mental Health, R01-MH078698-01.

Disclosures: None.

Disclaimers: None.

Supporting Information

Additional supporting information may be found in the online version of this article:

Appendix SA1: Author Matrix.

Appendix SA2: First-Stage Regression Results from IV Analysis.

Please note: Wiley-Blackwell is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.


  • American Psychiatric Association. Practice Guideline for the Assessment and Treatment of Patients with Suicidal Behaviors. American Journal of Psychiatry. 2003;160(11, suppl):1–60. [PubMed]
  • Charbonneau A, Rosen AK, Ash AS, Owen RR, Kader B, Spiro A, Hankin C, Herz LR, Jo M, Pugh V, Kazis L, Miller DR, Berlowitz DR. Measuring the Quality of Depression Care in a Large Integrated Health System. Medical Care. 2003;41(5):669–80. [PubMed]
  • Cowper DC, Hynes DM, Kubal JD, Murphy PA. Using Administrative Databases for Outcomes Research: Select Examples from VA Health Services Research and Development. Journal of Medical System. 1999;23(3):249–59. [PubMed]
  • Cowper DC, Kubal JD, Maynard C, Hynes DM. A Primer and Comparative Review of Major US Mortality Databases. Annals of Epidemiology. 2002;12(7):462–8. [PubMed]
  • Fortney J, Rost K, Zhang M, Pyne J. The Relationship between Quality and Outcomes in Routine Depression Care. Psychiatric Services. 2001;52(1):56–62. [PubMed]
  • Gunnell D, Saperia J, Ashby D. Selective Serotonin Reuptake Inhibitors (SSRIs) and Suicide in Adults: Meta-Analysis of Drug Company Data from Placebo Controlled, Randomised Controlled Trials Submitted to the MHRA's Safety Review. British Medical Journal. 2005;330(7488):385–8. [PMC free article] [PubMed]
  • Hausman JA. Specification Tests in Econometrics. Econometrica. 1978;46(6):1251–71.
  • Institute of Medicine. Reducing Suicide: A National Imperative. Washington, DC: National Academies Press; 2004.
  • Kashner TM. Agreement between Administrative Files and Written Medical Records: A Case of the Department of Veterans Affairs. Medical Care. 1998;36(9):1324–36. [PubMed]
  • Kerr EA, McGlynn EA, Van Vorst KA, Wickstrom SL. Measuring Antidepressant Prescribing Practice in a Health Care System Using Administrative Data: Implications for Quality Measurement and Improvement. Joint Commission Journal of Quality Improvement. 2000;26(4):203–16. [PubMed]
  • Kleibergen F, Paap R. Generalized Reduced Rank Tests Using the Singular Value Decomposition. Journal of Econometrics. 2006;133:97–126.
  • Mann JJ, Apter A, Bertolote J, Beautrais A, Currier D, Haas A, Hegerl U, Lonnqvist J, Malone K, Marusic A, Mehlum L, Patton G, Phillips M, Rutz W, Rihmer Z, Schmidtke A, Shaffer D, Silverman M, Takahashi Y, Varnik A, Wasserman D, Yip P, Hendin H. Suicide Prevention Strategies: A Systematic Review. Journal of American Medical Association. 2005;294(16):2064–74. [PubMed]
  • McClellan M, McNeil BJ, Newhouse JP. Does More Intensive Treatment of Acute Myocardial Infarction in the Elderly Reduce Mortality? Analysis Using Instrumental Variables. Journal of American Medical Association. 1994;272(11):859–66. [PubMed]
  • National Center for Injury Prevention and Control, Center for Disease Control (CDC) 2006. “Web-Based Injury Statistics Query and Reporting System” [accessed on October 18, 2006]. Available at
  • National Committee on Quality Assurance. 2006. “State of Health Care Quality Report, Antidepressant Medication Management” [accessed on October 17, 2006]. Available at
  • Newhouse JP, McClellan M. Econometrics in Outcomes Research: The Use of Instrumental Variables. Annual Review of Public Health. 1998;19:17–34. [PubMed]
  • Prentice RL, Langer R, Stefanick ML, Howard BV, Pettinger M, Anderson G, Barad D, Curb JD, Kotchen J, Kuller L, Limacher M, Wactawski-Wende J, for the Women's Health Initiative Investigators Combined Postmenopausal Hormone Therapy and Cardiovascular Disease: Toward Resolving the Discrepancy between Observational Studies and the Women's Health Initiative Clinical Trial. American Journal of Epidemiology. 2005;162(5):404–14. [PubMed]
  • Salkever DS, Slade EP, Karakus M, Palmer L, Russo PA. Estimation of Antipsychotic Effects on Hospitalization Risk in a Naturalistic Study with Selection on Unobservables. Journal of Nervous and Mental Disease. 2004;192(2):119–28. [PubMed]
  • Simon GE. Psychiatric Disorder and Functional Somatic Symptoms as Predictors of Health Care Use. Psychiatric Medicine. 1992;10(3):49–59. [PubMed]
  • Simon GE, Savarino J, Operskalski B, Wang PS. Suicide Risk during Antidepressant Treatment. American Journal of Psychiatry. 2006;163(1):41–7. [PubMed]
  • Simon TR, Swann AC, Powell KE, Potter LB, Kresnow M, O'Carroll PW. Characteristics of Impulsive Suicide Attempts and Attempters. Suicide and Life-Threatening Behavior. 2001;32(suppl):49–59. [PubMed]
  • Spettell CM, Wall TC, Allison J, Calhoun J, Kobylinski R, Fargason R, Kiefe CI. Identifying Physician-Recognized Depression from Administrative Data: Consequences for Quality Measurement. Health Service Research. 2003;38(4):1081–102. [PMC free article] [PubMed]
  • Stettin GD, Yao J, Verbrugge RR, Aubert RE. Frequency of Follow-Up Care for Adult and Pediatric Patients during Initiation of Antidepressant Therapy. American Journal of Managed Care. 2006;12(8):453–61. [PubMed]
  • Sturm R. Instrumental Variable Methods for Effectiveness Research. International Journal of Methods in Psychiatric Research. 1998;7(1):17–26.
  • U.S. Food and Drug Administration. 2004. “FDA Public Health Advisory: Suicidality in Children and Adolescents Being Treated with Antidepressant Medications” [accessed on September 10, 2006]. Available at
  • U.S. Food and Drug Administration. 2005. “FDA Public Health Advisory: Suicidality in Adults Being Treated wtih Antidepressant Medication” [accessed on September 10, 2006]. Available at
  • Valenstein M, Eisenberg D, McCarthy JF, Austin KL, Ganoczy D, Kim HM, Zivin K, Piette JD, Olfson M, Blow FC. Service Implications of Providing Intensive Monitoring during High-Risk Periods for Suicide among VA Patients with Depression. Psychiatric Services. 2009;60(4):439–44. [PMC free article] [PubMed]
  • Valenstein M, Kim HK, Ganoczy D, McCarthy JF, Zivin K, Austin KL, Hoggatt K, Eisenberg D, Piette JD, Blow FC, Olfson M. 2007. Periods of Increased Suicide Risks among VA Patients Receiving Depression Treatment: System and Policy Implications. Nineteenth NIMH Research Conference on Mental Health Services MHSR. Washington, DC. [PMC free article] [PubMed]
  • Zivin K, Kim HM, McCarthy JF, Austin KL, Hoggatt K J, Walters HM, Valenstein M. Suicide Mortality among Individuals Receiving Treatment for Depression in the Veterans Affairs Health System: Associations with Patient and Treatment Setting Characteristics. American Journal of Public Health. 2007;97(12):2193–8. [PubMed]

Articles from Health Services Research are provided here courtesy of Health Research & Educational Trust