Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Psychol Med. Author manuscript; available in PMC 2012 April 1.
Published in final edited form as:
PMCID: PMC3045479

Including information about comorbidity in estimates of disease burden: Results from the WHO World Mental Health Surveys



The methodology commonly used to estimate disease burden, featuring ratings of severity of individual conditions, has been criticized for ignoring comorbidity. A methodology that addresses this problem is proposed and illustrated here with data from the WHO World Mental Health Surveys. Although the analysis is based on self-reports about one’s own conditions in a community survey, the logic applies equally well to analysis of hypothetical vignettes describing comorbid condition profiles.


Face-to-face interviews in 13 countries (six developing, nine developed; n = 31,067; response rate = 69.6%) assessed 10 classes of chronic physical and 9 of mental conditions. A visual analog scale (VAS) was used to assess overall perceived health. Multiple regression analysis with interactions for comorbidity was used to estimate associations of conditions with VAS. Simulation was used to estimate condition-specific effects.


The best-fitting model included condition main effects and interactions of types by numbers of conditions. Neurological conditions, insomnia, and major depression were rated most severe. Adjustment for comorbidity reduced condition-specific estimates with substantial between-condition variation (.24–.70 ratios of condition-specific estimates with and without adjustment for comorbidity). The societal-level burden rankings were quite different from the individual-level rankings, with the highest societal-level rankings associated with conditions having high prevalence rather than high individual-level severity.


Plausible estimates of disorder-specific effects on VAS can be obtained using methods that adjust for comorbidity. These adjustments substantially influence condition-specific ratings.

Keywords: global burden of disease, visual analog scale (VAS), epidemiology, mental health, comorbidity


It is becoming increasingly clear that no country can afford to provide universal healthcare coverage for all illnesses to all citizens. Triage rules are needed to allocate available healthcare resources to deal with the inevitable shortfall between resources and need. Among the several kinds of information used to help develop these rules, comparative illness burden estimates have been especially valuable as a reference standard for government health policy planners (Lopez & Mathers, 2007; Murray & Lopez, 1996; Murray et al. 2001). A central component of these estimates is the condition-specific severity weight, a statistic obtained by having expert raters evaluate the relative burdens of different conditions using the person tradeoff method (Murray & Lopez, 1996; Murray et al. 2001; World Health Organization, 2004). An important limitation of this approach is that the vignettes represent single conditions rather than more realistic cases where an individual suffers from a number of different conditions (Fortin et al. 2007). This is an important limitation because methodological research has shown that condition-specific severity weights vary as a function of the presence of comorbidity (Moussavi et al. 2007).

Previous attempts to take comorbidity into consideration in estimating condition-specific illness burden have been limited by the fact that simplistic models were used to estimate effects (Maddigan et al. 2005; Verbrugge et al. 1989). The current report presents the results of an analysis aimed at generating condition-specific estimates of disease burden in a more realistic way. The method is illustrated in an analysis of data collected in general population surveys on the joint associations of health conditions reported by respondents in a series of community epidemiologic surveys and overall respondent ratings of perceived health, although the same logic could be applied to the analysis of complex vignettes describing comorbid condition profiles.


The sample

Data come from surveys carried out in 15 countries by the World Health Organization (WHO) World Mental Health (WMH) Survey Initiative (Kessler & Üstün, 2008). Six countries are classified by the World Bank as developing (Colombia, Lebanon, Nigeria, Mexico, Peoples’ Republic of China, Ukraine) and nine developed (Belgium, France, Germany, Italy, Israel, Japan, Netherlands, Spain, and United States of America). (Table 1) Country-specific response rates ranged from 45.9% (France) to 87.7% (Colombia), with a weighted (by sample size) average response rate across surveys of 69.6%. All surveys were based on probability samples of the adult household populations in the participating countries or regions within the countries. Respondents were ages 18+ other than in Israel, where the minimum age was 21. The upper end of the age range was unbounded in all countries other than Colombia, Mexico and the Peoples Republic of China, where the upper bound was 65. More details about WMH sampling and eligibility are reported elsewhere (Heeringa et al. 2008).

Table 1
Sample characteristics of the World Mental Health Surveys

All WMH interviews were conducted face-to-face by trained lay interviewers. Standardized interviewer training and quality control procedures were used (Pennell et al. 2008). Informed consent was obtained before beginning interviews. Each interview had two parts. All respondents completed Part I, which contained assessments of core mental disorders. The Part II interview, which assessed physical disorders and correlates, was administered to 100% of respondents who met lifetime criteria for any of Part I mental disorder plus a probability sub-sample of other Part I respondents. A Part II weight equal to the inverse of the respondent’s probability of selection into Part II was used to adjust for differential selection into Part II.


Chronic physical conditions

Physical conditions were assessed with a chronic conditions checklist based on the US National Health Interview Survey list (Center for Disease Control and Prevention, 2004; Schoenborn et al. 2003). Respondents were asked to report whether they ever had a series of symptom-based conditions (e.g., chronic headaches) and whether a health professional ever told them they had a series of silent conditions (e.g., cancer). Information was obtained whether reversible conditions were still present in the past 12 months. Checklists like this yield more accurate reports than estimates derived from responses to open-ended questions (Baker et al. 2001; Knight et al. 2001). These reports were grouped into ten categories to maximize comparability with previous studies (Murray et al. 2001). The categories include arthritis, cancer, cardiovascular disorders (heart attack, heart disease, hypertension, stroke), chronic pain conditions (chronic back or neck pain, other chronic pain conditions), diabetes, frequent or severe headaches or migraines, chronic insomnia, neurological disorders (multiple sclerosis, Parkinson’s, epilepsy, seizure disorders), digestive disorders (stomach or intestinal ulcer, irritable bowel disorder), and respiratory disorders (seasonal allergies, asthma, COPD, emphysema).

Mental disorders

Mental disorders were assessed with Version 3.0 of the WHO Composite International Diagnostic Interview (CIDI), a fully structured lay-administered interview designed to generate diagnoses of common mental disorders according to the definitions and criteria of both the ICD-10 and DSM-IV systems (Kessler & Üstün, 2004, 2008). DSM-IV criteria are used here. The nine mental disorders include major depressive episode, bipolar disorder I–II, panic-agoraphobia (panic disorder or agoraphobia without a history of panic disorder), specific phobia, social phobia, generalized anxiety disorder, post-traumatic stress disorder, alcohol abuse with or without dependence, and drug abuse with or without dependence. WMH clinical reappraisal studies have shown that the diagnoses of these disorders based on the CIDI have generally good concordance with diagnoses based on blinded clinician-administered reappraisal interviews (Haro et al. 2006). As with physical conditions, we focus on mental conditions present at some time in the 12 months before interview.

Health valuation

Respondents were asked to make a health valuation after all physical and mental conditions had been assessed using a 0-to-100 visual analog scale (VAS) where 0 represents the worst possible health a person can have and 100 represents perfect health to describe their own overall physical and mental health during the past 30 days taking into consideration all the physical and mental conditions reviewed in the survey. The recall period for the VAS (30-days) is different than for the conditions (12-months) because we wanted to include effects not only of active conditions but also of recent conditions that, although not active, might still have an important effect on health valuations (e.g., a heart attack that occurred several months before the interview).

Analysis methods

A series of multiple regression models was used to estimate joint predictive associations of conditions with VAS scores controlling age, sex, and country. As the sample size was too small to allow each of the 524,288 (219) logically possible multivariate condition profiles to be a separate predictor, the models necessarily made simplifying assumptions about effects of comorbidity. The first multivariate model (M1) assumed additivity; that is, a separate predictor for each condition without interactions. M2 included a series of predictors for number of conditions (e.g., one predictor for having exactly one condition, another for exactly two, etc.) without information about type of condition. M3 included 19 predictors for type and number of conditions. The number-of-conditions dummies in this model represent aggregate patterns of comorbidity assumed independent of types. M4 allowed for the effects of type to be a linear function of number of other conditions. More complex models allowed for interactions of type with number using weighted counts based on type coefficients, but these results are not reported because the models did not fit the data as well as the simpler models.

The skewed distribution of the VAS scores made ordinary least squares (OLS) regression analysis both biased and inefficient. This problem was addressed in two ways. First, a two-part modeling approach (Duan et al. 1984) was used where a Part I logistic regression equation (Hosmer & Lemeshow, 2001) predicted having a VAS score of 100 versus less than 100 in the total sample and a Part II linear regression equation predicted scores in the 0–99 range. Individual-level predicted scores were estimated by multiplying predicted values based on the two equations. A problem with this approach is that non-random variance in prediction errors can lead to bias even when sophisticated transformation methods are used (Manning, 1998). A second approach, generalized linear models (GLM), was used to address that problem by pre-specifying nonlinear associations and non-random error structures in one-part models. Such models can sometimes fit highly skewed data better than two-part models (Manning & Mullahy, 2001; McCullagh & Nelder, 1989; Mullahy, 1998). We used a number of different two-part model specifications and a number of standard GLM specifications and then selecting the best specification using standard empirical model comparison procedures (Buntin & Zaslavsky, 2004). All models were estimated separately in developed and developing countries in an effort to obtain a rough indication of variation in results by development, but no attempt was made to estimate country-specific models.

M4, which allowed the effects of comorbidity to vary by type of condition as a linear function of number of other conditions, was the best-fitting model. This is a model of intermediate complexity in that it allows interactions to vary across conditions but not across particular pairs or higher numbers of disorders. Although this is unlikely to be the optimal interaction model, the fact that it provides the best fit across the range of models considered suggests that it is a useful first approximation. But a complication, as in any interaction model, is that the coefficients have no intuitive interpretation. We addressed this problem by using individual-level simulation to transform coefficients to a scale of average decrement in VAS scores associated with each condition. This was done by generating two estimates of predicted VAS scores for each respondent from each simulation. The first estimate was based on the model parameters in M4, while the second estimate was based on a revision of this model that assumed none of the respondents had one particular focal condition. The first estimate was then subtracted from the second and the sum across respondents was divided by the number of respondents with the focal condition to estimate the average individual-level decrease in VAS scores associated with that condition taking comorbidity into consideration. This estimate was then projected to the societal level (i.e., the effect on the mean VAS score) by multiplying it by condition prevalence.

It is noteworthy that the simulation approach, by virtue of the fact that it works with mean VAS scores, treats the VAS as an interval scale. This assumption has been called into question in some previous studies (Krabbe et al. 2006; Parkin & Devlin, 2006) and nonlinear monotonic transformations have been proposed to approximate interval scale properties (Krabbe, 2008). However, strong linear associations have been found between health state values based on VAS scores and ordinal (Craig et al. 2009) or partially-metric (Krabbe et al. 2007) scaling methods. As a result, and given that we explored a number of different nonlinear transformations of the VAS in the GLM models, we treated the VAS as an interval scale in the current analysis.

Because the WMH sample design featured weighting and clustering, all multiple regression analyses used the Taylor series linearization method (Wolter, 1985) implemented in the SUDAAN software system (Research Triangle Institute, 2002). Standard errors of simulation estimates were obtained using the method of Jackknife Repeated Replications (Wolter, 1985) implemented with a SAS macro (SAS Institute Inc., 2002). Statistical significance was consistently evaluated using two-sided .05 level tests.


Condition prevalence estimates

More than half of all respondents reported having one or more conditions in the 12 months before interview. (Table 2) Of those with any conditions, 54.6% had more than one and 51% of those with more than one had more than two conditions. The majority of conditions were reported to be more prevalent in developed than developing countries.

Table 2
Twelve-month prevalence estimates of chronic physical conditions and mental conditions separately in WMH surveys in developing and developed countries

Distribution of VAS scores

VAS scores are distributed quite similarly in developing and developed countries. Fewer than 10% of respondents in either set of countries have scores below 50, while 20.8% have scores of 100 and an additional 7.4% have scores in the range 91–100. The median (IQR) among respondents with scores less than 100 is 80 (70–90) in both developing and developed countries.

Selecting a functional form and error structure for the models

Seven one-part GLM models and seven two-part models were estimated. We evaluated comparative model fit by plotting associations between predicted mean VAS scores and observed mean scores for each decile of predicted VAS scores and using a number of other model-fitting tests that have been proposed in the econometrics literature (Buntin & Zaslavsky, 2004). (Detailed results are available on request.) The GLM model with a square root functional form and independent error structure and the one-part OLS model were found to be the best-fitting models in terms of all the tests we considered. Based on this result and the simpler interpretation of the OLS model than the GLM model, we chose the OLS model.

The individual-level predictive associations of conditions with VAS scores

The coefficients in M1 are significant as a set and show each condition to have a negative predictive association with VAS scores. (Table 3) (Only a single illustrative fit statistic is shown in Table 3. More detailed results for each model are available on request.) The coefficients in M2 are also significant as a set and show that VAS scores decrease monotonically with number of conditions. The M3 results show that the individual conditions continue to have generally negative coefficients when controlling for number of conditions and that the coefficients vary significantly across conditions. The coefficients associated with number of conditions in M3 are significantly negative. This indicates sub-additive interactions: that the joint adverse associations of comorbid condition clusters with VAS scores are less than the sum of the associations of the individual pure conditions in the clusters taken one at a time. M4 shows that these non-additive associations vary significantly across conditions.

Table 3
Model comparisons for the multivariate associations of conditions on VAS scores separately in WMH surveys in developing and developed countries

Simulated individual-level estimates

Transformation of the M4 coefficients using simulation shows that the condition-specific individual-level estimates are consistently negative. (Table 4) Coefficients for only two conditions (digestive disorders and specific phobia) differ significantly between developing and developed countries (both higher in developed). Magnitude of estimates is also quite similar in developing vs. developed countries, with median (IQR) values on the 0–100 VAS of 5.4 (3.2–5.8) in developing and 4.9 (3.1–7.1) in developed countries. Differences in coefficients across conditions are statistically significant in the total sample and fairly consistent in developing vs. developed countries. The Spearman rank-order correlation among condition estimates between developed and developing is .54. The most notable exception is Drug Abuse, ranked 1st in developing countries and 14th in developed countries.

Table 4
Simulated Individual-level condition-specific severity estimates based on the best-fitting regression model separately in WMH surveys in developing and developed countries

Coefficients based on the bivariate model (i.e., considering only one condition at a time in predicting VAS) are consistently higher than those in the multivariate model, with the condition-specific ratio of the latter to former in the range .24–.70 and a median (IQR) ratio of .42 (.31–.51). (Table 5) Very similar results are found in developing [0.53 (0.35–0.62)] and developed [0.41(0.27–0.51)] countries. The influence of comorbidity can be seen in the fact that the correlation across conditions between mean number of comorbid conditions and the ratio of the coefficient based on the bivariate model to the coefficient based on the multivariate model is a statistically significant −.46.

Table 5
Individual-level condition-specific estimates based on bivariate and the best-fitting multivariate model in the total sample

Simulated societal-level predictive associations of conditions with mean VAS scores

Societal-level associations are a joint function of prevalence and severity. We derived these estimates by multiplying individual-level estimates by the condition prevalence estimates to arrive at estimated associations of conditions with changes in mean VAS scores in the population. (Table 6) Eight of the coefficients differ significantly between developing and developed countries, all but one higher in developed countries. The median (IQR) value of the coefficients is quite similar in developing [.09 (.03–.23)] and developed [.14 (.07–.40)] countries.

Table 6
Societal-level condition-specific estimates of effects on mean VAS scores based on the best-fitting multivariate model for developed and developing countries

While most societal-level coefficients do not differ significantly by development, 74.8% of the 171 (19×18/2) differences between pairs of the 19 coefficients are statistically significant at the .05 level in the total sample. The Spearman rank-order correlation among these conditions between sets of countries is .80. The top five conditions are the same in developing and developed countries, although the rankings differ somewhat. These top conditions are dominated by high-prevalence conditions with intermediate magnitudes of individual-level effects (6th–13th ranks), with only major depression being in the top five in terms of magnitude of individual-level effects.


A number of limitations must be considered in interpreting these results. First, only a restricted set of common conditions was included in the analysis and some were pooled to form larger disorder groups. A number of burdensome conditions, such as dementia and psychosis, were not included. Expansion and disaggregation is clearly needed in future research. Second, diagnoses of chronic physical conditions were based on self reports that could have been biased. Such bias might account for the generally higher prevalence estimates of these conditions in developed than developing countries. Third, we focused on 12-month prevalence of conditions but 30-day health valuations, as these were the time frames included in the WMH surveys. This difference in recall periods would be expected to lead to an under-estimate of the severity of the active phases of episodic conditions (e.g., migraine), although it should yield an accurate estimate of the average severity of conditions in a typical month (30-day) of the year (12-month). A related limitation is that even a 12-month time frame is relatively short compared to the time frames used in some other health valuation studies (e.g., 10-years or lifetime).

Another limitation is that the highly skewed distribution of VAS scores and non-additive effects of comorbid conditions might have led to instability of results. Even though we explored use of GLM rather than OLS and examined a number of different model specifications to capture effects of comorbidity, it is possible that future research will discover better specifications either of functional form or of joint associations of comorbid conditions with health valuations. In particular, the use of data mining techniques such as regression tree analysis (Breiman, 2001, 2009; Breiman et al. 1984; Friedman, 1991) might provide useful insights into better specification of interaction effects. A related limitation is that we assumed that the VAS is an interval scale. At noted above in the section on analysis methods, this assumption has been called into question in some previous studies (Krabbe et al. 2006; Parkin & Devlin, 2006). Nonlinear monotonic transformations have been proposed to approximate interval scale properties (Krabbe, 2008; Craig et al. 2009). It would be very useful in future methodological research to explore the extent to which these different methods influence results.

Another limitation is that our estimates were based only on the overall adult population in developed and developing countries. The ratings of conditions might be quite different in different population segments (e.g., elderly, women, poor) or in different countries. Future research is needed to investigate these specifications. The use of anchoring vignettes has been shown to help address this problem (Salomon et al. 2004). In addition, a number of statistical methods exist to improve the accuracy of comparisons across sub-samples and populations that could profitably be used in future applications (Tandon et al. 2002).

Another limitation is that our results are based on VAS scores assigned by respondents to their own health states rather than to health states based on hypothetical vignettes. While there is general agreement that perceptions of people in the general population should be taken into consideration in making health valuations (Gudex et al. 1996), concerns have been raised that bias exists in the perceptual ratings of community respondents based on their own illness experiences (Stiggelbout & de Vogel-Voogt, 2008) and their familiarity with the experiences of people close to them (Krabbe et al. 2006), resulting in a general preference for health valuations made by experts (Marquie et al. 2003). Furthermore, bias in self reports in the WMH data might have been greater for mental than physical conditions because so many questions were asked in the survey about mental conditions and the VAS was administered only at the end of the survey. It would be useful to investigate this potential bias in future applications by randomizing the order of presentation of the VAS question in the survey. Methods have been developed to integrate VAS responses with responses based on other valuation methods (e.g., time trade-off, willingness to pay) that might also profitably be used in future studies to evaluate these biases (Salomon & Murray, 2004).

A less obvious limitation, finally, is that the simulation method evaluated marginal effects of individual conditions. This method can be faulted because it implicitly assumes that the presence vs. absence of a single condition can be changed while holding constant all other conditions. This assumption would be plausible if all comorbid conditions were either causes or risk markers (Kraemer et al. 1997) of focal conditions. However, in cases where the comorbid condition is a consequence of the focal condition or where two or more conditions are reciprocally related, the simulation method used here will under-estimate the effect of the focal condition (assuming that comorbidity is positive) by controlling for one or more of the intervening pathways through which that condition influences VAS scores.

This under-estimation could be removed by deleting controls for all conditions that are thought to mediate the total effect of the focal condition. However, in the case where these comorbid conditions are reciprocally related to the focal condition, exclusion of the comorbid conditions from the prediction equation will lead to over-estimation of the effect of the focal condition. The only plausible way to address that issue is to develop a methodology of partial control: that is, to control for the subset of comorbid conditions that have causal effects on the focal conditions but not for the subset that occur as a consequence of the focal condition. An innovative methodology known as g-estimation has been developed to do this (Young et al. 2010), but this method requires access to large-scale longitudinal epidemiological data that monitor onset and course of comorbid conditions over time. As a result of this data requirement, use of g-estimation has been minimal (Taubman et al. 2009) and has never to our knowledge been used to study health valuation. This method is nonetheless very promising and deserves to be explored in future studies aimed at sorting out the effects of comorbidty on health valuation.

Within the context of these limitations, our results show clearly that sensible estimates can be obtained of condition-specific effects on VAS while taking comorbidity into consideration. As noted in the introduction, a similar approach could be used to study informant ratings by using a series of hypothetical vignettes of people with comorbid conditions rather than pure conditions. We find that the consideration of comorbidity makes a substantial difference to ratings. In particular, condition-specific ratings are lower when comorbidity is taken into consideration due to a general pattern of sub-additive interactions among comorbid conditions in predicting VAS scores. This sub-additive pattern is consistent with the findings of the one other previous study we know that carried out a similar type of analysis (Verbrugge et al. 1989). Furthermore, we found substantial between-condition variation in the extent to which adjustment for comorbidity influences estimates.

Although the substantive findings regarding effects of individual conditions on VAS should be interpreted with caution given the limitations enumerated above, it is noteworthy that neurological conditions, insomnia, and major depression were estimated to be the most severe conditions at the individual level. The neurological conditions we considered included epilepsy and seizure disorders, Parkinson’s disease, and multiple sclerosis, all of which have been shown to have high disability in previous studies (Jacoby & Baker, 2008; Singer et al. 1999). The high ranking of insomnia is surprising because previous studies, although documenting a high societal-level burden of insomnia, have generally found this to be due to high prevalence in conjunction with moderate individual-level burden rather than to high individual-level burden (Roth et al. 2006). The high individual-level severity of insomnia in our study probably lies in the fact that we required a greater sleep disruption (at least two hours of either delay in sleep onset or disruption in sleep maintenance per night most nights of the week for at least one month in the past year) than previous studies of insomnia (Ohayon, 2002). The high individual-level estimate we found for depression, finally, is consistent with much previous research (Donohue & Pincus, 2007; Gabilondo et al. 2009; Wang et al. 2008).

The rank-ordering of the individual-level VAS estimates was found to be quite similar in developing and developed countries. However, several exceptions were found. These should be investigated in future studies. Digestive conditions (stomach/intestine ulcer and irritable bowel disorder) were rated considerably more severe in developed than developing countries, possibly reflecting a different mix of cases that might explain the differences in estimated severity. The individual-level estimated severity of drug abuse, in comparison, was substantially higher in developing than developed countries. Differential willingness to admit drug problems might have been involved in this result, as reported prevalence of drug abuse was much lower in developing than developed countries, possibly indicating that the cases we learned of in developing countries were more severe than those in developed countries (Schmidt & Room, 1999).

Comparison of our individual-level condition severity estimates with estimates in an earlier WMH analysis of condition-specific role impairment (Ormel et al. 2008) finds that the conditions rated most severe in that earlier study were generally also rated among the most severe in the current investigation. However, a number of differences in relative ratings exist that could be attributed either to differences in the outcome (i.e., a global VAS score versus a measure of condition-specific role impairment) or to our previous analysis not adjusting for comorbidity.

Our results regarding societal-level associations are less innovative because, consistent with previous studies, we merely multiplied the prevalence estimates of the conditions with the individual-level estimates of condition severity to arrive at societal-level estimates of burden. As in previous studies that compared individual-level and societal-level estimates (Andlin-Sobocki et al. 2005; Saarni et al. 2007; Whiteford, 2000), the rank-ordering of conditions differs considerably between the two, with societal-level estimates influenced importantly by variation in prevalence and the conditions estimated to be most burdensome at the societal level dominated by high-prevalence conditions.

While our results argue clearly for the importance of considering comorbidity when estimating disease burden, the best way to do this is not obvious. The approach we took here has the advantage of considering comorbidities in their true distribution in the population rather than requiring hypothetical scenarios to be generated that might or might not adequately characterize the actual distribution of complex comorbidities in the population. However, methods also exist to allow the effects of individual conditions to be estimated using expert ratings of hypothetical patient scenarios that include information about complex profiles of comorbidity (Jasso, 2006; Saarni et al. 2007). Indeed, the actual distributions of comorbidity found in community surveys like the WMH surveys could be used to generate these vignettes so as to guarantee that they represent the distribution and range of patterns in the population. As many health policy researchers favor condition severity ratings made by experts rather than the ratings made by respondents in community surveys for a variety of other reasons (Insinga & Fryback, 2003; Marquie et al. 2003; Ormel et al. 2008; Schnadig et al. 2008), it might be that the best approach would be to build information about comorbidity into conventional expert rating scenarios. However, valuations of the sort presented here based on community samples also would seem to have value in representing the perceptions of actual people with real conditions in the population. It remains a challenge for the field to develop a way of integrating data of these different sorts.


The analysis for this paper is carried out in conjunction with the World Health Organization World Mental Health (WMH) Survey Initiative. We thank the WMH staff for assistance with instrumentation, fieldwork, and data analysis. These activities were supported by the United States National Institute of Mental Health (R01MH070884), the Mental Health Burden Study: Contract number HHSN271200700030C, the John D. and Catherine T. MacArthur Foundation, the Pfizer Foundation, the US Public Health Service (R13-MH066849, R01-MH069864, and R01 DA016558), the Fogarty International Center (FIRCA R03-TW006481), the Pan American Health Organization, the Eli Lilly & Company Foundation, Ortho-McNeil Pharmaceutical, Inc., GlaxoSmithKline, Bristol-Myers Squibb, and Shire. A complete list of WMH publications can be found at The Chinese World Mental Health Survey Initiative is supported by the Pfizer Foundation. The Colombian National Study of Mental Health (NSMH) is supported by the Ministry of Social Protection. The ESEMeD project is funded by the European Commission (Contracts QLG5-1999-01042; SANCO 2004123), the Piedmont Region (Italy), Fondo de Investigación Sanitaria, Instituto de Salud Carlos III, Spain (FIS 00/0028), Ministerio de Ciencia y Tecnología, Spain (SAF 2000-158-CE), Departament de Salut, Generalitat de Catalunya, Spain, Instituto de Salud Carlos III (CIBER CB06/02/0046, RETICS RD06/0011 REM-TAP), and other local agencies and by an unrestricted educational grant from GlaxoSmithKline. The Israel National Health Survey is funded by the Ministry of Health with support from the Israel National Institute for Health Policy and Health Services Research and the National Insurance Institute of Israel. The World Mental Health Japan (WMHJ) Survey is supported by the Grant for Research on Psychiatric and Neurological Diseases and Mental Health (H13-SHOGAI-023, H14-TOKUBETSU-026, H16-KOKORO-013) from the Japan Ministry of Health, Labour and Welfare. The Lebanese National Mental Health Survey (LEBANON) is supported by the Lebanese Ministry of Public Health, the WHO (Lebanon), Fogarty International, Act for Lebanon, anonymous private donations to IDRAAC, Lebanon, and unrestricted grants from Janssen Cilag, Eli Lilly, GlaxoSmithKline, Roche, and Novartis. The Mexican National Comorbidity Survey (MNCS) is supported by The National Institute of Psychiatry Ramon de la Fuente (INPRFMDIES 4280) and by the National Council on Science and Technology (CONACyT-G30544- H), with supplemental support from the PanAmerican Health Organization (PAHO). The Nigerian Survey of Mental Health and Wellbeing (NSMHW) is supported by the WHO (Geneva), the WHO (Nigeria), and the Federal Ministry of Health, Abuja, Nigeria. The Ukraine Comorbid Mental Disorders during Periods of Social Disruption (CMDPSD) study is funded by the US National Institute of Mental Health (RO1-MH61905). The US National Comorbidity Survey Replication (NCS-R) is supported by the National Institute of Mental Health (NIMH; U01-MH60220) with supplemental support from the National Institute of Drug Abuse (NIDA), the Substance Abuse and Mental Health Services Administration (SAMHSA), the Robert Wood Johnson Foundation (RWJF; Grant 044708), and the John W. Alden Trust.


Declaration of Interest: Dr. Kessler has been a consultant for GlaxoSmithKline Inc., Kaiser Permanente, Pfizer Inc., Sanofi-Aventis, Shire Pharmaceuticals, and Wyeth-Ayerst; has served on advisory boards for Eli Lilly & Company and Wyeth-Ayerst; and has had research support for his epidemiological studies from Bristol-Myers Squibb, Eli Lilly & Company, GlaxoSmithKline, Johnson & Johnson Pharmaceuticals, Ortho-McNeil Pharmaceuticals Inc., Pfizer Inc., and Sanofi-Aventis. The remaining authors declare that no competing interests exist.


  • Andlin-Sobocki P, Jonsson B, Wittchen HU, Olesen J. Cost of disorders of the brain in Europe. European Journal of Neurology. 2005;12(Suppl 1):1–27. [PubMed]
  • Baker M, Stabile M, Deri C. What do Self-Reported, Objective, Measures of Health Measure? Journal of Human Resources. 2001;39:1067–1093.
  • Breiman L. Random forests. Machine Learning. 2001;45:32.
  • Breiman L. Statistical modeling: the two cultures. Statistical Science. 2009;16:199–215.
  • Breiman L, Friedman JH, Olshen RA, Stone CJ. Classification and Regression Trees. Chapman & Hall; New York, NY: 1984.
  • Buntin MB, Zaslavsky AM. Too much ado about two-part models and transformation? Comparing methods of modeling Medicare expenditures. Journal of Health Economics. 2004;23:525–542. [PubMed]
  • Center for Disease Control and Prevention. Health, United States, 2004. National Center for Health Statistics; Atlanta, GA: 2004.
  • Craig BM, Busschbach JJ, Salomon JA. Modeling ranking, time trade-off, and visual analog scale values for EQ-5D health states: a review and comparison of methods. Medical Care. 2009;47:634–641. [PMC free article] [PubMed]
  • De Wit GA, Busschbach JJ, De Charro FT. Sensitivity and perspective in the valuation of health status: whose values count? Health Economics. 2000;9:109–126. [PubMed]
  • Donohue JM, Pincus HA. Reducing the societal burden of depression: a review of economic costs, quality of care and effects of treatment. Pharmacoeconomics. 2007;25:7–24. [PubMed]
  • Duan N, Manning WG, Morris CN, Newhouse JP. Choosing between the sample-selection model and the multi-part model. Journal of Business and Economic Statistics. 1984;2:289.
  • Fortin M, Soubhi H, Hudon C, Bayliss EA, van den Akker M. Multimorbidity’s many challenges. British Medical Journal. 2007;334:1016–1017. [PMC free article] [PubMed]
  • Friedman JH. Multivariate adaptive regression splines (with discussion) Annals of Statistics. 1991;19:1.
  • Gabilondo A, Rojas-Farreras S, Vilagut G, Haro JM, Fernandez A, Pinto-Meza A, Alonso J. Epidemiology of major depressive episode in a southern European country: Results from the ESEMeD-Spain project. Journal of Affective Disorders 2009 [PubMed]
  • Gudex C, Dolan P, Kind P, Williams A. Health state valuations from the general public using the visual analogue scale. Quality of Life Research. 1996;5:521–531. [PubMed]
  • Haro JM, Arbabzadeh-Bouchez S, Brugha TS, de Girolamo G, Guyer ME, Jin R, Lepine JP, Mazzi F, Reneses B, Vilagut G, Sampson NA, Kessler RC. Concordance of the Composite International Diagnostic Interview Version 3.0 (CIDI 3.0) with standardized clinical assessments in the WHO World Mental Health surveys. International Journal of Methods in Psychiatric Research. 2006;15:167–180. [PubMed]
  • Heeringa SG, Wells JE, Hubbard F, Mneimneh Z, Chiu WT, Sampson N. Sample Designs and Sampling Procedures. In: Kessler RC, Üstün TB, editors. The WHO World Mental Health Surveys: Global Perspectives on the Epidemiology of Mental Disorders. Cambridge University Press; New York, NY: 2008.
  • Hosmer DW, Lemeshow S. Applied Logistic Regression. 2. Wiley & Sons; New York, NY: 2001.
  • Insinga RP, Fryback DG. Understanding differences between self-ratings and population ratings for health in the EuroQOL. Quality of Life Research. 2003;12:611–619. [PubMed]
  • Jacoby A, Baker GA. Quality-of-life trajectories in epilepsy: a review of the literature. Epilepsy Behavior. 2008;12:557–571. [PubMed]
  • Jasso G. Factorial survey methods for studying beliefs and judgments. Sociological Methods and Research. 2006;34:334–423.
  • Kessler RC, Üstün TB. The World Mental Health (WMH) Survey Initiative Version of the World Health Organization (WHO) Composite International Diagnostic Interview (CIDI) International Journal of Methods in Psychiatric Research. 2004;13:93–121. [PubMed]
  • Kessler RC, Üstün TB, editors. The WHO World Mental Health Surveys: Global Perspectives on the Epidemiology of Mental Disorders. Cambridge University Press; New York, NY: 2008.
  • Knight M, Stewart-Brown S, Fletcher L. Estimating health needs: the impact of a checklist of conditions and quality of life measurement on health information derived from community surveys. Journal of Public Health in Medicine. 2001;23:179–186. [PubMed]
  • Krabbe PF. Thurstone scaling as a measurement method to quantify subjective health outcomes. Medical Care. 2008;46:357–365. [PubMed]
  • Krabbe PF, Salomon JA, Murray CJ. Quantification of health states with rank-based nonmetric multidimensional scaling. Medical Decision Making. 2007;27:395–405. [PubMed]
  • Krabbe PF, Stalmeier PF, Lamers LM, Busschbach JJ. Testing the interval-level measurement property of multi-item visual analogue scales. Quality of Life Research. 2006;15:1651–1661. [PubMed]
  • Kraemer HC, Kazdin AE, Offord DR, Kessler RC, Jensen PS, Kupfer DJ. Coming to terms with the terms of risk. Archives of General Psychiatry. 1997;54:337–343. [PubMed]
  • Lopez AD, Mathers CD. Inequalities in health status: findings from the 2001 Global Burden of Disease study. In: Matlin S, editor. The Global Forum Update on Research for Health. Vol. 4. Pro-Brook Publishing Limited; London: 2007. pp. 163–175.
  • Maddigan SL, Feeny DH, Johnson JA. Health-related quality of life deficits associated with diabetes and comorbidities in a Canadian National Population Health Survey. Quality of Life Research. 2005;14:1311–1320. [PubMed]
  • Manning SC. Configuring compliance: a professional fit. Journal of American Health Information Management Association. 1998;69:36–38. [PubMed]
  • Manning WG, Mullahy J. Estimating log models: to transform or not to transform? Journal of Health Economics. 2001;20:461–494. [PubMed]
  • Marquie L, Raufaste E, Lauque D, Marine C, Ecoiffier M, Sorum P. Pain rating by patients and physicians: evidence of systematic pain miscalibration. Pain. 2003;102:289–296. [PubMed]
  • McCullagh P, Nelder JA. Generalized Linear Models, 2nd Edition. Chapman & Hall; London: 1989.
  • Moussavi S, Chatterji S, Verdes E, Tandon A, Patel V, Ustun B. Depression, chronic diseases, and decrements in health: results from the World Health Surveys. Lancet. 2007;370:851–858. [PubMed]
  • Mullahy J. Much ado about two: reconsidering retransformation and the two-part model in health econometrics. Journal of Health Economics. 1998;17:247–281. [PubMed]
  • Murray CJ, Lopez AD. Evidence-based health policy--lessons from the Global Burden of Disease Study. Science. 1996;274:740–743. [PubMed]
  • Murray CJL, Lopez AD, Mathers CD, Stein C. The Global Burden of Disease 2000 Project: Aims, Methods and Data Sources. World Health Organization; Geneva: 2001.
  • Ohayon MM. Epidemiology of insomnia: what we know and what we still need to learn. Sleep Medicine Review. 2002;6:97–111. [PubMed]
  • Ormel J, Petukhova M, Chatterji S, Aguilar-Gaxiola S, Alonso J, Angermeyer MC, Bromet EJ, Burger H, Demyttenaere K, de Girolamo G, Haro JM, Hwang I, Karam E, Kawakami N, Lepine JP, Medina-Mora ME, Posada-Villa J, Sampson N, Scott K, Ustun TB, Von Korff M, Williams DR, Zhang M, Kessler RC. Disability and treatment of specific mental and physical disorders across the world. British Journal of Psychiatry. 2008;192:368–375. [PMC free article] [PubMed]
  • Parkin D, Devlin N. Is there a case for using visual analogue scale valuations in cost-utility analysis? Health Econ. 2006;15:653–664. [PubMed]
  • Pennell B-E, Mneimneh Z, Bowers A, Chardoul S, Wells JE, Viana MC, Dinkelmann K, Gebler N, Florescu S, He Y, Huang Y, Tomov T, Vilagut G. Implementation of the World Mental Health Surveys. In: Kessler RC, Üstün TB, editors. The WHO World Mental Health Surveys: Global Perspectives on the Epidemiology of Mental Disorders. Cambridge University Press; New York, NY: 2008.
  • Research Triangle Institute. SUDAAN: Professional Software for Survey Data Analysis. Research Triangle Institute; Research Triangle Park, NC: 2002.
  • Roth T, Jaeger S, Jin R, Kalsekar A, Stang PE, Kessler RC. Sleep problems, comorbid mental disorders, and role functioning in the national comorbidity survey replication. Biological Psychiatry. 2006;60:1364–1371. [PMC free article] [PubMed]
  • Saarni SI, Suvisaari J, Sintonen H, Pirkola S, Koskinen S, Aromaa A, Lonnqvist J. Impact of psychiatric disorders on health-related quality of life: general population survey. British Journal of Psychiatry. 2007;190:326–332. [PubMed]
  • Salomon JA, Murray CJ. A multi-method approach to measuring health-state valuations. Health Economics. 2004;13:281–290. [PubMed]
  • Salomon JA, Tandon A, Murray CJ. Comparability of self rated health: cross sectional multi-country survey using anchoring vignettes. British Medical Journal. 2004;328:258. [PMC free article] [PubMed]
  • SAS Institute Inc. SAS/STATR Software, Version 9.1 for Windows. SAS Institute Inc; Cary, NC: 2002.
  • Schmidt L, Room R. Cross-cultural applicability in international classifications and research on alcohol dependence. Journal of Studies on Alcohol. 1999;60:448–462. [PubMed]
  • Schnadig ID, Fromme EK, Loprinzi CL, Sloan JA, Mori M, Li H, Beer TM. Patient-physician disagreement regarding performance status is associated with worse survivorship in patients with advanced cancer. Cancer. 2008;113:2205–2214. [PMC free article] [PubMed]
  • Schoenborn CA, Adams PF, Schiller JS. Summary health statistics for the U.S. population: National Health Interview Survey, 2000. Vital Health and Statistics. 2003;10:1–83. [PubMed]
  • Singer MA, Hopman WM, MacKenzie TA. Physical functioning and mental health in patients with chronic medical conditions. Quality of Life Research. 1999;8:687–691. [PubMed]
  • Stiggelbout AM, de Vogel-Voogt E. Health state utilities: a framework for studying the gap between the imagined and the real. Value Health. 2008;11:76–87. [PubMed]
  • Tandon A, Murray CJL, Salomon JA, King G. Global Programme on Evidence for Health Policy Discussion. World Health Organization; Geneva: 2002. Statistical models for enhancing cross-population comparability, paper No. 42.
  • Taubman SL, Robins JM, Mittleman MA, Hernan MA. Intervening on risk factors for coronary heart disease: an application of the parametric g-formula. International Journal of Epidemiology. 2009;38:1599–1611. [PMC free article] [PubMed]
  • Verbrugge LM, Lepkowski JM, Imanaka Y. Comorbidity and its impact on disability. Milbank Quarterly. 1989;67:450–484. [PubMed]
  • Wang PS, Simon GE, Kessler RC. Making the business case for enhanced depression care: the National Institute of Mental Health-harvard Work Outcomes Research and Cost-effectiveness Study. Journal of Occupational and Environmental Medicine. 2008;50:468–475. [PubMed]
  • Whiteford H. Unmet need: a challenge for governments. In: Andrews G, Henderson S, editors. Unmet Need in Psychiatry: Problems, Resources, Responses. Cambridge University Press; Cambridge, UK: 2000. pp. 8–10.
  • Wolter KM. Introduction to Variance Estimation. Springer-Verlag; New York, NY: 1985.
  • World Health Organization. The Global Burden of Disease: 2004 Update. World Health Organization; Geneva: 2004.
  • Young JG, Hernan MA, Picciotto S, Robins JM. Relation between three classes of structural models for the effect of a time-varying exposure on survival. Lifetime Data Analysis. 2010;16:71–84. [PubMed]