Both the 17-item Hamilton Rating Scale for Depression (HRSD17) and 30-item Inventory of Depressive Symptomatology – Clinician-rated (IDS-C30) contain a subscale that assesses anxious symptoms. We used classical test theory and item response theory methods to assess and compare the psychometric properties of the two anxiety subscales (HRSDANX and IDS-CANX) in a large sample (N = 3453) of outpatients with non-psychotic major depressive disorder in the Sequenced Treatment Alternatives to Relieve Depression (STAR*D) study. Approximately 48% of evaluable participants had at least one concurrent anxiety disorder by the self-report Psychiatric Diagnostic Screening Questionnaire (PDSQ). The HRSDANX and IDS-CANX were highly correlated (r = 0.75) and both had moderate internal consistency given their limited number of items (HRSDANX Cronbach’s alpha = 0.48; IDS-CANX Cronbach’s alpha = 0.58). The optimal threshold for ascribing the presence/absence of anxious features was found at a total score of eight or nine for the HRSDANX and seven or eight for the IDS-CANX. It would seem beneficial to delete item 17 (loss of insight) from the HRSDANX as it negatively correlated with the scale’s total score. Both the HRSDANX and IDS-CANX subscales have acceptable psychometric properties and can be used to identify anxious features for clinical or research purposes.
depression; anxiety; rating scales; STAR*D; measurement-based care
We investigated measurement non-invariance of DSM-IV narcissistic personality disorder (NPD) criteria across age and sex in a population-based cohort sample of 2794 Norwegian twins. Age had a statistically significant effect on the factor mean for NPD. Sex had a statistically significant effect on the factor mean and variance. Controlling for these factor level effects, item-level analysis indicated that the criteria were functioning differently across age and sex. After correcting for measurement differences at the item level, the latent factor mean effect for age was no longer statistically significant. The mean difference for sex remained statistically significant after correcting for item threshold effects. The results indicate that DSM-IV NPD criteria perform differently in males and females and across age. Differences in diagnostic rates across groups may not be valid without correcting for measurement non-invariance.
narcissistic personality disorder; twins; population-based sample; item response theory; measurement non-variance
Effective screening for emotional and behavioral disorders among youth requires brief screening scales with good validity to identify youth requiring further evaluation and to estimate prevalence of target disorders in populations of interest such as schools or neighborhoods. This paper examines the psychometric properties of a very short (six-item) screening scale, the K6, to assess serious emotional disturbance (SED) among youth. The K6, which is made up of symptoms of depression and anxiety, has been shown in previous research to be a strong predictor of serious mental illness (SMI) in adults, but no information is available on the ability of the scale to screen for SED among youth. The current report examines the K6 as a screen for SED in a national survey of US adolescents, the National Comorbidity Survey Replication Adolescent Supplement (NCS-A). The K6 is shown to provide fairly good prediction of SED [area under curve (AUC) = 0.74] that is somewhat higher for internalizing (AUC = 0.80) than behavior (AUC = 0.75) disorders. Based on this result, we augmented the K6 with questions about symptoms of behavior disorders. This improved prediction of SED (from AUC = 0.74 to AUC = 0.83) as well as of SED associated with pure behavior disorders (from AUC = 0.53 to AUC = 0.78). These results show that although the symptoms of depression and anxiety in the K6 are sufficient to detect SMI among adults, high rates of behavior disorders among adolescents require indicators of behavior disorders to be added to the K6 to screen adequately for adolescent SED.
adolescents; screening; serious emotional disturbance
The DSM-IV and ICD-10 are both operational diagnostic systems that classify known psychological disorders according to the number of criteria symptoms. Certain discrepancies between the criteria exist and may lead to some inconsistencies in psychiatric research. The purpose of this study was to investigate these differences in the assessment of depression with item response theory (IRT) analyses. The World Mental Health-Japan (WMHJ) Survey is an epidemiological survey of the general population in Japan. We analyzed data from the WMHJ completed by 353 respondents who had either depressive mood or diminished interest. A 2-parameter logistic model was used to evaluate the characteristics of the symptoms of the DSM-IV and ICD-10. IRT analyses revealed that the symptoms about psychomotor activity, worthlessness and self-reproach were more informative and suggestive of greater severity, while the symptoms about dietary habits were less informative. IRT analyses also revealed that the ICD-10 seems more sensitive to the mild range of the depression spectrum compared to the DSM-IV. Although there were some variations in severity among respondents, most of the respondents diagnosed with a severe or moderate depressive episode according to the ICD-10 were also diagnosed with a major depressive episode according to the DSM-IV.
Depression; World Mental Health Japan Survey; DSM-IV; ICD-10; Item Response Theory
While item response theory (IRT) research shows a latent severity trait underlying response patterns of substance abuse and dependence symptoms, little is known about IRT-based severity estimates in relation to clinically relevant measures. In response to increased prevalences of marijuana-related treatment admissions, an elevated level of marijuana potency, and the debate on medical marijuana use, we applied dimensional approaches to understand IRT-based severity estimates for marijuana use disorders (MUDs) and their correlates while simultaneously considering gender- and race/ethnicity-related differential item functioning (DIF). Using adult data from the 2008 National Survey on Drug Use and Health (N=37,897), DSM-IV criteria for MUDs among past-year marijuana users were examined by IRT, logistic regression, and multiple indicators–multiple causes (MIMIC) approaches. Among 6,917 marijuana users, 15% met criteria for a MUD; another 24% exhibited subthreshold dependence. Abuse criteria were highly correlated with dependence criteria (correlation=0.90), indicating unidimensionality; item information curves revealed redundancy in multiple criteria. MIMIC analyses showed that MUD criteria were positively associated with weekly marijuana use, early marijuana use, other substance use disorders, substance abuse treatment, and serious psychological distress. African Americans and Hispanics showed higher levels of MUDs than whites, even after adjusting for race/ethnicity-related DIF. The redundancy in multiple criteria suggests an opportunity to improve efficiency in measuring symptom-level manifestations by removing low-informative criteria. Elevated rates of MUDs among African Americans and Hispanics require research to elucidate risk factors and improve assessments of MUDs for different racial/ethnic groups.
Differential item functioning; item response theory; multiple indicators–multiple causes model; marijuana use disorders
The widely-used Kessler K6 nonspecific distress scale screens for severe mental illness defined as a K6 score ≥ 13, estimated to afflict about 6% of US adults. The K6, as currently used, fails to capture individuals struggling with more moderate mental distress that nonetheless warrants mental health intervention. The current study determined a cutoff criterion on the K6 scale indicative of moderate mental distress based on mental health treatment need and assessed the validity of this criterion by comparing participants with identified moderate and severe mental distress on relevant clinical, impairment, and risk behavior measures. Data were analyzed from 50,880 adult participants in the 2007 California Health Interview Survey. Receiver operating characteristic curve analysis identified K6≥5 as the optimal lower threshold cut-point indicative of moderate mental distress. Based on the K6, 8.6% of California adults had serious mental distress and another 27.9% had moderate mental distress. Correlates of moderate and serious mental distress were similar. Respondents with moderate mental distress had rates of mental health care utilization, impairment, substance use and other risks lower than respondents with serious mental distress and greater than respondents with none/low mental distress. The findings support expanded use and analysis of the K6 scale in quantifying and examining correlates of mental distress at a moderate, yet still clinically relevant, level.
mental distress; mental health; psychiatric scale; Kessler
Data are reported on the background and performance of the K6 screening scale for serious mental illness (SMI) in the World Health Organization (WHO) World Mental Health (WMH) surveys. The K6 is a 6-item scale developed to provide a brief valid screen for DSM-IV SMI based on the criteria in the US ADAMHA Reorganization Act. Although methodological studies have documented good K6 validity in a number of countries, optimal scoring rules have never been proposed. Such rules are presented here based on analysis of K6 data in nationally or regionally representative WMH surveys in 14 countries (combined n = 41,770 respondents). Twelve-month prevalence of DSM-IV SMI was assessed with the fully-structured WHO Composite International Diagnostic Interview. Nested logistic regression analysis was used to generate estimates of the predicted probability of SMI for each respondent from K6 scores taking into consideration the possibility of variable concordance as a function of respondent age, gender, education, and country. Concordance, assessed by calculating the area under the receiver operating characteristic curve (AUC), was generally substantial (Median .83; Range .76-.89; Inter-quartile range .81-.85). Based on this result, optimal scaling rules are presented for use by investigators working with the K6 scale in the countries studied.
K6 screening scale; psychiatric epidemiology; serious mental illness (SMI)
The 54-item Social Adjustment Scale – Self-report (SAS-SR) is a measure of social functioning used in research studies and clinical practice. Two shortened versions were recently developed: the 24-item SAS-SR: Short and the 14-item SAS-SR: Screener. We briefly describe the development of the shortened scales and then assess their reliability and validity in comparison to the full SAS-SR in new analyses from two separate samples of convenience from a family study and from a primary care clinic.
Compared to the full SAS-SR, the shortened scales performed well, exhibiting high correlations with full SAS-SR scores (r values between 0.81 and 0.95); significant correlations with health-related quality of life as measured by the Short Form 36 Health Survey; the ability to distinguish subjects with major depression versus other psychiatric disorders versus no mental disorders; and sensitivity to change in clinical status as measured longitudinally with the Symptom Checklist-90 and Global Assessment Scale.
The SAS-SR: Short and SAS-SR: Screener retained the areas assessed by the full SAS-SR with fewer items in each area, and appear to be promising replacements for the full scale when a shorter administration time is desired and detailed information on performance in different areas is not required. Further work is needed to test the validity of the shortened measures.
social adjustment scale–self-report (SAS-SR); screening; reliability; validity
Evaluations of assessment instruments using classical test theory typically rely on indices of internal consistency, test-retest reliability, and construct validity. However, the use of models from item response theory (IRT) allows comparison of instruments (and items) in terms of the information they provide and where they provide it along the continuum of severity of the construct being assessed. Such results help to identify the measures most appropriate for specific clinical and research contexts. The present study examined the functioning of the Beck Depression Inventory (BDI), the Center for Epidemiologic Studies – Depression Scale (CES-D), and the nine primary symptoms from the depression module of the Schedule for Affective Disorders and Schizophrenia – Children (K-SADS) using IRT methods. A large sample of adolescents (n = 1,709) completed the BDI, CES-D, and K-SADs. IRT calibration analyses demonstrated that the BDI and CES-D performed well in similar ranges of depressive severity (approximately −1 to +3 SDs), although the BDI provided more information at higher severity levels and the CES-D at lower severity levels. The K-SADS depression items, which are dichotomous and focused on clinical disorder, provided the least information that was restricted to the narrowest range (approximately +1 to +3 SDs). This work finds consistency between past rationale for the use of the BDI in clinical samples while using the CES-D in epidemiological studies. The results for the KSADs suggest that interview measures may benefit from increasing the number of items and/or response options to collect more psychometric information.
Research diagnostic interviews need to discriminate between closely related disorders in order to allow comorbidity among mental disorders to be studied reliably. Yet conventional studies of diagnostic validity generally focus on single disorders and do not examine discriminant validity. The current study examines the validity of fully-structured diagnoses of closely-related distress disorders (generalized anxiety disorder, post-traumatic stress disorder, major depressive episode, and dysthymic disorder) in the lay-administered Composite International Diagnostic Interview Version 3.0 (CIDI) with independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-Age Children (K-SADS) in the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A). The NCS-A is a national survey of DSM-IV mental disorders among 10,148 adolescents. A probability subsample of 347 of these adolescents and their parents were administered blinded follow-up K-SADS interviews. Good concordance (AUC; area under the receiver operating characteristic curve) was found between diagnoses based on the CIDI and the K-SADS for generalized anxiety disorder (AUC = .78), post-traumatic stress disorder (AUC = .79), and major depressive episode/dysthymic disorder (AUC = .86). Further, the CIDI was able to effectively discriminate among different types of distress disorders in the sub-sample of respondents with any distress disorder.
Major Depressive Episode; Generalized Anxiety Disorder; Posttraumatic Stress Disorder; WHO Composite International Diagnostic Interview (CIDI); US National Comorbidity Survey Replication Adolescent Supplement (NCS-A)
Accurate information concerning alcohol consumption level and patterns is vital to formulating public health policy. The objective of this paper is to critically assess the extent to which survey design, response rate and alcohol consumption coverage obtained in random digit dialing, telephone-based surveys impact on conclusions about alcohol consumption and its patterns in the general population. Our analysis will be based on the Canadian Alcohol and Drug Use Monitoring Survey (CADUMS) 2008, a national survey intended to be representative of the general population. The conclusions of this paper are as follows: 1) ignoring people who are homeless, institutionalized and/or do not have a home phone may lead to an underestimation of the prevalence of alcohol consumption and related problems; 2) weighting of observations to population demographics may lead to a increase in the design effect, does not necessarily address the underlying selection bias, and may lead to overly influential observations; and 3) the accurate characterization of alcohol consumption patterns obtained by triangulating the data with the adult per capita consumption estimate is essential for comparative analyses and intervention planning especially when the alcohol coverage rate is low like in the CADUMS with 34%.
alcohol; average volume of consumption; patterns of drinking; adult per capita consumption; survey; random digit dialing; bias
Information about the prevalence of serious mental illness (SMI) among adults or serious emotional disturbance (SED) among youth in small domains such as counties, states, or schools is valuable for mental health policy planning purposes, but prohibitively expensive to collect with semi-structured surveys. Commonly used synthetic estimation methods yield imprecise estimates. An improved method is described here that combines information about socio-demographic covariates with screening scale scores obtained from a sample of individuals, using a prediction equation derived from a Bayesian multilevel regression model with bivariate outcomes fitted to a larger population survey. This method is illustrated using K6 screening scale scores to predict school-level prevalence of SED in the sample of 282 schools that participated in the National Comorbidity Survey Replication Adolescent Supplement. Respondents completed a diagnostic interview that was used to define DSM-IV SED. SED prevalence varied significantly across schools and was strongly correlated with aggregate K6 scores (ρ = .70). Calculations suggest that near-maximum precision of school-level SED prevalence estimates could be attained with K6 samples of 200 students per school. This modeling approach holds great promise for generating accurate estimates of SMI/SED in small-area planning units based on K6 scores collected in ongoing health tracking surveys.
K6 screening scale; small-area estimation; psychiatric epidemiology; serious emotional disturbance (SED)
Given the enormous influence of classification on the major clinical, research, and administrative activities of mental health professionals, understanding the true number and nature of disorders and the reasons for their comorbidity is an important public health priority. However, while studies of latent structure have yielded valuable information about disorder boundaries, their reliance on nonrepresentative samples and failure to evaluate the practical implications of structural findings has limited their ability to effect nosological change. Conversely, community epidemiology studies, which inform classification by assessing the implications of diagnostic criteria in representative samples, have been limited by their focus on mental disorders as they are currently conceptualized by the field rather than on correlates and consequences of these disorders as they actually exist in nature. I consider the potential value of integrating systematically the methods of structural research with the methods of epidemiological research, exploring five ways in which these largely independent traditions may profitably be combined to inform the next classifications of mental disorders. By capitalizing on the complementary strengths of structural and epidemiological research, an integrated approach has significant potential to advance understanding of the nature of psychopathology and improve the validity and utility of its diagnosis.
classification; continuity; comorbidity; epidemiology; latent structure
Validity of the adolescent version of the World Health Organization Composite International Diagnostic Interview (CIDI) Version 3.0, a fully-structured research diagnostic interview designed to be used by trained lay interviewers, is assessed in comparison to independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-Age Children (K-SADS). This assessment is carried out in the clinical reappraisal sub-sample (n = 347) of the US National Comorbidity Survey Adolescent Supplement (NCS-A), a large (n = 10,148) community epidemiological survey of the prevalence and correlates of adolescent mental disorders in the US. The diagnoses considered are panic disorder and phobic disorders (social phobia, specific phobia, agoraphobia). CIDI diagnoses are found to have good concordance with K-SADS diagnoses (AUC = .81–.94), although the CIDI diagnoses are consistency somewhat higher than the K-SADS diagnoses. Data are also presented on criterion-level concordance in an effort to pinpoint CIDI question series that might be improved in future modifications of the instrument. Finally, data are presented on the factor structure of the fears associated with social phobia, the only disorder in this series where substantial controversy exists about disorder subtypes.
Panic Disorder; Specific Phobia; Social Phobia; Agoraphobia; WHO Composite International Diagnostic Interview (CIDI); US National Comorbidity Survey Replication Adolescent Supplement (NCS-A)
With emergence of new technologies, there has been an explosion of basic and clinical research on the affective and cognitive neuroscience of face processing and emotion perception. Adult emotional face stimuli are commonly used in these studies. For developmental research, there is a need for a validated set of child emotional faces. This paper describes the development of the NIMH Child Emotional Faces Picture Set (NIMH-ChEFS), a relatively large stimulus set with high quality, color images of the emotional faces of children. The set includes 482 photos of fearful, angry, happy, sad and neutral child faces with two gaze conditions: direct and averted gaze. In this paper we describe the development of the NIMH-ChEFS and data on the set’s validity based on ratings by 20 healthy adult raters. Agreement between the a priori emotion designation and the raters’ labels was high and comparable with values reported for commonly used adult picture sets. Intensity, representativeness, and composite “goodness” ratings are also presented to guide researchers in their choice of specific stimuli for their studies. These data should give researchers confidence in the NIMH-ChEFS’s validity for use in affective and social neuroscience research.
face processing; emotion perception; face stimuli sets; developmental psychopathology; methodology
The metric of disability-adjusted life years (DALYs) has become the global standard of measuring burden of disease. DALYs are comprised of years of life lost due to premature mortality and years of healthy life lost due to living with disability. In order to calculate the second part of the DALY equation, disease specific disability weights have to be established, i.e., measures for the decline of health associated with these disease states, which vary between 0 for perfect health and 1 for death. Although these disability weights are key for estimating DALYs, there have not been many comprehensive studies with empirical determinations of them. This article describes a systematic review on the state of the art with respect to empirically determining disability weights. Based on this review, a multi-method approach is outlined, which has also been implemented in a U.S. study to measure burden of disease. This approach involves the use of psychometric methodology as well as economic trade-off methods for determining the value of health states. It is conceptualized as a disaggregated approach, where the disability weight of any health state can be calculated if the attributes of this health state are known. The U.S. study received the collaboration of experts from more than 20 institutes of the National Institutes of Health and of the Centers for Disease Control and Prevention. First results will be available by the end of this year.
disability-adjusted life years (DALYs); burden of disease; disability weight; empirical assessment; psychometrics; trade-off methods
The performance of the short screening scale for DSM-IV posttraumatic stress disorder (PTSD), developed by Breslau et al. (1999), has not been assessed in an independent general population sample, although it has been used in epidemiological as well as clinical research. In this report we evaluate the short screening scale in the National Epidemiologic Survey on Alcohol and Related Conditions (NESARC), a population-based survey of US household and group quarter residents. DSM-IV PTSD was assessed via symptom questions in the Alcohol Use Disorder and Associated Disabilities Interview Schedule–DSM-IV Version (AUDADIS-IV). Sensitivity, specificity, positive and negative predictive value, and percent correctly classified were calculated, using the interview-based diagnosis as the standard. Replicating findings from the initial report, a score of four or more on Breslau’s short screening scale identifies cases of PTSD with sensitivity of 78%, specificity of 97%, positive predictive value of 75%, and negative predictive value of 98%. The percentage of correctly classified respondents was 96%. The findings support the utility of the seven-item scale for screening PTSD in clinical and general population samples.
Attrition in longitudinal studies can lead to biased results. The study is motivated by the unexpected observation that alcohol consumption decreased despite of increased availability, which may be due to sample attrition of heavy drinkers. Several imputation methods have been proposed, but rarely compared in longitudinal studies of alcohol consumption. The imputation of consumption level measurements is computationally particularly challenging due to alcohol consumption being a semi-continuous variable (dichotomous drinking status and continuous volume among drinkers), and the non-normality of data in the continuous part. Data come from a longitudinal study in Denmark with four waves (2003–2006) and 1771 individuals at baseline. Five techniques for missing data are compared: Last value carried forward (LVCF) was used as a single, and Hotdeck, Heckman modelling, multivariate imputation by chained equations (MICE), and a Bayesian approach as multiple imputation methods. Predictive mean matching was used to account for non-normality, where instead of imputing regression estimates, “real” observed values from similar cases are imputed. Methods were also compared by means of a simulated dataset. The simulation showed that the Bayesian approach yielded the most unbiased estimates for imputation. The finding of no increase in consumption levels despite a higher availability remained unaltered.
panel surveys; missing data; multiple imputation; Bayesian models; alcohol consumption
Multivariate imputation by chained equations (MICE) has emerged as a principled method of dealing with missing data. Despite properties that make MICE particularly useful for large imputation procedures and advances in software development that now make it accessible to many researchers, many psychiatric researchers have not been trained in these methods and few practical resources exist to guide researchers in the implementation of this technique. This paper provides an introduction to the MICE method with a focus on practical aspects and challenges in using this method. A brief review of software programs available to implement MICE and then analyze multiply imputed data is also provided.
missing data; multiple imputation; analyze
The clinician-rated (QIDS-C16) and self-report (QIDS-SR16) versions of the 16-item Quick Inventory of Depressive Symptomatology have been extensively examined in adult populations. This study evaluated both versions of the QIDS and the 17-item Children’s Depressive Rating Scale-Revised (CDRS-R) in an adolescent outpatient sample.
Both the QIDS-C16 and QIDS-SR16 were completed for the adolescents. Three different methods were used to complete the QIDS-C16: (a) adolescents’ responses to clinician interviews; (b) parents’ responses to clinician interview; and (c) a composite score using the most pathological response from the two interviews. Both classical and item response theory methods were used. Factor analyses evaluated the dimensionality of each scale.
The sample included 140 adolescent outpatients. All versions of the QIDS, save the parent interview, and the CDRS-R were very reliable (α ≥ 0.8). All four versions of the QIDS are reasonably effective and unidimensional. The CDRS-R was clearly at least two-dimensional. The CDRS-R was the most discriminating among low and extremely high levels of depression. The QIDS-SR16 was the most discriminating at moderate levels of depression. There was no relation between the QIDS scores and concurrent Axis III comorbidities.
The QIDS-C16 and the QIDS-SR16 are suitable for use in adolescents.
Adolescent; depression; depressive symptom ratings; psychometrics; Quick Inventory of Depressive Symptomatology–Clinician-rated; Quick Inventory of Depressive Symptomatology–Self-report
We present a case study using a multilevel modeling approach to determine whether depressive symptoms are affected by genetic factors. Existing studies examining this question have focused on twins. The present study built on the literature by conducting a preliminary study of the heritability of depressive symptoms within extended families. At the same time, this study assessed the need for adjustment of a heritability measure in a family study using a multigenerational sample. The sample consisted of 230 community-dwelling extended families that included 431 adult offspring, comprising full siblings, half siblings and cousins that participated in the University of Southern California Longitudinal Study of Generations. All participants filled out the Center for Epidemiologic Studies Depression (CES-D) scale. The multilevel analysis allowed us to model the natural hierarchy of the extended family. Results indicate that the proportion of the phenotypic variance for CES-D that occurs due to genetic differences is not significantly larger than zero among these participants [h2 = 8.6%, 95% confidence interval (CI) = 0–57%, p = 0.71]. Our findings suggest that future studies examining depressive symptoms in this sample can focus on non-genetic explanatory factors without the necessity to control for genetic variation. However, our study may be limited by measurement of prevalent depressive symptoms, which may not generalize to lifetime depressive symptoms.
depressive symptoms; heritability; genetic variance; family study; multilevel model
Although needs assessment surveys are carried out after many large natural and man-made disasters, synthesis of findings across these surveys and disaster situations about patterns and correlates of need is hampered by inconsistencies in study designs and measures. Recognizing this problem, the US Substance Abuse and Mental Health Services Administration (SAMHSA) assembled a task force in 2004 to develop a model study design and interview schedule for use in post-disaster needs assessment surveys. The US National Institute of Mental Health subsequently approved a plan to establish a center to implement post-disaster mental health needs assessment surveys in the future using an integrated series of measures and designs of the sort proposed by the SAMHSA task force. A wide range of measurement, design, and analysis issues will arise in developing this center. Given that the least widely discussed of these issues concerns study design, the current report focuses on the most important sampling and design issues proposed for this center based on our experiences with the SAMHSA task force, subsequent Katrina surveys, and earlier work in other disaster situations.
Disaster; Epidemiology; needs assessment survey; PTSD
This paper evaluates the internal consistency reliability and concurrent validity of the assessment of Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) attention deficit hyperactivity disorder (ADHD) in the adolescent version of the World Health Organization (WHO) Composite International Diagnostic Interview Version 3.0 (CIDI). The CIDI is a lay-administered diagnostic interview that was carried out in conjunction with the US National Comorbidity Survey Adolescent Supplement, a US nationally representative survey of 10,148 adolescents and their parents. Internal consistency reliability was evaluated using factor and item response theory analyses. Concurrent validity was evaluated against diagnoses based on blinded clinician-administered interviews. Inattention and hyperactivity-impulsivity items loaded on separate but correlated factors, with hyperactivity and impulsivity items forming a single factor in parent reports but separate factors in youth reports. We were able to differentiate hyperactivity and impulsivity factors for parents as well by eliminating a subset who endorsed zero ADHD items from the factor analysis. Although concurrent validity was relatively weak, decomposition showed that this was due to low validity of adolescent reports. A modified CIDI diagnosis based exclusively on parent reports generated a diagnosis that had good concordance with clinical diagnoses [area under the curve (AUC) = 0.78]. Implications for assessing ADHD using the CIDI and the effect of different informants on measurement are discussed.
attention deficit hyperactivity disorder; WHO Composite International Diagnostic Interview (CIDI); validity; National Comorbidity Survey Replication Adolescent Supplement (NCS-A)
A primary challenge in psychiatric genetics is the lack of a completely validated system of classification for mental disorders. Appropriate statistical methods are needed to empirically derive more homogenous disorder subtypes.
Using the framework of Robins & Guze’s (1970) five phases, latent variable models to derive and validate diagnostic groups are described. A process of iterative validation is proposed through which refined phenotypes would facilitate research on genetics, pathogenesis, and treatment, which would in turn aid further refinement of disorder definitions.
Latent variable methods are useful tools for defining and validating psychiatric phenotypes. Further methodological research should address sample size issues and application to iterative validation.
latent class analysis; phenotype; validation
An overview is presented of the design and field procedures of the US National Comorbidity Survey Replication Adolescent Supplement (NCS-A), a US face-to-face household survey of the prevalence and correlates of DSM-IV mental disorders. The survey was based on a dual-frame design that included 904 adolescent residents of the households that participated in the US National Comorbidity Survey Replication (85.9% response rate) and 9,244 adolescent students selected from a nationally representative sample of 320 schools (74.7% response rate). After expositing the logic of dual-frame designs, comparisons are presented of sample and population distributions on Census socio-demographic variables and, in the school sample, school characteristics. These document only minor differences between the samples and the population. The results of statistical analysis of the bias-efficiency trade-off in weight trimming are then presented. These show that modest trimming meaningfully reduces mean squared error. Analysis of comparative sample efficiency shows that the household sample is more efficient than the school sample, leading to the household sample getting a higher weight relative to its size in the consolidated sample relative to the school sample. Taken together, these results show that the NCS-A is an efficient sample of the target population with good representativeness on a range of socio-demographic and geographic variables.
Psychiatric epidemiology; child-adolescent mental disorder; National Comorbidity Survey (NCS)