It is estimated that 10–15% of children under 6 years of age suffer from emotional or behavioral problems [1
]. This population is being seen in clinics and treated by psychopharmacology with increasing frequency. One estimate is that 2.3% of 2–4 year-old Medicaid-insured children received one or more psychotherapeutic medications in 2001, which is more than double the usage rate in 1995 [2
]. Efforts to increase pharmacotherapy quality for all ages of children with both legislative incentives [3
] and regulations [4
] have been enacted recently. Special journal issues have been devoted to preschool psychopharmacology [5
]. Clinical treatment algorithms have been devised for young children [6
], and comprehensive clinical textbooks have appeared for infant [7
] and preschool [8
] specialists. Still, few instruments are available that provide developmentally-sensitive assessments of young children. Perhaps because of this, there are relatively few diagnostic validity studies with preschool children that can guide research on diagnostic-related psychopathology [9
Until recently, diagnostic instruments did not exist for youth under 7 years of age. The standardized measures that were available to assess young children were parent-report checklists, such as the commonly used Child Behavior Checklist 1.5–5 years (CBCL) [10
] and the newer Infant-Toddler Social and Emotional Assessment [11
]. These parent and teacher report questionnaires have many advantages for certain research questions, but they do not include coverage of all symptoms that are needed to make the Diagnostic and Statistical Manual, Fourth Edition (DSM-IV) [12
] diagnoses that are needed for clinical service and clinical research. In addition, they lack linkage for disorder-specific functional impairment, which also is required for making diagnoses. The checklist format precludes interviewing in which problems can be probed, challenged, and expanded upon to determine if respondents truly understand the items and are giving accurate information.
The ideal assessment of young children would include both caregivers and children as informants. Regrettably, interviews of the children themselves when younger than 7 years of age are not feasible because they have not yet mastered multiple types of skills needed for this task. Despite some advances in this area with 5- and 6-years-old children, most notably with the Berkeley Puppet Interview [13
], there is little reason to believe that children younger than 5 years would have sufficient skills, and there have been no known studies with children younger than 7 years on their accuracy to self-report in relation to diagnoses. Assessments of disorders in young children with current techniques are therefore practically dependent on interviews of their caregivers.
Even when relying on caregivers’ reports, there is cause for concern that interviews about young children may be less reliable and/or valid compared to older age groups for which there are more established norms for problem behaviors. Nevertheless, when examined empirically, the first reported psychometrics for an early childhood diagnostic instrument showed promising results. The inter-rater reliability of two clinicians rating videotapes of another clinician’s interviews of 15 parents of 1–3 year-old children using a standardized instrument for posttraumatic stress disorder (PTSD) was substantial with a Cohen’s kappa of 0.74 [14
]. Subsequently, the largest demonstration of feasibility comes from a study with the Preschool Age Psychiatric Assessment (PAPA), which was the first multi-disorder instrument with published psychometric properties. The PAPA was used to interview caregivers of 307 2–5 year-old children recruited from a general pediatric clinic by two research assistants for 12 disorders [15
]. Categorical agreements were substantial (kappa greater than 0.60) for seven of the disorders, fair to good (between 0.4 and 0.6) for three disorders, and poor for two disorders (generalized anxiety disorder [GAD] and specific phobia). These findings were comparable to those found with older children [16
Despite the advances represented by the PAPA, there are encumbrances to its use in clinical service or research settings, and the Diagnostic Infant and Preschool Assessment (DIPA) was created with several features to fill this gap. The PAPA is quite long to administer. The DIPA assesses 13 disorders with 47 pages and 517 questions that require responses; the hard copy of the PAPA covers the same 13 disorders with 245 pages and 1,591 questions. The DIPA represents an 81% reduction in pages and 68% reduction in number of questions.
Most PAPA disorder modules are not self-contained, and none of them contain an algorithm for making diagnoses. Questions about five anxiety disorders are intermixed in one module without disorder headings; a clinician cannot tell from the interview if a child has a particular anxiety disorder without creating an algorithm for items that are dispersed throughout the module. Symptoms of conduct disorder (CD) and oppositional defiant disorder (ODD) are also intermixed. Symptoms of sleep difficulty, appetite disturbance, and fatigue are separated from the major depressive disorder (MDD) module. Also, two of the symptoms needed for PTSD (sense of a foreshortened future and diminished interests) are in the MDD module. In contrast, each disorder in the DIPA is in self-contained modules and a one-page tally sheet provides diagnostic algorithms for all disorders.
Beyond these practical issues of organization, two theoretical differences distinguish the instruments. The PAPA was limited to children two through 5 years of age, and the stem questions and coding rules are not applicable for infants or one-year old children. The DIPA was worded so that it could be applied to younger children if desired and was not based on an a priori assumption that disorders could not be detected in younger children in the absence of data.
A second theoretical distinction is that approximately 40 of the questions in the PAPA were worded to ask if behaviors “ever” happened or “how much” they happened (excluding questions about frequencies). This may be a strength for gathering a range of normative versus non-normative data, but is problematic in a structured instrument when many children normatively show the relevant behaviors on one or more occasions and do not have the behaviors as recurring symptoms. Wording questions in terms of “ever” or “how much” misleads respondents in a clinical setting to believe that they are being asked to inform about normative behaviors in addition to problem behaviors. In contrast, the DIPA is constructed with probe questions worded specifically to educate the respondents that the focus is on behaviors that are beyond what is normal for children of these ages. DIPA questions ask whether things are “excessive”, “abnormal”, or “more than the average child his/her age.” This approach is believed to be a more direct route to detecting symptoms and saves time; not a trivial concern given the length of time that administering diagnostic instruments requires.
In addition to reporting basic reliability and validity data on the DIPA, this report examines the reliability of rating disorder-specific functional impairment in young children with disorders for the first time. For all DSM-IV disorders, both symptoms and disorder-specific functional impairments are required. The assessment of functional impairment is an additional challenge in the preschool population because fewer domains of role functioning are available. If ratings of impairment are less reliable than ratings of symptoms, then this will disproportionately affect the ability to make diagnoses even though sufficient numbers of symptoms are present. But even amongst studies of older populations of children, only the Diagnostic Interview Schedule for Children (DISC) has examined disorder-specific impairment to our knowledge. When caregivers of 9–18 year-old children were interviewed twice, agreements were acceptable for disorder-specific impairment alone for MDD, ADHD, ODD, CD but not for social phobia (κ= .33) or avoidant disorder (κ = .34) [17
]. Reliabilities did not substantially change whether impairment was required or not for diagnoses except disorder reliability decreased when impairment was required for social phobia [16
Hypotheses: (1) The test–retest reliabilities between two independent interviewers (trained clinicians compared to trained research assistants) at the disorder level will be acceptable for both continuous (intraclass correlation coefficient > 0.50) and categorical indices (Cohen’s kappa fair to good [greater than .40]). (2) The concurrent criterion validity will be acceptable (correlations > 0.50, and kappas fair to good) when the DIPA is compared to relevant Child Behavior Checklist scales on both continuous and categorical variables. (3) A more exploratory goal is to descriptively examine reliabilities of the presence of disorder with impairment, disorder without impairment required, and any impairment alone. This provides the first preliminary data on the reliability of assessing disorder-specific impairment alone and the impact on reliability when including or not including impairment for diagnoses in young children.