The US Preventive Services Task Force advises primary clinicians to screen adolescents for depression provided there is a system of care to confirm diagnosis and initiate treatment.1
To implement this recommendation, providers need screening tools that can be easily implemented in pediatric primary care settings. Although it has been extensively tested among adults, this is the first study to examine the test characteristics of the PHQ-9 in an adolescent population. We found that, when compared to a structured diagnostic interview, the PHQ-9 had high sensitivity (89.5%) and good specificity (78.8%) for detecting major depression among adolescents and on ROC analysis had an area under the curve of 0.88 putting this screening tool in a “good” range. 18
This sensitivity and specificity of the PHQ-9 is in a similar range to other depression screening tools that have been tested among adolescents in primary care: BDI (Sensitivity – 91%, Specificity – 91%)3
, PHQ-A (Sensitivity – 73%, Specificity – 94%)6
, and Short MFQ19
(Sensitivity – 80%, Specificity – 81%)20
, and performs better than physician interview following targeted training (Sensitivity – 43% and Specificity – 87%).5
In adult samples, a PHQ-9 score of 10 or higher is recommended to identify individuals with likely depression. Based on our findings, we would recommend using a cut-point of 11 or higher to indicate the need for further evaluation for depression. However, providers may reasonably choose an alternate cut point. For example, even though it may result in a higher rate of false positives, clinics where both adolescents and adults are seen might choose to use a cut-off of 10 in order to simplify procedures for providers.
In a prior study using this sample, we found that the PHQ-2 which contains the first two items of the PHQ-9 has a sensitivity of 74% and a specificity of 75% for detecting major depression among adolescents.10
Clinics wishing to minimize respondent burden could start with the PHQ-2 followed by the full PHQ-9 only on those with a score of 3 or higher on the PHQ-2. The benefit of adding the PHQ-9 in this protocol is that it provides more information on individual depressive symptoms, has better specificity for major depression than the PHQ-2, and it includes a question about suicide, an important cause of mortality among adolescents.21
It is important to note, however, in using the PHQ-9 that youth do not need to be depressed to be suicidal. Any positive indication of suicidality (a score of 1 or higher on item 9 of the PHQ-9) should be taken seriously and followed up on by providers regardless of total PHQ-9 score.
Compared to the findings in adults, the sensitivity of the PHQ-9 is higher but the specificity is lower in the adolescent population. This suggests that when used as a screening tool, the PHQ-9 is less likely to miss youth with major depression but there is a higher false positive rate in adolescent populations. The higher false positive rate in adolescent populations may be a result of a high rate of subthreshold depressive symptoms and adjustment disorders, as well as a significant overlap of symptoms between mental health disorders among this age group. Of the youth who were in the false positive category, 82% had an indication of a mental health concern including meeting criteria for “intermediate depression” on the DISC-IV, having depression in the past year but not in the past month, having high levels of externalizing behavior and/or having high levels of anxiety symptoms suggesting the need for further monitoring.
An additional difference between the adult and the youth DSM-IV criteria for major depressive disorder is that youth may meet the diagnostic criteria by presenting with irritability rather than depressed mood. The PHQ-9 does not include an item about irritability and, to allow for the use of a single form for settings where both adolescents and adults are seen, we chose not to change the wording of the PHQ-9. As we did not add an irritability item, we are not able to determine how it may have modified the performance of the PHQ-9. The DISC-IV does include an irritability item and some of the discrepancy between these two instruments may relate to this difference.
This study has the following limitations. First, this study was conducted in an insured population of adolescents in the Pacific Northwest and may not be generalizable to all adolescent populations. Second, the response rate to our initial brief screen was 60% and we may have had some selection bias regarding youth who participated in the study. Although we were very encouraged by the 89% participation rate in the follow-up interview study, it is possible that youth who chose not to participate were different from those who did. Third, since we oversampled youth with elevated PHQ-2 scores, the prevalence of depression in our study sample may be higher than would be seen if conducting screening in the primary care clinic. The positive and negative predictive values are influenced by underlying population prevalence and may be lower in a general primary care sample. Additionally, the PHQ-9 was administered via a phone interview which may have resulted in different responses than if it had been self-administered. Finally, the DISC-IV asks questions about a one-month time period while the PHQ-9 asks about the prior 2 weeks. Some of the lack of sensitivity and specificity may be due to these time window differences.
Despite these limitations, the PHQ-9 is a promising screening tool for use among adolescents. It is brief, easy for patients to understand, simple to score, and available without cost. An additional major advantage of the PHQ-9 is that many primary care providers are already using it for the adult population and thus have familiarity with administration and scoring. It performs well in this age group and will be particularly useful for providers or researchers who want to conduct rapid screening in primary care settings or as part of research protocols.