|Home | About | Journals | Submit | Contact Us | Français|
The objective of this study was to assess the usefulness of the Academic Performance Questionnaire (APQ) to identify low reading and math achievement in children who are being evaluated for attention-deficit/hyperactivity disorder (ADHD).
Charts of 997 patients who were seen in a multidisciplinary ADHD evaluation program were reviewed. Patients who were in first-through sixth-grade and had complete APQ and Wechsler Individual Achievement Test II Basic Reading and Numerical Operations subtests were enrolled in this study. The 271 eligible patients were randomly assigned to a score-development group (n = 215) and a validation group (n = 56). By using data from the score-development sample, APQ questions that predicted low academic achievement were identified and the scores for these questions were entered into a logistic regression to identify the APQ questions that independently predicted low achievement.
Only 2 APQ questions, 1 about reading and 1 about math, independently predicted low achievement. By using these 2 questions, the area under the receiver operating characteristic curve was 0.834, and the optimal combination of sensitivity and specificity occurred when the total score for the 2 items was >4. This cutoff had a sensitivity of 0.86 and a specificity of 0.63 in the score-development group and a sensitivity of 1.0 and a specificity of 0.53 in the validation sample.
The APQ may be a useful screening tool to identify children being evaluated for ADHD who need additional testing for learning problems. Although the predictive value of a negative screen on the APQ is good, the predictive value of a positive test is relatively low.
Attention-deficit/hyperactivity disorder (ADHD) is a neurobehavioral disorder that appears early in childhood and is characterized by inattention and hyperactivity-impulsivity, resulting in functional impairments.1 It has an estimated prevalence of 2% to 10% among children and adolescents.2 Limited access to behavioral health specialists has resulted in increased demands on primary care physicians (PCPs) to diagnose and manage ADHD.3 When left unrecognized or untreated, ADHD and its comorbidities have been associated with strained familial and peer relationships, educational and employment difficulties, substance use, and unintentional injuries.4
Learning disorders co-occur with ADHD in 20% to 30% of children and are typically identified by a psychoeducational assessment.4–6 PCPs who evaluate children for ADHD often find it difficult to identify which children they should refer for additional assessment of academic skills.3,7 Although a number of parent- and teacher-completed questionnaires to assist in evaluating ADHD symptoms and symptoms of other associated mental health conditions have been developed, no validated screening tools are available to assist them in identifying children who have ADHD symptoms and may need additional evaluation for learning problems.8 The ADHD Toolkit includes a teacher rating scale with 3 items that assess academic performance, but the reliability and the predictive validity of these questions is not known.8
Previous research on the accuracy of teacher rating of students’ academic achievement has produced conflicting findings. Kenny and Chekaluk9 found that teachers could classify kindergarten through second-grade students into 3 categories—poor-, average-, or advanced-reader—with excellent concurrent validity to standardized tests that measure phonological, language, reading, and memory skills. In their 1997 study, Gresham and MacMillan10 found that teachers’ ratings of second-, third-, and fourth-grade students could differentiate children who had or were at risk for learning problems from control subjects with 95% accuracy. They additionally found that teachers’ ratings of students’ performance compared with their peers was most predictive for children who were classified as having a learning disability11; however, Glascoe12 found that global teacher ratings of students’ academic performance by using a simple 5-point Likert scale were only moderately sensitive for detecting low achievement on standardized tests. None of these studies has investigated the value of teacher rating in detecting poor academic achievement in children who are being evaluated for ADHD.
The Academic Performance Questionnaire (APQ) is a 10-item questionnaire that is completed by teachers.11 It uses 4- and 5-point ordinal scales to identify the child’s performance in reading, mathematics, writing, and homework. This measure has been used to obtain descriptive information about children who are being evaluated in a multidisciplinary ADHD center that specializes in the assessment and treatment of ADHD and is based in a tertiary care pediatric hospital in the mid-Atlantic region of the United States. It is unknown whether the APQ might be a valid and clinically useful screening tool for learning problems that could be used by physicians. The purposes of this study were to (1) examine the test-retest reliability of the APQ and (2) evaluate the validity of the APQ with regard to predicting low achievement in reading and/or math.
The charts of 997 consecutive patients who were evaluated through a multi-disciplinary ADHD center between May 2001 and March 2005 were reviewed. Figure 1 indicates how potential participants were included and excluded from the study. Patients were included only when they were enrolled in grades 1 through 6 (n=558). Children were excluded when they had a history of receiving special education services at the time of their evaluation (n = 236), because a screening instrument would not be required to determine the need for a learning evaluation in these cases. Patients were also excluded when they did not have at least 1 teacher-completed APQ (n = 72) or when they did not have complete Wechsler Individual Achievement Test II (WIAT-II) Basic Reading and Numerical Operations subtest scores (n=9). The final study population consisted of 271 patients. These patients were randomly assigned to an initial validation sample (n = 215) and a cross-validation sample (n = 56) so that ~4 patients were assigned to the score-development group for every 1 patient assigned to the validation sample (see Fig 1). This method was used to ensure that there was sufficient power in the initial validation sample to conduct the analyses. All patients had consented to use of their assessment data for research, and institutional review board approval for this study was obtained. Patients did not receive incentives or compensation for participation in this study.
Data to assess test-retest reliability of the APQ were collected on a convenience sample of first- through fourth-grade students at a local suburban elementary school. Teachers were asked to systematically select 4 students from their grade book in the following manner: select the first boy and the first girl in the grade book who required remedial assistance or academic supports, and select the second boy and third girl in the grade book who were receiving regular education services only. This method was used to ensure that roughly an equal number of children with and without learning problems were included in the sample. APQ data for these students were collected anonymously, as recommended by the institutional review board, because it was not feasible to obtain parent consent; therefore, no demographic information about the children in this sample was available. The school population consists of 33% black, 3% Asian, 13% Hispanic, and 51% white children. Thirty percent of the school population is eligible for free or reduced-price lunch.
All charts were reviewed by research team members (Drs Bennett and Eiraldi) and research assistants. The data extracted for this study included demographic data at the time of initial evaluation (age in months, gender, ethnicity, and socioeconomic class determined by the Hollingshead Index), history of grade failure, and the use of special education services. Results of the evaluation were also extracted and included the results of the Diagnostic Interview for Children and Adolescents–Revised, Parent Version,13 APQ,11 and subtest scores on the WIAT-II.14 ADHD status was determined by using clinician diagnosis on the basis of results from the Diagnostic Interview for Children and Adolescents–Revised, Parent Version as well as parent and teacher rating scales.
For test-retest data collection, the study protocol was presented to teachers during a routine faculty meeting. Teachers who chose to participate completed APQs on systematically selected students at baseline and 3 weeks later. A 3-week retest period was estimated to be sufficient to assess reliability; that is, long enough for teachers not to recall their previous responses and brief enough not to be affected by the student’s attainment of new skills and resulting in changes in classroom performance.
The APQ is a 10-item questionnaire that is completed by teachers (Table 1) and was designed to assess student progress in the classroom curriculum in relationship to other students.11 It has been used clinically in the ADHD center since 2001 to obtain descriptive data about children’s academic performance in the classroom. Teachers respond to each questionnaire on an ordinal 4- or 5-point scale as describe in Table 1.
The WIAT-II basic reading and numerical operations subtests were administered to all patients to identify children who were underachieving in math and reading. The subtests were selected on the basis of their excellent psychometric properties. Average stability coefficients for the word reading and numerical operations subtests have been reported to be 0.98 and 0.92, respectively. Each subtest has demonstrated very high correlations with respective composite scores for all age groups (0.91–0.95 for total reading and 0.86–0.93 for total mathematics).14 Furthermore, standard score differences between individuals with learning disabilities and matched control subjects were significantly large for both subtests (P < .01).14 Low achievement was defined as a standard score of <85 on either basic reading or numerical operations subtests. This cut point has been suggested as a useful indicator of low achievement by experts in psychology and education.6
All statistical analyses were performed by using SPSS 16.0 (SPSS, Inc, Chicago, IL). Demographic characteristics of the initial validation sample and the cross-validation sample were compared by using the Mann-Whitney Wilcoxon Test.
Univariate logistic regression was then computed to determine which items on the APQ predicted low achievement in reading and/or math. Items that were determined to be significant predictors of low achievement were entered into a multivariate logistic regression model to assess their unique predictive ability. Items that were determined to be uniquely predictive were combined and used to plot a receiver operating characteristic (ROC) curve to determine scoring cutoff points that provide greatest sensitivity and specificity for the outcome of low achievement. Test-retest reliability of teachers’ categorization of students’ academic skills using the APQ scoring algorithm was assessed using the κ statistic. The sensitivity and the specificity of the identified APQ cut score for low academic achievement were then assessed in the validation sample.
There were no significant demographic differences between the initial validation sample and the cross-validation sample (Table 2). The majority of patients were male; a high percentage were white; and socioeconomic status of ~70% was in the upper 2 of 5 categories on the Hollingshead (1975) scale. There was no difference in the frequency of ADHD diagnosis in the 2 groups (Table 2), and ADHD subtypes did not vary significantly between groups: ADHD, combined type (39% vs 41%); ADHD, predominantly inattentive type (22% vs 27%); ADHD, predominantly hyperactive-impulsive type (3% vs 4%); and ADHD, not otherwise specified (9% vs 11%).
Approximately 17% of the sample were determined to be low achievers in reading and/or mathematics: math only, 9.5%; reading only, 3.7%; and both reading and math, 3.3%. Logistic regression conducted with each item separately revealed that 7 items were significantly associated with low achievement (see Table 1). Two questions remained uniquely predictive of low achievement when the 7 items were entered simultaneously in the multiple regression model (Table 3). Combining these 2 items, total scores ranging from 2 to 8 we repossible, with higher scores indicating lower academic achievement. Area under the ROC curve was optimal (0.834) when the cut point was set at a total raw score of 4 (Fig 2). Using a summed score of >4 yielded a sensitivity of 0.86, a specificity of 0.63, a positive predictive value (PPV) of 0.32, and a negative predictive value (NPV) of 0.96 in the score-development sample (Table 4). Inclusion of additional APQ questions that showed a trend toward being predictive of low achievement in the regression model did not improve the area under the curve or psychometric properties. Analysis of validation sample using this scoring cut point revealed similar values: sensitivity of 1.0, specificity of 0.53, PPV of 0.29, and NPV of 1.0.
For assessment of test-retest reliability, teachers completed APQs on 24 students at the initial time point, and an APQ was collected 3 weeks later for 100% of these students. κ for the scoring algorithm was acceptable at 0.743. Responses that were based on categorization by using the scoring algorithm on the APQ were the same at baseline and the 3-week retest for 88% of the students.
This study assessed the ability of a brief teacher-completed questionnaire to detect academic under-achievement in children who are being evaluated for ADHD. There are no validated academic rating scales to assist physicians in identifying children who have symptoms of ADHD and may have learning problems. Results of this study demonstrate that using only 2 questions from the APQ, 1 about math and 1 about reading, as a screen produces a test with acceptable test-retest reliability and sensitivity. The sensitivity of this measure in our population is particularly impressive in that we excluded the children whom teachers would most easily detect (those already receiving assistance for learning problems in special education).
The results of our study are consistent with previous research that has suggested that teachers can accurately rate students’ academic performance9,10 but differ somewhat from Glascoe’s study, which found that teacher ratings on a 5-point Likert scale had a sensitivity of 0.49 to 0.61 and a specificity of 0.84 to 0.85 for detecting low math and reading achievement; however, Glascoe’s study evaluated the use of teacher ratings in detecting low academic achievement for all students, whereas we evaluated the ratings in the context of evaluating children with concerns about ADHD. The higher sensitivity found in our study may be partly attributable to teachers’ being more sensitive to academic difficulties in a population already identified as having problematic classroom behaviors. The lower specificity may relate to the difficulty of distinguishing between academic skills deficits and problems with attention and impulsivity as a cause for poor classroom performance. Another reason for the discrepant findings may be the difference in response options between the 2 rating scales. Glascoe’s measure used a 5-point Likert scale (far above average to far below average) to rate students’ academic performance, whereas the APQ used a 4-point scale for the 2 items included in the predictive model.
This study documents the importance of physicians’ screening for low academic achievement in the evaluation of children for ADHD. In this sample, 17% of the children who were not receiving special education assistance in school were performing poorly in reading and/or math. It is interesting that our population had significantly more children with low achievement in math (13%) as compared with reading (7%). Because the frequency of reading and math learning problems generally is similar in children with ADHD,15 the higher prevalence of math problems in this sample, which excluded students who previously were identified by their school districts for special education services, may suggest that schools detect or intervene more quickly for children with reading problems than for children with math problems.
Clinicians who use the APQ as part of the evaluation of children for ADHD must carefully consider the implications of its good sensitivity but only moderate specificity. In its current form, a cut point of >4 on the APQ would be expected to have a good NPV (few false-negative results) but a low PPV (many false-positive results). Thus, many of the children who are referred for additional evaluation would not be found to have low academic achievement. The APQ may function well as an initial screener; for children who screen positive, it may be sensible to administer an additional screener before a complete psychoeducational assessment is performed. This additional assessment could be completed by a school-based prereferral (for special education) intervention team,16 or, in the context of developmental-behavioral pediatric practice, may include administration of brief office-based academic testing.
The results of this study should be considered in the context of the following limitations. First, the children in this study did not have a full psychoeducational assessment. For clinical efficiency, only the 2 subtests of the WIAT-II that best correlate with overall reading and math scores were selected. A more complete academic assessment may have changed the classification of some children who scored near the cutoff. In addition, we did not assess other important skills, such as writing, spelling, and phonics. Thus, the ability of the APQ questions to detect children with low achievement in these areas could not be assessed.
Second, this study reports use of the APQ in a specialized clinic setting. Before the APQ can be recommended for incorporation into primary care practice, its psychometric properties in these settings should be evaluated. Furthermore, study children were generally of white background and from families of moderate to high socioeconomic status. Additional assessment of the APQ in more diverse settings is needed.
Finally, we assessed the ability of the APQ to detect low academic achievement and did not assess its ability to detect learning disabilities by using an IQ achievement discrepancy model. Whereas previous definitions of learning disabilities required a discrepancy between results of IQ and academic achievement testing, reauthorization of the Individuals With Disabilities Education Act in 2004 called for replacement of the traditional intellectual and achievement discrepancy model with use of a process that is based on a child’s response to evidence-based intervention.6,17 Furthermore, the Individuals With Disabilities Education Act of 2004 identifies regular classroom teachers as key members of the group to determine eligibility for a learning disability classification, and achievement below expected age or grade level is an acceptable initial criterion for implementation of early interventions.17 Nevertheless, we recognize that a questionnaire that rates a child’s relative academic performance would not accurately identify children with high IQs and discrepant achievement.
Despite these limitations, we believe the APQ represents a significant step forward in identifying methods to assist physicians in screening for learning problems in children who are being assessed for ADHD. Although the ADHD Toolkit8 includes a teacher-report measure with 3 items for assessing academic skills, the reliability and predictive validity of these items has not been established. This study supports the use of the APQ as a teacher-report screening tool for assessing learning problems among children who present with problems related to ADHD.
The APQ may be a useful initial screening tool for assessing learning problems among children who present with symptoms of ADHD or other school problems. Before the APQ can be implemented as a primary care screening tool, additional research is needed to confirm its predictive validity in a primary care setting assessing children with a diverse range of demographic characteristics.
ADHD is increasingly managed by PCPs and has many comorbidities, including learning disorders. Parent- and teacher-completed rating scales are recommended for diagnosing ADHD, but many PCPs have difficulty screening for learning problems in children with ADHD symptoms.
We assessed the sensitivity, specificity, and predictive value of a scoring algorithm and assessed the test-retest reliability and validity of that algorithm for a brief teacher-completed screening questionnaire that may improve screening for learning problems in pediatric practice.
This research project was supported by projects 33463 and T77 MC 00012 from the Maternal Child Health Bureau (Title V, Social Security Act), Health Resources and Services Administration, Department of Health and Human Services.
We thank Abbas Jawad, PhD, for suggestions regarding statistical analysis for this study. We also thank the faculty of the University of Pennsylvania Public Health program for feedback. Finally, we thank the teachers and administrators of the Edgewater Park School District for participation in this study.
FINANCIAL DISCLOSURE: The authors have indicated they have no financial relationships relevant to this article to disclose.