Of 487 citations entered into broad screening, 224 were considered potentially relevant; their full reports were retrieved, and these were assessed formally for relevance. Reasons for excluding reports62
= 145) from this last phase included nonbehavioural outcome (75), no randomization (27), an unsystematic (e.g., a single scale) diagnostic procedure (17), no placebo control (9), participants received an excluding co-diagnosis (7), some participants received stimulants other than methylphenidate (5), n
-of-1 study (2), follow-up report (2) and not all participants received an ADD diagnosis (1).
From the remaining 79 relevant reports, we identified 62 unique trials. All reports were published between January 1981 and December of 1999 and exclusively in English-language journals. These reports are listed in Appendix 2 (available on the CMAJ
Web site at www.cma.ca/cmaj/vol-165/issue-11/pdf/deficitappendix.pdf
). Crossover designs predominated (83.9%), yet only 13.5% of these evaluated a carryover effect.
The trials that we reviewed included 2897 participants (). Only 11% of studies randomized more than 80 participants; the mean sample size was 46.7 (range 11–234). The median age of all trial participants was 8.7 years (range 2.4–18 years); these data are taken from 52 trials. The median “percent male” composition for the trials included in the meta-analysis was 88.1% (range 0%–100%); these data are taken from 59 trials.
The number of trials that included participants with a homogeneous primary diagnosis was 45 (72.6%); they included the following: ADHD (38.7%) and ADDH (33.9%). The DSM-III-R and DSM-III diagnostic systems guided 38.7% and 45.1% of trial methodologies respectively.
In 59.7% of the trials, a comorbid condition was identified. Of these, externalizing disorders were either the most (67.7%) or second most frequently diagnosed comorbidity (19.1%). In 19 trials, patients received a homogeneous primary diagnosis without any comorbidity being reported or evaluated (i.e., 12 and 7 with ADDH and ADHD respectively).
Interventions lasted, on average, 3.3 weeks (range 2 days to 28 weeks). Only 9 RCTs (14.5%) had interventions that lasted longer than 4 weeks, and more than half of the 62 trials lasted no more than 10 days. One study lasted 12 months, yet only a subset of participants was followed beyond 4 months.63,64
The most frequently used dose-related conventions included the following: 5-mg, 10-mg, 15-mg versus 20-mg dose potency contrast (n = 6 trials), 0.3 mg/kg dose potency (irrespective of potency contrasts) (n = 26), and a twice-a-day dosing schedule (n = 39). Across the 62 RCTs, 38 different types of methylphenidate intervention (i.e., combinations of dose potency contrast by schedule: for example, 0.3 mg/kg twice a day versus 0.6 mg/kg twice a day) were employed.
A co-intervention (e.g., a summer treatment program, with behavioural regimens) was provided in 25.8% of the studies, whereas in 11.3% of the RCTs, participants continued to receive a pretrial intervention (e.g., a specific special education service).
No trial report received a maximum 5-point Jadad total quality score, 5 (8.1%) studies achieved a score of 4, 38.7% were judged to have low quality (0–2 points), and the mean total quality score was 2.6 (low). Only 5 (8.1%) trials reported adequate allocation concealment, with the remaining descriptions deemed “unclear.”
Analyses of the primary efficacy outcomes
HI-T data (effect size 0.78, 95% CI 0.64–0.91) and HI-P data (effect size 0.54, 95% CI 0.40–0.67) exhibited significant effects in favour of methylphenidate (). This represents a decrease of 6 and 4 points on the respective HI scales from a mean of 14 in patients who received placebo within the largest trial of 161 participants.65
These changes are “interpreted” in terms of the largest trial that provided HI data, because it is reasonable to assume that it would be the least biased.
Fig. 1: Effect sizes for the hyperactivity index: teacher ratings (95% confidence interval [CI]). MPH = methylphenidate, ACTRS = Abbreviated Conners Teacher Rating Scale, CTRS = Conners Teacher Rating Scale, SSQ = School Situations Questionnaire. p values (more ...)
Fig. 2: Effect sizes for the hyperactivity index: parent ratings (95% CI). ACPRS = Abbreviated Conners Parent Rating Scale, CPRS = Conners Parent Rating Scale, HSQ = Home Situations Questionnaire. p values for statistical heterogeneity: *p < 0.001, (more ...)
Analyses of additional efficacy outcomes: core features, clinical response and related features data
Teacher-reported results similar to those seen for the primary outcome were recorded for clinical response (teacher or staff), global indices, core features and key externalizing features (). However, the magnitudes of effect were variable and tended to be smaller than for the primary outcome; for both attention and emotional lability, the effect was not statistically significant.
Methylphenidate exhibited significant, though variable and weaker, effects on parent-reported clinical response, global indices, core features and key externalizing features. Estimates regarding inattention, hyperactivity/impulsivity and oppositional–defiant behaviour were not statistically significant.
Minimal evidence of statistical heterogeneity was observed. Only 4 of 16 teacher outcome analyses () and 3 of 13 parent outcome analyses () respectively were associated with significant levels of statistical heterogeneity. Insufficient comparability in tic-related outcomes precluded the analysis of teacher and parent data.
We are contrasting the results reported by teachers and those reported by parents (). Even though the changes in methylphenidate effect were more pronounced for teacher-reported data, trial quality and efficacy were inversely related for both informants. The initial HI-T and HI-P estimates of effect decreased by 9.0% and 5.6% for high-quality trials respectively, yet increased by 26.9% and 24.1% for low-quality studies. Reports with adequate allocation concealment increased the methylphenidate effect by 39.7% and 13% for teacher and parent data respectively. All effects remained statistically significant.
Fig. 3: Sensitivity and subgroup analyses of the hyperactivity index: teacher ratings (95% CI). ADHD = attention-deficit hyperactivity disorder (DSM-III-Revised),58 ADDH = attention-deficit disorder with hyperactivity (DSM-III).48 p values for statistical (more ...)
Fig. 4: Sensitivity and subgroup analyses of the hyperactivity index: parent ratings (95% CI). *Variance from each trial reduced by 20% to compensate for correlation in crossover phases potentially leading to an underestimate of variance.
Our analyses of study design revealed that only the 2 methylphenidate effects for crossover trials were statistically significant (. When we reanalyzed both teacher and parent data following the variance adjustment for crossover designs, the estimates remained virtually unchanged. Only the adjusted HI-T estimate displayed evidence for statistical heterogeneity.
Changes tended to be more pronounced for teacher data, and most remained statistically significant (). The original HI-T estimates increased for both age categories, and the HI-P estimates increased only for older participants. For both informant sources, trials exclusively including males exhibited a stronger methylphenidate effect than those in which both sexes were represented. Only the HI-P effect for mixed-sex trials did not increase its initial methylphenidate estimate.
For all categories of primary diagnosis, and for both informants, we observed statistically significant methylphenidate effects. Both the HI-T and HI-P estimates for ADHD diagnosis studies decreased their initial methylphenidate effect estimates. The strongest teacher and parent effects were seen for ADDH and mixed diagnosis trials respectively.
For teacher data, methylphenidate effect magnitude and dose potency appeared to be positively related (), suggesting a possible trend.
Each of the HI-T and HI-P methylphenidate effect sizes was positively related to intervention length for interventions lasting no more than 4 weeks. In both cases, the 1-week and 2–4-week duration were decreased and increased from their initial estimates respectively. The trials that lasted more than 4 weeks entailed parallel designs.
The teacher-reported methylphenidate effects for trials with and without a co-intervention decreased and slightly increased the original methylphenidate effect respectively. Trials without a co-intervention maintained the original HI-P estimate.
Two instances of HI-T-related statistical heterogeneity were observed in analyses involving dose potency. No HI-P subgroup analyses were associated with evidence for statistical heterogeneity ().
The rank correlation test revealed significant inverse relations for HI-T (r = –0.45, p = 0.004) and HI-P (r = –0.36, p = 0.04). The graphical test confirmed substantial funnel plot asymmetry () for HI-T (asymmetry 1.07, p < 0.001) and HI-P (asymmetry 0.72, p < 0.001). Regarding teacher data, it was estimated that 8 trials (95% CI lower bound 5) showing a lesser or nonsignificant methylphenidate effect might have been suppressed (). The adjustment decreased the initial teacher estimate by 21% (to 0.62, ). For HI-P data, evidence of publication bias lead to a 3.7% decrease in the original estimate (to 0.52, ). It was estimated that one trial (95% CI lower bound 0) that showed a lesser or nonsignificant methylphenidate effect might have been suppressed ().
Fig. 5: (A) Funnel plot of methylphenidate effect size for the hyperactivity index (T) versus its precision. The trials' symbols are proportional to sample size (median 32, range 11–161). (B) Funnel plot of methylphenidate effect size for the (more ...)
Analyses of adverse events
We observed that almost all the differences between methylphenidate and placebo were in favour of a higher percentage of adverse events while participants were taking methylphenidate (). The largest statistically significant divergences were exclusively in favour of methylphenidate and entailed parent/self ratings: all related decreased appetite (30.3%, 95% CI 18.0–42.6), insomnia (17.0%, 95% CI 8.3–25.8) or stomach ache events (9.0%, 95% CI 1.2–16.9) and serious decreased appetite events (8.7%, 95% CI 3.6–13.9). Teacher/staff ratings for serious decreased appetite events (6.1%, 95% CI 0.2–12.0) also showed a statistically significant difference in favour of methylphenidate. Children were more likely to experience anxiety (reported by parent/self) and headache (reported by teacher/staff) events while on placebo,54
although neither of these results achieved statistical significance.
Overall, the number-needed-to-harm data for “all related” adverse events exhibited a wide variation (): for example, 4 (decreased appetite) to 22 (headache) for parent/self data compared with 40 (decreased appetite) to 367 (drowsiness) for teacher/staff data. For serious adverse events, a narrower range characterized the parent/self- related number-needed-to-harm estimates (i.e., 14 [decreased appetite] to 26 [insomnia]); and, only one teacher/staff-indexed number needed to harm (i.e., decreased appetite) could be derived.
Prominent number-needed-to-harm results were derived exclusively from parent/self ratings of 5 specific adverse events (). Only 4 and 7 study participants receiving methylphenidate, respectively, were required for a decreased appetite and insomnia-related event to be identified. Three other important number-needed-to-harm values were observed: all stomach ache events (number needed to harm = 9), all drowsiness events (number needed to harm = 10) and all dizziness events (number needed to harm = 11).
We found no evidence of statistical heterogeneity associated with any of the methylphenidate effects or prominent number-needed-to-harm values.