|Home | About | Journals | Submit | Contact Us | Français|
To investigate whether aerobic fitness and obesity in school children are associated with standardized test performance.
1,989 ethnically diverse fifth, seventh and ninth graders attending California schools comprised the sample. Aerobic fitness was determined by a one-mile run/walk test; BMI was obtained from state-mandated measurements. California standardized test scores were obtained from the school district.
Students whose mile run/walk times exceeded California Fitnessgram standards or whose BMI exceeded CDC sex- and age-specific body weight standards scored lower on California standardized math, reading and language tests than students with desirable BMI status or fitness level, even after controlling for parent education among other covariates. Ethnic differences in standardized test scores were consistent with ethnic differences in obesity status and aerobic fitness. BMI-for-age was no longer a significant multivariate predictor when covariates included fitness level.
Low aerobic fitness is common among youth and varies among ethnic groups, and aerobic fitness level predicts performance on standardized tests across ethnic groups. More research is needed to uncover the physiological mechanisms by which aerobic fitness may contribute to performance on standardized academic tests.
Schools have been ambivalent about addressing student obesity and lack of physical fitness because these health conditions are thought to be only tangentially related to academic achievement. Optimizing student academic achievement has typically been seen as a primary goal for school boards. The suggestion that physical activity and other lifestyle behaviors may affect brain functions such as learning, memory and decision-making is largely untested. Evidence is beginning to emerge, however, suggesting that childhood obesity and fitness may influence learning and measured academic performance (3, 4). In counseling parents about consequences of their child’s weight status, it may be helpful for pediatricians to be able to address the evidence for a possible link between academic achievement and a child’s body weight. Moreover, the lack of opportunities for students to engage in physical activity in the school system and the lack of measures of fitness as vital sign in pediatric medicine may contribute to this ambivalence and lack of translation of knowledge about childhood health from physician to parent.
An initial step in this process is to investigate associations between objective indices of learning and fitness and/or obesity. Along these lines, Datar et al (5) suggested that first grade children’s standardized test scores may be associated with obesity, however their results became statistically non-significant after including socioeconomic and behavioral characteristics as covariates. The study did not include an objective measure of physical fitness (5). The present study investigated relationships between state-mandated measures of aerobic fitness, BMI and academic performance (i.e. standardized test scores) obtained from an ethnically diverse sample of elementary, middle, and high school children from a Southern California school district.
Data were collected from 2,703 youth enrolled in public schools, including 10 elementary, 2 middle and 2 high schools during the spring. These students were evaluated as part of the statewide mandated physical performance testing (Assembly Bill 265, Education Code Section 2, Chapter 6, Section 60800). By law, California public school districts must assess all fifth, seventh, and ninth graders annually for physical fitness and body weight. To facilitate analyses, some students otherwise eligible for inclusion were excluded from analyses because they were missing data on one or more of the study measures (N = 707). The study participants were 749 fifth, 761 seventh and 479 ninth graders, comprising 1,012 males and 977 females attending a middle-to-high income Southern California school district in 2002–03. The demographic characteristics of the 1,989 students included in the analyses resembled the 2,696 who were eligible (Table I). Fifth graders comprised 37.7% of the analytic sample but only 31% of the population, indicating some over-representation of fifth graders. Ninth graders comprised 24.1% of the analytic sample but 32.5% of the population, indicating some under-representation of ninth graders. Departures from representativeness involving ethnicity and sex were negligible, with differences in proportions of ethnic composition between the analytic sample and the population differing by less than one percent for every major ethnic group, except for African Americans, where the difference was 1.4%. The sex composition of the analytic sample was identical to the sex composition of the population. Ethnicity was categorized as African American, Asian/Pacific Islander (including Filipinos, Asians and Pacific Islanders), Hispanic, and Non-Hispanic whites.
Aerobic fitness, body weight and student demographic data were obtained from existing school records with personal identifiers removed except an arbitrary district identification number used to link Fitnessgram data, school district demographic data and standardized test score data. Parental education, child ethnicity and eligibility for free or reduced price lunch status were determined by parent self-report information collected by the school district. These data were taken from the California Department of Education website (http://www.ed-data.k12.ca.us). Missing data included refusals to provide such information. The Institutional Review Board at the University of California, Los Angeles approved the study protocol. “Fitnessgram” refers to a comprehensive battery of physical fitness assessments devised by the Cooper Institute for Aerobics Research (6) to assess a student’s overall health-related physical fitness. The Fitnessgram has been adopted by the State of California as the required physical performance test to be administered annually to all students in the fifth, seventh and ninth grades. Physical education staff of each school district must follow strictly the measurement protocol stipulated in the Fitnessgram manual for the assessment of body composition and physical fitness.
All testing took place within the school environment and was administered by the physical education staff. For the purposes of this study, only aerobic fitness assessments of the Fitnessgram were included. Aerobic capacity was measured through use of the state-approved Fitnessgram assessment utilizing the mile run test. This approach has been validated as a field measure estimate of maximal oxygen uptake (O2 max) in both adults and children (7). For the Fitnessgram aerobic fitness assessment, groups of students ran around a flat quarter mile track four times and the mile time was recorded up to 15 minutes of run/walk time. Students who had not completed the mile in 15 minutes were assigned the maximum time of 15 minutes. Sex- and age-specific state standards were established for the mile run/walk times that students need to achieve in order to qualify as falling within the “Healthy Fitness Zone” (8).
Height and body weight measurements were taken using a stadiometer and balance beam scale (Detecto, Webb City, Missouri). The school district recalibrates the scales regularly to ensure accurate measures. BMI was calculated as weight (kg)/height (m2). For the calculation of sex-specific BMI-for-age percentiles LMS parameters were provided by the CDC (9) and were used to generate Z-scores of BMI values that were then applied to estimate sex-specific BMI-for-age percentiles. Obesity risk classification was determined for each student. We opted to use the CDC weight status cutpoints (10, 11), where the 85th–94th percentile category is termed “overweight” and the 95th+ percentile is termed “obese.” Recent policy statements now speak of the “obese child” rather than limiting the term “obese” to adults (12). Hence, BMI percentile categories included: <5th,≥5th–84th,≥85th–94th,≥95th, which correspond to the following classifications: “underweight,” “desirable weight,” “overweight,” and “obese,” respectively.
Four hundred and twelve study participants had also participated, four months earlier, in a health promotion program that also included assessment of BMI. The personnel collecting these measures were nurses hired and trained to collect anthropometric and medical data. This prior assessment of BMI permitted the investigators to gauge the reliability of BMI assessment in children over four months. After controlling for minor differences in ethnicity and socioeconomic characteristics between the Fitnessgram sample and the health promotion subsample, the partial correlation between the two BMI measures was 0.93, which indicated a reasonably high level of repeatability over four months in a pediatric population experiencing natural growth in body weight.
California Department of Education school-level data standardized test score data from the California Achievement Tests version 6 (CAT6) and California Standards Tests (CST) were obtained from the district in both 2002 and 2003 for math and reading (CAT) or math and language (CST). CAT6 test scores are used to compare California students with those in other states and are expressed in percentiles ranging from 0% to 99%. The CST was used to categorize students as “far below basic,” “below basic,” “basic,” “proficient,” or “advanced.” The CST scores are specific to California content standards; the resulting categorizations are expressed in a range from 1 to 5. By law, all students attending public schools in California were required to complete the CAT6 and are required to complete the California Standards Tests (CST) by May of each year. These tests assess grade-appropriate achievement in math, reading and language arts and other disciplines. More information is available at: http://star.cde.ca.gov/star2004/aboutSTAR_programbg.asp.
Descriptive statistics are presented as means, standard deviations, and percentages. BMI-for-age z-scores and percentiles were created using the CDC growth chart-derived norms for sex and age (13). Hierarchical linear regression models were estimated using maximum likelihood, regressing 2002 standardized test score performance onto BMI-for-age z-scores and/or mile run/walk times. The initial model consisted of a null model. With the inclusion of additional predictors, subsequent models controlled for the potentially confounding effects of income (reflected by student eligibility for free and/or reduced price school lunches), sex and ethnicity, the last of which was computed using dummy variables and treating Non-Hispanic Whites as the referent ethnic group. Hierearchical linear modeling was used to account for students being clustered in schools. Because some study measures had unacceptably skewed or kurtotic distributions, study participants were categorized into quintiles for mile time or BMI-for-age and test score performance. However, we did run the analysis on raw data, and no differences in the relationships were noted. We conducted tests of linear trend by treating the quintile categories as continuous variables and assigning the median score to each category in unconditional regressions (14). For analysis purposes, only the major ethnic groups were compared; the seven students identified as American Indian were dropped from the analyses. All data were analyzed using the STATA 10.0® Statistical software package (College Station, Texas).
Table I describes characteristics of the sample by sex, grade, and ethnicity. The sample included a similar number of female and male subjects. The ethnic distribution of the children was 59% non-Hispanic white, 27% Hispanic, 7% African American and 6% Asian/Pacific Islander. To investigate whether the Fitnessgram participants were demographically representative of the district, the distribution of the sample by ethnicity was compared with the ethnic distribution of the district. The distributions did not differ significantly by ethnicity (P > 0.15), but did vary by free and reduced price meal eligibility (P = .03), suggesting that the Fitnessgram participants were ethnically representative of the students comprising the school district but slightly better off, economically, than the school district as a whole (Table I). The frequency of parental completion of college was 61.5% and only 22.8% of the children were eligible for free or reduced price school lunches, indicating a school district with higher socioeconomic status than the average California public school district. The average California school district in 2002–2003 had 48.3% of students eligible for free or reduced price school lunches (obtained from http://www.cde.ca.gov/ds/sh/cw/filesafdc.asp).
Table II illustrates the mile time by healthy fitness zone standards, as established by the state of California (8). Physical education instructors were instructed to stop the Fitnessgram mile run/walk test at 15 minutes; some physical education instructors permitted students completing the final quarter mile to complete it in 16 minutes. Observed means are therefore underestimates, given that some students who were unable to complete a mile because of the time constraint would have completed it in more time if permitted. Sixty-five percent of students had a fitness level below recommended age-specific, sex-specific standards for mile time performance. Additionally, 64% of study participants had mile times that were slower than the norms recommended by the state of California. There was a similar percentage of girls and boys classified as being in the healthy fitness zone (P =0.48). Table II also depicts the relationship between ethnicity and achievement of California age-specific mile time performance. African American students were less likely to achieve California fitness standards than Asian American and non-Hispanic white students (ORAsians= .49, P = .04; ORWhites= .58, P = .005).
Obesity status and mean BMI-for-age percentiles (converted from z-scores for ease of interpretation) are also included in Table II. The combined prevalence of overweight and obese (BMI≥85th percentile) was 31.8% for males and 27.7% for females (Table II). Mean BMI-for-age percentiles varied by ethnicity, with 16.3% of Asians classified as either at risk for overweight or obesity, 22.1% of non-Hispanic whites, 40.6% of African Americans and 47.7% of Hispanics. The percentages of students classified as overweight or obese was greater among African Americans and Hispanics than among Asians or non-Hispanic whites (all comparisons P <0.003, after Bonferroni correction). Additionally, Asian/Pacific Islanders had slightly lower BMI-for-age percentiles than non-Hispanic whites (F(1,11) = 10.65, P = .04). Hispanics and African Americans did not differ significantly from each other, nor did boys’ BMI-for-age differ on average from that of girls.
With respect to age-specific fitness standards, students who failed to run the mile in the appropriate time interval established as appropriate for each age and sex scored significantly lower on the CAT6 and CST math, reading and language California standards tests compared with those students who fell in the healthy fitness zone (Table III). Tests for linear trends revealed that decreasing quintiles of aerobic fitness scored progressively lower on CAT6 math and reading (linear trend, Pmath < .0001; Preading = .001) (Figure 1) and on CST math and language tests (linear trend, Pmath < .0001; Planguage < .0001) (Figure 2; available at www.jpeds.com).
Table III depicts test scores by BMI percentiles; as observed for mile run/walk time, those who exceeded both the 85th and 95th percentile for BMI-for-age scored significantly lower on the CAT6 and CST math, reading and language tests than those in the recommended range for BMI. Tests for linear trend showed that increasing quintiles of BMI-for-age percentile scores scored progressively lower on both CAT6 math and reading (linear trend, Pmath = .007; Preading = .028) (Figure 1) as well as CST math and language tests (linear trend, Pmath = .013; Planguage = .073) (Figure 2).
Sequential hierarchical linear regression models (Stata xtreg procedure) were used to regress CAT6 and CST test score performance measures onto the following predictors: 1) null model, 2) age, ethnicity, sex, and eligibility for free/reduced price school lunches, 3) the foregoing covariates and BMI-for-age z-scores, and 4) the foregoing covariates/predictors and mile run/walk time. Mile run/walk time was a significant predictor of standardized CAT6 2002 math test score performance such that the math score dropped 1.9 points (out of a possible 99) for every additional minute required to complete the one mile run/walk (b = −1.94, 95% CI = −2.37, −1.53) even when age, free or reduced price lunch status, sex, and ethnicity and BMI-for-age were included as covariates. Adding the demographic covariates to the null model reduced the intraclass correlation from .09 to .01 and explained 15.9% of the null model variance. Adding BMI-for-age to the demographic variables-augmented model decreased the null model variance only an additional 0.4%, although this change was still statistically significant (model difference likelihood ratio chi square (1) = 8.6, P = .003). The full model, including all of the foregoing covariates/predictors (including BMI-for-age) but also including a measure of the student’s performance on the 1-mile fitness test decreased the null model variance an additional 3.5% (model difference likelihood ratio chi square(1) = 81.6, P < .0001). With the inclusion of the fitness measure, the student’s BMI-for-age was no longer a significant contributor to the CAT6 2002 math test score. All major ethnic groups differed significantly from Whites in the full model, with Asian math scores higher than Whites’ scores (bAsian = 4.6, 95% CI = 1.44, 7.85), and Hispanic and African American math scores lower than Whites’ scores, respectively (bHispanic = −11.44; 95% CI = −13.74, −9.14; bAfrican American = −16.33, 95% CI = −19.70, −12.96).
The standardized CAT6 2002 reading test score dropped 1.1 points for every additional minute required to complete the one mile run/walk (b = −1.13, 95% CI = −1.56, −0.70) even when age, sex, ethnicity, free and reduced price lunch status, and BMI-for-age were included as covariates. Adding the demographic covariates to the null model reduced the intraclass correlation from .09 to .01 and explained 15.6% of the null model variance. Adding BMI-for-age to the demographic variables-augmented model decreased the null model variance only an additional 0.7%, although this change was still statistically significant (model difference likelihood ratio chi square (1) = 6.23, P = .013). The full model, including all of the foregoing covariates/predictors, including BMI-for-age, but also including a measure of the student’s performance on the 1-mile fitness test decreased the null model variance an additional 1.3% (model difference likelihood ratio chi square(1) = 26.2, p < .0001). With the inclusion of the fitness measure, the student’s BMI-for-age was no longer a significant contributor to the CAT6 2002 reading test score. Asian ethnicity generally explained no additional variance in CAT6 reading test scores relative to Non-Hispanic whites, which was the referent ethnic group, but Hispanic and African American ethnicity did explain some additional variation (11.2 point drop for Hispanics (bHispanic = −11.23; 95% CI = −13.57, −8.89); 14.5 point drop for African Americans (bAfrican American = −14.53, 95% CI = −17.96, −11.12).
Similar findings were noted for CST math and language test performance (data not shown). These analyses also confirmed that the pattern of ethnic differences in standardized test scores (Asians and Non-Hispanic Whites > African Americans and Hispanics) was consistent with the pattern of ethnic differences in percent achieving recommended levels of BMI and aerobic fitness.
The most impressive findings were the consistency of positive associations between aerobic fitness and standardized test score performance, and the consistency of inverse associations between BMI-for-age and standardized test score performance. Even those children who were classified as overweight but not obese (i.e. >85th percentile but <95th) scored significantly lower than did desirable weight children. Because decreased socioeconomic status has been consistently associated with decreased standardized test scores (15), it is an obvious potential confounder of any association between obesity and standardized test performance. Controlling for age, socioeconomic status, sex and ethnicity did attenuate the significance of the relationships between aerobic fitness and standardized test scores and between BMI-for-age percentiles and standardized test scores. Nevertheless, associations remained significant. It appears that both BMI-for-age and performance on the one mile run/walk predict standardized test score performance, above and beyond the large amount of variance predicted by sex, ethnicity and socioeconomic status, but the remaining variance explained is small.
The findings presented here confirm and extend previous findings that aerobic fitness is associated with enhanced performance on standardized achievement tests. The extensions include generalization to ethnically and socioeconomically diverse students varying across primary and secondary school grades, using objective measures of both aerobic and standardized test score performance. Additionally, the data suggest that the association of obesity status and test score performance may be mediated by fitness.
Our data indicate that the fitness level of nearly two-thirds of the students surveyed did not fall in the healthy fitness zone. These data are in agreement with the 2004 California Fitness Test data, where only 27% of more than 1.3 million students tested in grades five, seven and nine met fitness standards in all six examined variables. Decades of declining participation in physical education (16–18) have resulted in decades of declining student fitness levels (19). Sixty-one percent of children aged 9–13 years do not participate in any organized physical activity during their non-school hours (20). The result is that one-third of adolescents fail to achieve a recommended minimum of 30 minutes of moderate to vigorous physical activity three times per week (21, 22).
We also noted ethnic disparities in aerobic fitness, with both Hispanic and African American youth reporting slower mile run/walk performances than Whites and Asians. This finding is consistent with that of Beets et al (23), who reported a similar trend in all California students in 2002. Participation in vigorous activity is higher in Whites (67%) compared with Blacks (54%) and Hispanics (60%), and decreases with advancing grade. The higher BMI-for-age (Table II) presenting in African Americans and Hispanics may also contribute to their lower aerobic fitness levels.
We noted approximately 30% prevalence of overweight/obesity in the current study, with higher prevalence in Hispanic (48%) and African American (41%) students. Our data corroborate prior studies such as the California Children Healthy Eating and Exercise Practices Survey (calCHEEPS), where 32% of 4th and 5th graders were overweight or obese (24). When comparing the prevalence of overweight/obesity, our data for Non-Hispanic Whites and Asians are similar to those reported for all California students in 2002, and our data for Hispanics and African-Americans indicated higher prevalence rates than corresponding estimates for the state (23). This is of interest, given the high socio-economic status of the cohort as a whole, and agrees with prior data that even in higher socio-economic strata, obesity risk is increasing.
The mechanism(s) by which students with higher aerobic fitness and/or lower BMI-for-age might perform better on standardized academic achievement tests is unknown. Some studies have suggested that cognitive function may be impaired by obesity (25), low fitness (26, 27) and metabolic syndrome (28). Lifestyle behaviors such as everyday physical activity and food choices can affect both aerobic fitness and body weight. There may be links between these lifestyle behaviors and learning and objective academic performance. When adding a daily physical activity program to existing primary school curricula, there was no evidence of any loss of academic performance as measured by arithmetic and reading tests in spite of 45–60 minutes loss of formal teaching time each day (29–31). Mechanistic studies of cognitive function suggest a positive effect of physical activity on intellectual performance (32), although the relevance to children is unknown because most studies of mechanisms that might explain how physical activity affects brain function have been performed in adults to date.
There were several limitations in this study. One limitation is the mile run/walk test; the validity of the results assume that all students exerted the maximum effort possible when completing this assessment. The validity of the results also assumes that the one mile run/walk test is a reliable surrogate of gold standard measures of aerobic fitness, a point that is disputed by some (33, 34). Additionally, given that excess adiposity may affect mile time, we cannot unequivocally state that the effect of fitness is independent of obesity status. Strict adherence to test administration and data collection procedures could not be confirmed and could have affected the reliability of the data; however mile time and BMI assessment are measures that a state-certified physical education instructor would be qualified to perform. BMI is only a surrogate measure of body composition and cannot be used to differentiate between changes in lean and adipose tissue. For example, resistance training can elicit muscle hypertrophy, resulting in greater lean body mass for height. Future research might consider using waist circumference or body composition testing, rather than BMI-forage as a better predictor of obesity-related conditions. Results were limited to grades studied; however there is reason to believe that what is true of fifth, seventh and ninth graders would also be true of children in other grades. The cross-sectional nature of the data prohibits against drawing causal inferences from the observed relationship between variations in fitness and variations in standardized test performance. Longitudinal research is needed to confirm whether changes in body composition or physical fitness over time explain variations in test scores and to shed light on the temporal mechanisms that may explain how physical fitness and body composition influence school children’s performance on standardized tests.
The current study suggests that even in a higher socioeconomic status cohort, the prevalence of low fitness and obesity are common. If future studies confirm a causal role for the influence of physical activity on both fitness and academic performance, schools will have to reverse their recent disinvestment in physical education ostensibly for the purpose of boosting student achievement.
Supported in part by a grant from the National Institute of Diabetes, Digestive and Kidney Disorders (1R01-DK063507). The sponsor had no role in the design of the study, in the collection, analysis, or interpretation of the data, or in the writing or submission of this paper.
We are grateful to the Director, Information Services, of the anonymous school district that participated in this study, for his work in transmitting to the investigators the data used in this report.
The authors declare no conflicts of interest.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.