Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Aging Ment Health. Author manuscript; available in PMC 2011 January 1.
Published in final edited form as:
PMCID: PMC2828360

Age and education effects and norms on a cognitive test battery from a population-based cohort: The Monongahela –Youghiogheny Healthy Aging Team (MYHAT)



Performance on cognitive tests can be affected by age, education, and also selection bias. We examined the distribution of scores on a several cognitive screening tests by age and educational levels in a population-based cohort.


An age-stratified random sample of individuals aged 65+ years was drawn from the electoral rolls of an urban U.S. community. Those obtaining age and education-corrected scores ≥ 21/30 on the Mini-Mental State Examination were designated as cognitively normal or only mildly impaired, and underwent a full assessment including a battery of neuropsychological tests. Participants were also rated on the Clinical Dementia Rating scale. The distribution of neuropsychological test scores within demographic strata, among those receiving a CDR of 0 (no dementia), are reported here as cognitive test norms. After combining individual test scores into cognitive domain composite scores, multiple linear regression models were used to examine associations of cognitive test performance with age, and education.


In this cognitively normal sample of older adults, younger age and higher education were associated with better performance in all cognitive domains. Age and education together explained 22% of the variation of memory, and less of executive function, language, attention, and visuospatial function.


Older age and lesser education are differentially associated with worse neuropsychological test performance in cognitively normal older adults representative of the community at large. The distribution of scores in these participants can serve as population-based norms for these tests, and be especially useful to clinicians and researchers assessing older adults outside specialty clinic settings.

Keywords: Neuropsychological tests, epidemiology, normative, community


As the world's population ages, concern is growing about the individual and public health impact of cognitive aging. Patients and families are increasingly concerned about memory loss and its implications, and clinicians are frequently required to undertake objective assessment of patients' cognitive functions. Researchers, both clinical and epidemiological, also feel the need for reliable cognitive assessment of older adults. While there is an abundance of neuropsychological tests available for use, a limiting factor is the absence of adequate norms for the elderly. Further, when tests that were normed on healthy younger adults are administered to older adults, the interpretation of test results can be challenging (Mitrushina, Boone, Razani, & D'Elia, 2005). It is well established that normal aging has an adverse impact on test performance, particularly aspects related to speed of information processing, and that different cognitive domains are affected differently. Level and quality of education also influences cognitive test performance (Acevedo, Loewenstein, Agron, & Duara, 2007; Ganguli, Snitz, Vander Bilt, & Chang, 2009; Manly et al., 1999; Marcopulos, McLain, & Giuliano, 1997; Unverzagt et al., 1996). Further, there are potential cohort effects: the educational experiences that older adults had in their youth differ materially from those of subsequent generations, and the amount and quality of previous education can affect test performance. Normative data are increasingly available from older adults; however these are often individuals attending specialty memory disorder clinics, or exceptionally healthy older adults volunteering for research studies, and may not be representative of those in the community at large. Without norms based on representative samples of older adults, it becomes difficult to distinguish normal from pathological aging, and thus to embark on the diagnosis of disorders that affect cognition.

Here, we report neuropsychological test data from a cognitively normal sample of older adults participating in an epidemiological cohort study. We examined the distribution of test scores in the overall sample and across age and education subgroups, and also the associations of these demographic indices with performance within each cognitive domain.


Study area, sampling, and recruitment

The small-town area selected for the study surrounds the confluence of the Monongahela and Youghiogheny rivers, in southwestern Pennsylvania in the USA. The study cohort was named the Monongahela-Youghiogheny Healthy Aging Team (MYHAT). The steel industry was formerly the mainstay of the region's economy, which has remained depressed since that industry collapsed in the late 1970s. The older population of the area is stable, with low rates of in- and out-migration.

The electoral rolls are considered comprehensive and have the added advantage of being publicly available. Sampling ratios were derived to accrue a cohort of approximately 2,000 individuals with approximately equal numbers in the age-intervals 65-74, 75-84, and 85+ years. To compensate for the relatively small number of individuals aged 85+, we oversampled those who were aged 80-84 and could be expected to age into the 85+ group during the study. Community outreach and recruitment procedures were approved by the University of Pittsburgh Institutional Review Board. Further details have been reported previously (Ganguli et al., 2009). Recruitment took place during the period 2006 – 2008. Entry criteria were (a) age 65 years or older, (b) living within the selected area, (c) not already in a long-term care institution. Individuals were considered ineligible if they (d) were too ill to participate, (e) had severe vision impairment, (f) had severe hearing impairment, or (g) were decisionally incapacitated. Over the approximately two-year recruitment period, a total of 2036 individuals were recruited.

Assessment (overview)

A single-stage assessment was employed to avoid both delays and potentially non-random attrition between screening and definitive assessment stages (Prince, 2000). The Mini-Mental State Examination (MMSE) (Folstein, Folstein, & McHugh, 1975) was administered and scored on the spot, applying a standard correction for age and education (Mungas, Marshall, Weldon, Haan, & Reed, 1996). Fifty-four individuals scoring <21 /30 (age-education corrected) were classified as having moderate to severe cognitive impairment and therefore not part of the target population for the MYHAT study. These individuals were not assessed further. The remaining 1982 participants, who scored ≥21 on the age-education corrected MMSE, proceeded to the full assessment, which included several components. This report is focused on the neuropsychological assessment and clinical dementia rating.

Neuropsychological assessment

Cognitive functioning was assessed by the following test battery, categorized here according to the principal cognitive domain tapped by the tests.


Trail making Test A (Reitan, 1955), Digit Span Forward (Wechsler, 1987)

Executive Function

Trail making Test B (Reitan, 1955), clock drawing (Freedman et al., 1994), verbal fluency for initial letters P&S (Benton & Hamsher, 1989).


Boston Naming Test (Kaplan, Goodglass, & Weintraub, 1978), verbal fluency for categories (animals) (Rosen, 1980), Indiana University Token Test (Unverzagt et al., 1996).


WMS-R Logical Memory (immediate and delayed recall) (Wechsler, 1987), WMS-R Visual Reproduction (immediate and delayed recall) (Wechsler, 1987), 3-trial Fuld Object Memory Evaluation with Semantic Interference (Fuld, 1981; Loewenstein et al., 2003)

Visuospatial Function

WAIS-III-Block Design (Wechsler, 1997)

Except for the visuospatial function domain which includes only a single test, statistical modeling used composite scores created for each cognitive domain.

Clinical Dementia Rating

Participants were also rated by the interviewers on the Clinical Dementia Rating (CDR) scale (Hughes, Berg, Danziger, Coben, & Martin, 1982; Morris, McKeel, Fulling, Torack, & Berg, 1988). The rating was based on an assessment protocol composed of standardized questions, as well as observation, regarding the participant's daily functioning in the six areas of memory, orientation, judgment, home and hobbies, community affairs, and personal care. Note that these assessments are not based on neuropsychological test performance but on reports and observation of activities and functioning. Each of the six areas is rated on a scale of 0 through 0.5, 1, 2, and 3, and a standard algorithm is used to generate a summary CDR rating of 0 (no dementia), 0.5 (possible dementia), 1.0 (mild dementia), 2.0 (moderate dementia), and 3.0 (severe dementia). The current report of normative data is restricted to 1413 individuals who received CDR scores of 0.

Statistical Methods

Distributions of the test scores were assessed by the mean, standard deviation (SD), 5th %ile, and 50th %ile (median) for the sample as a whole, for age groups categorized as 65-74, 75-84, and ≥85 years, and for educational levels characterized as less than high school, high school graduate, and more than high school. Given that most test scores were nor normally distributed, we report thresholds in terms of percentiles rather than standard deviations below the mean. The conventional cutpoint of 1.5 SD below the mean is equivalent to the 6.7th percentile score, which in the skewed distributions reported here was close if not identical to the 5th percentile score. There is precedent for reporting 5th percentile thresholds (Benton, Sivan, Hamsher, Varney, & Spreen, 1994).

Cognitive domain composites were created combining groups of individual cognitive tests. Each test was first transformed into the standardized score by centering to its mean value and divided by its standard deviation. For the Trail making tests A and B, normative data are shown for time taken to complete the task which is the standard score; however, the distribution of the time scores was markedly skewed. We therefore used number of correct connections per second in the creation of the composites including these tests. The arithmetic mean of the corresponding standardized scores was then calculated as the final composite score for each cognitive domain, except for the visuospatial domain which is comprised of a single test (Block Design).

Simple linear regression models were first fit to assess the relationships of each domain with age and education separately. Multiple linear regression models were then fit to assess the independent effects on each cognitive domain of age and education examined simultaneously (i.e., adjusting for each other). For the models, the sample size was restricted to the 1260 cognitively normal participants with complete data on all tests. Analyses were conducted using SAS version 9.1 (SAS Institute Inc, Cary, North Carolina) and Stata version 10 (StataCorp LP, College Station, Texas).


Cohort characteristics

As noted, an age-stratified random sample was recruited with larger sampling fractions in the older age-groups and deliberate oversampling of those aged 80-84. The overall mean (SD) age was 77.6 (7.4) years; 61.1% were women. The median educational level was high school graduate; 13.8%, 45.1%, and 41.1% respectively had less than high school education, were high school graduates, and had more than high school education. As regards race, 94.8% were White and 4.9% were Black, while the remainder were Asian or reported more than one race. The racial/ethnic breakdown of the cohort is largely representative of older adults in this region.

For the current report, the normative sample was restricted to 1413 individuals with age-education-corrected MMSE scores ≥ 21, and Clinical Dementia Rating of 0. Their mean (SD) age was 76.8 (7.3) years, 63.6 % were women, and 95.8% were White. Table 1 shows the distribution of the sample by age, and educational levels.

Table 1
Age and education distribution in study sample *

Table 2. Mean, standard deviations (SD), median (50th % percentile) and 5th percentile scores are reported on each test within nine groups (three age groups each divided into three education groups). These data can serve as population-based norms.

Table 2
Cognitive Norms: Mean, standard deviations (SD), median (50th percentile) and 5th percentile scores on neuropsychological tests in MYHAT battery.

In both univariable (simple linear regression) and multivariable (multiple linear regression) models, the reference group for age is the youngest (age 65-74) group; the two older age groups 75-84 years and 85+ years are each compared to the youngest group. For education, the reference group is the least educated (less than high school); the higher educated groups, of high school graduates and those with greater than high school education, are each compared to the least educated group.

Results of multivariable analysis, shown in Table 3, for each level of each covariate on each domain in each model, include the coefficient, standard error, confidence interval for the coefficient, and the adjusted R2 (reflecting the percent of variance explained), t value, and the statistical significance of the association as reflected in P values. Multivariable analyses were restricted to 1260 individuals with complete data on all tests.

Table 3
Results of multiple linear regression models of associations of each cognitive domain with age and education. N=1260 individuals with complete data on all tests.

In both univariable (simple linear regression, data not shown) and multivariable analyses (Table 3), older age and lesser education were significantly associated with worse performance in all cognitive domains. In the univariable models, age explained 19% of the variance in the memory domain, 16% of executive functioning, 14% of language, 12% of attention, and 9% of visuospatial function. Similarly, in the univariable models, education explained 7% of the variation in the memory domain, 7% of executive functioning, 7%of language, 4% of attention, and 7% of visuospatial function. In the multivariable models, age and education together explained 22% of the variation of memory, 19% of executive function, 18% of language, 13% of attention, and 13% of visuospatial function.


It is well-established that most neuropsychological tests are influenced by age and education (Acevedo et al., 2007; Ganguli et al., 1991; Manly et al., 1999; Marcopulos et al., 1997; Unverzagt et al., 1996), complicating the clinical interpretation of test performance by those at the extremes of age and education. In a large cohort of adults, representative of the economically depressed US community from which it was drawn, we have demonstrated here the significant associations of age and educational level with performance on a battery of standard neuropsychological tests, among individuals free of dementia. Greater age and lesser education were associated with inferior performance in all cognitive domains tapped by our neuropsychological battery. Age and education together explained the largest proportion (over a fifth) of the variance in memory, and somewhat less of the variance in the remaining domains.

We have also provided population-based norms by age and education on each test in our neuropsychological battery. These data should be of value to clinicians who administer neuropsychological tests to elderly patients and need to distinguish between normal and abnormal cognitive aging. For example, test performance at the fifth percentile, which is close to one and a half standard deviations below the mean, for the patient's level of age and education could alert clinicians to the possibility of cognitive impairment. In the absence of appropriate norms, it is difficult to determine whether or not a seemingly low test score is in fact within the expected range for a patient aged 86 years with less than high school education. Similarly, a score that appears “normal” might in fact be below the expected range for a patient aged 66 years with college education. The normative values should also help clinical and population researchers using these tests to select appropriate screening cutpoints for their own research.

It can be challenging to directly compare norms across studies because of differences in study design and study sample as well as in the way norms are reported. For example, the Mayo Older Adults Normative Studies (MOANS) have reported norms on the Wechsler Memory Scale – Revised (WMS-R) for ages 56-94. For Immediate Recall of the Logical Memory task, the MOANS report scaled scores, for ages 62-72, with no educational breakdown, as follows: for the 41-59 percentile range, scores range from 24-26. These scores are reported in a very different manner, but seem roughly comparable to MYHAT normative 50th%ile scores for age 65-74 of 21, 23, and 24 for those with less than high school, high school, and more than high school education respectively. (Ivnik, Malec, Smith, Tangalos, & Petersen, 1996). Several groups have reported norms on the CERAD neuropsychological battery, including category fluency for animals. An Australian group report mean (SD) scores ranging as high as 23.1 (5.6) among those with twelve or more years of education, and as low as 18.4 (5.7) among those with less than twelve years. MYHAT mean (SD) scores among those aged 65-74 ranged from 17.0 for those with less than high school to 18.35 (4.52) for those with more than high school.(Collie, Shafiq-Antonacci, Maruff, Tyler, & Currie, 1999)

It should of course be recognized that norms from small-town community-based elders in Southwestern Pennsylvania cannot be generalized to all populations everywhere. In particular, this cohort included too few members of ethnic minorities to allow separate norms to be reported for them. Efforts should be made to generate normative values in any environment where a given test is likely to be employed, so that test performance can be interpreted appropriately.


The work reported here was supported by grants R01 AG023651, P50 AG005133, and K24 AG022035 from the National Institute on Aging, National Institutes of Health, and U.S. Department of Health and Human Services. The authors thank all MYHAT project personnel, and all MYHAT study participants, for their contributions to the study.


  • Acevedo A, Loewenstein DA, Agron J, Duara R. Influence of sociodemographic variables on neuropsychological test performance in Spanish-speaking older adults. Journal of Clinical & Experimental Neuropsychology: Official Journal of the International Neuropsychological Society. 2007;29(5):530–544. [PubMed]
  • Benton AL, Hamsher K. Multilingual Aphasia Examination Manual of Instructions. 2nd. Iowa City: AJA Associate; 1989.
  • Benton AL, Sivan AB, Hamsher KdeS, Varney NR, Spreen O. Contributions to Neuropsychological Assessment: A Clinical Manual. 2nd. Oxford University Press; 1994.
  • Collie A, Shafiq-Antonacci R, Maruff P, Tyler P, Currie J. Norms and the effects of demographic variables on a neuropsychological battery for use in healthy ageing Australian populations. Australian and New Zealand Journal of Psychiatry. 1999;33:568–575. [PubMed]
  • Folstein MF, Folstein SE, McHugh PR. Mini-Mental State: A practical method for grading the cognitive state of patients for the clinician. Journal of Psychiatric Research. 1975;12:189–198. [PubMed]
  • Freedman M, Leach L, Kaplan E, Winocur G, Shulman KI, Delis D. Clock drawing: A neuropsychological analysis. New York: Oxford University Press Inc.; 1994.
  • Fuld PA. Fuld Object-Memory Evaluation. Woodale, IL: Stoelting Company; 1981.
  • Ganguli M, Ratcliff G, Huff FJ, Belle S, Kancel MJ, Fischer L, et al. Effects of age, gender, and education on cognitive tests in a rural elderly community sample: norms from the Monongahela Valley Independent Elders Survey. Neuroepidemiology. 1991;10(1):42–52. [PubMed]
  • Ganguli M, Snitz B, Vander Bilt J, Chang CCH. How much do depressive symptoms affect cognition at the population level? The Monongahela-Youghiogheny Healthy Aging Team (MYHAT) Study. International Journal of Geriatric Psychiatry. 2009 doi: 10.1002/gps.2257. [PMC free article] [PubMed] [Cross Ref]
  • Hughes CP, Berg L, Danziger WL, Coben LA, Martin RL. A new clinical scale for the staging of dementia. British Journal of Psychiatry. 1982;140:566–572. [PubMed]
  • Ivnik J, Malec J, Smith G, Tangalos E, Petersen R. Neuropsychological tests' norms above age 55: COWAT, BNT, MAE token, WRAT-R reading, AMNART, Stroop, TMT, and JLO. The Clinical Neuropsychologist. 1996;10:262–278.
  • Kaplan EF, Goodglass H, Weintraub S. The Boston Naming Test. Boston: E Kaplan & H Goodglass; 1978.
  • Loewenstein DA, Acevedo A, Schram L, Ownby R, White G, Mogosky B, et al. Semantic interference in mild Alzheimer disease: preliminary findings. American Journal of Geriatric Psychiatry. 2003;11(2):252–255. [PubMed]
  • Manly JJ, Jacobs DM, Sano M, Bell K, Merchant CA, Small SA, et al. Effect of literacy on neuropsychological test performance in nondemented, education-matched elders. Journal of the International Neuropsychological Society. 1999;5(3):191–202. [PubMed]
  • Marcopulos BA, McLain CA, Giuliano AJ. Cognitive impairment or inadequate norms? A study of healthy, rural, older adults with limited education. Clinical Neuropsychologist. 1997;11(2):111–131.
  • Mitrushina M, Boone KB, Razani J, D'Elia L. Handbook of normative data for neuropsychological assessment. 2nd. New York: Oxford University Press; 2005.
  • Morris JC, McKeel DW, Jr, Fulling K, Torack RM, Berg L. Validation of clinical diagnostic criteria for Alzheimer's disease. Annals of Neurology. 1988;24(1):17–22. [PubMed]
  • Mungas D, Marshall SC, Weldon M, Haan M, Reed BR. Age and education correction of Mini-Mental State Examination for English and Spanish-speaking elderly. Neurology. 1996;46(3):700–706. [PubMed]
  • Prince M. Methodological issues for population-based research into dementia in developing countries: A position paper from the 10/66 dementia research group. International Journal of Geriatric Psychiatry. 2000;15:21–30. [PubMed]
  • Reitan RM. The relation of the Trailmaking Test to organic brain damage. Journal of Consulting Psychology. 1955;19:393–394. [PubMed]
  • Rosen WG. Verbal fluency in aging and dementia. Journal of Clinical Neuropsychology. 1980;2:135–146.
  • Unverzagt FW, Hall KS, Torke AM, Rediger JD, Mercado N, Gureje O, et al. Effects of age, education, and gender on CERAD neuropsychological test performance in an African American sample. Clinical Neuropsychologist. 1996;10(2):180–190.
  • Wechsler D. Wechsler Memory Scale Revised: The Psychological Corporation 1987
  • Wechsler D. Wechsler Adult Intelligence Scale. Third. San Antonio, TX: The Psychological Corporation; 1997.