|Home | About | Journals | Submit | Contact Us | Français|
This glossary provides a guide to some concepts, findings and issues of discussion in the new field of research in which intelligence test scores are associated with mortality and morbidity. Intelligence tests are devised and studied by differential psychologists. Some of the major concepts in differential psychology are explained, especially those regarding cognitive ability testing. Some aspects of IQ (intelligence) tests are described and some of the major tests are outlined. A short guide is given to the main statistical techniques used by differential psychologists in the study of human mental abilities. There is a discussion of common epidemiological concepts in the context of cognitive epidemiology.
Since the start of the new millennium, although there have been some earlier studies,1,2,3,4,5 low intelligence test scores have begun to appear in epidemiological reports as a risk factor for total mortality and possibly some disease‐specific outcomes, including coronary heart disease.6,7,8,9,10,11,12,13,14,15,16,17 This new field has been termed cognitive epidemiology.18 The association between cognitive test scores and mortality was probably first demonstrated at the group, rather than individual, level in 1933 by Maller.19
The study of human intelligence has a long and contentious research history, characterised by highly technical debates, multiple controversies and some scandal.20,21,22,23,24 However, there is, in fact, much consensus among researchers in the field regarding the measurement and validity of differences in human intelligence differences.25,26,27 It is important for those engaged in cognitive epidemiology research to understand this area if they are to use intelligence intelligently as a new risk indicator for health and disease. This glossary aims to provide four services. Firstly, it introduces the reader to major concepts in the field of differential psychology. Secondly, it describes concepts and procedures in mental ability testing. Thirdly, it describes statistical techniques in differential psychology. Fourthly, it sets common epidemiological concepts in the context of cognitive epidemiology. The terms within the first three of these sections are listed in an order which progresses from general to specific information, with the later items typically developing ideas presented in the earlier ones.
Cognitive epidemiology is used here to mean the use of cognitive ability test scores as risk factors for human health and disease outcomes, including mortality. This usage was urged by Lubinski and Humphreys,28 who considered IQ‐type test scores to be underutilised in epidemiology generally. To date, total mortality has been the principal outcome studied. Given the well‐established patterning of total mortality by socioeconomic position (SEP),29,30,31 and the fact that intelligence is significantly associated with indicators of SEP,25 there was a good prima facie case for introducing intelligence differences into the epidemiology of chronic diseases.
Differential psychology is the branch of psychology concerned with the nature, origins and applications of individual differences. Its principal topics are cognitive abilities and personality, but it also addresses attitudes, moods, and other psychological states and traits. It can be traced back to the London School of Psychology and Charles Spearman, the discoverer of general intelligence (g) and pioneer in the statistical field of factor analysis.32 Differential psychologists use the statistical techniques of psychometrics and are, partly, concerned with the devising and validating of mental tests. The statistical techniques include correlation, factor analysis, item response theory and structural equation modelling. An accessible introduction to the topics and techniques of differential psychology is provided by Cooper.33
Those higher mental functions that are not principally sensory, emotional or conative (related to the will) are cognitive, comprising a large range of functions related to the selection, storage, manipulation and organisation of information. In psychology, there are two quite different approaches to the study of cognition, originally associated with Cambridge and London Universities.34 The experimental (Cambridge) approach inquires after the modal structure of the mind. Largely, it studies those functions that are common to us all, and aims to provide a wiring diagram that would apply to most of us, and/or to people with specific disorders of cognition. The differential or correlational (London) approach is interested in individual differences, and studies the nature, number, causes and consequences of variance in people's cognitive performance. There have been repeated calls for combining the two approaches.34
The construct “cognitive abilities” appears under different terms, which might mean the same or slightly different things to the user—for example, intelligence(s), psychometric intelligence, mental abilities, cognitive functions or IQ. People differ in how accurately and quickly they perform mental work. The number of concepts we should use in order to describe these differences has been the topic of a century‐long debate.26 There are those who emphasise people's differences in g,35 and those who focus on people's relative strengths on a number of supposedly different (even independent) abilities.36 There is a growing consensus that it is valid to do both.27 That is, it is correct to acknowledge general and specific cognitive abilities. This concept has arisen because of the pioneering work of researchers such as Gustafsson,37 who used confirmatory factor analysis to test different models of the structure of cognitive ability differences, and Carroll,38 who retrieved and re‐analysed (using a standard methodology) most of the major intelligence datasets from the 20th century. Both of them suggested that a hierarchical model of cognitive abilities best accommodates most data. At the pinnacle of the hierarchy, g accounts for about half of the total test score variance. Next, accounting for relatively small amounts of variance, there are group factors of ability, which describe major correlated domains of cognitive ability, such as verbal, visuospatial, memory and processing speed. There is no absolute agreement on the number and nature of these domains.39 At a still more detailed level, some variance in people's performance is accounted for by very specific cognitive abilities. To date, the work on cognitive epidemiology has largely ignored this hierarchy. An exception is the Danish Metropolit study.8 Similar patterns of association with health outcomes were evident for specific cognitive test scores (essentially spatial, inductive and verbal abilities) and for global test scores. Most studies in cognitive epidemiology use omnibus tests of cognition, implicitly assuming that they are assessing largely g. Future research might address in more detail whether g or more specific aspects of cognitive ability are related to morbidity and mortality.
Intelligence is here taken to mean psychometric intelligence, as tested by standardised mental tests. In response to one of the controversies regarding intelligence (The bell curve),40 52 researchers in the field wrote and signed a piece in the Wall Street Journal (page A18, 13 December 1994), defining intelligence as
a very general mental capability that, among other things, involves the ability to reason, plan, solve problems, think abstractly, comprehend complex ideas, learn quickly and learn from experience. It is not merely book learning, a narrow academic skill or test‐taking smarts. Rather, it reflects a broader and deeper capability for comprehending our surroundings—‘catching on', ‘making sense' of things, or ‘figuring out' what to do.
An introduction to the main issues in intelligence testing, and the structure, causes and consequences of intelligence differences is provided by Deary.27
The term g is a widely used shorthand for general intelligence. In the light of the acrimonious debates it has attracted, it is ironic that it was coined as a neutral signifier by Spearman who discovered it in 1904.41,42 He thought the use of g—as opposed to intelligence, which he considered to be loaded with too many meanings—would prevent premature reification of what was a statistical finding. Typically, g is the first unrotated principal component (or factor) from a battery of mental tests administered to a sample of a population. It describes the near‐universal finding that all mental tests tend to correlate positively. The g extracted from different, large mental test batteries correlates at levels that are above r=0.9.43 The g factor tends to have the best correlations with outcomes such as educational and occupational success.35 To date, studies on mental ability, and mortality and physical health have tended to use single mental tests rather than a general factor extracted from a battery of varied tests (see section Cognitive abilities). Theories on human mental abilities that posited some number of uncorrelated multiple intelligences and excluded a general factor36,44,45 are not consistent with datasets gathered for over 80 years.38
The psychologists Cattell and Horn articulated a theory about two (substantially correlated) aspects of g, fluid and crystallised.46 Briefly, fluid intelligence is seen as our basic information‐processing capability. This is best tested using culture‐reduced, novel material, often under time pressure. Crystallised intelligence is regarded as the stored knowledge produced over time. It is best captured by tests like vocabulary and other knowledge‐based assessments. Fluid intelligence is more susceptible to the effects of age and physical insults to the brain, such as somatic illness, head trauma and neurotoxins. Crystallised intelligence is more robust. For example, vocabulary is well maintained with age, and the ability to read irregular words, as captured, for instance, in the National Adult Reading Test (NART), is preserved even during the initial stages of dementia at a time when fluid ability has declined markedly.47 An examination of the predictive significance of fluid and crystallised intelligence for morbidity and mortality would have some value. Baltes' concepts of the mechanics and pragmatics of intelligence are almost identical to the concepts of fluid and crystallised intelligence, respectively.48
Often, when it is applied to mental testing, this refers to reaction time and cognate procedures and the indices derived from them. Going back to the very start of mental testing, there has been an idea that individual differences in complex cognitive tasks might partly be founded on differences in how the brain copes with the processing of information, even in simple tasks.49 This is partly correct: reaction times are simple tasks and people's mean reaction times (and their intraindividual variabilities) correlate modestly and significantly with intelligence test scores.50 People with higher intelligence test scores tend to have faster and less variable reaction times. One study has found that, after adjusting for reaction time, there was no longer a significant association between intelligence and mortality.18 This supported the speculation that the general integrity of the body (perhaps indexed by information‐processing efficiency in the form of reaction time) might partly underlie both intelligence and mortality.10
The common cause hypothesis derives from work on human cognitive ageing. Age‐related changes in different mental abilities tend to correlate positively—that is, as one cognitive domain declines with age, others also tend to decline.51,52 Moreover, age‐related changes in mental abilities are correlated with age‐related changes in sensory functions, such as vision, hearing and balance.53,54 In addition, there are associations between age‐related changes in cognition and physical measures, such as lung function (as measured using forced expiratory volume in 1 s) and grip strength.55 This web of associations in ageing trajectories is captured in the common cause suggestion: age‐related deterioration in cognitive function might in part reflect general bodily deterioration, and the possibility that some of the causes of diverse, age‐related physical and mental changes might be shared. This is important for cognitive epidemiology, because it could mean that intelligence–mortality/morbidity associations might reflect this common cause. Thus, intelligence–mortality gradients in old or even middle‐aged people might occur because intelligence tests are quite sensitive signals of early physical pathology.56 This possibility makes even more valuable those cohorts that have intelligence test scores since their early life. Thus, if childhood intelligence and mortality are related, typically ascertained some decades later, it is much less likely that this is caused by age‐related bodily deterioration or pathology (see section Reverse causality).10
The French psychologist Binet invented the first widely used mental tests with his physician coworker Simon.57 The Binet and Simon test appeared in 1905. The key insight, widely accepted today, was that older children, on average, could accomplish mental tasks that younger children could not. There were 30 tasks in the test, based on attention, social interaction, vocabulary, commands, reasoning, judgement and memory. Goddard took Binet's test from continental Europe to the USA, and had it and Binet's papers translated.58 Goddard sent out tens of thousands of copies from his Vineland Training Institute for the feeble‐minded (persons with a learning disability) for other researchers to use. The Binet test was applied far beyond individuals with learning disabilities to try to explain behavioural problems in society such as crime, alcoholism and prostitution. It was also used in testing immigrants to the USA, just one application that Goddard himself later realised was poorly justified.58
Mental tests are also known as intelligence tests, cognitive tests, IQ tests, ability tests and aptitude tests. There are thousands of mental tests available. Many are reviewed by the Buros Institute of Mental Measurement (http://www.unl.edu/buros/). The more established tests often come with detailed norms and validation information. Some mental tests may be administered to groups, others on a one‐to‐one basis. Some are single, others are collected in test batteries. Some take a few minutes, others take hours. Some are designed to give an estimate of g, others to measure relatively specific and diverse abilities. They are often called paper and pencil tests. Some are paper and pencil in format, but many are not: they might involve responding verbally, or interacting with various physical objects, and some are presented using computers. They are widely used in education, in the workplace, in medicine and in research. A remarkable finding is that almost all cognitive tests correlate positively when applied to a population sample with a spread of abilities.
It is not feasible to describe large numbers of mental tests in this glossary. Therefore, some prominent tests are described, covering those administered to adults and children, those administered to individuals and groups, and those that assess fluid and crystallised abilities.
IQ was originally applied only to children. It is derived from their mental age divided by their chronological age, multiplied by 100. Traditionally, and arbitrarily, it had a mean (SD) of 100 (15). It tends to be used for adults too, as a useful, standardised way of describing a person's deviation from a population mean. For example, if someone's IQ is >130 on this scale, they are considered to be in the top 2% of the population. Apart from the scale, the term IQ tends to be used loosely to mean intelligence generally, or to refer to a psychometric cognitive ability test.
The Moray House Tests are a large series of mental tests devised by Thomson, Principal of Moray House College of Education (in Edinburgh) and Bell Professor of Education at the University of Edinburgh from 1925 to 1951 and his associates.59 The Moray House Test series included a number of so‐called verbal reasoning tests, which, in fact, also included some numerical and spatial reasoning items and other types of question. Those aimed at 11‐year‐olds were used in England for selection from primary to secondary education (the 11‐plus tests, now largely extinct). Of interest here, versions of the Moray House Test number 12 were used in the Scottish Mental Surveys of 1932 and 1947.16,60 There were also picture‐based Moray House Tests. Designed for younger children, these tests were used in the Aberdeen Children of the 1950s cohort.61
One of the most widely used set of batteries used to assess intelligence was devised by Wechsler. There are preschool, child and adult versions of these tests. For example, the Wechsler Adult Intelligence Scale III has 13 individual tests, which are administered by a trained tester to an individual subject. Table 11 describes the contents of individual tests and the cognitive domains which they assess. In a validation sample of 2450 American adults, the mean correlation among these 13 tests was 0.49 (range 0.26–0.77). That is, despite the wide range of the tests' contents and thinking and response requirements, people who did well on any single subtest tended to do well on all of the others. Some groups of the subtests correlate more highly among themselves than they do with others. A confirmatory factor analysis (using structural equation modelling) of these data found that there were four identifiable cognitive domains underlying the tests.26 These are named verbal comprehension, perceptual organisation, processing speed and working memory.27,62 The reliability of the test is high: the test–retest reliability of full‐scale IQ is 0.98, and the mean test–retest reliability of the 13 individual tests is 0.85 (range 0.77–0.93). Documentation of these reliability and validity results might be described as industrial: the reliability chapter in the test's technical manual runs to over 20 pages and the validity chapter (covering content validity, criterion validity and construct validity) fills >100 pages.62
The most widely used English version of Binet's original test for children was developed by Terman and his colleagues at Stanford University. It first appeared in 1916 and has been revised regularly since then.63 The most recent edition has 15 subtests, may be applied from age 2 years to adulthood, and uses the three‐stratum hierarchical model of human mental abilities as a framework.38 It takes up to about 90 min to administer. Figure 11 shows some of the tests and the mental ability domains into which they combine. Note that individual differences in the domains are correlated positively, and so a general factor is found in the Stanford–Binet test.
Whereas the Wechsler scales and the Stanford–Binet test are administered by a trained tester to a single subject, Raven's Progressive Matrices63a is an example of a group test (as are the Moray House Tests), one which may be administered to several people at the same time. Subjects work on a booklet which contains the items, and they complete their answers. Raven's Matrices are available in children's, standard and advanced forms, reflecting the fact that materials of different difficulty levels are needed for children, normal adults and very able adults (such as university students). Raven's tests involve non‐verbal inductive reasoning. They were devised by the Scottish psychologist Raven, intended to be a relatively pure examination of Spearman's conception of the mental processes most closely related to g. The stimuli are abstract patterns. The person taking the test inspects a pattern with a piece missing and has to choose one of the answer options to complete the pattern. The correct option is arrived at by working out the rules (inductive reasoning) of the pattern and then applying them to arrive at the piece that completes it. An example is given in fig 22.. The Raven test is among the best single indicators of g.38 It is a relatively fluid intelligence test, meaning that it requires active mental work involving the effortful manipulation of information. Other group tests of relatively fluid ability are the Alice Heim Series and the Cattell Culture Fair test series.
The crystallised partner to the fluid Raven's Matrices is the Mill Hill Vocabulary test, also devised by Raven. There are junior and senior versions. Whereas the Raven test involves active inductive reasoning, one part of the Mill Hill Vocabulary test requires the subject to underline which of six words is closest in meaning to a target word. There are many other vocabulary and knowledge‐based tests. The NART, mentioned above, requires the subject to read aloud 50 words, none of which follow normal English rules of grapheme–phoneme correspondence and/or stress. These types of test decline much less with age and, therefore, are often used to indicate the person's prior mental ability. As a validation of this, the NART remained stable before and after identification of mild dementia, and the score in old age correlated highly with childhood IQ.47 The partnering of fluid (active thinking with new materials) and crystallised (demonstrating already‐stored knowledge) assessments as is done in the Raven and Mill Hill tests is also found in other group test combinations—for example, in the Wide Range Achievement Test.
Associated with the classic studies by the New Zealand political scientist Flynn, the Flynn effect refers to the finding that mental test scores rose during the 20th century.64 That is, people of a given age scored higher on intelligence tests as the 20th century progressed. The more practical implication of this was that norms became out of date and tests were re‐normed; people got higher IQs when they were tested on older versions of tests (with older norms) than when they were tested on more recently revised versions. This rising IQ effect is found in many countries, and some of the best data come from military conscript testing, where fathers and sons were compared—for example, Dutch military conscript data showed sons outscoring their fathers, who had been tested about 27 years earlier, by about 18 IQ points. The cause of the Flynn effect is not known. Some have suggested cultural effects, and some have suggested nutrition.65 Flynn himself has provided a hypothesis, although few understand it fully.66 The Flynn effect is an enigma, not least because it seems to have left the reliability and validity of intelligence test scores unaffected (and good) within each generation.
Factor analysis is a set of statistical techniques used to discover the latent traits in a correlation/covariance matrix.67 Often, in psychology, several tests (cognitive or other types of test) are applied to subjects and, in fact, the number of constructs being assessed is considerably less than the number of tests. Applied to mental test scores, factor analysis often reveals a large, single general factor (see section General intelligence). However, after factor rotation, it is common to find a number of positively correlated, but separable cognitive domains. A distinction should be made between exploratory factor analysis, which may be used to describe a dataset, and confirmatory factor analysis, which may be used to test a particular model for goodness of fit to a dataset and/or to test between competing models. Confirmatory factor analysis is one of the techniques belonging to the family structural equation modelling.68
Structural equation modelling is a set of statistical techniques that combines aspects of multivariable regression, path analysis and factor analysis.68 It may be used to test hypotheses about the associations among latent and measured (manifest) variables. It is implemented in a number of specialised software packages, such as Lisrel, EQS, AMOS, R, Mx and Mplus. Its attractions include the possibilities of (a) extracting latent traits from a number of measured variables that are correlated, (b) simultaneously analysing more than one outcome variable and (c) conducting formal tests of mediation. As an example of (c), Hart et al11 examined the hypothesis that occupational social class at midlife substantially mediated the association between childhood IQ and mortality in a Scottish cohort, and found that it did not.
If psychological test scores can account for reliable variance in some external criterion, they are said to have predictive validity. Thus, intelligence tests are used because they correlate with educational and occupational outcomes.25,27 Substantial evidence for their predictive validity for health outcomes, in the field of cognitive epidemiology, is relatively new.6,69
Researchers of behaviour genetics use twin data to examine the contribution of genes and environment (shared and non‐shared) to human quantitative traits and illness syndromes. However, genetic covariance takes this further by studying the effects of genes and environment on the correlations between phenotypic traits. This multivariate genetic research has shown that, among a battery of diverse cognitive tests, g accounts for much of the genetic variance, and that information processing tasks are genetically linked to the g from intelligence tests.70 With regard to cognitive epidemiology, it is known that both mortality and intelligence have environmental and genetic influences. Moreover, they are correlated. Therefore, if a study of twins contained both cognitive ability and mortality data, it would be possible to study the relative contributions of genes and environment (shared and non‐shared) to the association between intelligence and mortality.
Epidemiologists commonly cite childhood SEP as a candidate confounder in the association between intelligence and mortality.7 Childhood socioeconomic factors (indexed most typically by parental occupational social class, income or education) are related to both intelligence test scores and health. It has been posited that, after childhood socioeconomic factors are taken into account, there will be no remaining association between intelligence and mortality. This would be complete confounding. In fact, adjusting for childhood SEP does not seem to substantially attenuate the intelligence–mortality association.6 In this context, confounders are often confused with mediators.
With regard to mediation, people with higher intelligence are more likely to work in more professional jobs, and these jobs in turn may provide safer environments, so that people with higher intelligence tend to live longer and be healthier. Thus, adult SEP could mediate (fully or partly) the association between intelligence and mortality, or persons with high scores on intelligence tests might differentially interpret health promotion advice in comparison with lower scorers, and so smoke less, take more exercise and eat a more healthy diet. In the Scottish Mental Survey of 1932–Midspan collaboration, using structural equation modelling, it was found that adult SEP only partially mediated the association between intelligence and all‐cause mortality.11 Other studies using survival analyses indicate about 50% attenuation.13
In addition to true confounding and mediation, Singh‐Manoux71 has discussed situations in which supposed confounders might in fact be moderators or antecedent variables. Hart et al11 found that the deprivation of a person's residential area moderated the effect of childhood IQ on mortality. An example of a potential antecedent variable in the context of cognitive epidemiology is birth weight, which, even in the normal range, is associated with later health outcomes72 and with childhood intelligence.73 Therefore, it has been suggested that birth weight might explain some of the association between childhood intelligence and health outcomes. However, adjusting for birth weight does not substantially diminish the inverse association between childhood IQ and subsequent total mortality7 or coronary heart disease.8
Explaining the association between childhood intelligence and mortality is a key task in cognitive epidemiology.74 One explanation has been that intelligence is associated with health outcomes because preventing and managing disease is a set of tasks, some of which are cognitively complex and demanding. This hypothesis is most closely associated with Gottfredson,69 who has gone further to suggest that the social patterning of health outcomes might largely be accounted for by the influence of intelligence on health literacy, a suggestion not fully supported by the limited data yet available.75
One suggested explanation for the association between intelligence and health is that people with higher intelligence might have healthier behaviours, including smoking less, not drinking excessively, avoiding accidents, exercising more and maintaining a prudent diet. Although there is some evidence that people with higher intelligence responded to the health warnings on smoking,12,76 it has not been established to date that health behaviours account for a substantial part of the association.13
The association of childhood and early adult intelligence with total mortality has been found in samples from Australia, Denmark, two areas of Scotland, Sweden, the UK and the US. At least four possible mechanisms were suggested, none of which was exclusive of the others,10 and additions have been made.6 Here we state them briefly, and offer, in each case, an example of a study that has tried to test an aspect of these hypotheses.
To date, little is known about the mechanisms of this association, and this should be the main thrust of new research.
Intelligence is associated with morbidity and mortality, but it is also plausible that illness lowers cognitive test scores rather than the reverse. It is well known that, among older people, illness is associated with lower intelligence, and there is a terminal decline in cognition that is apparent from >3 years prior to death.77 Therefore, the association between cognition and death among older people56 did not attract the same speculation as the association between childhood IQ and adult mortality up to several decades later.10 As indicated, among studies of the latter type, the problem of reverse causality is far less acute.
Intelligence is correlated significantly with childhood and adult SEP.78,79,80 Childhood SEP might be a confounder of any childhood IQ–mortality/morbidity association, and adult SEP might be a mediator.71 To date, it seems that childhood SEP does not substantially attenuate the association, and that intelligence and adult SEP may have some shared and independent influences on health outcomes.75
IJD is the recipient of a Royal Society‐Wolfson Research Merit Award. GDB holds a Wellcome Advanced Training Fellowship (number 071954/Z/03/Z). We thank Linda Gottfredson for drawing our attention to the paper by Maller.19
g - general intelligence
NART - National Adult Reading Test
SEP - socioeconomic position
Competing interests: None declared.
Note added in proof. In the period between this glossary being accepted for publication and the authors receiving the proofs there have been many publications in the new field of cognitive epidemiology. It is not possible to incorporate them here. However, readers might find especially useful the systematic review of studies linking early life IQ and later mortality risk.81