We have so far considered examples where we consider there to be sufficient genetic
data available to arrive at an estimate of the effect size. Our conclusions, that
the effect sizes found in genetic studies of endophenotypes are not considerably
greater than those found in psychiatric diseases, may be biased because we have been
limited to studying the COMT locus, or because so far no large
effects have been found. The first issue is difficult to tackle. Significant
associations reported for other genetic loci are based on either a single, or a very
small number, of studies, so that we do not yet know how robust these findings are.
Two examples illustrate the point. The first concerns fMRI measures which are
difficult to obtain in large numbers of people. Hariri and colleagues reported that
a polymorphism in the serotonin transporter gene (5-HTT
associated with the response of the amygdala to fearful stimuli (Hariri et
). In a comparison of two groups of 14
individuals, carriers of the s
allele at the 5-HTT
gene promoter were found to exhibit an increased amygdala fearful response compared
with those homozygous for the l
allele (Hariri et al.
): the means of the two groups were
=0·22) and 0·03
=0·19) with respect to % blood-oxygen-level-dependent (BOLD)
signal change for fearful stimuli compared to neutral stimuli. In comparison with
the data we have reviewed, this represents an enormous effect size (equivalent to
~40% of phenotypic variance) that could be detected at a significance
threshold of 0·05 with just 18 subjects. This finding could be due to
chance statistical fluctuation. Indeed, a subsequent study of 92 individuals
(including 19 from the first study) carried out by the same group again showed a
significant effect, but with a reduction in the effect size. The reported mean
values were this time 0·16 and 0·03 for the two groups (Hariri
), equivalent to an effect size of
just over 10% of the phenotypic variance. Additional studies are needed to confirm
whether the effect is indeed this large.
The second example is the investigation into the genetic basis of an event-related
potential, the P50. This is another endophenotype for schizophrenia, in which two
stimuli are presented. At interstimulus intervals exceeding 8 s, two event-related
potentials are detected. If the second stimulus occurs within 8 s of the
first, normal subjects inhibit the response to the second stimulus. Schizophrenia
subjects have a deficit in such inhibition, a finding supported by a recent
meta-analysis (Bramon et al.
). The difference in inhibition is
maximal when the second stimulus follows the first by 500 ms (Adler
Interest in the molecular basis of the P50 phenotype was spurred by a 1997 report
that a locus on chromosome 15 had been identified by linkage mapping (Freedman
). The report was important because
the locus contained a gene, the alpha 7-nicotinic acid receptor, which was later
reported by the same group to be significantly associated with failure to inhibit
the P50 auditory-evoked potential (Leonard et al.
). Furthermore, the group has
presented evidence that functional variants in the promoter of the gene are present
at significantly greater frequencies in schizophrenia subjects compared to controls
(Leonard et al.
). A second group has independently
reported that a 2 bp deletion in exon 6 of the gene and a polymorphism in
the promoter are associated with the P50 phenotype (Raux et al.
; Houy et al.
) but not with schizophrenia. Clearly,
given the findings for other complex traits, these results must be replicated in
large samples if they are to be regarded as genuine.
The two examples encourage the view that some endophenotypes may be more genetically
tractable than the ones we have discussed. Our concern is that as additional data
accumulate these findings may turn out to be false positives, or at least the effect
sizes will be much smaller than initially reported, as has often been found with
genetic association studies (Trikalinos et al.
). Indeed, removing samples from the
first published study in our analysis of WCST data rendered the association with
genotype non-significant and reduced the pooled effect
size estimate by 20%.
The second issue is whether the current failure to find large genetic effects in
endophenotypes can be read as a general indication of the complexity of genetic
architecture for all phenotypes. Endophenotypes are assumed to have a relatively
simple genetic architecture because there are relatively few pathways from gene to
phenotype. The consequence is that sequence variants interact relatively directly
with the phenotype so the correlation should be easier to detect. We have so far
examined this assumption by investigating what is known about the genetic
architecture of commonly investigated endophenotypes, and have shown that there is
little evidence that it is considerably simpler than that of psychiatric disease.
Perhaps we happen to have selected those phenotypes that have a complex genetic
architecture. In the absence of detailed genetic analyses of multiple endophenotypes
we cannot gainsay this point. However, we are able to approach this question from
another point of view. We can ask what is known about the genetic architecture of
phenotypes which have a much closer relationship to their genetic basis than
endophenotypes for psychiatric disease.
Analyses of model organisms provide the relevant information. We will discuss two
examples. The first is genetic analysis of phenotypes in the mouse, from which we
have robust genome-wide association data for multiple phenotypes, behavioural and
physiological, and associated estimates of locus effect sizes. These data allow us
to compare the genetic architecture of behavioural phenotypes with those that would
qualify as endophenotypes: for instance measures of electrophysiology, biochemistry,
haematology and immunology. The drawback is that mouse models of psychiatric disease
are imperfect, so that inferences drawn from the mouse data may be misleading.
Nevertheless, we have no reason to expect the relationship between endophenotypes
and behavioural phenotypes to be different in rodents and humans.
Reviews of the distribution of locus effect sizes show no difference between
physiological and behavioural phenotypes (Flint et al.
). Moreover, in the most detailed
analysis to date of the genetic architecture of complex traits in the mouse, among
phenotypes there was no significant difference in the number or effect size of loci
detected (Valdar et al.
). Intriguingly, regardless of the
phenotype, the genetic effects that were detected explained about the same
proportion of the additive variance, suggesting considerable similarity in the
genetic architecture of many phenotypes (Valdar et al.
). shows the effect sizes of 843 quantitative trait loci (QTL). The
phenotypes include measures of anxiety and learning and memory, as well as
haematology, immunology, biochemistry, physiology and anatomy. A full description is
given in Valdar et al.
) and on a website (http://gscan.well.ox.ac.uk
). shows that preponderance of small effects. For each of the 100 phenotypes
analysed, many loci contribute a small proportion to the variance. Large effect QTL
are rare: only ten account for greater than 5% of phenotypic variance, and the mean
Distribution of effect sizes of 843 mouse quantitative trait loci (QTL).
The mouse data indicate that we would not have obtained a simpler genetic
architecture by working with any of the physiological, immunological, biochemical or
haematological phenotypes in place of the behavioural phenotypes. We would still
face the currently demanding challenge of having to identify the molecular basis of
many small genetic effects.
The second example addresses the question of the relationship between genetic
variants and phenotypes at an extremely immediate level, namely the correlation
between DNA sequence variants and variation in the relative abundance of mRNA.
Variation in transcript abundance is heritable and it is reasonable to expect that
the variation in expression of some genes may correlate with psychiatric disease.
Thus a gene expression profile could act as an endophenotype, although currently we
do not know which genes show expression patterns correlated with psychiatric
disease. Compared to any of the endophenotypes so far analysed (), variation in the amount of
transcript is more proximal to DNA sequence variation and, if the assumptions about
the nature of an endophenotype are correct, the genetic architecture of transcript
variation should be relatively simple.
Analyses of gene expression variation in yeast, rodents and humans concur in finding
that the genetic architecture of gene expression is polygenic and that the genetic
effects are relatively small (Morley et al.
; Brem & Kruglyak, 2005
; Chesler et al.
; Hubner et al.
). In yeast, where we have the most
reliable estimates, relatively few transcripts have large effects: the median effect
size was equivalent to 27% of variation in transcript level, only 16% of loci
accounted for more than 60% of variation, and a quarter explained less than 10%
(Brem & Kruglyak, 2005
estimates of effect size are not as robust in rodents, the available data indicate
that the effect sizes are comparable to those found in yeast (Morley et al.
; Chesler et al.
; Hubner et al.
Although the effect sizes of loci contributing to variation in mRNA transcript
abundance are larger than the effects found in complex traits (which explain less
than 1% of the phenotypic variance) effect sizes are relative to the population in
which they are measured. A reasonable comparison for the rodent mRNA data is with
the effect sizes of QTL found in crosses between inbred strains of mice. Remarkably,
the median effect size of QTL is 12% (Flint et al.
), just under half that of the
expression phenotypes. Therefore, even when we consider a phenotype that is directly
linked to the genetic constitution of the organism, genetic architecture is not
radically different from complex phenotypes.
We have reviewed what is known about the genetic basis of endophenotypes and shown
that their genetic architecture may be as complex as that of psychiatric disease.
This does not mean there is no advantage to the use of endophenotypes for genetic
studies. We have pointed out that the robust, quantitative measures that are typical
of many endophenotypes means that they may be suitable for collecting the large
samples needed for genetic analysis of complex traits, and may afford more
statistically reliable data. We suggest that, along with the frequency and
penetrance of a disease-causing allele given in , the ease and reliability of phenotyping should be factored into
There are important limitations to our analysis. First, we have little reliable data
about the genetic basis of complex traits in general and psychiatric endophenotypes
in particular. Assumptions about the genetic architecture of complex traits depend
so far largely on negative findings: our inability to detect robust linkage and
association signals is due to lack of power and we have not sufficiently appreciated
the genetic complexity. It is possible that, with the completion of the first whole
genome association studies when estimates of effect size across the genome are
available, a different picture will emerge. A second important limitation is that
our review has concentrated on the effect of COMT on
endophenotypes. Unfortunately there are no other examples where sufficient data have
accumulated to be included in meta-analyses. Again, the availability of additional
data might alter our results.
However, we think that our conclusions are unlikely to change much: first, studies in
genetically more tractable organisms, such as yeast, flies and rodents, confirm the
finding of genetic complexity for all phenotypes. The results are not here based on
negative results: we have definite evidence of complexity. Second, as we have shown
in the example of the genetic analysis of transcriptional abundance, there is no
indication that alternative phenotypes will be any easier to deal with. Thus, while
endophenotypes may be useful for many reasons, such as providing trait markers of
susceptibility to psychiatric illness, for providing biological markers of disease
and models for investigating disease process, we do not think they are likely to be
any easier to dissect at a genetic level than the disorders to which they are