|Home | About | Journals | Submit | Contact Us | Français|
Alzheimer’s disease (AD) genetics may be one of the most prolifically published areas in medicine and biology. There are nearly 200 reviews on this topic since 1991, when the first report on an autosomal dominant mutation in the amyloid precursor protein gene (APP) came out. Three early-onset AD genes with causative mutations (APP, PS1, PS2) and one late-onset AD susceptibility gene (ApoE) exist with ample biological, genetic and epidemiological data and essentially universal acceptance about their roles in AD. Evidence from family and twin studies suggest a significant genetic component underlying AD which is not explained by the known genetic risk factors. The past 10 years in AD genetics research have led to ten independent whole genome linkage and association studies with implications for multiple genomic areas for harboring AD susceptibility genes. To date, there are about 900 papers reporting associations between variations in more than 350 genes spread over 23 autosomes. One hundred years after the first published article on Alzheimer’s disease, much is known about the pathophysiology of this disease, however much more remains to be discovered about its etiology. This review summarizes the evidence for the genetic component in AD, identification of the early-onset familial AD genes and ApoE, and the current state of knowledge for additional AD susceptibility loci and alleles. The future directions for genetic research in Alzheimer’s disease as a common and complex condition are also discussed.
AD is the most common adult neurodegenerative disease, with estimated 10–30% prevalence by age 85 or older and incidence of 6–8 % in the same age group [reviewed in 1]. With about 3–4 million affected individuals in the United States, 350,000 new cases per year (range: 200,000–600,000) and average annual cost per patient varying between $10,400 to $34,517 [2,3], AD is an epidemic with major health, social, and economic impact. In the absence of any interventions to delay its onset, which is the current status, the prevalence of AD is expected to reach 8.64 million (range: 4.37 to 15.4 million) by the year 2047 . Brookmeyer et al. estimated that the 50-year projected prevalence of AD could decrease by 380,000 individuals, if there were an intervention to delay disease onset by a mere 6 months, corresponding to annual savings of nearly $18 billion after 50 years. Although this study was restricted to the epidemiological data from four studies in the United States only, with cost estimates from a single region in the US, the message is clear: the personal, social and economic impact of AD will continue to expand with the status quo of management approaches.
A multitude of environmental factors, concomitant health conditions and life events have been proposed to be associated with risk for developing AD [1,3–5]. These proposed risk factors can be arbitrarily divided into the following categories: 1) non-modifiable (e.g. age, ethnicity, perinatal conditions, early-life development and growth), 2) socio-economically modifiable (e.g. socio-economic conditions, environmental enrichment, cognitive reserve, diet), 3) medically modifiable (obesity, hyperlipidemia, hypertension, diabetes). While epidemiologic studies assessing non-genetic risk factors can provide useful insights into the origins of AD, their interpretation and establishment as causative factors rather than mere associations are usually problematic, given the natural lack of an experimental setting, difficulty in assessing early-life events for a disease of the elderly population (retrospective rather than prospective studies) and the concomitant existence of multiple common conditions (such as hypertension, diabetes and AD). There are exceptions such as the Lazarov et al. study , which showed that transgenic mice co-expressing two familial AD genes with mutations (APPswe and PS1ΔE9) and raised in an “enriched environment” had less AD pathology reflected as reduced brain Aβ levels compared to trangenic mice raised in “standard environment”. Interestingly, of the transgenic mice in the “enriched environment”, those that were more physically active had the most significant reductions in brain Aβ levels. The authors also identified elevated activity of Neprilysin, an Aβ-degrading protease, in the environmentally enriched transgenic mice, as well as differential gene expression, suggesting gene-environment interactions and a possible role of exercise and environmental enrichment for protection against AD. Others determined that environmental enrichment improved cognition in the same transgenic mouse model of AD . Though there is discrepancy between these studies regarding the effects of environmental enrichment on brain Aβ levels [6.7], which is beyond the scope of this review, both studies are important as they provide a paradigm to test the effects of environment and gene-environment interactions in an experimental setting. Despite such advances, exploring the effects of environment on a common, complex condition such as AD remains a challenging task, only surpassed in difficulty by the prospect of modifying environmental factors to decrease disease risk.
Given the public impact of common, complex disorders such as AD and the challenges associated with studying and more importantly modifying environmental factors, much focus has been placed on genetic investigations. The rationale for studying genetics of any disease is multiple-fold: 1) Uncovering the underlying genetics leads to an understanding about disease pathophysiology. 2) Genetic risk factors may be modifiable, unlike many environmental risk factors. 3) Molecules encoded by genetic risk factors can be drug targets. 4) Genetic risk factors may provide clues into modifiable environmental risk factors. 5) Genetic variants and molecules identified through genetic studies can serve as biomarkers to identify at risk populations for disease prevention. 6) Knowledge about genetic risk factors in individuals may allow for personalized medicine in the future.
The first reports suggesting a genetic component for AD were published merely 2–3 decades ago, about 70–80 years after Dr. Alois Alzheimer’s publication on the very first case of this disease . Early reports focused on the invariable development of AD-like disease in Down’s syndrome patients after age 40, the increased risk of disease in family members of AD patients with disease onset <65 years and the autosomal dominant inheritance pattern in the rare families with early-onset forms of this disease . Systematic analyses geared towards estimation of the genetic component of AD can be grouped into the following categories: 1) Familial aggregation; 2) Transmission pattern and 3) Twin studies. These studies are discussed below:
Familial aggregation studies have revealed that having a first-degree relative with AD increased one’s risk for developing AD significantly. Breitner et al.  investigated 379 first-degree relatives of 79 probands in a longitudinal study of AD and found that the cumulative incidence of AD among relatives increased strikingly with age to 49% by age 87 in comparison to an incidence of < 10% for controls. Importantly, they did not observe an appreciable difference between the risks to relatives of presenile-onset (AD age of onset < 65 years) versus senile-onset AD probands (AD age of onset > 65 years), suggesting the existence of a genetic origin for both early-onset AD (EOAD) and late-onset AD (LOAD). They emphasized that their results should not be interpreted as evidence for autosomal dominant transmission in LOAD, but as a rationale for pursuing genetic studies in LOAD as well as EOAD. Farrer et al.  independently assessed 70 families with one or more AD subjects using survival analysis and determined evidence of different transmission patterns for EOAD (defined as mean onset age <58 years in their study) and LOAD (>58 years). Offspring of AD subjects had an estimated lifetime risk of 53% in EOAD families whereas this was 86% in LOAD families. These results supported an autosomal dominant transmission pattern for EOAD, whereas LOAD likely had a more heterogeneous transmission possibly with both genetic and environmental contributions. Thus, despite the differences in the risk-to-relative estimates, it was evident from these early studies that both EOAD and LOAD had a transmissible component.
More recent longitudinal familial aggregation studies based on much larger datasets supported the earlier findings of familial aggregation and inheritance of AD. As part of the Multi-Institutional Research in Alzheimer Genetic Epidemiology (MIRAGE) project, Lautenschlager et al.  estimated the risk to 12,971 first-degree relatives to 1,694 AD probands (mean age at onset 69.8) using survival analysis procedures. They found this risk to be 39.0% ± 2.1% by age 96 years, which is approximately twice the estimated cumulative incidence of AD in the general population, establishing the substantial genetic component affecting this disorder. Furthermore, the lifetime risk for LOAD (≥65 years) estimated separately was still high at 38.7±2.4%, though somewhat less than that for EOAD (39.5±4.1%). Given that the cumulative risk to first-degree relatives in autosomal dominant disorders is expected to be 50%, it was deduced from this study that the lifetime risk among relatives did not support a simple autosomal dominant inheritance pattern of disease, unlike some of the earlier familial aggregation studies, and that AD likely had a more complicated transmission pattern such as additive, multifactorial or polygenic inheritance. This study found a lower lifetime risk estimate compared to other studies [10, 11]. The difference was attributed to inflated estimates resulting from missing information especially in older age groups in other studies. MIRAGE study, which included ~400 subjects aged ≥90 determined decreased risk of AD in this oldest age group, raising the interesting hypothesis that not only are there genetic risk factors for AD, but also protective genes at play in the healthy, elderly population.
Recent analysis of the MIRAGE data  comparing risk to first degree relatives in the African-American (255 probands) vs. the white population (2339 probands) indicated a higher cumulative risk by age 85 for the former (43.7±3.1% vs. 26.9±0.8%, p<0.001). However, risk to the spouses was also found to be higher in the African-American population indicating that the risk attributable to familial aggregation was similar in the two ethnic groups. Given that the risk to first degree relatives was higher than that for spouses in all ApoE4 strata, this study concluded that there exist substantial heritable component to AD in both African-American and white populations, which is different from ApoE. Other familial aggregation studies controlling for ApoE genotype also found similar evidence supporting the presence of factors other than ApoE accounting for risk of AD [14–16]. In these studies, presence of an ApoE ε4 allele in the proband increased AD susceptibility in the relatives. However, there was also increased AD risk to relatives of patients without an ApoE ε4 allele; and that the estimated risk to AD relatives was higher than their predicted ApoE ε4 carrier status, implicating familial factors other than ApoE for risk of developing AD.
Although familial aggregation studies using survival analyses methods provided evidence of a transmissible factor accounting for risk of AD, these approaches did not necessarily distinguish a genetic factor from a transmissible environmental factor. To address this issue and to determine the mode of inheritance of genetic factors, segregation studies were pursued in AD families. Using segregation analysis of 232 AD families (age-at-onset 42–86 years), Farrer et al.  determined that the model of inheritance with the best fit was autosomal dominant genetic factor plus a multifactorial component. In this study, the models for a single major locus, multifactorial component only, no genetic susceptibility and autosomal recessive transmission were all rejected. These findings could be explained by the existence of multiple genetic and environmental risk factors for AD as well as heterogeneity with some families harboring a major risk gene and others multifactorial transmissible factors. Segregation analysis of Dutch AD families of early-onset (<65 years) yielded similar results implicating heterogeneity in EOAD . Rao et al.  performed segregation analysis on a total of 401 AD families, 68 of which were early-onset (age at onset of AD cutoff = 65). They determined that stratification of the families according to the age of-onset always yielded a better model fit, an indication of the etiological heterogeneity between EOAD and LOAD families. AD in early-onset families was found to have an autosomal dominant transmission pattern. However all transmission models tested were rejected in the LOAD families, suggesting the possibility of heterogeneity or a complex genetic mechanism such as oligogenic inheritance. Thus, results from independent segregation analyses provided further support for genetic risk factors for both EOAD and LOAD; implicated multiple genetic and environmental factors especially for LOAD; and suggested a role for autosomal dominant transmission for at least some EOAD families.
More recently, Warwick Daw et al.  utilized oligogenic segregation analysis to study 75 AD families ascertained through a proband with LOAD and composed of 742 subjects, to estimate the number of trait loci affecting age-at-onset in LOAD. They accounted for effects of ApoE and sex in their study and determined that there exist four or more quantitative trait loci (QTLs) in addition to ApoE which contributed to age-at-onset in LOAD. They estimated that these loci had an effect size equal to or greater than that of ApoE, which itself accounts for 7–9% of the total variance for age-at-onset. These studies clearly provided evidence for a significant genetic component for AD, with a complex mode of inheritance for LOAD and provided a clear rationale for search of novel LOAD genes.
Twin studies particularly from Scandinavian twin registries have been instrumental in establishing the genetic component for AD [21–23]. These studies compare the concordance of disease in monozygotic (MZ) twins, who share 100% of their genetic material, and dizygotic (DZ) twins, who on average share 50% thereof. For a disease that is entirely due to genes, the lifetime concordance for MZ twins is expected to approach 100%, allowing for lesser estimates due to diagnostic inaccuracies and inherent late-onset nature of certain diseases (such as AD). The same estimate for DZ twins would be expected to approach 50%, the same estimate for sibpairs, or slightly more, secondary to possibly increased environmental sharing in DZ twins compared to other sibpairs. Given that MZ and DZ twins are assumed to share similar intrauterine and rearing environments , concordance rates significantly higher for MZ than DZ twins, can be attributed to the effect of shared genes. The uses and potential pitfalls of twin studies in AD have previously been reviewed [24, 25]. In a study of the Finnish Twin Cohort , monozygotic twins (MZ, n=51) had significantly higher cumulative incidence of AD than did dizygotic twins (DZ, n=43), whereas there was no difference between the incidence of vascular or mixed dementia between the MZ and DZ twins. There were higher pairwise disease concordance rates between the MZ than the DZ twins both for AD (18.6% vs. 4.7%) and vascular dementia (18.2% vs. 6.7%). Bergem et al.  found a significantly higher pairwise concordance rate for AD in MZ (78%) vs. DZ twins (39%) but no difference for vascular dementia (MZ 17% vs. DZ 25%). By using tetrachoric correlations, they estimated the heritability (proportion of total variance due to genes) for AD to be 60%. The largest AD twin study to date was based on the Swedish Twin Registry analyzing 392 twin pairs who had clinical diagnoses . The MZ twins’ concordance of AD in this study was also higher than that of DZ twins, confirming the general conclusions of earlier twin studies. This study, unlike earlier twin studies, adjusted their findings for age and also addressed the issue of sex-difference on genetic factors by utilizing both same-sex and opposite-sex twins. They concluded that the age-adjusted heritability of AD was estimated to be 58–79% (varying based on model utilized) and that there was no significant gender-difference for either the prevalence or heritability of AD after controlling for age. Gatz et al.  also discussed the discrepancy in the concordance and heritability estimates between the different twin studies and highlighted differences between ascertainment, sample size, follow-up period, diagnostic methods, and age of analysis of subjects. Regardless of the individual parameter estimates obtained from these studies, the ultimate conclusion is unifying and support a significant genetic component for AD estimated at 60–80%.
Twin studies have also been utilized to study familial aggregation and the influence of ApoE on the genetic component of AD. In a study of 94 twin pairs who were either members of the National Academy of Sciences Registry of Aging Twin Veterans (44 pairs) or were volunteers (50 pairs), Steffens et al.  determined that concordance for AD among twins was significantly associated with higher rate of AD among first degree relatives (21%) compared with first degree relatives of discordant twins (9.5%). They also assessed the effect of ApoE genotype in the twins for development of AD and determined that the presence of ApoE ε4 genotype in the twin pair was associated with increased likelihood of AD in the first degree relatives. There was one family in this study concordant for AD and without ApoE ε4 with highest degree of AD in the first degree relatives (4/6 relatives), suggesting presence of additional AD genes instrumental at least in this family. Given that the ApoE status was not determined or inferred for the AD relatives in this study, and the small numbers of non-ε4 positive concordant twins, it is not possible to accurately estimate the contribution of genes other than ApoE for development of AD in first degree relatives of AD twins. Bergem and Lannfelt  determined that in a collection of Norwegian twin pairs, the frequency of the ApoE ε4 allele did not differ between concordant and discordant dizygotic twin pairs, though the presence of this allele was associated with an earlier age at onset. These findings from twin studies suggest that while ApoE accounts for risk and earlier age at onset of AD (the two phenotypes are interdependent for late-onset diseases), this gene does not explain all of the genetic risk for AD.
The collective evidence from familial aggregation, segregation and twin studies for AD, suggest a significant genetic component for this disease. The next section discusses the identification of the three known early-onset familial AD (EOFAD) genes.
Genetic linkage studies followed by candidate gene analysis or positional cloning have historically been successful in EOFAD and led to the identification of the three known EOFAD genes. Amyloid precursor protein (APP) gene on chromosome 21 is the first one found to have a mutation which causes EOFAD . It is a good example of how linkage analysis coupled with candidate gene approach led to the successful identification of one of the EOFAD genes.
The identification of an association between Trisomy 21 (Down’s Syndrome = DS) and AD led researchers to focus on chromosome 21 as the possible locus for an AD gene. The first report that discovered the link between DS and AD dates back to the 19th century when Fraser and Mitchell reported “precipitated senility” as a cause of death in DS patients . In studies that were conducted half a century later, autopsies of DS patients who died after the age of 40 revealed AD-like histopathology, namely senile plaques and neurofibrillary tangles [30, 31, reviewed in 32]. During the early 1980’s researchers discovered that the “amyloid protein” isolated from the brains of AD patients had the same biochemical properties and amino acid sequence as that isolated from DS brains [33, 34]. Because of the clinical, histopathological, and biochemical association of DS with AD and the existence of an extra copy of chromosome 21 in DS, researchers started to focus on chromosome 21 as the possible location for an AD susceptibility gene. In 1987, St George-Hyslop et al. found linkage to a locus on chromosome 21 in extended families with the autosomal dominant form of EOAD . In two related articles, Goldgaber et al.  and Tanzi et al.  reported the mapping of the amyloid beta protein found in the brains of both AD and DS patients to a region on chromosome 21 that is near the newly identified AD locus. After several reports that refuted linkage of autosomal dominant EOFAD to Aβ or to chromosome 21 in general [38–40], Goate et al., first found evidence of linkage to chromosome 21 in EOFAD families , and then identified a missense mutation in the APP gene segregating with AD in some of the families they analyzed . The mutation occurred in exon 17 of the APP gene partially encoding for the Aβ peptide and led to a valine to isoleucine change at amino acid 717 (Val717Ile) corresponding to the transmembrane domain of the protein. It was thus predicted that this first EOFAD mutation (a.k.a. London mutation) would lead to AD through its effects on Aβ. These predictions were subsequently proven to be accurate through functional analyses discussed below. Since then, according to the Alzheimer Disease & Frontotemporal Dementia Mutation Database (http://www.molgen.ua.ac.be/ADMutations), 26 other mutations have been identified within APP from 74 EOFAD families. Many of these mutations were functionally assessed for their effects on APP processing . These studies provided strong support for the “amyloid hypothesis” discussed in detail in the related section of this issue . The prevalence and clinical phenotype of the EOFAD APP mutations are discussed in subsequent sections.
Linkage analysis and positional cloning studies led to the identification of the second EOFAD gene, PSEN1 on chromosome 14. In 1992, Schellenberg et al. reported the results from a genome search they conducted in non-Volga-German kindreds with EOFAD where they found significant linkage to a locus on chromosome 14 . This finding was confirmed by the positive linkage results obtained by other groups to the same region of chromosome 14 in EOFAD families [45–47]. Sherrington et al.  cloned AD3 in 1995, later known as PSEN1, by identifying a minimal cosegregating region on the chromosome 14 linkage area and determining the transcripts within this region. They found that one of the transcripts corresponded to a novel gene harboring mutations that segregated with EOFAD in the families they studied. Subsequent studies described the structure of the PSEN1 gene and identified new PSEN1 mutations segregating with EOFAD [49–51]. To date 157 pathogenic PSEN1 mutations have been identified in 347 EOFAD families (http://www.molgen.ua.ac.be/ADMutations), making PSEN1 mutations the most common known genetic cause of EOFAD.
The third EOFAD gene, PSEN2, was identified as a result of linkage studies accompanied by homology analysis of the transcripts in the linkage region. In 1995 Levy-Lahad et al.  reported significant linkage to a region on chromosome 1 in EOFAD kindreds of Volga-German origin, where linkage to chromosomes 14 and 21 have previously been excluded [40, 44]. They identified a candidate gene in this region that showed sequence homology to PSEN1 and that had a segregating mutation resulting in an asparagine to isoleucine substitution (Asn141Ile) . Concurrently, Rogaev et al.  independently identified the same PSEN2 N141L mutation in the affected probands of Volga-German families. They also identified a second missense mutation in this gene (Met239Val) segregating in an extended EOFAD pedigree of Italian origin. Unlike the case with PSEN1, PSEN2 mutations are a rare cause of EOFAD. With 11 mutations identified from 19 pedigrees, most PSEN2 mutations are thought to be private to pedigrees [http://www.molgen.ua.ac.be/ADMutations, 51, 55, 56].
Compared with late-onset form of AD (LOAD), EOAD is exceedingly rare. Campion et al.  defined EOAD as age of onset <61 and assessed the population based prevalence of this disease in Rouen, France. They determined that the prevalence of EOAD was 41.2 per 100,000 people at risk (defined as ages 41–60). Their estimations were similar to those obtained from earlier population based prevalence estimates of EOAD which ranged between 18.2–45.2 per 100,000 inhabitants [58–60]. Campion et al. further defined autosomal dominant EOAD (ADEOAD) as having 3 EOAD cases in three generations and determined the prevalence of ADEOAD to be 5.3 people/100,000 at risk. They then performed mutational analysis of the three known EOAD genes in the ADEOAD families they identified in the prevalence study combined with ADEOAD families referred to their laboratory. In their screen of 34 ADEOAD families, PSEN1 mutations were identified in 56%, APP mutations in 15% and PSEN2 mutations in none of the families. Other screens of EOAD identified PSEN1 mutations at a frequency of 18–56% [50, 62], with different estimates between the studies attributed to the different selection criteria for the EOAD families analyzed. Others identified APP variations at a frequency of 13% in EOAD . According to the Alzheimer Disease & Frontotemporal Dementia Mutation Database (http://www.molgen.ua.ac.be/ADMutations), of the EOFAD mutations identified, PSEN1 mutations account for the majority (81%), followed by APP (14%) and with PSEN2 mutations identified only in a handful of families (6%).
In summary, EOAD accounts for 6–7% of all AD (57, 64) and EOFAD (or ADEOAD) accounts for 13% of EOAD. Given that mutations in APP, PSEN1 or PSEN2 can explain up to 71% of the autosomal dominant transmission pattern in EOFAD, these genes account for about 0.5% of all AD. Although these are essentially negligible figures in terms of population attributable risk posed by the known EOFAD mutations, their discoveries were, nonetheless, monumental in terms of understanding the pathophysiology of AD discussed briefly below and in detail elsewhere in this issue . Importantly, 50% or more of EOAD are not explained by the known EOFAD mutations, indicating this aggressive form of AD could possibly harbor as yet unknown genetic factors. Furthermore, existence of LOAD families with an apparent autosomal dominant pattern of transmission suggests the presence of other Mendelian mutations with less aggressive phenotypes [65, 66]. In a recent whole genome scan of 12 AD families with an autosomal dominant pattern of inheritance, Giedraitis et al. found evidence of a shared 40 cM haplotypic region on chromosome 8p in affected individuals from two families with age at onset 54–75 . Analyses of single, extended families with autosomal dominant transmission patterns and members with either EOAD or LOAD in multiple generations have shown linkage to numerous chromosomal regions, some of which have not been identified in prior whole genome linkage scans of AD. Thus, analyses of EOAD families and large LOAD families with Mendelian transmission patterns may emerge as a way to identify other genetic causes of AD. Though the population-based impact of such discoveries are expected to be small, they can provide new insights to the pathogenesis of AD and be instrumental in the development of drugs, novel quantitative phenotypes and biomarkers for prevention and treatment of more common forms of this devastating condition.
Aβ, the major proteinaceous component of senile plaques (SPs) in AD brains is processed from APP, by the cleavage of two secretases: β-secretase and γ-secretase [70–73]. Processing of APP by β-secretase and α-secretase precludes the formation of the Aβ peptide and instead leads to the production of the shorter, non-fibrillogenic and non-amyloidogenic P3 peptide . The EOFAD mutations identified in the APP gene all reside close to the secretase cleavage sites . Analysis of plasma Aβ from affected families and the secreted Aβ from cell lines transfected with mutant forms of APP all showed an increase in the production of Aβ42± an increase in Aβ40; increased Aβ42/Aβ40 ratios or enhanced Aβ protofibril formation for all functionally assessed APP mutations [42, 76–80]. Finally trangenic mice carrying the mutant APP were shown to have elevated Aβ, amyloid plaques and correlative memory deficits, abnormalities that are reminiscent of AD . PSEN mutations, similarly have been shown to increase the production of Aβ . Moreover transgenic mice overexpressing a mutant form of the PSEN1 have been shown to have increased Aβ42(43) in their brains , providing further support for the amyloid cascade hypothesis as the mechanism of AD. Knock-out studies have determined that endogenous wild-type PSEN1 and PSEN2 are required for the γ-secretase cleavage of APP . The finding that all of the functionally tested EOFAD mutations affect Aβ metabolism that is reflected in plasma and/or fibroblasts studied from EOFAD patients, presymptomatic mutation-carriers and transgenic mice; and the evidence that PSENs are a key component of the γ-secretase complex suggest a common pathological pathway for AD, involving Aβ [42, 76–84].
Thus, though exceedingly rare, the identification of the autosomal dominant mutations in the three known EOFAD and the subsequent functional studies led to the development of the amyloid cascade hypothesis. Though the exact steps involved in the cascade of events in this pathophysiologic mechanism are as yet unknown and debatable, it has clearly set the stage for the ongoing drug discovery efforts for Alzheimer’s disease [reviewed in 85–87].
Failure for multiple research groups to replicate the initial linkage reports to chromosome 21  in various independent collections of AD families led to the hypothesis that AD may be a heterogeneous disorder [39–40, 88]. In 1991, Pericak-Vance et al., reported the results of a genome search they conducted in both early and LOAD families, where they found linkage to chromosome 21 in their EOAD families, as well as a novel locus on chromosome 19 . The novel locus seemed to have an especially strong effect in the LOAD families (defined as age at onset >60). Given the previous association between an ApoCII allele and AD reported by Schellenberg et al. , initially this gene on chromosome 19 emerged as a candidate for LOAD.
Concurrently, a group studying the change of lipids in AD brains by using an antibody to ApoE, a protein having a special relevance to nervous tissue, unexpectedly found that ApoE immunoreactivity was associated with amyloid in both senile plaques and cerebral vessels and neurofibrillary tangles . Given that the gene for ApoE mapped to the same locus identified in the Pericak-Vance et al. linkage study , researchers started investigating the genetic and biological association of ApoE with AD. In 1993, Strittmatter et al.  showed in an in-vitro assay that ApoE binds Aβ with high avidity and that ApoE ε4, a particular allelic form of ApoE, is found at a higher frequency in LOAD when compared to unrelated age-matched controls. Subsequently Corder et al.  demonstrated a dose-dependent increase in risk and a decrease in age-at-onset for LOAD, in ApoE ε4 carriers. This study which assessed 42 LOAD families found a hazard ratio of 2.84 for carriers of a single ApoE ε4 allele and 8.07 for those with the ApoE ε4/ε4 genotype.
Since then, multiple large, population or clinic-based studies established the effect of ApoE ε4 as genetic a risk factor for LOAD [94–100]. Table 1 depicts a list of studies arbitrarily defined as having a large number of participants; and either performed on population-based samples or multi-center collections. This table is not meant to depict the exhaustive list of all studies testing for association between Alzheimer’s disease risk and ApoE. Clearly, there are a large number of such studies in literature as a simple search on PubMed (www.pubmed.org) using the keywords “ApoE, association, Alzheimer, risk” yields an impressive 647 articles to date (January 18, 2007). Instead, Table 1 shows a collection of the studies with the largest sample sizes, which led to the establishment of ApoE as a susceptibility gene for AD. Several conclusions can immediately be drawn from this table: 1) ApoE ε4 allele increases risk for AD in a dose-dependent manner; 2) The risk conferred by ApoE ε4 in predominantly or entirely Caucasian populations is unequivocal with largely overlapping odds ratios (ORs) and confidence intervals; 3) Studies in the African-American and Hispanic populations yield conflicting results.
Using these studies as a starting point, the population impact of ApoE, possible ethnicity differences, role of ApoE in the diagnosis of AD and as a premorbid predictor for its development are discussed in the following sections.
In a population based prospective study of the predominantly Caucasian Framingham cohort, Myers et al. , determined that the cumulative incidence of AD increased in a dose dependent fashion where 55% of the ApoE ε4/ε4 group developed AD by age 80, compared with 27% of the ApoE ε3/ε4 and 9% of the ApoE ε3/ε3 groups by age 85 years. Despite the small number of ApoE ε4/ε4 individuals (n=16), this study clearly demonstrated that unlike the autosomal dominant EOFAD mutations, ApoE ε4 is a susceptibility and not a deterministic factor for the development of AD. Furthermore, since only 10% of the individuals with one or more copies of ApoE ε4 developed AD (positive predictive value 0.1), with specificity 0.81 and sensitivity 0.49, ApoE ε4 is unlikely to be useful for the diagnosis of AD. Similar dose dependent increases in the incidence or prevalence of AD were determined in other population-based prospective cohort [97, 98] or case-control studies [95, 96, 99].
Farrer et al., performed meta-analysis of 40 different clinic, population or autopsy-based studies and assessed the effects of sex, age and ethnicity on the ApoE association with AD . They reported results from population-based and clinic-based Caucasian samples, separately (Table 1 depicts their population-based results only), which showed largely overlapping ORs for the ApoE genotypes in these two different study types. Importantly, this study showed an age-dependent effect for the risk conferred by ApoE ε4, such that this allele was associated with risk of AD across all age groups between 40–90 years, but the effect was strongest for ages 60–75, with a decline after age 70. Similar age-specific effects were also detected by others [96, 99, 100]. This age specificity with declines in the risky effect of ApoE ε4 in the older age groups could be due to a number of reasons: 1) ApoE ε4 may decrease age at onset of AD, lead to more aggressive form of disease and a shorter survival; 2) There may be genetic and non-genetic factors which render certain individuals relatively invulnerable to AD, and promote longevity; 3) There may be genetic, non-genetic factors which interact with ApoE ε4 in an age-dependent manner.
The studies in non-Caucasian populations did not yield as consistent results for the association between AD and ApoE [95, 98, 100]. The meta-analysis study by Farrer et al. determined that the ApoE ε3/ε4 association is non-significant; and that for the ApoE ε4/ε4 genotype had lower, though overlapping, OR estimates in the African-American population, compared with the Caucasian population from that study. Farrer et al. emphasized the presence of heterogeneity between these two ethnic groups in terms of ApoE. Tang et al. found that the presence of an ApoE ε4 allele did not pose significant risk to the African-Americans or Hispanics in their longitudinal cohort from New York City . The African-Americans, Hispanics and Caucasians in their study had similar cumulative risk for AD in the presence of ApoE ε4; whereas in its absence the cumulative risk to the African-Americans and Hispanics were 4 and 2 times higher, respectively, even after adjusting for education and family history of AD. The authors concluded that these results provided evidence for the presence of other genetic and non-genetic risk factors for AD in these two ethnic groups. Another longitudinal study on an African-American cohort from Chicago found similar non-significance for ApoE ε4 association with AD , whereas a longitudinal cohort from Indianapolis  and a multi-center case-control study from southeastern United States  determined significant ApoE ε4 risk for AD, in clear contrast. A recent review discusses these opposing results in the African-American population as well as potential reasons for this discrepancy in this literature . The ApoE ε4 allele frequencies are higher in controls from those studies which fail to show significant association [98, 101], compared to those which do [100, 102], suggesting different underlying population structure. Indeed, the ApoE ε4 allele control frequencies from the non-significant studies are similar to those seen in the Yoruban-African population from Nigeria , where AD association with ApoE is also lacking. The incidence of AD is also lower in the Yoruban-African population , in comparison to the African-American populations from studies both confirming and refuting ApoE-AD association [98, 101–103]. Thus, the lack of association despite high incidence rates in the cohorts from Chicago and New York suggest a) the presence of factors which prevent development of AD in the controls from these cohorts with the ApoE ε4 allele; b) the presence of risk factors other than ApoE which lead to the development of AD in non-ApoE ε4 carriers; c) the presence of gene x gene, gene x environment interactions specific to or more prominent in certain populations; d) the presence of population admixture which lead to false negative results; e) a combination of one or more of these.
The association studies in Hispanics have similarly been equivocal [95, 98]. More recent studies have demonstrated significant ApoE association and age-at-onset effect in Caribbean Hispanics with familial (but not sporadic) LOAD [105, 106], suggesting role of additional genes and other gene x ApoE effects in this population. The discrepancies between the ApoE ε4 frequencies and AD associations between Hispanic populations of different source and geography imply the effects of genetic and possibly environmental heterogeneity in this ethnic group, as well .
These findings highlight several important issues regarding AD and genetic studies of complex diseases in general: 1) Large studies from multiple centers are required to dissect complex genetics; 2) These studies need to control for potential population substructure arising from genetic heterogeneity; 3) There likely exist multiple genetic and non-genetic factors accounting both for risk of and protection from AD. Analysis of a variety of ethnic groups and utilizing their genetic and environmental differences may prove to be highly useful in the detection of these factors.
The population attributable risk for AD or dementia due to the ApoE ε4 allele was estimated to be between 20–70% [97, 99]. These estimates provide further evidence for the existence of additional genetic factors underlying the risk of AD.
Clinicopathologic series where role of ApoE in diagnosing AD is estimated did not support a rationale for its genotyping in the clinical setting [108, 109]. Mayeux et al. assessed the accuracy of clinical diagnosis alone, ApoE genotyping alone and their combination in a clinicopathologic series of 2188 patients from 26 AD centers, 1833 of whom had clinical and 1770 of whom had pathologic diagnosis of AD, with 418 patients who had pathologic diagnosis of other causes of dementia . The sensitivity and specificity of clinical diagnosis alone were 93% and 55% respectively, whereas ApoE genotyping alone had 65% and 68% sensitivity and specificity, respectively. The addition of ApoE genotyping to the clinical diagnosis of AD decreased the sensitivity to 61%, but increased specificity to 84%, thus reducing false positive rate. Tsuang et al. found similar findings in a community-based longitudinal series of 132 patients with cognitive complaints and subsequent autopsy . The sensitivity and specificity of clinical diagnosis alone were 84% and 50% in this study; whereas those for ApoE alone were 59% and 71%, respectively. In the presence of clinical diagnosis of AD, ApoE decreased sensitivity to 49% and increased specificity to 84%. In the absence of clinical diagnosis of AD, ApoE increased sensitivity to 94% and decreased specificity to 37%, indicating high false positive rate of diagnosing AD in individuals with an ApoE ε4 allele but no pathologic evidence of AD. Although ApoE genotyping increased specificity (decreased false positive rate) when a clinical diagnosis of AD is made, this association is not absolute, such that there are individuals who lack the ApoE ε4 allele and have pathologically confirmed AD, as well as those with ApoE ε4 and cognitive impairment but non-AD pathology. Thus, ApoE genotyping is not advocated as part of the diagnostic work-up for AD .
Given the importance of early detection for development of AD from prodromal cognitive impairment states, multiple studies focused on the use of ApoE genotyping as a predictor for the development of AD. Most population or clinic-based longitudinal prospective studies which utilized mild cognitive impairment (MCI) or other clinical criteria for pre-dementia cognitive impairment found ApoE ε4 to be a significant predictor for the development of AD [111–114]. In two longitudinal studies assessing cognitive testing, ApoE ε4 was found to be a predictor for development of AD when used independently but not in combination with cognitive assessment [115–116]. This is in contrast to the findings by Petersen et al. where ApoE ε4 was a strong predictor for conversion from MCI to AD, even in the presence of cognitive testing . The discrepancy may arise from the definition of cognitive state in the starting populations in these studies. Cervilla et al. assessed conversion from cognitively normal state ; and Lee et al. studied subjects with questionable dementia . Both of these populations are likely more heterogeneous than the MCI subjects studied by Petersen et al. . In conclusion although ApoE ε4 appears to be an important predictor of conversion to AD, especially from well-defined cognitive impairment states, its use in the clinical setting is not substantiated. Nonetheless, future studies of other potential premorbid biomarkers of AD need to include this genetic factor for increased accuracy.
The extensive data from prior twin, segregation, family aggregation, EOFAD autosomal dominant families and ApoE association studies discussed above have led to the conclusion that AD has a substantial genetic component not explained by the known EOFAD mutations or ApoE. This, along with the development of high-throughput techniques for genotyping an ever-expanding number of markers (microsatellites and single nucleotide polymorphisms-SNPs), statistical methodologies geared towards identification of genetic loci in the absence of known Mendelian patterns of inheritance and collective efforts to obtain collections of LOAD families, relative pairs and case-control series led to the pursuit of a number of whole genome linkage and association studies in the literature [117–136]. A summary of the key characteristics of these studies are presented in Table 2. As seen here, there are essentially three groups of studies: “A” denotes those whole genome linkage studies on AD families or sibpairs [117, 121, 126, 127, 129, 132, 135, 136]. These studies are independent in that they were either performed entirely on ethnically distinct datasets of Caribbean Hispanic [126, 132], Amish  or Swedish  subjects; or even if they did have overlapping data [NIMH or NIA datasets], there was considerable non-overlapping data as well (117, 121, 127, 129]. The initial linkage analysis by Pericak-Vance et al.  performed on a total of 54 LOAD families may be a subset of their subsequent work on a significantly larger dataset  based on the similar sources of the study subjects, although the extent of overlap is not explicitly mentioned in their latter publication. Nonetheless, the results from their initial study is presented here separately (Tables 3–8), because this was the first whole genome linkage analysis conducted in LOAD and also because it led to the identification of the chromosome 12 locus, subsequently replicated in other studies [126, 127, 130]. Although the extent of overlap is not clearly evident between the three predominantly US sample-based studies utilizing the NIMH and NIA datasets [121, 127, 129], Blacker et al.  tried to address this issue by reporting results from both their entire dataset, as well as the subset of subjects non-overlapping with the first  and second stages  of the Washington University study. Here, their results from the non-overlapping (non-Kehoe) subset are depicted.
The second type of study denoted by “B” identifies whole genome association studies conducted on case-control series [119, 123, 130]. These three studies are entirely independent and assess subjects within the US , Finland  or an inbred Arab community . Their results are shown in Tables 3–8. Studies in group “C” are performed on datasets which have complete or major overlaps with other published genome scans [122, 124, 125, 128, 131, 133, 134]. Group C studies re-assess these datasets either using alternative analytical strategies, such as conditional linkage analysis , ordered subsets analyses , linkage analysis with covariates [128, 133, 134]; employing alternative phenotypes such as age at onset  or defined subsets within the data such as LOAD with psychosis . The summary results from these alternative approaches are depicted in the text rather than Tables 3–8. Finally, the first stage of the Washington University genome scan, which is a subset of their subsequent study  and the Pericak-Vance et al. study from 1998 , which appears to be the same study as ref. 117, are denoted as a separate category “D”. Their results are not discussed separately.
As seen in Table 2, most of the studies are based on AD families or sibpairs with age at onset greater than 60 [117, 118, 119, 120, 121, 123, 124, 127, 128, 130, 134, 136]. Most studies with both late and early/mixed age at onset datasets showed separate results for these different age groups [126, 129, 132, 133]. Where applicable, the age subsets for the results are shown in Tables 3–8. Many of the studies performed subset analyses based on ApoE genotypes [117, 124, 126, 127, 136], though their definitions of E4+ (ApoE ε4 +) or E4− subsets varied between the studies. Again, results for the various ApoE subsets are shown in Tables 3–8.
Tables 3–8 summarize the results of the whole genome linkage (group A studies from Table 2) and association (group B studies from Table 2) studies in the literature. Different analytical approaches have been used between and within the studies. Where there is more than one analytical approach for a given study, findings from the approach yielding the most significant result is depicted. Likewise, when there are subset analyses, the results of the subset with the most significant result are used; and this is depicted in the table. For multi-stage genome scans, the results from the second stage or follow-up study are used where available. When both twopoint and multipoint linkage results are available, the latter result is shown. LOD scores (parametric or non-parametric linkage results) above 1.0 or p values below 0.05 are shown, unless study authors determined a more significant cutoff in their paper. When ≥2 “interesting” results exist on a single chromosome separated by an arbitrary distance cutoff of >15 cM, each one is reported separately. Otherwise, the most significant result is reported. When a multipoint linkage curve spans >15 cM with ≥1 peaks separated only by sharp “dips” (as opposed to a plateau of “no suggestive linkage” with LOD scores consistently <1 in-between), the highest multipoint result is reported. Otherwise, each multipoint result is reported separately. For locations of the multipoint LOD score results, unless given explicitly in the paper, the nearest marker location is utilized.
Figure 1 is a compilation of all chromosomes where >1 suggestive/significant result exists in >1 study from Tables 3–8. The plain karyotype figures (not showing the study results) are taken from the Ensemble website (http://www.ensembl.org/info/software/website/installation/build.html), . The chromosomal locations for the markers are as per the March 2006 human genome assembly on the UCSC Genome Browser (http://genome.ucsc.edu/) .
Despite the differences in the study subjects, study designs, and analytical approaches between the various studies, inspection of Tables 3–8 and Figure 1 reveals a number of chromosomal regions where evidence for linkage or association exists based on a number of studies. This is evident for chromosome 19, where all but two studies [130, 132] showed significant/suggestive evidence of linkage or association. One of these studies was performed on a specific ethnic group of an inbred Arab community where ApoE frequency is low both in controls and AD cases . The other study was on a collection of Caribbean-Hispanic families where there was evidence of association but no significant linkage . These results strongly suggest the existence of genetic risk factors for AD other than ApoE, especially in ethnic groups of non-European or non-North American origin. Furthermore, there are three studies where one or more locus yields greater significance for linkage or association compared with the ApoE locus on chromosome 19 [123, 127, 135]. Hiltunen et al. identified loci on chromosomes 1–6, 10, 13, and 18 in their study of a Finnish case-control series where evidence for association was greater than that for ApoE . They reported simulated p values. Myers et al. found strongest evidence of linkage on chromosome 10q  with LOD score of 3.9 in their whole dataset vs. 1.3 for chromosome 19q . Finally Hahs et al., in their study of extended Amish families from the US, where ApoE was previously shown not to be an important risk factor for AD, identified both novel (Ch 4–7, 11, 18–20) and previously reported (Ch 4–7, 11, 18–20) regions with evidence for linkage . Thus, there is ample evidence from these whole genome linkage and association studies for the existence of novel LOAD risk loci.
Given the plethora of positive findings, it is important to be able to discern the true positive vs. false positive results. Evidence of linkage or association from multiple independent studies is an important indicator for a result as being a true positive. Although studies with overlapping datasets [117, 121, 127, 129] render such a comparison somewhat difficult, attempts are being made to approach this analytical difficulty. Blacker et al. , reported results of their linkage analyses on both overlapping and non-overlapping (non-Kehoe) datasets separately, where latter results can be seen as an independent assessment of findings by Pericak-Vance et al. [117, 121] and Myers et al. . Despite these confounders, several chromosomal loci emerge as potential risk loci for AD. Chromosomal loci 1p31.1 [129, 132, 135], 2p24.1-23.2 [123, 130], 3p14.1-2 [129, 135], 3q27.2-28 [123, 132, 135], 4p13-15.1 [123, 132, 135], 4q31.3-32.1 [117, 121, 129, 135], 4q34.3-35.2 [129, 132, 135], 13q12.12 [121, 123], 14q22.2 [129, 132], 14q24.1-2 [123, 135], and 18q22.3-23 [121, 132] have been implicated in at least two completely independent whole genome linkage and/or association studies. Of the chromosomal loci with independent replication those on chromosomes 6 [121, 123, 127, 129, 132], 9 [121, 127, 129, 130], 10 [119, 121, 123, 127, 130, 135] and 12 [117, 126, 127] received the greatest interest due to the large number of confirmatory studies, strength of these results and the existence of multiple strong candidate genes in these regions with positive association results from one or more studies.
Chromosome 6p21.2-12.3 (117, 123, 127, 132) and 6q27 (121, 129) regions have been implicated in whole genome linkage and association studies. Initially, Pericak-Vance et al.  identified linkage to chromosome 6p21.2 and 6q15 in a follow-up analyses of 38 LOAD families, where evidence for linkage was also replicated for chromosomes 4, 12 and 20. In their larger subsequent analyses of 466 LOAD families, continued linkage signal was observed for the 6q27 region . Chromosome 6p21.1-12.1 region showed significant association in a whole genome association study from Finland . Myers et al., in a sample with “as much as 80%” overlap with Pericak-Vance et al.  also detected linkage to chromosome 6p21.1-12.3 region in their ApoE ε4+ sample . Interestingly, their maximum multipoint LOD score is near marker D6S1017, which showed evidence of association in the Finnish association study . Blacker et al.  confirmed the linkage in the 6p21 and 6q27 regions, which is perhaps not surprising given their sample overlap with the two prior linkage studies [121, 127]. Nonetheless, the D6S1017 marker also yielded a twopoint LOD score of 1.2 in their non-Kehoe families, which do not overlap with the Myers et al.  study. Furthermore, this subset continued to have suggestive multipoint linkage to the 6q27 region. Finally, in a Caribbean-Hispanic series evidence for linkage was identified to chromosome 6p22.3 and 12.3 regions . These studies provide independent evidence for the existence of two AD risk regions of interest on chromosome 6 on the p and q arms. A number of candidate genes have been analyzed in these regions which are depicted graphically on the AlzGene website (http://www.alzforum.org/res/com/gen/alzgene/)  and reviewed previously [141, 142]. In a recent meta-analysis of the available candidate gene studies compiled by the AlzGene database, tumor necrosis factor (TNF) emerged as a promising candidate gene. The rs4647198 (-1031) SNP in a transcriptional regulatory site for TNF showed significant association with AD in a meta-analysis of three studies .
Several chromosome 10 regions were identified as candidate AD risk loci in numerous linkage and association studies [119, 121, 123, 127, 129, 130, 132, 135]. Zubenko et al. identified significant AD association with D10S1423 on 10p12.33 in their 10 cM whole genome scan of 100 AD cases and 100 controls . Chromosome 10p11.23, ~13 cM downstream of this region showed suggestive linkage in 199 autopsy-confirmed LOAD families , studied by Pericak-Vance et al. Two studies implicated the more upstream chromosome 10p region [123, 129], while Farrer et al. confirmed the findings in the 10p11.23 region . The 10p13 region implicated in the Finnish association study , 10p14 identified with linkage in early/mixed onset non-Kehore LOAD families  along with the initial association region by Zubenko et al.  span ~17 cM on the short arm of chromosome 10. Whether this region and the more downstream 10p11.23 region identified independently in a US-based linkage  and an inbred Arab-based association study  represent the same locus is not as yet known.
The most significant linkage evidence for chromosome 10 identified 10q21.3 as a potential AD risk region, with a maximum multipoint LOD score of 3.9 at ~82.2 cM, which is significant at the whole genome level . In this study of 451 affected sibpairs by Myers et al., the ApoE ε4+ group yielded the strongest linkage signal. Importantly, in an independent study of 10 extended LOAD families, using plasma Aβ42 levels as a quantitative phenotype, Ertekin-Taner et al. found evidence of linkage to the same region on chromosome 10 with greatest evidence of linkage in those 5 extended LOAD families whose AD proband had extremely high plasma Aβ levels . This study was the first one in AD genetics using a quantitative biologically relevant phenotype for mapping an AD risk locus. Jointly, these two independent studies provided evidence for a novel AD risk locus on 10q21.3 which acts via the Aβ pathway [143, 144]. The linkage peak in the Myers et al. study covers a broad region (59 cM–103 cM). In a study of Amish pedigrees, Hahs et al. found linkage to ~101 cM  on 10q22.3. Similarly, Farrer et al. found evidence of association in their inbred Arab sample to a number of chromosome 10 markers spanning 81–115 cM, with greatest single marker association at 10q22.3 (97 cM) and four contiguous markers showing association between 105–115 cM . Not surprisingly, Blacker et al. found linkage in their overlapping NIMH sample at 10q22 and 10q24, but not in the non-Kehoe sample. In another analysis of the same dataset using a different set of markers, this group reported significant linkage and association at 115–127 cM near the IDE gene . Finally, using age at onset as a quantitative phenotype Li et al.  identified linkage to 10q25.3 at 139 cM. These studies provide evidence for the existence of one or more AD risk loci on chromosome 10, some/all of which appear to affect both Aβ levels and age at onset. According to the AlzGene website (www.alzgene.org), 69 candidate genes have been studied on chromosome 10 . Some of these candidate gene studies will be discussed briefly below.
The first evidence of linkage to chromosome 12 came from Pericak-Vance et al. who detected the strongest linkage signal with a maximum multipoint lod score of 3.5 at 12p11.23 in their 54 LOAD families . The evidence for linkage was strongest in their subset of ApoE4-families, defined as those families with at least one patient who lacked the ApoE ε4 allele. This group subsequently genotyped additional markers and re-analyzed the same dataset using different subgroups by their ApoE genotypes and underlying pathology . Their analyses of all affected sibpairs confirmed the initial linkage findings ~12p11.23 though identified a second, smaller peak ~18 cM more downstream at 12q13.13. When the analyses were weighted by ApoE ε4− individuals, the peak linkage occurred at 12q13.2 and was significantly stronger than the ApoE ε4+ weighted linkage. Furthermore, subset analyses of the 8 families with Lewy body pathology yielded stronger linkage than that in the remaining 46 families. The families with Lewy body pathology had linkage to the 12p region whereas the other families linked to the 12q region. These results suggest the presence of one or more loci in the ~30cM region encompassing the linkage peaks in the various subgroups at 12p and 12q. Myers et al. found a suggestive linkage to chromosome 12 at 12p13.2, also with stronger results in the ApoE ε4-subset . Finally, Rogaeva et al.  found strongest evidence of linkage near marker D12S96 at 12q13.13, though their linkage region extended to 12p13.2 spanning ~42 cM region (according to the marker locations from the Marshfield map). An independent analysis of Caribbean-Hispanic LOAD families  found evidence of linkage at the 12p13.31 region, as well as the more downstream 12p12.1 regions, close to the Myers et al.  and Pericak-Vance et al. linkage regions , respectively. The evidence of linkage in this study was stronger in the ApoE ε4- subset, as well. Two independent association studies also found significant results on chromosome 12 [119, 130], however the loci identified in these reports were ~15 cM  to 93 cM  distal to the nearest linkage region.
Thus, similar to the findings on chromosomes 10 and 6, multiple potential LOAD risk regions have been identified on chromosome 12. More than 20 candidate genes have been analyzed on this chromosome of which α2-macroglobulin (A2M) mapping to 12p13.31 and low-density lipoprotein-like protein (LRP1) at 12q13.3 received the greatest interest due to their potential involvement in the Aβ clearance pathway [reviewed in 148]. Both genes are excellent functional and positional candidates, however their role as Aβ-susceptibility genes have not yet been established unequivocally due to inconsistent results from association studies [reviewed in 149]. Recently, LOAD association has been detected for glyceraldehyde-3 phosphate dehydrogenase (GAPD) variants in multiple series [150, 151].
Although, chromosome 9 has received interest due to the evidence of linkage or association from several studies [121, 127, 129, 130], it should be noted that there is overlap between datasets assessed by three of these studies [121, 127, 129] and they, therefore, cannot be considered as providers for independent evidence of linkage. Chromosome 9 is therefore considered last in this section. In a whole genome scan of 466 LOAD families, Pericak-Vance et al. identified a region on chromosome 9p to have strong evidence for linkage with a multipoint lod score of 2.97 . The evidence was even stronger in the autopsy-confirmed subset of 199 families with a multipoint lod score (MLS=4.31) even higher than that found in the ApoE region on chromosome 19 (MLS=3.42). The autopsy confirmed subset also had another downstream region on chromosome 9q with evidence of linkage. In the substantially overlapping dataset analyzed by Myers et al., evidence for linkage was identified to regions on 9p and 9q . The 9p linkage coincides with that in the Pericak-Vance et al. study, although the 9q linkage findings for the two studies are ~46 cM apart. Blacker et al. also identified two suggestive linkage peaks on 9p21.1 and 9q22.2 in their overall dataset, however could not detect a suggestive signal in their non-overlapping, non-Kehoe families . Finally, Farrer et al., in an association analysis of an independent series of Arab case-controls determined four markers on chromosome 9 spanning 32–47 cM (9p22.2-21.2) with significant LOAD association . Their most significant marker (AFM220XF2, p=2.3×10−7) resides in a region coinciding with two other linkage signals [121, 127] from overlapping datasets.
Several functional and positional candidate genes reside on chromosome 9, including DAPK1 (death-associated protein kinase) , ABCA1-2 (ATP binding cassette A1-2 transporters) [153, 154], VLDLR (very low density lipoprotein receptor) , and UBQLN1 (ubiquilin 1) , with roles in apoptosis , lipid metabolism [153–155] and protein degradation . Recent meta-analyses of genetic association studies on LOAD from the AlzGene database determined that variants in DAPK1 showed significant association in meta-analyses of 3–6 independent series  and also significantly altered expression of this gene .
In addition to the 9 whole genome linkage or association analyses in predominantly non-overlapping samples [117, 119, 121, 123, 126, 127, 129, 130, 132, 135, 136], seven linkage studies are presented here, which utilized alternative analytical strategies to re-assess the data in overlapping datasets [122, 124, 125, 128, 131, 133, 134]. We will briefly discuss their approaches and results, as they bring potential new understanding and evidence to the LOAD risk loci. Curtis et al.  re-analyzed the NIMH families, which have significant overlap with the genome-wide scans by Kehoe et al.  and Myers et al. , using ApoE genotypes as liability classes, thus introducing a novel method for simultaneous analysis of any loci conditional on a known risk locus. They were able to confirm the 10q21.3 linkage at D10S1211 and 12p13.2-1 at D12S358. Scott et al.  utilized “ordered subsets analysis” developed by Hauser et al.  to identify subsets of families ordered by their age-at-onset and which yielded stronger evidence of linkage to certain loci, compared to the analyses using the whole dataset. This approach was applied to 437 families with complete age-at-onset information, from the original set of 466 families in the Pericak-Vance et al. whole genome screen . Two novel loci were detected at chromosomes 2q34 in early onset AD (50–60 years) and 15q22 in very late onset AD (≥79 years), in addition to increasing the previously detected linkage signal at 9p from 3.0 to 4.6 (n=334 families, 60–75 years). The chromosome 2q34 locus also showed evidence of linkage in Caribbean Hispanics , along with two other regions encompassing 2q34 detected in Amish families . The 15q region is at least 17 cM away from the nearest risk region identified in another study .
Several groups included a number of variables as covariates in linkage analyses [128, 133, 134]. Olson et al. studied 272 sibpairs from the NIMH Genetics Initiative with complete covariate information on age-at-onset, current age and ApoE genotype, to determine the effects of these covariates on linkage . They were able to obtain stronger evidence of linkage to regions on chromosomes 5, 6, 9 and 21, previously identified in genome scans using the NIMH data [121, 127]. More interestingly, they detected signals at chromosomes 14 and 20 in regions not previously identified by inclusion of covariates. The most significant evidence came from chromosomes 20p and 21p near the APP region, where linkage was strongest in the oldest age group, lacking the ApoE ε4 allele. There was also significant epistasis between these two regions suggesting a biological link. Avramopoulos et al. found linkage to chromosome 14q24 region near the PSEN1 locus using presence of psychotic features as a covariate  in a subset of the NIMH sibpairs. Their strongest linkage was detected in earlier onset AD families without comorbid hallucinations. Two other studies identified 14q24 as a risk region [123, 135]. Recent meta-analyses of an intronic SNP in PSEN1 was found to have significant association with AD in 34 different studies, drawing attention to the potential involvement of this established EOAD gene for LOAD, as well . Bacanu et al. analyzed the subset of the NIMH families with psychotic symptoms and ApoE ε4 allele and determined linkage to chromosomes 2p, 6q and 21q . These regions do not overlap with those from the Avrampoulos et al. study . Although both studies showed linkage to 2p in LOAD with psychotic features, their locations are ~40 cM apart. Holmans et al.  analyzed the effects of multiple covariates including age-at-onset, rate of decline and ApoE on the linkage of LOAD series with major overlap with the Myers et al. study . Their most significant finding was on chromosome 21 in the NIMH dataset, where linkage was strongest in the sibpairs with high age-at-onset, thus confirming the results of Olson et al. . Their approach led to improved linkage findings in several other chromosomal loci.
Quantitative trait analyses in LOAD were first performed by Ertekin-Taner et al. using plasma Aβ levels as the phenotype  where evidence of linkage to chromosome 10 was determined in a locus overlapping with that from the Myers et al. study . Li et al. used age-at-onset as a quantitative phenotype in AD and PD families . Their strongest evidence for linkage in AD families came from chromosome 10q at a locus >50 cM downstream that of the Myers et al.  and Ertekin-Taner et al.  linkage.
These linkage studies with alternative analytical approaches have several important uses. They can: 1) lead to identification of novel loci by reducing heterogeneity and increasing power; 2) help confirm prior findings; 3) identify the subset of individuals where effect of the risk locus is strongest, thus potentially help guide future mapping/association studies; 4) bring some potential understanding to the pathophysiologic effects of the risk locus (e.g. decreases age at onset, acts via the amyloid pathway etc.).
There have been 968 association studies on 398 AD candidate genes to date according to the AlzGene database with update on February 8, 2007 (http://www.alzforum.org/res/com/gen/alzgene/) . It is neither possible nor effectively informative to provide a thorough review of these studies. Instead, this section will focus on some general pitfalls and guidelines on the association studies for complex diseases, with highlight on a few AD candidate genes as examples. The lack of a universally accepted genetic risk factor(s) for LOAD besides ApoE, despite the substantial genetic component underlying this disorder and the large numbers of genes studied, is testimony in part to the complexity of the genetic underpinnings of AD as well as the inadequacy of some of the approaches utilized. Several reviews have highlighted potential reasons for the failure to identify the “next ApoE” gene for AD [141, 158–160]. These include 1) initial false positive results followed by lack of replication; 2) lack of power due to small sample size (false negatives); 3) lack of informative markers; 4) genetic heterogeneity (different sets of genes underlying the same AD phenotype); 5) clinical heterogeneity (multiple clinical subtypes with different sets of susceptibility genes);.
Although, initial false positive associations could account for some of the lack of replication in LOAD association studies, it is unlikely to account for all cases of failure to replicate. Lohmueller et al. assessed 301 published studies on 25 single variant associations and determined a highly substantial excess of studies with significant replication of the original association results . They determined that of the 301 follow-up studies to an original significant association, 59, 26 and 10 studies showed significant associations with p<0.05, <0.01 and <0.001, respectively. These numbers were significantly higher than the numbers of significant association studies expected by chance alone (15, 3 and <1, respectively). Ioannidis et al. systematically assessed by meta-analysis 370 published studies on 36 genetic associations . Both studies concluded that the initial estimate of the genetic effect size from the original positive association study tended to be inflated [161, 162], with modest true effect sizes with ORs of 0.5–2.0 in most cases. Given this, many association studies assessed by these authors and as well from the LOAD literature will be sorely underpowered to detect true but modest effects. Two groups systematically analyzed AD association studies published within a given period of time [141, 160]. Bertram and Tanzi  assessed 105 (38 positive, 67 negative) AD association studies published in 2003, whereas Blomqvist et al.  analyzed 138 (86 positive, 52 negative) articles published between January 2004 to April 2005. Both studies excluded reports on ApoE. They both observed insufficient sample sizes. Blomqvist et al. found that 59% of the studies (82/138) had <500 subjects, whereas Bertram and Tanzi have determined that ~20% of the studies had <200 individuals. Since sample sizes on the order of 1000–10,000 will often be required to achieve sufficient power to detect the expected modest effect sizes for most complex disease susceptibility genes, a large number of the AD association studies are underpowered. It should also be noted that small sample size is not only problematic for lack of power for follow-up studies, but also an important determinant for an initial false positive association. Initial sample size of <150 was significantly less likely to be replicated compared with larger studies, suggesting a higher false positive rate for initial studies of smaller size .
Association of a disease with a genetic variant does not imply causation and may simply be due to linkage disequilibrium (LD) between the tested marker and an untested susceptibility variant. Furthermore, more than one susceptibility variant could be present in a single gene, acting in an additive or multiplicative fashion. Both of these issues complicate association studies and could potentially lead to false negative results due to a) different extent of LD between tested marker and untested susceptibility variant between different populations or b) incomplete assessment of the true susceptibility conferred by a genic region. To address these issues, choosing potentially functional variants (e.g. missense SNPs, insertions/deletions, variants in highly conserved regions) and assessing multiple variants and haplotypes have been advocated [141, 163]. The two recent surveys of AD association studies have revealed that the majority (>60%) have been performed on a single variant [141, 160] and less than 20% pursued haplotype analysis .
Genetic heterogeneity due to ethnic or environmental differences leading to false negative results, as well as false positive results secondary to population substructure have been raised as potential problems affecting association studies [reviewed in 163], though their true impact on AD genetics is yet to be determined. Finally, use of biologically relevant, quantitative phenotypes have been proposed as a means to overcome potential clinical heterogeneity, as such quantitative traits may a) be a less heterogeneous and more direct correlate for the genetic variation(s); b) be less prone to the imprecisions of a clinical phenotype; c) be amenable to inclusion of subjects with pre-clinical disease; and d) allow for downstream functional testing of putative susceptibility variants [158, 160, 163]. More recently, the use of quantitative traits such as Aβ levels, mini-mental state examination scores, CSF tau levels, age-at-onset, pathological disease burden and CSF cholesterol levels [164–168] have emerged as novel approaches to study candidate AD genes.
Despite the pitfalls and shortcomings of association studies on AD, a recent meta-analysis of 127 polymorphisms in 69 AD candidate genes, identified 20 polymorphisms in 13 genes (other than ApoE and related genes) with significant results . It should be noted that the effects of these variants were modest with average ORs of 0.82–1.25. Additionally, several recent reports on AD candidate genes have applied many of the suggested guidelines for complex disease association studies and found significant results [164–167, 174, 176, 178, 179]. While the role of these genes for AD remains to be established with further confirmatory studies and despite presence of negative results, these findings are nonetheless encouraging.
Chromosome 10 has been an area of great interest in AD due to the convergence of multiple linkage and association signals as well as the presence of multiple excellent positional and functional candidate genes. We will thus give examples of several AD candidate gene association studies from this chromosome highlighting study designs as well as briefly summarizing the conclusions.
Ertekin-Taner et al.  analyzed alpha-T-catenin (VR22) as a candidate gene on chromosome 10 due to its location and indirect interaction with presenilin 1 via β-catenin. They have identified two intronic SNPs which showed association with plasma Aβ42 levels in two independent sets of families, showed that these SNPs accounted for their plasma Aβ linkage on chromosome 10 and bounded the association within the large VR22 gene by analyzing an additional 49 SNPs within and around this gene, none of which showed as strong association. The strengths of this study were the use of a quantitative phenotype, multiple variants, internal replication, bounding of the association within a gene and assessment of linkage conditional on association. Subsequently three affirmative [169–171] and four negative [172–175] association studies were published on this genic region.
In 2003, Prince et al.  identified a haplotype block spanning insulin degrading enzyme (IDE), which showed significant association with AD and/or its quantitative traits (mini-mental state examination scores, CSF tau levels, age at onset) in an analysis of 5 independent case-control series. The strengths of this study were the use of multiple variants, haplotype analysis, multiple independent series from different populations and both clinical and quantitative phenotypes. Assessing the same haplotypes in two case-control and one family series, Ertekin-Taner et al.  found evidence of association with AD and plasma Aβ42 levels with directions of effect similar to those from the Prince et al. study. Sixteen additional association studies exist on the IDE gene with five confirmatory reports, all of which are summarized on the AlzGene website .
Li et al.  screened 22,000 genes by an expression assay and identified 52 genes with significant expression difference between ADs and controls, four of which resided on chromosome 10. They assessed these candidate genes in their AD and PD series, and determined variants in glutathione S-transferase omega-1 and -2 genes (GSTO1, GSTO2) to be significantly associated with age-at-onset of AD and PD. They subsequently determined that this association contributed significantly to their linkage results on chromosome 10 . The main strength of this study is the selection of candidate genes based on a functional screen, in addition to use of multiple markers, multiple series, quantitative phenotypes and assessing effect of association on linkage. Of the four follow-up studies on GSTO1-2, two show a positive trend [summarized in AlzGene, 140].
Grupe et al.  genotyped ~1,400 gene-based SNPs in the chromosome 10q region to identify 69 with significant association in their exploratory series. Of these one SNP in LOC439999 showed significant association in four of their six case-control series. The strengths of this study were the use of a high-throughput screening strategy to survey a large number of genes in a non-hypothesis driven fashion and the presence of internal replication using >1000 cases and controls. Their association did not account for the linkage on chromosome 10. Only two negative follow-up studies exist for this association to date . Using a similar screening approach with 1206 chromosome 10 SNPs in Japanese case-control series with no ApoE ε4 allele, Kuwano et al.  identified six SNPs in a two-stage association study, three of which resided in the dynamin binding protein gene (DNMBP). The strengths of this study are similar to those of Grupe et al. . Kuwano et al. also determined decreased expression levels for DNMBP in the AD vs. control brains, thus providing some functional correlate for this gene.
Finally, though not on chromosome 10, recent findings on neuronal sortilin related receptor SORL1 on chromosome 11 will be briefly discussed  to highlight the study approach and challenges of complex disease association studies. Rogaeva et al. assessed SNPs in several genes from the vacuolar protein sorting family and identified SORL1 as a candidate for further analyses. Utilizing a two-stage approach with a total of 6 series from different ethnic groups, with two family-based series in the exploratory set and four combined case-control and family series in the replication set, they assessed SORL1 variants in >1500 cases and controls. In addition, they analyzed three more independent series of 1405 ADs and 2124 controls. The authors found significant association with SORL1 variants in multiple independent data sets, though there was no single variant with significant association across all data sets. There was also considerable functional data demonstrating reduced expression of SORL1 in subjects who are carriers of a SORL1 risk haplotype. Finally, the authors showed experimental results depicting the role of SORL1 in differential sorting of the APP holoprotein. The strengths of this study are the use of large numbers of independent series of both family and case-control type, with multiple different ethnic groups, hypothesis-driven assessment of candidate genes in a specific pathway, multiple single-SNP and haplotype variants assessed, and the presence of expression data in support of the association findings.
Alzheimer’s disease is one of the most challenging disorders of the century due to its personal and public impact. Discovering the underlying genetics of this common, complex disorder harbors the hope for its early detection, prevention and treatment. With the identification of exceedingly rare, early-onset autosomal dominant familial AD gene mutations over the past two decades, substantial progress has been made in understanding the disease pathophysiology. The most common, late-onset form of AD has a substantial genetic component which remains unexplained despite the identification of the susceptibility allele ApoE ε4. Evidence from whole genome linkage, association and candidate gene studies strongly implicate multiple different genetic loci responsible for susceptibility to AD. The ever-growing numbers of variants in the human genome with systematic cataloguing efforts, development of high-throughput genotyping and phenotyping assays, generation of well-characterized and large study series, and advent of novel analytical strategies will provide the tools essential for studying AD genetics. Though, significant progress is as yet lacking in this area, recent studies hold promise for illuminating important genetic aspects of this multifaceted disorder.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.