We summarize the methodological contributions from Group 3 of Genetic Analysis Workshop 17 (GAW17). The overarching goal of these methods was the evaluation and enhancement of state-of-the-art approaches in integration of biological knowledge into association studies of rare variants. We found that methods loosely fell into three major categories: (1) hypothesis testing of index scores based on aggregating rare variants at the gene level, (2) variable selection techniques that incorporate biological prior information, and (3) novel approaches that integrate external (i.e., not provided by GAW17) prior information, such as pathway and single-nucleotide polymorphism (SNP) annotations. Commonalities among the findings from these contributions are that gene-based analysis of rare variants is advantageous to single-SNP analysis and that the minor allele frequency threshold to identify rare variants may influence power and thus needs to be carefully considered. A consistent increase in power was also identified by considering only nonsynonymous SNPs in the analyses. Overall, we found that no single method had an appreciable advantage over the other methods. However, methods that carried out sensitivity analyses by comparing biologically informative to noninformative prior probabilities demonstrated that integrating biological knowledge into statistical analyses always, at the least, enabled subtle improvements in the performance of any statistical method applied to these simulated data. Although these statistical improvements reflect the simulation model assumed for GAW17, our hope is that the simulation models provide a reasonable representation of the underlying biology and that these methods can thus be of utility in real data.
exome sequencing; pathway analysis; gene association
Genome-wide association (GWAS) methods have identified genes contributing to Parkinson disease (PD); we sought to identify additional genes associated with PD susceptibility.
A two stage design was used. First, individual level genotypic data from five recent PD GWAS (Discovery Sample: 4,238 PD cases and 4,239 controls) were combined. Following imputation, a logistic regression model was employed in each dataset to test for association with PD susceptibility and results from each dataset were meta-analyzed. Second, 768 SNPs were genotyped in an independent Replication Sample (3,738 cases and 2,111 controls).
Genome-wide significance was reached for SNPs in SNCA (rs356165, G: odds ratio (OR)=1.37; p=9.3 × 10−21), MAPT (rs242559, C: OR=0.78; p=1.5 × 10−10), GAK/DGKQ (rs11248051, T:OR=1.35; p=8.2 × 10−9/ rs11248060, T: OR=1.35; p=2.0×10−9), and the HLA region (rs3129882, A: OR=0.83; p=1.2 × 10−8), which were previously reported. The Replication Sample confirmed the associations with SNCA, MAPT, and the HLA region and also with GBA (E326K OR=1.71; p=5 × 10−8 Combined Sample) (N370 OR=3.08; p=7 × 10−5 Replication sample). A novel PD susceptibility locus, RIT2, on chromosome 18 (rs12456492; p=5 × 10−5 Discovery Sample; p=1.52 × 10−7 Replication sample; p=2 × 10−10 Combined Sample) was replicated. Conditional analyses within each of the replicated regions identified distinct SNP associations within GBA and SNCA, suggesting that there may be multiple risk alleles within these genes.
We identified a novel PD susceptibility locus, RIT2, replicated several previously identified loci, and identified more than one risk allele within SNCA and GBA.
Various genome-wide association studies (GWAS) have been done in ischaemic stroke, identifying a few loci associated with the disease, but sample sizes have been 3500 cases or less. We established the METASTROKE collaboration with the aim of validating associations from previous GWAS and identifying novel genetic associations through meta-analysis of GWAS datasets for ischaemic stroke and its subtypes.
We meta-analysed data from 15 ischaemic stroke cohorts with a total of 12 389 individuals with ischaemic stroke and 62 004 controls, all of European ancestry. For the associations reaching genome-wide significance in METASTROKE, we did a further analysis, conditioning on the lead single nucleotide polymorphism in every associated region. Replication of novel suggestive signals was done in 13 347 cases and 29 083 controls.
We verified previous associations for cardioembolic stroke near PITX2 (p=2·8×10−16) and ZFHX3 (p=2·28×10−8), and for large-vessel stroke at a 9p21 locus (p=3·32×10−5) and HDAC9 (p=2·03×10−12). Additionally, we verified that all associations were subtype specific. Conditional analysis in the three regions for which the associations reached genome-wide significance (PITX2, ZFHX3, and HDAC9) indicated that all the signal in each region could be attributed to one risk haplotype. We also identified 12 potentially novel loci at p<5×10−6. However, we were unable to replicate any of these novel associations in the replication cohort.
Our results show that, although genetic variants can be detected in patients with ischaemic stroke when compared with controls, all associations we were able to confirm are specific to a stroke subtype. This finding has two implications. First, to maximise success of genetic studies in ischaemic stroke, detailed stroke subtyping is required. Second, different genetic pathophysiological mechanisms seem to be associated with different stroke subtypes.
Wellcome Trust, UK Medical Research Council (MRC), Australian National and Medical Health Research Council, National Institutes of Health (NIH) including National Heart, Lung and Blood Institute (NHLBI), the National Institute on Aging (NIA), the National Human Genome Research Institute (NHGRI), and the National Institute of Neurological Disorders and Stroke (NINDS).
The heat shock protein (HSP) 70 family has been implicated in the pathology of Alzheimer’s disease (AD). In this study, we examined common genetic variations in the 80 genes encoding HSP70 and its co-chaperones. We conducted a study in a series of 462 patients and 5238 unaffected participants derived from the Rotterdam Study, a population-based study including 7983 persons aged 55 years and older. We genotyped a total of 12,053 Single Nucleotide Polymorphisms (SNPs) using the HumanHap550K Genotyping BeadChip from Illumina. Replication was performed in two independent cohort studies, the Framingham Heart study (FHS; N=806) and Cardiovascular Health Study (CHS; N=2150). When adjusting for multiple testing, we found a small but consistent, though not significant effect of rs12118313 located 32kb from PFDN2, with an OR of 1.19 (p-value from meta-analysis =0.003). However this SNP was in the intron of another gene, suggesting it is unlikely this SNP reflects the effect of PFDN2. In a formal pathway analysis we found nominally significant evidence for an association of BAG, DNAJA and prefoldin with AD. These findings corroborate with those of a study of 2032 AD patients and 5328 controls, in which several members of the prefoldin family showed evidence for association to AD. Our study did not reveal evidence for a genetic variant if the HSP70 family with a major effect on AD. However, our findings of the single SNP analysis and pathway analysis suggest that multiple genetic variants in prefoldin are associated with AD.
Heat-Shock Proteins; Alzheimer Disease; prefoldin; Genetic Association Studies
Mutations in the leucine-rich repeat kinase 2 gene (LRRK2), located at 12q12, are the most common known genetic causes of Parkinson’s disease (PD). Studies of LRRK2 mutation carriers have shown incomplete and age-dependent penetrance and previous studies have suggested that inherited susceptibility factors may modify the penetrance of LRRK2 mutations.
Genomewide linkage to age of onset of LRRK2-related PD was evaluated in a sample of 113 LRRK2 mutation carriers from 64 families using single nucleotide polymorphism data from the Illumina HumanCNV370 genotyping array. Association between onset age and SNPs located under suggestive linkage peaks was also evaluated.
The top LOD-score for onset age (LOD-score=2.43) was located in the chromosome 1q32.1 region. Moderate linkage to onset was also identified at 16q12.1 (LOD-score=1.58). Examination of single nucleotide polymorphism association to PD onset under the linkage peaks revealed no statistically significant SNP associations.
The two novel genomic regions identified may harbor modifiers of LRRK2-related PD onset age or penetrance and further study of these regions may provide important insight into LRRK2-related PD.
Parkinson’s Disease; LRRK2; Linkage
White matter hyperintensities (WMH) detectable by magnetic resonance imaging (MRI)are part of the spectrum of vascular injury associated with aging of the brain and are thought to reflect ischemic damage to the small deep cerebral vessels. WMH are associated with an increased risk of cognitive and motor dysfunction, dementia, depression, and stroke. Despite a significant heritability, few genetic loci influencing WMH burden have been identified.
We performed a meta-analysis of genome-wide association studies (GWAS) for WMH burden in 9,361 stroke-free individuals of European descent from 7 community-based cohorts. Significant findings were tested for replication in 3,024 individuals from 2 additional cohorts.
We identified 6 novel risk-associated single nucleotide polymorphisms (SNPs)in one locus on chromosome 17q25 encompassing 6 known genes including WBP2, TRIM65, TRIM47, MRPL38, FBF1, and ACOX1. The most significant association was for rs3744028 (Pdiscovery= 4.0×10−9; Preplication =1.3×10−7; Pcombined =4.0×10−15). Other SNPs in this region also reaching genome-wide significance are rs9894383 (P=5.3×10−9), rs11869977 (P=5.7×10−9), rs936393 (P=6.8×10−9), rs3744017 (P=7.3×10−9), and rs1055129 (P=4.1×10−8). Variant alleles at these loci conferred a small increase in WMH burden (4–8% of the overall mean WMH burden in the sample).
This large GWAS of WMH burden in community-based cohorts of individuals of European descent identifies a novel locus on chromosome 17. Further characterization of this locus may provide novel insights into the pathogenesis of cerebral WMH.
Duplications and triplications of the α-synuclein (SNCA) gene increase risk for PD, suggesting increased expression levels of the gene to be associated with increased PD risk. However, past SNCA expression studies in brain tissue report inconsistent results. We examined expression of the full-length SNCA transcript (140 amino acid protein isoform), as well as total SNCA mRNA levels in 165 frontal cortex samples (101 PD, 64 control) using quantitative real-time polymerase chain reaction. Additionally, we evaluated the relationship of eight SNPs in both 5′ and 3′ regions of SNCA with the gene expression levels. The association between postmortem interval (PMI) and SNCA expression was different for PD and control samples: SNCA expression decreased with increasing PMI in cases, while staying relatively constant in controls. For short PMI, SNCA expression was increased in PD relative to control samples, whereas for long PMI, SNCA expression in PD was decreased relative to control samples.
Genetic Analysis Workshop 17 (GAW17) provided a platform for evaluating existing statistical genetic methods and for developing novel methods to analyze rare variants that modulate complex traits. In this article, we present an overview of the 1000 Genomes Project exome data and simulated phenotype data that were distributed to GAW17 participants for analyses, the different issues addressed by the participants, and the process of preparation of manuscripts resulting from the discussions during the workshop.
Genome-wide association studies often emphasize single-nucleotide polymorphisms with the smallest p-values with less attention given to single-nucleotide polymorphisms not ranked near the top. We suggest that gene pathways contain valuable information that can enable identification of additional associations. We used gene set information to identify disease-related pathways using three methods: gene set enrichment analysis (GSEA), empirical enrichment p-values, and Ingenuity pathway analysis (IPA). Association tests were performed for common single-nucleotide polymorphisms and aggregated rare variants with traits Q1 and Q4. These pathway methods were evaluated by type I error, power, and the ranking of the VEGF pathway, the gene set used in the simulation model. GSEA and IPA had high power for detecting the VEGF pathway for trait Q1 (91.2% and 93%, respectively). These two methods were conservative with deflated type I errors (0.0083 and 0.0072, respectively). The VEGF pathway ranked 1 or 2 in 123 of 200 replicates using IPA and ranked among the top 5 in 114 of 200 replicates for GSEA. The empirical enrichment method had lower power and higher type I error. Thus pathway analysis approaches may be useful in identifying biological pathways that influence disease outcomes.
Genome wide association studies (GWAS) have recently identified CLU, PICALM and CR1 as novel genes for late-onset Alzheimer’s disease (AD).
In a three-stage analysis of new and previously published GWAS on over 35000 persons (8371 AD cases), we sought to identify and strengthen additional loci associated with AD and confirm these in an independent sample. We also examined the contribution of recently identified genes to AD risk prediction.
Design, Setting, and Participants
We identified strong genetic associations (p<10−3) in a Stage 1 sample of 3006 AD cases and 14642 controls by combining new data from the population-based Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) consortium (1367 AD cases (973 incident)) with previously reported results from the Translational Genomics Research Institute (TGEN) and Mayo AD GWAS. We identified 2708 single nucleotide polymorphisms (SNPs) with p-values<10−3, and in Stage 2 pooled results for these SNPs with the European AD Initiative (2032 cases, 5328 controls) to identify ten loci with p-values<10−5. In Stage 3, we combined data for these ten loci with data from the Genetic and Environmental Risk in AD consortium (3333 cases, 6995 controls) to identify four SNPs with a p-value<1.7×10−8. These four SNPs were replicated in an independent Spanish sample (1140 AD cases and 1209 controls).
Main outcome measure
We showed genome-wide significance for two new loci: rs744373 near BIN1 (OR:1.13; 95%CI:1.06–1.21 per copy of the minor allele; p=1.6×10−11) and rs597668 near EXOC3L2/BLOC1S3/MARK4 (OR:1.18; 95%CI1.07–1.29; p=6.5×10−9). Associations of CLU, PICALM, BIN1 and EXOC3L2 with AD were confirmed in the Spanish sample (p<0.05). However, CLU and PICALM did not improve incident AD prediction beyond age, sex, and APOE (improvement in area under receiver-operating-characteristic curve <0.003).
Two novel genetic loci for AD are reported that for the first time reach genome-wide statistical significance; these findings were replicated in an independent population. Two recently reported associations were also confirmed, but these loci did not improve AD risk prediction, although they implicate biological pathways that may be useful targets for potential interventions.
genome-wide association study; genetic epidemiology; genetics; dementia; Alzheimer’s disease; cohort study; meta-analysis; risk
Data relating parental history of stroke to stroke risk in offspring remain surprisingly inconsistent, largely due to heterogeneity of study design, and the absence of verified, as opposed to historical, data on parental stroke status.
Methods and Results
We determined if prospectively verified parental occurrence of stroke increased incident stroke risk among offspring in a community-based sample by studying 3443 stroke-free Framingham Offspring (53% female, mean age 48±14 years) with verified parental stroke status (by age 65 years), who attended the 1st, 3rd, 5th and/or 7th Offspring examinations, and were followed for up to 8 years after each baseline examination. Over up to 11,029 such person-observation periods (77,534 person-years), we documented 106 parental strokes by age 65, and 128 offspring strokes (74 parental and 106 offspring strokes were ischemic). Using multivariable Cox models, adjusted for age-, sex-, sib-ship and baseline stroke risk factors, we observed that parental stroke, both all-stroke generally, and ischemic stroke specifically, was associated with an increased risk of incident stroke of the same type in the offspring (HR 2.79, 95% CI: 1.68–4.66; p<0.001 for all stroke, and HR 3.15, 95% CI: 1.69–5.88; p<0.001 for ischemic stroke). This was true for both maternal and paternal stroke.
Documented parental stroke by age 65 years was associated with a three-fold increase in risk of offspring stroke. This increased risk persisted after adjustment for conventional stroke risk factors. Thus, verified parental stroke may serve as a clinically useful risk marker of an individual’s propensity to stroke.
stroke; ischemic stroke; heredity; familial aggregation
Estrogen exposure has been associated with the occurrence of Parkinson’s disease (PD), as well as many other disorders, and yet the mechanisms underlying these relations are often unknown. While it is likely that estrogen exposure modifies the risk of various diseases through many different mechanisms, some estrogen-related disease processes might work in similar manners and result in association between the diseases. Indeed, the association between diseases need not be due only to estrogen-related factors, but due to similar disease processes from a variety of mechanisms.
Patients and methods:
All female Parkinson’s disease cases between 1982 and 2007 (n = 12,093) were identified from the Danish National Registry of Patients, along with 10 controls matched by years of birth and enrollment. Conditional logistic regressions (CLR) were used to calculate risk of PD after diagnosis of the estrogen-related diseases, endometriosis and osteoporosis, conditioning on years of birth and enrollment. To identify novel associations between PD and any other preceding conditions, CLR was also used to calculate the odds ratios (ORs) for risk of PD for 202 different categories of preceding disease diagnoses. Empirical Bayes methods were used to identify the robust associations from the over 200 associations produced by this analysis.
We found a positive association between osteoporosis and osteoporotic fractures and PD (OR = 1.18, 95% confidence interval [CI] of 1.08–1.28), while a lack of association was observed between endometriosis and PD (OR = 1.37, 95% CI 0.99–1.90). Using empirical Bayes analyses, 24 additional categories of diseases, likely unrelated to estrogen exposure, were also identified as potentially associated with PD.
We identified several novel associations, which may provide insight into common causal mechanisms between the diseases or greater understanding of potential early preclinical signs of PD. In particular, the associations with several categories of mental disorders suggest that these may be early warning signs of PD onset or these diseases (or the causes of these diseases) may predispose to PD.
Parkinson’s disease; estrogen; osteoporosis; endometriosis; empirical bayes
Heritability and genetic and environmental correlations of total and regional brain volumes were estimated from a large, generally healthy, community-based sample, to determine if there are common elements to the genetic influence of brain volumes and white matter hyperintensity volume. There were 1538 Framingham Heart Study participants with brain volume measures from quantitative magnetic resonance imaging (MRI) who were free of stroke and other neurological disorders that might influence brain volumes and who were members of families with at least two Framingham Heart Study participants. Heritability was estimated using variance component methodology and adjusting for the components of the Framingham stroke risk profile. Genetic and environmental correlations between traits were obtained from bivariate analysis. Heritability estimates ranging from 0.46 to 0.60, were observed for total brain, white matter hyperintensity, hippocampal, temporal lobe, and lateral ventricular volumes. Moderate, yet significant, heritability was observed for the other measures. Bivariate analyses demonstrated that relationships between brain volume measures, except for white matter hyperintensity, reflected both moderate to strong shared genetic and shared environmental influences. This study confirms strong genetic effects on brain and white matter hyperintensity volumes. These data extend current knowledge by showing that these two different types of MRI measures do not share underlying genetic or environmental influences.
heritability; quantitative MRI; brain volume; white matter hyperintensity
The genes underlying the risk of stroke in the general population remain undetermined.
We carried out an analysis of genomewide association data generated from four large cohorts composing the Cohorts for Heart and Aging Research in Genomic Epidemiology consortium, including 19,602 white persons (mean [±SD] age, 63±8 years) in whom 1544 incident strokes (1164 ischemic strokes) developed over an average follow-up of 11 years. We tested the markers most strongly associated with stroke in a replication cohort of 2430 black persons with 215 incident strokes (191 ischemic strokes), another cohort of 574 black persons with 85 incident strokes (68 ischemic strokes), and 652 Dutch persons with ischemic stroke and 3613 unaffected persons.
Two intergenic single-nucleotide polymorphisms on chromosome 12p13 and within 11 kb of the gene NINJ2 were associated with stroke (P<5×10−8). NINJ2 encodes an adhesion molecule expressed in glia and shows increased expression after nerve injury. Direct genotyping showed that rs12425791 was associated with an increased risk of total (i.e., all types) and ischemic stroke, with hazard ratios of 1.30 (95% confidence interval [CI], 1.19 to 1.42) and 1.33 (95% CI, 1.21 to 1.47), respectively, yielding population attributable risks of 11% and 12% in the discovery cohorts. Corresponding hazard ratios were 1.35 (95% CI, 1.01 to 1.79; P = 0.04) and 1.42 (95% CI, 1.06 to 1.91; P=0.02) in the large cohort of black persons and 1.17 (95% CI, 1.01 to 1.37; P = 0.03) and 1.19 (95% CI, 1.01 to 1.41; P = 0.04) in the Dutch sample; the results of an underpowered analysis of the smaller black cohort were nonsignificant.
A genetic locus on chromosome 12p13 is associated with an increased risk of stroke.
Women have a reduced risk of developing Parkinson's disease (PD) compared with age-matched men. Neuro-protective effects of estrogen potentially explain this difference. Tamoxifen, commonly used in breast cancer treatment, may interfere with the protective effects of estrogen and increase risk of PD. We compared the rate of PD in Danish breast cancer patients treated with tamoxifen to the rate among those not treated with tamoxifen.
A cohort of 15,419 breast cancer patients identified from the Danish Breast Cancer Collaborative Group database was linked to the National Registry of Patients to identify PD diagnoses. Overall risk and rate of PD following identification into the study was compared between patients treated with tamoxifen as adjuvant hormonal therapy and patients not receiving tamoxifen. Time-dependent effects of tamoxifen treatment on PD rate were examined to estimate the likely induction period for tamoxifen.
In total, 35 cases of PD were identified among the 15,419 breast cancer patients. No overall effect of tamoxifen on rate of PD was observed (HR = 1.3, 95% CI: 0.64-2.5), but a PD hazard ratio of 5.1 (95% CI: 1.0-25) was seen four to six years following initiation of tamoxifen treatment.
These results provide evidence that the neuro-protective properties of estrogen against PD occurrence may be disrupted by tamoxifen therapy. Tamoxifen treatments may be associated with an increased rate of PD; however these effects act after four years, are of limited duration, and the adverse effect is overwhelmed by the protection against breast recurrence conferred by tamoxifen therapy.
In some genetic association studies, samples contain both parental and unrelated controls. Under such scenarios, instead of analyzing only trios using family-based association tests or only unrelated subjects using a case-control study design, [Nagelkerke et al., 2004] and [Epstein et al., 2005] proposed methods that implemented a likelihood ratio test to combine the two different types of data. In this article, we put forward a more powerful and simplified strategy to combine trios with unrelated subjects based on the Haplotype Relative Risk (HRR) [Falk et al., 1987]. The HRR compares parental marker alleles transmitted to an affected offspring to those not transmitted as a test for association, a strategy that is similar to a case-control study that compares allele frequencies in diseased cases to those of unrelated controls. We prove that affected offspring can be pooled with diseased cases and that parental controls can be treated as unrelated controls when the trios and unrelated subjects are randomly sampled from the same population. Therefore, unrelated subjects can be incorporated into the HRR intuitively and effortlessly. For trios without complete parental genotypes, we adopted the strategy proposed by [Guo et al., 2005], which is more feasible than the one proposed by [Weinberg, 1999]. In addition, simulation results suggest that the CHRR is more powerful than Epstein et al.’s method regardless of the disease prevalence in a homogeneous population.
TDT; HRR; case-control association study; trios; missing data
Age at onset in Parkinson disease (PD) is a highly heritable quantitative trait for which a significant genetic influence is supported by multiple segregation analyses. Because genes associated with onset age may represent invaluable therapeutic targets to delay the disease, we sought to identify such genetic modifiers using a genomewide association study in familial PD. There have been previous genomewide association studies (GWAS) to identify genes influencing PD susceptibility, but this is the first to identify genes contributing to the variation in onset age.
Initial analyses were performed using genotypes generated with the Illumina HumanCNV370Duo array in a sample of 857 unrelated, familial PD cases. Subsequently, a meta-analysis of imputed SNPs was performed combining the familial PD data with that from a previous GWAS of 440 idiopathic PD cases. The SNPs from the meta-analysis with the lowest p-values and consistency in the direction of effect for onset age were then genotyped in a replication sample of 747 idiopathic PD cases from the Parkinson Institute Biobank of Milan, Italy.
Meta-analysis across the three studies detected consistent association (p < 1 × 10-5) with five SNPs, none of which reached genomewide significance. On chromosome 11, the SNP with the lowest p-value (rs10767971; p = 5.4 × 10-7) lies between the genes QSER1 and PRRG4. Near the PARK3 linkage region on chromosome 2p13, association was observed with a SNP (rs7577851; p = 8.7 × 10-6) which lies in an intron of the AAK1 gene. This gene is closely related to GAK, identified as a possible PD susceptibility gene in the GWAS of the familial PD cases.
Taken together, these results suggest an influence of genes involved in endocytosis and lysosomal sorting in PD pathogenesis.
Genetic variants in embryonic lethal, abnormal vision, Drosophila-like 4 (ELAVL4) have been reported to be associated with onset age of Parkinson disease (PD) or risk for PD affection in Caucasian populations. In the current study we genotyped three single nucleotide polymorphisms in ELAVL4 in a Caucasian study sample consisting of 712 PD patients and 312 unrelated controls from the GenePD study. The minor allele of rs967582 was associated with increased risk of PD (odds ratio = 1.46, nominal P value = 0.011) in the GenePD population. The minor allele of rs967582 was also the risk allele for PD affection or earlier onset age in the previously studied populations. This replication of association with rs967582 in a third cohort further implicates ELAVL4 as a PD susceptibility gene.
We carried out a genome-wide association study of genetic predictors of anti-cyclic citrullinated peptide antibody (anti-CCP) level in 531 self-reported non-Hispanic Caucasian Rheumatoid Arthritis (RA) patients enrolled in the Brigham Rheumatoid Arthritis Sequential Study (BRASS). For replication, we then analyzed 289 single nucleotide polymorphisms (SNPs) with P < 0.001 in BRASS in an independent population of 849 RA patients from the North American Rheumatoid Arthritis Consortium (NARAC). BRASS and NARAC samples were genotyped using the Affymetrix 100K and Illumina 550K platforms respectively. Association between SNPs and anti-CCP titer was tested using general linear models. The five most significant SNPs from BRASS all were within the major histocompatibility complex (MHC) region (P ≤ 3.5 × 10−6). After controlling for the human leukocyte antigen shared epitope (HLA-SE), the top SNPs still yielded P values < 0.0002. In NARAC, a single SNP from the MHC region near BTNL2 and HLA-DRA, rs1980493 (r2 = 0.85 with the top five SNPs from BRASS), was associated significantly with CCP titer (P = 6.1 × 10−5) even after adjustment for the HLA-SE (P = 0.0002). The top SNPs found in BRASS and NARAC had r2 = 0.46 and 0.64, respectively, to HLA-DRB1 DR3 alleles. These results confirm that the most significant genome region affecting anti-CCP titers in RA is the MHC region. We identified a SNP in moderate linkage disequilibrium (LD) with HLA-DR3, which may influence anti-CCP titer independently of the HLA-SE.
We have created a program that searches densely genotyped regions for associated non-contiguous haplotypes using a standard family based haplotype association test. This program was designed to expand upon the ‘sliding window’ methodologies commonly used for haplotype construction by allowing the association of subsets of single nucleotide polymorphisms (SNPs) to drive the construction of the haplotype. This strategy permits HaploBuild to construct more biologically relevant haplotypes that are not constrained by arbitrary length and contiguous orientation.
The PARK3 locus on chromosome 2p13 has shown linkage to both the development and age of onset of Parkinson’s disease (PD). One candidate gene at this locus is sepiapterin reductase (SPR). Sepiapterin reductase catalyzes the final step in the biosynthetic pathway of tetrahydrobiopterin (BH4), an essential cofactor for aromatic amino acid hydrolases including tyrosine hydroxylase, the rate-limiting enzyme in dopamine synthesis. The expression of SPR was assayed using semiquantitative real-time RT-PCR in human post-mortem cerebellar tissue from neuropathologically confirmed PD cases and neurologically normal controls. The expression of other enzymes involved in BH4 biosynthesis, including aldose reductase (AKR1B1), carbonyl reductase (CBR1 and CBR3), GTP-cyclohydrolase I (GCH1), and 6-pyruvoyltetrahydrobiopterin (PTS), was also examined. Single-nucleotide polymorphisms around the SPR gene that have been previously reported to show association to PD affection and onset age were genotyped in these samples. Expression of SPR showed a significant 4-fold increase in PD cases relative to controls, while the expression of AKR1B1 and PTS was significantly decreased in PD cases. No difference in expression was detected for CBR1, CBR3, and GCH1. Genetic variants did not show a significant effect on SPR expression, however, this is likely due to the low frequency of rare genotypes in the sample. While the association of SPR to PD is not strong enough to support that this is the PARK3 gene, this study further implicates a role for SPR in idiopathic PD.
Parkinson’s disease; PARK3; sepiapterin reductase; RT-PCR; human; tetrahydrobiopterin
We used the simulated data set from Genetic Analysis Workshop 15 Problem 3 to assess a two-stage approach for identifying single-nucleotide polymorphisms (SNPs) associated with rheumatoid arthritis (RA). In the first stage, we used random forests (RF) to screen large amounts of genetic data using the variable importance measure, which takes into account SNP interaction effects as well as main effects without requiring model specification. We used the simulated 9187 SNPs mimicking a 10 K SNP chip, along with covariates DR (the simulated DRB1 gentoype), smoking, and sex as input to the RF analyses with a training set consisting of 750 unrelated RA cases and 750 controls. We used an iterative RF screening procedure to identify a smaller set of variables for further analysis. In the second stage, we used the software program CaMML for producing Bayesian networks, and developed complex etiologic models for RA risk using the variables identified by our RF screening procedure. We evaluated the performance of this method using independent test data sets for up to 100 replicates.
Brain magnetic resonance imaging (MRI) and cognitive tests can identify heritable endophenotypes associated with an increased risk of developing stroke, dementia and Alzheimer's disease (AD). We conducted a genome-wide association (GWA) and linkage analysis exploring the genetic basis of these endophenotypes in a community-based sample.
A total of 705 stroke- and dementia-free Framingham participants (age 62 +9 yrs, 50% male) who underwent volumetric brain MRI and cognitive testing (1999–2002) were genotyped. We used linear models adjusting for first degree relationships via generalized estimating equations (GEE) and family based association tests (FBAT) in additive models to relate qualifying single nucleotide polymorphisms (SNPs, 70,987 autosomal on Affymetrix 100K Human Gene Chip with minor allele frequency ≥ 0.10, genotypic call rate ≥ 0.80, and Hardy-Weinberg equilibrium p-value ≥ 0.001) to multivariable-adjusted residuals of 9 MRI measures including total cerebral brain (TCBV), lobar, ventricular and white matter hyperintensity (WMH) volumes, and 6 cognitive factors/tests assessing verbal and visuospatial memory, visual scanning and motor speed, reading, abstract reasoning and naming. We determined multipoint identity-by-descent utilizing 10,592 informative SNPs and 613 short tandem repeats and used variance component analyses to compute LOD scores.
The strongest gene-phenotype association in FBAT analyses was between SORL1 (rs1131497; p = 3.2 × 10-6) and abstract reasoning, and in GEE analyses between CDH4 (rs1970546; p = 3.7 × 10-8) and TCBV. SORL1 plays a role in amyloid precursor protein processing and has been associated with the risk of AD. Among the 50 strongest associations (25 each by GEE and FBAT) were other biologically interesting genes. Polymorphisms within 28 of 163 candidate genes for stroke, AD and memory impairment were associated with the endophenotypes studied at p < 0.001. We confirmed our previously reported linkage of WMH on chromosome 4 and describe linkage of reading performance to a marker on chromosome 18 (GATA11A06), previously linked to dyslexia (LOD scores = 2.2 and 5.1).
Our results suggest that genes associated with clinical neurological disease also have detectable effects on subclinical phenotypes. These hypothesis generating data illustrate the use of an unbiased approach to discover novel pathways that may be involved in brain aging, and could be used to replicate observations made in other studies.
Genetic variations that predispose individuals to complex disorders, such as essential hypertension, may be found in gene coding regions, intronic regions or in gene promoter regions. Most studies have focused on gene variations that result in amino acid substitutions because they result in different isoforms of the protein, presumably resulting in differences in protein properties. Less attention has been placed on the role of intronic or promoter mutations. In this report, we examined two single nucleotide polymorphisms (SNPs) in the catalase (CAT) gene prompter region in a cohort of hypertensive Caucasians and African Americans with a Mass Spec based Homogenous MassEXTEND assay. We found an association when a specific combination of the two promoter SNPs was examined in Caucasians. No association was observed in African Americans. Our data suggest that genetic variations in the promoter region of catalase gene influence the susceptibility to essential hypertension. In addition, the genetic factors that contribute to hypertension maybe different between ethnic groups.
catalase; essential hypertension; SNP; promoter
During aging, intracranial volume remains unchanged and represents maximally attained brain size, while various interacting biological phenomena lead to brain volume loss. Consequently, intracranial volume and brain volume in late life reflect different genetic influences. Our genome-wide association study in 8,175 community-dwelling elderly did not reveal any genome-wide significant associations (p<5*10−8) for brain volume. In contrast, intracranial volume was significantly associated with two loci: rs4273712 (p=3.4*10−11), a known height locus on chromosome 6q22, and rs9915547, tagging the inversion on chromosome 17q21 (p=1.5*10−12). We replicated the associations of these loci with intracranial volume in a separate sample of 1,752 older persons (p=1.1*10−3 for 6q22 and p=1.2*10−3 for 17q21). Furthermore, we also found suggestive associations of the 17q21 locus with head circumference in 10,768 children (mean age 14.5 months). Our data identify two loci associated with head size, with the inversion on 17q21 also likely involved in attaining maximal brain size.