The role of v-ATPases in cancer biology is being increasingly recognized. Yeast studies indicate that the tyrosine-kinase inhibitor imatinib may interact with the v-ATPase genes and alter the course of cancer progression. Data from humans in this regard is lacking.
We constructed 55 lymphoblastoid cell lines from pedigreed, cancer-free human subjects and treated them with IC20 concentration of imatinib mesylate. Using these cell lines, we: i) estimated the heritability and differential expression of 19 genes encoding several subunits of the v-ATPase protein in response to imatinib treatment; ii) estimated the genetic similarity among these genes and iii) conducted a high-density scan to find cis-regulating genetic variation associated with differential expression of these genes.
We found that the imatinib response of the genes encoding v-ATPase subunits is significantly heritable and can be clustered to identify novel drug targets in imatinib therapy. Further, five of these genes were significantly cis-regulated and together represented nearly half-log fold change in response to imatinib (p = 0.0107) that was homogenous (p = 0.2598).
Our results proffer support to the growing view that personalized regimens using proton pump inhibitors or v-ATPase inhibitors may improve outcomes of imatinib therapy in various cancers.
chronic myeloid leukemia; metastasis; imatinib; microarray
Intima-media thickness (IMT) of the common and internal carotid arteries is an established surrogate for atherosclerosis and predicts risk of stroke and myocardial infarction. Often IMT is measured as the average of these two arteries, yet they are believed to result from separate biological mechanisms. The aim of this study was to conduct a family-based genome-wide association study (GWAS) for IMT to identify polymorphisms influencing IMT and to determine if distinct carotid artery segments are influenced by different genetic components.
Methods and Results
IMT for the common and internal carotid arteries was determined through B-mode ultrasound in 772 Mexican Americans from the San Antonio Family Heart Study. A GWAS utilizing 931,219 single nucleotide polymorphisms (SNPs) was undertaken with six internal and common carotid artery IMT phenotypes utilizing an additive measured genotype model. The most robust association detected was for two SNPs (rs16983261, rs6113474, p=1.60e−7) in complete linkage disequilibrium on chromosome 20p11 for the internal carotid artery near wall, next to the gene PAX1. We also replicated previously reported GWAS regions on chromosomes 19q13 and 7q22. We found no overlapping associations between internal and common carotid artery phenotypes at p<5.0e0−6. The genetic correlation between the two carotid IMT arterial segments was 0.51.
This study represents the first large scale GWAS of carotid IMT in a non-European population and identified several novel loci. We do not detect any shared GWAS signals between common and internal carotid arterial segments but the moderate genetic correlation implies both common and unique genetic components.
intima-media thickness; carotid artery; GWAS; Hispanics
Increased serum uric acid (SUA) is a risk factor for gout and renal and cardiovascular disease (CVD). The purpose of this study was to identify genetic factors that affect the variation in SUA in 632 Mexican Americans participants of the San Antonio Family Heart Study (SAFHS). A genome-wide association (GWA) analysis was performed using the Illumina Human Hap 550K single nucleotide polymorphism (SNP) microarray. We used a linear regression-based association test under an additive model of allelic effect, while accounting for non-independence among family members via a kinship variance component. All analyses were performed in the software package SOLAR. SNPs rs6832439, rs13131257, and rs737267 in solute carrier protein 2 family, member 9 (SLC2A9) were associated with SUA at genome-wide significance (p < 1.3 × 10−7). The minor alleles of these SNPs had frequencies of 36.2, 36.2, and 38.2%, respectively, and were associated with decreasing SUA levels. All of these SNPs were located in introns 3–7 of SLC2A9, the location of the previously reported associations in European populations. When analyzed for association with cardiovascular-renal disease risk factors, conditional on SLC2A9 SNPs strongly associated with SUA, significant associations were found for SLC2A9 SNPs with BMI, body weight, and waist circumference (p < 1.4 × 10−3) and suggestive associations with albumin-creatinine ratio and total antioxidant status (TAS). The SLC2A9 gene encodes an urate transporter that has considerable influence on variation in SUA. In addition to the primary association locus, suggestive evidence (p < 1.9 × 10−6) for joint linkage/association (JLA) was found at a previously-reported urate quantitative trait locus (Logarithm of odds score = 3.6) on 3p26.3. In summary, our GWAS extends and confirms the association of SLC2A9 with SUA for the first time in a Mexican American cohort and also shows for the first time its association with cardiovascular-renal disease risk factors.
variance components decomposition approach; joint linkage/association analysis; kinship; hyperuricemia
Several studies have identified effects of genetic variation on DNA methylation patterns and associated heritability, with research primarily focused on Caucasian individuals. In this paper, we examine the evidence for genetic effects on DNA methylation in a Mexican American cohort, a population burdened by a high prevalence of obesity. Using an Illumina-based platform and following stringent quality control procedures, we assessed a total of 395 CpG sites in peripheral blood samples obtained from 183 Mexican American individuals for evidence of heritability, proximal genetic regulation and association with age, sex and obesity measures (i.e. waist circumference and body mass index). We identified 16 CpG sites (∼4%) that were significantly heritable after Bonferroni correction for multiple testing and 27 CpG sites (∼6.9%) that showed evidence of genetic effects. Six CpG sites (∼2%) were associated with age, primarily exhibiting positive relationships, including CpG sites in two genes that have been implicated in previous genome-wide methylation studies of age (FZD9 and MYOD1). In addition, we identified significant associations between three CpG sites (∼1%) and sex, including DNA methylation in CASP6, a gene that may respond to estradiol treatment, and in HSD17B12, which encodes a sex steroid hormone. Although we did not identify any significant associations between DNA methylation and the obesity measures, several nominally significant results were observed in genes related to adipogenesis, obesity, energy homeostasis and glucose homeostasis (ARHGAP9, CDKN2A, FRZB, HOXA5, JAK3, MEST, NPY, PEG3 and SMARCB1). In conclusion, we were able to replicate several findings from previous studies in our Mexican American cohort, supporting an important role for genetic effects on DNA methylation. In addition, we found a significant influence of age and sex on DNA methylation, and report on trend-level, novel associations between DNA methylation and measures of obesity.
Individual differences in biological ageing (i.e., the rate of physiological response to the passage of time) may be due in part to genotype-specific variation in gene action. However, the sources of heritable variation in human age-related gene expression profiles are largely unknown. We have profiled genome-wide expression in peripheral blood mononuclear cells from 1,240 individuals in large families and found 4,472 human autosomal transcripts, representing ~4,349 genes, significantly correlated with age. We identified 623 transcripts that show genotype by age interaction in addition to a main effect of age, defining a large set of novel candidates for characterization of the mechanisms of differential biological ageing. We applied a novel SNP genotype×age interaction test to one of these candidates, the ubiquilin-like gene UBQLNL, and found evidence of joint cis-association and genotype by age interaction as well as trans-genotype by age interaction for UBQLNL expression. Both UBQLNL expression levels at recruitment and cis genotype are associated with longitudinal cancer risk in our study cohort.
Transcriptional ageing; genotype by age interaction; ubiquitins; UBQLNL; cancer risk gene
Antibodies against infectious pathogens provide information on past or present exposure to infectious agents. While host genetic factors are known to affect the immune response, the influence of genetic factors on antibody levels to common infectious agents is largely unknown. Here we test whether antibody levels for 13 common infections are significantly heritable.
IgG antibodies to Chlamydophila pneumoniae, Helicobacter pylori, Toxoplasma gondii, adenovirus 36 (Ad36), hepatitis A virus, influenza A and B, cytomegalovirus, Epstein-Barr virus, herpes simplex virus (HSV)-1 and −2, human herpesvirus-6, and varicella zoster virus were determined for 1,227 Mexican Americans. Both quantitative and dichotomous (seropositive/seronegative) traits were analyzed. Influences of genetic and shared environmental factors were estimated using variance components pedigree analysis, and sharing of underlying genetic factors among traits was investigated using bivariate analyses.
Serological phenotypes were significantly heritable for most pathogens (h2 = 0.17–0.39), except for Ad36 and HSV-2. Shared environment was significant for several pathogens (c2 = 0.10–0.32). The underlying genetic etiology appears to be largely different for most pathogens.
Our results demonstrate, for the first time for many of these pathogens, that individual genetic differences of the human host contribute substantially to antibody levels to many common infectious agents, providing impetus for the identification of underlying genetic variants, which may be of clinical importance.
Pathogen; Infection; Antibody; Serology; Genetics; Heritability; Mexican Americans
A decade ago, there was widespread enthusiasm for the prospects of genome-wide association studies to identify common variants related to common chronic diseases using samples of unrelated individuals from populations. Although technological advancements allow us to query more than a million SNPs across the genome at low cost, a disappointingly small fraction of the genetic portion of common disease etiology has been uncovered. This has led to the hypothesis that less frequent variants might be involved, stimulating a renaissance of the traditional approach of seeking genes using multiplex families from less diverse populations. However, by using the modern genotyping and sequencing technology, we can now look not just at linkage, but jointly at linkage and linkage disequilibrium (LD) in such samples. Software methods that can look simultaneously at linkage and LD in a powerful and robust manner have been lacking. Most algorithms cannot jointly analyze datasets involving families of varying structures in a statistically or computationally efficient manner. We have implemented previously proposed statistical algorithms in a user-friendly software package, PSEUDOMARKER. This paper is an announcement of this software package. We describe the motivation behind the approach, the statistical methods, and software, and we briefly demonstrate PSEUDOMARKER's advantages over other packages by example.
Computer software; Family-based association; Genome-wide association; Likelihood methods; Linkage analysis; Linkage disequilibrium; Study design
Imatinib mesylate is currently the drug of choice to treat chronic myeloid leukemia. However, patient resistance and cytotoxicity make secondary lines of treatment, such as omacetaxine mepesuccinate, a necessity. Given that drug cytotoxicity represents a major problem during treatment, it is essential to understand the biological pathways affected to better predict poor drug response and prioritize a treatment regime.
We conducted cell viability and gene expression assays to determine heritability and gene expression changes associated with imatinib and omacetaxine treatment of 55 non-cancerous lymphoblastoid cell lines, derived from 17 pedigrees. In total, 48,803 transcripts derived from Illumina Human WG-6 BeadChips were analyzed for each sample using SOLAR, whilst correcting for kinship structure.
Cytotoxicity within cell lines was highly heritable following imatinib treatment (h2 = 0.60-0.73), but not omacetaxine treatment. Cell lines treated with an IC20 dose of imatinib or omacetaxine showed differential gene expression for 956 (1.96%) and 3,892 transcripts (7.97%), respectively; 395 of these (0.8%) were significantly influenced by both imatinib and omacetaxine treatment. k-means clustering and DAVID functional annotation showed expression changes in genes related to kinase binding and vacuole-related functions following imatinib treatment, whilst expression changes in genes related to cell division and apoptosis were evident following treatment with omacetaxine. The enrichment scores for these ontologies were very high (mostly >10).
Induction of gene expression changes related to different pathways following imatinib and omacetaxine treatment suggests that the cytotoxicity of such drugs may be differentially tolerated by individuals based on their genetic background.
Chronic myeloid leukemia; Microarray; Toxicity; Gene expression; Imatinib; Omacetaxine
The delta-5 and delta-6 desaturases (D5D and D6D), encoded by fatty acid desaturase 1 (FADS1) and 2 (FADS2) genes, respectively, are rate-limiting enzymes in the metabolism of ω-3 and ω-6 fatty acids. The objective of this study was to identify genes influencing variation in estimated D5D and D6D activities in plasma and erythrocytes in Alaskan Eskimos (n = 761) participating in the genetics of coronary artery disease in Alaska Natives (GOCADAN) study. Desaturase activity was estimated by product: precursor ratio of polyunsaturated fatty acids. We found evidence of linkage for estimated erythrocyte D5D (eD5D) on chromosome 11q12-q13 (logarithm of odds score = 3.5). The confidence interval contains candidate genes FADS1, FADS2, 7-dehydrocholesterol reductase (DHCR7), and carnitine palmitoyl transferase 1A, liver (CPT1A). Measured genotype analysis found association between CPT1A, FADS1, and FADS2 single-nucleotide polymorphisms (SNPs) and estimated eD5D activity (p-values between 10−28 and 10−5). A Bayesian quantitative trait nucleotide analysis showed that rs3019594 in CPT1A, rs174541 in FADS1, and rs174568 in FADS2 had posterior probabilities > 0.8, thereby demonstrating significant statistical support for a functional effect on eD5D activity. Highly significant associations of FADS1, FADS2, and CPT1A transcripts with their respective SNPs (p-values between 10−75 and 10−7) in Mexican Americans of the San Antonio Family Heart Study corroborated our results. These findings strongly suggest a functional role for FADS1, FADS2, and CPT1A SNPs in the variation in eD5D activity.
essential fatty acids; single-nucleotide polymorphisms; bayesian quantitative trait nucleotide analysis
The high-density-lipoprotein-(HDL-) associated esterase paraoxonase 1 (PON1) is a likely contributor to the antioxidant and antiatherosclerotic capabilities of HDL. Two nonsynonymous mutations in the structural gene, PON1, have been associated with variation in activity levels, but substantial interindividual differences remain unexplained and are greatest for substrates other than the eponymous paraoxon. PON1 activity levels were measured for three substrates—organophosphate paraoxon, arylester phenyl acetate, and lactone dihydrocoumarin—in 767 Mexican American individuals from San Antonio, Texas. Genetic influences on activity levels for each substrate were evaluated by association with approximately one million single nucleotide polymorphism (SNPs) while conditioning on PON1 genotypes. Significant associations were detected at five loci including regions on chromosomes 4 and 17 known to be associated with atherosclerosis and lipoprotein regulation and loci on chromosome 3 that regulate ubiquitous transcription factors. These loci explain 7.8% of variation in PON1 activity with lactone as a substrate, 5.6% with the arylester, and 3.0% with paraoxon. In light of the potential importance of PON1 in preventing cardiovascular disease/events, these novel loci merit further investigation.
Elucidating the genetic architecture of preeclampsia is a major goal in obstetric medicine. We have performed a genome-wide association study (GWAS) for preeclampsia in unrelated Australian individuals of Caucasian ancestry using the Illumina OmniExpress-12 BeadChip to successfully genotype 648,175 SNPs in 538 preeclampsia cases and 540 normal pregnancy controls. Two SNP associations (rs7579169, p = 3.58×10−7, OR = 1.57; rs12711941, p = 4.26×10−7, OR = 1.56) satisfied our genome-wide significance threshold (modified Bonferroni p<5.11×10−7). These SNPs reside in an intergenic region less than 15 kb downstream from the 3′ terminus of the Inhibin, beta B (INHBB) gene on 2q14.2. They are in linkage disequilibrium (LD) with each other (r2 = 0.92), but not (r2<0.80) with any other genotyped SNP ±250 kb. DNA re-sequencing in and around the INHBB structural gene identified an additional 25 variants. Of the 21 variants that we successfully genotyped back in the case-control cohort the most significant association observed was for a third intergenic SNP (rs7576192, p = 1.48×10−7, OR = 1.59) in strong LD with the two significant GWAS SNPs (r2>0.92). We attempted to provide evidence of a putative regulatory role for these SNPs using bioinformatic analyses and found that they all reside within regions of low sequence conservation and/or low complexity, suggesting functional importance is low. We also explored the mRNA expression in decidua of genes ±500 kb of INHBB and found a nominally significant correlation between a transcript encoded by the EPB41L5 gene, ∼250 kb centromeric to INHBB, and preeclampsia (p = 0.03). We were unable to replicate the associations shown by the significant GWAS SNPs in case-control cohorts from Norway and Finland, leading us to conclude that it is more likely that these SNPs are in LD with as yet unidentified causal variant(s).
Heart rate (HR) has been identified as a risk factor for cardiovascular disease (CVD), yet little is known regarding genetic factors influencing this phenotype. Previous research in American Indians (AIs) from the Strong Heart Family Study (SHFS) identified a significant quantitative trait locus (QTL) for HR on chromosome 9p21. Genetic association on HR was conducted in the SHFS. HR was measured from electrocardiogram (ECG) and echocardiograph (Echo) Doppler recordings. We examined 2248 single-nucleotide polymorphisms (SNPs) on chromosome 9p21 for association using a gene-centric statistical test. We replicated the aforementioned QTL [logarithm of odds (LOD) = 4.83; genome-wide P= 0.0003] on chromosome 9p21 in one SHFS population using joint linkage of ECG and Echo HR. After correcting for effective number of SNPs using a gene-centric test, six SNPs (rs7875153, rs7848524, rs4446809, rs10964759, rs1125488 and rs7853123) remained significant. We applied a novel bivariate association method, which was a joint test of association of a single locus to two traits using a standard additive genetic model. The SNP, rs7875153, provided the strongest evidence for association (P = 7.14 × 10−6). This SNP (rs7875153) is rare (minor allele frequency = 0.02) in AIs and is located within intron 9 of the gene KIAA1797. To support this association, we applied lymphocyte RNA expression data from the San Antonio Family Heart Study, a longitudinal study of CVD in Mexican Americans. Expression levels of KIAA1797 were significantly associated (P = 0.012) with HR. These findings in independent populations support that KIAA1797 genetic variation may be associated with HR but elucidation of a functional relationship requires additional study.
Genome-wide association studies that compare the statistical association between thousands of DNA variations and a human trait have detected 958 loci across 127 different diseases and traits. However, these statistical associations only provide evidence for genomic regions likely to harbor a causal gene(s) and do not directly identify such genes. We combined gene variation and expression data in a human cohort to identify causal genes.
RESEARCH DESIGN AND METHODS
Global gene transcription activity was obtained for each individual in a large human cohort (n = 1,240). These quantitative transcript data were tested for correlation with genotype data generated from the same individuals to identify gene expression patterns influenced by the variants.
Variant rs8050136 lies within intron 1 of the FTO gene on chromosome 16 and marks a locus strongly associated with type 2 diabetes and obesity and widely replicated across many populations. We report that genetic variation at this locus does not influence FTO gene expression levels (P = 0.38), but is strongly correlated with expression of RBL2 (P = 2.7 × 10−5), ∼270,000 base pairs distant to FTO.
These data suggest that variants at FTO influence RBL2 gene expression at large genetic distances. This observation underscores the complexity of human transcriptional regulation and highlights the utility of large human cohorts in which both genetic variation and global gene expression data are available to identify disease genes. Expedient identification of genes mediating the effects of genome-wide association study–identified loci will enable mechanism-of-action studies and accelerate understanding of human disease processes under genetic influence.
Hyperuricemia is associated with the metabolic syndrome, gout, renal and cardiovascular disease (CVD). American Indians have high rates of CVD and 25 % of individuals in the Strong Heart Family Study (SHFS) have high serum uric acid levels. The aim of this study was to investigate the genetic determinants of serum uric acid variation in American Indian participants of the SHFS. A variance component decomposition approach (implemented in SOLAR) was used to conduct univariate genetic analyses in each of three study centers and the combined sample. Serum uric acid was adjusted for age, sex, age*sex, BMI, estimated glomerular filtration rate, alcohol intake, diabetic status and medications. Overall mean ± SD serum uric acid for all individuals was 5.14 ± 1.5 mg/dl. Serum uric acid was found to be significantly heritable (0.46 ± 0.03 in all centers, and 0.39 ± 0.07, 0.51 ± 0.05, 0.44 ± 0.06 in Arizona, Dakotas and Oklahoma, respectively). Multipoint linkage analysis showed significant evidence of linkage for serum uric acid on chromosome 11 in the Dakotas center (logarithm of odds score (LOD) = 3.02) and in the combined sample (LOD = 3.56) and on chromosome 1 (LOD = 3.51) in the combined sample. A strong positional candidate gene in the chromosome 11 region is solute carrier family22, member 12 (SLC22A12) that encodes a major uric acid transporter URAT1. These results show a significant genetic influence and a possible role for one or more genes on chromosomes 1 and 11 on the variation in serum uric acid in American Indian populations.
SLC22A12 gene; URAT1; variance component decomposition approach; chromosome
Population studies have demonstrated an important role of social, behavioral, and environmental factors in blood pressure levels. Accounting for the genetic interaction of these factors may help to identify common blood pressure susceptibility alleles.
Methods and Results
We studied the interaction of additive genetic effects and behavioral (physical activity, smoking, alcohol use) and socioeconomic (education) factors on blood pressure in approximately 3,600 American Indians participants of the Strong Heart Family Study, using variance component models. The mean and standard deviation of resting systolic and diastolic blood pressures were 123 ± 17 and 76 ± 11 mm Hg, respectively. We detected evidence for distinct genetic effects on diastolic blood pressure among ever smokers compared to never smokers (P=0.01). For alcohol intake, we observed significant genotype-by-environment interactions on diastolic (ρg=0.10, P = 0.0003) and on systolic blood pressures (ρg= 0.59, P = 0.0008) among current drinkers compared to former or never drinkers. We also detected genotype-by-physical activity interactions on diastolic blood pressure (ρg=0.35, P = 0.0004). Lastly, there was evidence for distinct genetic effects on diastolic blood pressure among individuals with less than high school education compared to those with 12 or more years of education (ρg= 0.41, P = 0.02).
Our findings suggest that behavioral and socioeconomic factors can modify the genetic effects on blood pressure phenotypes. Accounting for context dependent factors may help us to better understand the complexities of the gene effects on blood pressure and other complex phenotypes with high levels of genetic heterogeneity.
epidemiology; genetics; blood pressure
Pulse pressure, a measure of central arterial stiffness and a predictor of cardiovascular mortality, has known genetic components.
To localize the genetic effects of pulse pressure, we conducted a genome-wide linkage analysis of 1,892 American Indian participants of the Strong Heart Family Study. Blood pressure was measured three times and the average of the last two measures was used for analyses. Pulse pressure, the difference between systolic and diastolic blood pressures, was log-transformed and adjusted for the effects of age and sex within each study center. Variance component linkage analyses were performed using marker allele frequencies derived from all individuals and multipoint identity-by-descent matrices calculated in Loki.
We identified a quantitative trait locus influencing pulse pressure on chromosome 7 at 37 cM (marker D7S493, LOD=3.3) and suggestive evidence of linkage on chromosome 19 at 92 cM (marker D19S888, LOD=1.8).
The signal on 7p15.3 overlaps positive findings for pulse pressure among Utah population samples, suggesting that this region may harbor gene variants for blood pressure related traits.
Genetics; pulse pressure; American Indian
Circulating soluble intercellular adhesion molecule-1 (sICAM-1) is a biochemical marker of inflammation. We performed variance-components-based quantitative genetic analyses in SOLAR of sICAM-1 in 1170 individuals from Mexican American families in the San Antonio Family Heart Study. The trait is heritable (h2 = 0.50±0.06, P<10-6). Multipoint linkage analysis using a ∼10-cM microsatellite map revealed a region on Chromosome 19p near marker D19S586 showing strong evidence of linkage for sICAM-1 (empirically adjusted univariate-equivalent LOD = 4.95), coincident with the structural gene ICAM1. This region has been identified previously as a QTL for inflammatory, autoimmune, and metabolic syndrome traits. There is significant evidence (P=0.0023) of locus heterogeneity for sICAM-1 in this sample: a subset of pedigrees contributes most of the linkage signal for sICAM-1 on Chromosome 19, suggesting a logical focus for future genetic dissection of the trait.
ICAM-1; inflammation; genetic heterogeneity; genome scan; quantitative trait locus; Mexican Americans
Whole-transcriptome expression profiling provides novel phenotypes for analysis of complex traits. Gene expression measurements reflect quantitative variation in transcript-specific messenger RNA levels and represent phenotypes lying close to the action of genes. Understanding the genetic basis of gene expression will provide insight into the processes that connect genotype to clinically significant traits representing a central tenet of system biology. Synchronous in vivo expression profiles of lymphocytes, muscle, and subcutaneous fat were obtained from healthy Mexican men. Most genes were expressed at detectable levels in multiple tissues, and RNA levels were correlated between tissue types. A subset of transcripts with high reliability of expression across tissues (estimated by intraclass correlation coefficients) was enriched for cis-regulated genes, suggesting that proximal sequence variants may influence expression similarly in different cellular environments. This integrative global gene expression profiling approach is proving extremely useful for identifying genes and pathways that contribute to complex clinical traits. Clearly, the coincidence of clinical trait quantitative trait loci and expression quantitative trait loci can help in the prioritization of positional candidate genes. Such data will be crucial for the formal integration of positional and transcriptomic information characterized as genetical genomics.
Infection with Epstein-Barr virus (EBV) is highly prevalent worldwide, and it has been associated with infectious mononucleosis and severe diseases including Burkitt lymphoma, Hodgkin lymphoma, nasopharyngeal lymphoma, and lymphoproliferative disorders. Although EBV has been the focus of extensive research, much still remains unknown concerning what makes some individuals more sensitive to infection and to adverse outcomes as a result of infection. Here we use an integrative genomics approach in order to localize genetic factors influencing levels of Epstein Barr virus (EBV) nuclear antigen-1 (EBNA-1) IgG antibodies, as a measure of history of infection with this pathogen, in large Mexican American families. Genome-wide evidence of both significant linkage and association was obtained on chromosome 6 in the human leukocyte antigen (HLA) region and replicated in an independent Mexican American sample of large families (minimum p-value in combined analysis of both datasets is 1.4×10−15 for SNPs rs477515 and rs2516049). Conditional association analyses indicate the presence of at least two separate loci within MHC class II, and along with lymphocyte expression data suggest genes HLA-DRB1 and HLA-DQB1 as the best candidates. The association signals are specific to EBV and are not found with IgG antibodies to 12 other pathogens examined, and therefore do not simply reveal a general HLA effect. We investigated whether SNPs significantly associated with diseases in which EBV is known or suspected to play a role (namely nasopharyngeal lymphoma, Hodgkin lymphoma, systemic lupus erythematosus, and multiple sclerosis) also show evidence of associated with EBNA-1 antibody levels, finding an overlap only for the HLA locus, but none elsewhere in the genome. The significance of this work is that a major locus related to EBV infection has been identified, which may ultimately reveal the underlying mechanisms by which the immune system regulates infection with this pathogen.
Many factors influence individual differences in susceptibility to infectious disease, including genetic factors of the host. Here we use several genome-wide investigative tools (linkage, association, joint linkage and association, and the analysis of gene expression data) to search for host genetic factors influencing Epstein-Barr virus (EBV) infection. EBV is a human herpes virus that infects up to 90% of adults worldwide, infection with which has been associated with severe complications including malignancies and autoimmune disorders. In a sample of >1,300 Mexican American family members, we found significant evidence of association of anti–EBV antibody levels with loci on chromosome 6 in the human leukocyte antigen region, which contains genes related to immune function. The top two independent loci in this region were HLA-DRB1 and HLA-DQB1, both of which are involved in the presentation of foreign antigens to T cells. This finding was specific to EBV and not to 12 other pathogens we examined. We also report an overlap of genetic factors influencing both EBV antibody level and EBV–related cancers and autoimmune disorders. This work demonstrates the presence of EBV susceptibility loci and provides impetus for further investigation to better understand the underlying mechanisms related to differences in disease progression among individuals infected with this pathogen.
nuclear factor kappa B; gene expression network; principal components factor analysis; linkage analysis; systems genetics
A large number of genome-wide association studies have been performed during the past five years to identify associations between SNPs and human complex diseases and traits. The assignment of a functional role for the identified disease-associated SNP is not straight-forward. Genome-wide expression quantitative trait locus (eQTL) analysis is frequently used as the initial step to define a function while allele-specific gene expression (ASE) analysis has not yet gained a wide-spread use in disease mapping studies. We compared the power to identify cis-acting regulatory SNPs (cis-rSNPs) by genome-wide allele-specific gene expression (ASE) analysis with that of traditional expression quantitative trait locus (eQTL) mapping. Our study included 395 healthy blood donors for whom global gene expression profiles in circulating monocytes were determined by Illumina BeadArrays. ASE was assessed in a subset of these monocytes from 188 donors by quantitative genotyping of mRNA using a genome-wide panel of SNP markers. The performance of the two methods for detecting cis-rSNPs was evaluated by comparing associations between SNP genotypes and gene expression levels in sample sets of varying size. We found that up to 8-fold more samples are required for eQTL mapping to reach the same statistical power as that obtained by ASE analysis for the same rSNPs. The performance of ASE is insensitive to SNPs with low minor allele frequencies and detects a larger number of significantly associated rSNPs using the same sample size as eQTL mapping. An unequivocal conclusion from our comparison is that ASE analysis is more sensitive for detecting cis-rSNPs than standard eQTL mapping. Our study shows the potential of ASE mapping in tissue samples and primary cells which are difficult to obtain in large numbers.
Chronic kidney disease (CKD) is an important public health problem in American Indian populations. Recent research has identified associations of polymorphisms in the myosin heavy chain type II isoform A (MYH9) gene with hypertensive CKD in African-Americans. Whether these associations are also present among American Indian individuals is unknown. To evaluate the role of genetic polymorphisms in the MYH9 gene on kidney disease in American Indians, we genotyped 25 SNPs in the MYH9 gene region in 1,119 comparatively unrelated individuals. Four SNPs failed, and one SNP was monomorphic. We inferred haplotypes using seven SNPs within the region of the previously described E haplotype using Phase v2.1. We studied the association between 20 MYH9 SNPs with kidney function (estimated glomerular filtration rate, eGFR) and CKD (eGFR < 60 ml/min/1.73 m2 or renal replacement therapy or kidney transplant) using age-, sex- and center-adjusted models and measured genotyped within the variance component models. MYH9 SNPs were not significantly associated with kidney traits in additive or recessive genetic adjusted models. MYH9 haplotypes were also not significantly associated with kidney outcomes. In conclusion, common variants in MYH9 polymorphisms may not confer an increased risk of CKD in American Indian populations. Identification of the actual functional genetic variation responsible for the associations seen in African-Americans will likely help to clarify the lack of replication of this gene in our population of American Indians.
Systemic lupus erythematosus (SLE) is the prototype autoimmune disease where genes regulated by type I interferon (IFN) are over-expressed and contribute to the disease pathogenesis. Because signal transducer and activator of transcription 4 (STAT4) plays a key role in the type I IFN receptor signaling, we performed a candidate gene study of a comprehensive set of single nucleotide polymorphism (SNPs) in STAT4 in Swedish patients with SLE. We found that 10 out of 53 analyzed SNPs in STAT4 were associated with SLE, with the strongest signal of association (P = 7.1 × 10−8) for two perfectly linked SNPs rs10181656 and rs7582694. The risk alleles of these 10 SNPs form a common risk haplotype for SLE (P = 1.7 × 10−5). According to conditional logistic regression analysis the SNP rs10181656 or rs7582694 accounts for all of the observed association signal. By quantitative analysis of the allelic expression of STAT4 we found that the risk allele of STAT4 was over-expressed in primary human cells of mesenchymal origin, but not in B-cells, and that the risk allele of STAT4 was over-expressed (P = 8.4 × 10−5) in cells carrying the risk haplotype for SLE compared with cells with a non-risk haplotype. The risk allele of the SNP rs7582694 in STAT4 correlated to production of anti-dsDNA (double-stranded DNA) antibodies and displayed a multiplicatively increased, 1.82-fold risk of SLE with two independent risk alleles of the IRF5 (interferon regulatory factor 5) gene.