Several infrequent genetic polymorphisms in the SERPINA1 gene are known to substantially reduce concentration of alpha1-antitrypsin (AAT) in the blood. Since low AAT serum levels fail to protect pulmonary tissue from enzymatic degradation, these polymorphisms also increase the risk for early onset chronic obstructive pulmonary disease (COPD). The role of more common SERPINA1 single nucleotide polymorphisms (SNPs) in respiratory health remains poorly understood.
We present here an agnostic investigation of genetic determinants of circulating AAT levels in a general population sample by performing a genome-wide association study (GWAS) in 1392 individuals of the SAPALDIA cohort.
Five common SNPs, defined by showing minor allele frequencies (MAFs) >5%, reached genome-wide significance, all located in the SERPINA gene cluster at 14q32.13. The top-ranking genotyped SNP rs4905179 was associated with an estimated effect of β = −0.068 g/L per minor allele (P = 1.20*10−12). But denser SERPINA1 locus genotyping in 5569 participants with subsequent stepwise conditional analysis, as well as exon-sequencing in a subsample (N = 410), suggested that AAT serum level is causally determined at this locus by rare (MAF<1%) and low-frequent (MAF 1–5%) variants only, in particular by the well-documented protein inhibitor S and Z (PI S, PI Z) variants. Replication of the association of rs4905179 with AAT serum levels in the Copenhagen City Heart Study (N = 8273) was successful (P<0.0001), as was the replication of its synthetic nature (the effect disappeared after adjusting for PI S and Z, P = 0.57). Extending the analysis to lung function revealed a more complex situation. Only in individuals with severely compromised pulmonary health (N = 397), associations of common SNPs at this locus with lung function were driven by rarer PI S or Z variants. Overall, our meta-analysis of lung function in ever-smokers does not support a functional role of common SNPs in the SERPINA gene cluster in the general population.
Low levels of alpha1-antitrypsin (AAT) in the blood are a well-established risk factor for accelerated loss in lung function and chronic obstructive pulmonary disease. While a few infrequent genetic polymorphisms are known to influence the serum levels of this enzyme, the role of common genetic variants has not been examined so far. The present genome-wide scan for associated variants in approximately 1400 Swiss inhabitants revealed a chromosomal locus containing the functionally established variants of AAT deficiency and variants previously associated with lung function and emphysema. We used dense genotyping of this genetic region in more than 5500 individuals and subsequent conditional analyses to unravel which of these associated variants contribute independently to the phenotype's variability. All associations of common variants could be attributed to the rarer functionally established variants, a result which was then replicated in an independent population-based Danish cohort. Hence, this locus represents a textbook example of how a large part of a trait's heritability can be hidden in infrequent genetic polymorphisms. The attempt to transfer these results to lung function furthermore suggests that effects of common variants in this genetic region in ever-smokers may also be explained by rarer variants, but only in individuals with hampered pulmonary health.
DNA methylation plays an important role in development of disease and the process of aging. In this study we examine DNA methylation at 476,366 sites throughout the genome of white blood cells from a population cohort (N = 421) ranging in age from 14 to 94 years old. Age affects DNA methylation at almost one third (29%) of the sites (Bonferroni adjusted P-value <0.05), of which 60.5% becomes hypomethylated and 39.5% hypermethylated with increasing age. DNA methylation sites that are located within CpG islands (CGIs) more often become hypermethylated compared to sites outside an island. CpG sites in promoters are more unaffected by age, whereas sites in enhancers more often becomes hypo- or hypermethylated. Hypermethylated sites are overrepresented among genes that are involved in DNA binding, transcription regulation, processes of anatomical structure and developmental process and cortex neuron differentiation (P-value down to P = 9.14*10−67). By contrast, hypomethylated sites are not strongly overrepresented among any biological function or process. Our results indicate that the 23% of the variation in DNA methylation is attributed chronological age, and that hypermethylation is more site-specific than hypomethylation. It appears that the change in DNA methylation partly overlap with regions that change histone modifications with age, indicating an interaction between the two major epigenetic mechanisms. Epigenetic modifications and change in gene expression over time most likely reflects the natural process of aging and variation between individuals might contribute to the development of age-related phenotypes and diseases such as type II diabetes, autoimmune and cardiovascular disease.
Advanced ileocecal Crohn's disease (ICD) is characterized by strictures, inflammation in the enteric nervous system (myenteric plexitis), and a high frequency of NOD2 mutations. Recent findings implicate a role of NOD2 and another CD susceptibility gene, ATG16L1, in the host response against single-stranded RNA (ssRNA) viruses. However, the role of viruses in CD is unknown. We hypothesized that human enterovirus species B (HEV-B), which are ssRNA viruses with dual tropism both for the intestinal epithelium and the nervous system, could play a role in ICD.
We used immunohistochemistry and in situ hybridization to study the general presence of HEV-B and the presence of the two HEV-B subspecies, Coxsackie B virus (CBV) and Echovirus, in ileocecal resections from 9 children with advanced, stricturing ICD and 6 patients with volvulus, and in intestinal biopsies from 15 CD patients at the time of diagnosis.
All patients with ICD had disease-associated polymorphisms in NOD2 or ATG16L1. Positive staining for HEV-B was detected both in the mucosa and in myenteric nerve ganglia in all ICD patients, but in none of the volvulus patients. Expression of the cellular receptor for CBV, CAR, was detected in nerve cell ganglia.
The common presence of HEV-B in the mucosa and enteric nervous system of ICD patients in this small cohort is a novel finding that warrants further investigation to analyze whether HEV-B has a role in disease onset or progress. The presence of CAR in myenteric nerve cell ganglia provides a possible route of entry for CBV into the enteric nervous system.
Glycosylation of immunoglobulin G (IgG) influences IgG effector function by modulating binding to Fc receptors. To identify genetic loci associated with IgG glycosylation, we quantitated N-linked IgG glycans using two approaches. After isolating IgG from human plasma, we performed 77 quantitative measurements of N-glycosylation using ultra-performance liquid chromatography (UPLC) in 2,247 individuals from four European discovery populations. In parallel, we measured IgG N-glycans using MALDI-TOF mass spectrometry (MS) in a replication cohort of 1,848 Europeans. Meta-analysis of genome-wide association study (GWAS) results identified 9 genome-wide significant loci (P<2.27×10−9) in the discovery analysis and two of the same loci (B4GALT1 and MGAT3) in the replication cohort. Four loci contained genes encoding glycosyltransferases (ST6GAL1, B4GALT1, FUT8, and MGAT3), while the remaining 5 contained genes that have not been previously implicated in protein glycosylation (IKZF1, IL6ST-ANKRD55, ABCF2-SMARCD3, SUV420H1, and SMARCB1-DERL3). However, most of them have been strongly associated with autoimmune and inflammatory conditions (e.g., systemic lupus erythematosus, rheumatoid arthritis, ulcerative colitis, Crohn's disease, diabetes type 1, multiple sclerosis, Graves' disease, celiac disease, nodular sclerosis) and/or haematological cancers (acute lymphoblastic leukaemia, Hodgkin lymphoma, and multiple myeloma). Follow-up functional experiments in haplodeficient Ikzf1 knock-out mice showed the same general pattern of changes in IgG glycosylation as identified in the meta-analysis. As IKZF1 was associated with multiple IgG N-glycan traits, we explored biomarker potential of affected N-glycans in 101 cases with SLE and 183 matched controls and demonstrated substantial discriminative power in a ROC-curve analysis (area under the curve = 0.842). Our study shows that it is possible to identify new loci that control glycosylation of a single plasma protein using GWAS. The results may also provide an explanation for the reported pleiotropy and antagonistic effects of loci involved in autoimmune diseases and haematological cancer.
After analysing glycans attached to human immunoglobulin G in 4,095 individuals, we performed the first genome-wide association study (GWAS) of the glycome of an individual protein. Nine genetic loci were found to associate with glycans with genome-wide significance. Of these, four were enzymes that directly participate in IgG glycosylation, thus the observed associations were biologically founded. The remaining five genetic loci were not previously implicated in protein glycosylation, but the most of them have been reported to be relevant for autoimmune and inflammatory conditions and/or haematological cancers. A particularly interesting gene, IKZF1 was found to be associated with multiple IgG N-glycans. This gene has been implicated in numerous diseases, including systemic lupus erythematosus (SLE). We analysed N-glycans in 101 cases with SLE and 183 matched controls and demonstrated their substantial biomarker potential. Our study shows that it is possible to identify new loci that control glycosylation of a single plasma protein using GWAS. Our results may also provide an explanation for opposite effects of some genes in autoimmune diseases and haematological cancer.
We have used targeted genomic sequencing of high-complexity DNA pools based on long-range PCR and deep DNA sequencing by the SOLiD technology. The method was used for sequencing of 286 kb from four chromosomal regions with quantitative trait loci (QTL) influencing blood plasma lipid and uric acid levels in DNA pools of 500 individuals from each of five European populations. The method shows very good precision in estimating allele frequencies as compared with individual genotyping of SNPs (r2=0.95, P<10−16). Validation shows that the method is able to identify novel SNPs and estimate their frequency in high-complexity DNA pools. In our five populations, 17% of all SNPs and 61% of structural variants are not available in the public databases. A large fraction of the novel variants show a limited geographic distribution, with 62% of the novel SNPs and 59% of novel structural variants being detected in only one of the populations. The large number of population-specific novel SNPs underscores the need for comprehensive sequencing of local populations in order to identify the causal variants of human traits.
pooling; next-generation DNA sequencing; SOLiD; SNP; indels
The high intake of game meat in populations with a subsistence-based diet may affect their blood lipids and health status.
To examine the association between diet and circulating levels of blood lipid levels in a northern Swedish population.
We compared a group with traditional lifestyle (TLS) based on reindeer herding (TLS group) with those from the same area with a non-traditional lifestyle (NTLS) typical of more industrialized regions of Sweden (NTLS group). The analysis was based on self-reported intake of animal source food (i.e. non-game meat, game meat, fish, dairy products and eggs) and the serum blood level of a number of lipids [total cholesterol (TC), low-density lipoprotein cholesterol (LDL), high-density lipoprotein cholesterol (HDL), triglycerides (TG), glycerophospholipids and sphingolipids].
The TLS group had higher cholesterol, LDL and HDL levels than the reference group. Of the TLS group, 65% had cholesterol levels above the threshold for increased risk of coronary heart disease (≥240 mg/dl), as compared to 38% of the NTLS group. Self-reported consumption of game meat was positively associated with TC and LDL.
The high game meat consumption of the TLS group is associated with increased cholesterol levels. High intake of animal protein and fat and low fibre is known to increase the risk of cardiovascular disease, but other studies of the TLS in northern Sweden have shown comparable incidences of cardiovascular disease to the reference (NTLS) group from the same geographical area. This indicates that factors other than TC influence disease risk. One such possible factor is dietary phospholipids, which are also found in high amounts specifically in game meat and have been shown to inhibit cholesterol absorption.
epidemiology; nutrition; animal source foods; game; lipids; cholesterol; phospholipids; sphingolipids
Atopic allergy is effected by a number of environmental exposures, such as dry air and time spent outdoors, but there are few estimates of the prevalence in populations from sub-arctic areas.
To determine the prevalence and severity of symptoms of food, inhalation and skin-related allergens and coeliac disease (CD) in the sub-arctic region of Sweden. To study the correlation between self-reported allergy and allergy test results. To estimate the heritability of these estimates.
The study was conducted in Karesuando and Soppero in Northern Sweden as part of the Northern Sweden Population Health Study (n=1,068). We used a questionnaire for self-reported allergy and CD status and measured inhalation-related allergens using Phadiatop, food-related allergens using the F×5 assay and IgA and IgG antibodies against tissue transglutaminase (anti-tTG) to indicate prevalence of CD.
The prevalence of self-reported allergy was very high, with 42.3% reporting mild to severe allergy. Inhalation-related allergy was reported in 26.7%, food-related allergy in 24.9% and skin-related allergy in 2.4% of the participants. Of inhalation-related allergy, 11.0% reported reactions against fur and 14.6% against pollen/grass. Among food-related reactions, 14.9% reported milk (protein and lactose) as the cause. The IgE measurements showed that 18.4% had elevated values for inhalation allergens and 11.7% for food allergens. Self-reported allergies and symptoms were positively correlated (p<0.01) with age- and sex-corrected inhalation allergens. Allergy prevalence was inversely correlated with age and number of hours spent outdoors. High levels of IgA and IgG anti-tTG antibodies, CD-related allergens, were found in 1.4 and 0.6% of participants, respectively. All allergens were found to be significantly (p<3 e–10) heritable, with estimated heritabilities ranging from 0.34 (F×5) to 0.65 (IgA).
Self-reported allergy correlated well with the antibody measurements. The prevalence of allergy was highest in the young and those working inside. Heritability of atopy and sensitization was high. The prevalence of CD-related autoantibodies was high and did not coincide with the self-reported allergy.
allergy; coeliac disease; atopic allergy; heritability; self reported allergy
Cervical cancer (CxCa) is caused by persistent human papillomavirus (HPV) infection; genetic predisposition is also suspected to play a role. The present study is a targeted candidate gene follow-up based on: i) strong clinical evidence demonstrating that mutations in the TMC6 and TMC8 (EVER1 and EVER2) genes associate with the HPV-associated disease Epidermodysplasia Verruciformis (EV), and ii) recent epidemiological data suggesting a genetic susceptibility conferred by polymorphisms in such genes for skin and cervical cancer. Clarifying the association of the TMC6/8 genes with risk of CxCa will help in understanding why some HPV-infected women develop persistent infection, cervical lesions and eventually cancer while others do not. Twenty-two single nucleotide polymorphisms (SNP) harbouring the TMC6/8 genes were genotyped in 2,989 cases with cervical intraepithelial neoplasia grade III (CINIII) or invasive cervical cancer (ICC) and 2,281 controls from the Swedish population. Association was evaluated in logistic regression models. Two SNPs displayed association with cervical disease: rs2290907 (ORGGvsAA = 0.6, 95% CI: 0.3 - 0.9, p = 0.02) and rs16970849 (ORAGvsGG = 0.8, 95% CI: 0.66 - 0.98, p = 0.03). The present data supports the involvement of the TMC6/8 region in CxCa susceptibility but further analyses are needed to replicate our findings, fully characterize the region and understand the function of the genetic variants involved.
Cervical cancer; EVER1; EVER2; polymorphism; TMC6; TMC8
Serum concentrations of low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), triglycerides (TGs) and total cholesterol (TC) are important heritable risk factors for cardiovascular disease. Although genome-wide association studies (GWASs) of circulating lipid levels have identified numerous loci, a substantial portion of the heritability of these traits remains unexplained. Evidence of unexplained genetic variance can be detected by combining multiple independent markers into additive genetic risk scores. Such polygenic scores, constructed using results from the ENGAGE Consortium GWAS on serum lipids, were applied to predict lipid levels in an independent population-based study, the Rotterdam Study-II (RS-II). We additionally tested for evidence of a shared genetic basis for different lipid phenotypes. Finally, the polygenic score approach was used to identify an alternative genome-wide significance threshold before pathway analysis and those results were compared with those based on the classical genome-wide significance threshold. Our study provides evidence suggesting that many loci influencing circulating lipid levels remain undiscovered. Cross-prediction models suggested a small overlap between the polygenic backgrounds involved in determining LDL-C, HDL-C and TG levels. Pathway analysis utilizing the best polygenic score for TC uncovered extra information compared with using only genome-wide significant loci. These results suggest that the genetic architecture of circulating lipids involves a number of undiscovered variants with very small effects, and that increasing GWAS sample sizes will enable the identification of novel variants that regulate lipid levels.
serum lipids; polygenic; genome-wide association; polygenic score; pathway analysis
The genetic structure of human populations is important in population genetics, forensics and medicine. Using genome-wide scans and individuals with all four grandparents born in the same settlement, we here demonstrate remarkable geographical structure across 8–30 km in three different parts of rural Europe. After excluding close kin and inbreeding, village of origin could still be predicted correctly on the basis of genetic data for 89–100% of individuals.
population structure; principal components; genome-wide genotyping
Somatic mutations of mtDNA are implicated in the aging process, but there is no universally accepted method for their accurate quantification. We have used ultra-deep sequencing to study genome-wide mtDNA mutation load in the liver of normally- and prematurely-aging mice. Mice that are homozygous for an allele expressing a proof-reading–deficient mtDNA polymerase (mtDNA mutator mice) have 10-times-higher point mutation loads than their wildtype siblings. In addition, the mtDNA mutator mice have increased levels of a truncated linear mtDNA molecule, resulting in decreased sequence coverage in the deleted region. In contrast, circular mtDNA molecules with large deletions occur at extremely low frequencies in mtDNA mutator mice and can therefore not drive the premature aging phenotype. Sequence analysis shows that the main proportion of the mutation load in heterozygous mtDNA mutator mice and their wildtype siblings is inherited from their heterozygous mothers consistent with germline transmission. We found no increase in levels of point mutations or deletions in wildtype C57Bl/6N mice with increasing age, thus questioning the causative role of these changes in aging. In addition, there was no increased frequency of transversion mutations with time in any of the studied genotypes, arguing against oxidative damage as a major cause of mtDNA mutations. Our results from studies of mice thus indicate that most somatic mtDNA mutations occur as replication errors during development and do not result from damage accumulation in adult life.
Mitochondria represent the powerhouses of cells and have their own DNA. Mutations in the mitochondrial genome are associated with a range of human diseases and have also been implicated as a driving force behind the aging process. We have used ultra-deep sequencing to study the genome-wide mutation load in the mitochondrial DNA (mtDNA) of liver from normal inbred mice and mice that express a proof-reading–deficient mtDNA polymerase (mtDNA mutator mice) that cause premature aging. The mtDNA mutator mice show a dramatic increase of point mutations with age and have 10-times-higher point mutation levels than wildtype siblings or normal C57Bl/6N mice. Circular mtDNA molecules with large deletions occur at very low frequencies in mtDNA mutator mice and are therefore unlikely to contribute to the premature aging phenotype. We found no increase in levels of point mutations or deletions in normal mice with increasing age, arguing against the accumulation of mtDNA mutations as contributing to aging. Our results indicate that most somatic mtDNA mutations occur as replication errors during the rapid amplification of mtDNA during embryogenesis and do not result from damage accumulation in adult life.
Family-based research in genetically isolated populations is an effective approach for identifying loci influencing variation in disease traits. In common with all studies in humans, those in genetically isolated populations need ethical approval; however, existing ethical frameworks may be inadequate to protect participant privacy and confidentiality and to address participants' information needs in such populations. Using the ethical–legal guidelines of the Council for International Organizations of Medical Sciences (CIOMS) as a template, we compared the participant information leaflets and consent forms of studies in five European genetically isolated populations to identify additional information that should be incorporated into information leaflets and consent forms to guarantee satisfactorily informed consent. We highlight the additional information that participants require on the research purpose and the reasons why their population was chosen; on the potential risks and benefits of participation; on the opportunities for benefit sharing; on privacy; on the withdrawal of consent and on the disclosure of genetic data. This research raises some important issues that should be addressed properly and identifies relevant types of information that should be incorporated into information leaflets for this type of study.
informed consent; isolates; participation; EUROSPAN; information leaflets; ethics
Over half of all proteins are glycosylated, and alterations in glycosylation have been observed in numerous physiological and pathological processes. Attached glycans significantly affect protein function; but, contrary to polypeptides, they are not directly encoded by genes, and the complex processes that regulate their assembly are poorly understood. A novel approach combining genome-wide association and high-throughput glycomics analysis of 2,705 individuals in three population cohorts showed that common variants in the Hepatocyte Nuclear Factor 1α (HNF1α) and fucosyltransferase genes FUT6 and FUT8 influence N-glycan levels in human plasma. We show that HNF1α and its downstream target HNF4α regulate the expression of key fucosyltransferase and fucose biosynthesis genes. Moreover, we show that HNF1α is both necessary and sufficient to drive the expression of these genes in hepatic cells. These results reveal a new role for HNF1α as a master transcriptional regulator of multiple stages in the fucosylation process. This mechanism has implications for the regulation of immunity, embryonic development, and protein folding, as well as for our understanding of the molecular mechanisms underlying cancer, coronary heart disease, and metabolic and inflammatory disorders.
By combining recently developed high-throughput glycan analysis with genome-wide association study, we performed the first comprehensive analysis of common genetic polymorphisms that affect protein glycosylation. Over half of all proteins are glycosylated; but, due to difficulties in glycan analysis and the absence of a genetic template for their synthesis, knowledge about the complex processes that regulate glycan assembly is still limited. We demonstrated that HNF1α regulates the expression of key fucosyltransferase and fucose biosynthesis genes and acts as a master regulator of plasma protein fucosylation. Proper protein fucosylation is essential in numerous processes including inflammation, cancer, and coronary heart disease, thus the identification of a master regulator of plasma protein fucosylation has important implications for understanding both normal biological functions and disease processes.
We profile the chimpanzee transcriptome by using deep sequencing of cDNA from brain and liver, aiming to quantify expression of known genes and to identify novel transcribed regions.
Using stringent criteria for transcription, we identify 12,843 expressed genes, with a majority being found in both tissues. We further identify 9,826 novel transcribed regions that are not overlapping with annotated exons, mRNAs or ESTs. Over 80% of the novel transcribed regions map within or in the vicinity of known genes, and by combining sequencing data with de novo splice predictions we predict several of the novel transcribed regions to be new exons or 3' UTRs. For approximately 350 novel transcribed regions, the corresponding DNA sequence is absent in the human reference genome. The presence of novel transcribed regions in five genes and in one intergenic region is further validated with RT-PCR. Finally, we describe and experimentally validate a putative novel multi-exon gene that belongs to the ATP-cassette transporter gene family. This gene does not appear to be functional in human since one exon is absent from the human genome. In addition to novel exons and UTRs, novel transcribed regions may also stem from different types of noncoding transcripts. We note that expressed repeats and introns from unspliced mRNAs are especially common in our data.
Our results extend the chimpanzee gene catalogue with a large number of novel exons and 3' UTRs and thus support the view that mammalian gene annotations are not yet complete.
Pulmonary function measures are heritable traits that predict morbidity and mortality and define chronic obstructive pulmonary disease (COPD). We tested genome-wide association with forced expiratory volume in 1 s (FEV1) and the ratio of FEV1 to forced vital capacity (FVC) in the SpiroMeta consortium (n = 20,288 individuals of European ancestry). We conducted a meta-analysis of top signals with data from direct genotyping (n ≤ 32,184 additional individuals) and in silico summary association data from the CHARGE Consortium (n = 21,209) and the Health 2000 survey (n ≤ 883). We confirmed the reported locus at 4q31 and identified associations with FEV1 or FEV1/FVC and common variants at five additional loci: 2q35 in TNS1 (P = 1.11 × 10−12), 4q24 in GSTCD (2.18 × 10−23), 5q33 in HTR4 (P = 4.29 × 10−9), 6p21 in AGER (P = 3.07 × 10−15) and 15q23 in THSD4 (P = 7.24 × 10−15). mRNA analyses showed expression of TNS1, GSTCD, AGER, HTR4 and THSD4 in human lung tissue. These associations offer mechanistic insight into pulmonary function regulation and indicate potential targets for interventions to alleviate respiratory disease.
SplitSeek can be used to detect novel splicing events in SOLiD RNA-seq data without the need for a pre-defined library.
We have developed a new strategy for de novo prediction of splice junctions in short-read RNA-seq data, suitable for detection of novel splicing events and chimeric transcripts. When tested on mouse RNA-seq data, >31,000 splice events were predicted, of which 88% bridged between two regions separated by ≤100 kb, and 74% connected two exons of the same RefSeq gene. Our method also reports genomic rearrangements such as insertions and deletions.
Serum creatinine (SCR) is the most important biomarker for a quick and non-invasive assessment of kidney function in population-based surveys. A substantial proportion of the inter-individual variability in SCR level is explicable by genetic factors.
We performed a meta-analysis of genome-wide association studies of SCR undertaken in five population isolates ('discovery cohorts'), all of which are part of the European Special Population Network (EUROSPAN) project. Genes showing the strongest evidence for an association with SCR (candidate loci) were replicated in two additional population-based samples ('replication cohorts').
After the discovery meta-analysis, 29 loci were selected for replication. Association between SCR level and polymorphisms in the collagen type XXII alpha 1 (COL22A1) gene, on chromosome 8, and in the synaptotagmin-1 (SYT1) gene, on chromosome 12, were successfully replicated in the replication cohorts (p value = 1.0 × 10-6 and 1.7 × 10-4, respectively). Evidence of association was also found for polymorphisms in a locus including the gamma-aminobutyric acid receptor rho-2 (GABRR2) gene and the ubiquitin-conjugating enzyme E2-J1 (UBE2J1) gene (replication p value = 3.6 × 10-3). Previously reported findings, associating glomerular filtration rate with SNPs in the uromodulin (UMOD) gene and in the schroom family member 3 (SCHROOM3) gene were also replicated.
While confirming earlier results, our study provides new insights in the understanding of the genetic basis of serum creatinine regulatory processes. In particular, the association with the genes SYT1 and GABRR2 corroborate previous findings that highlighted a possible role of the neurotransmitters GABAA receptors in the regulation of the glomerular basement membrane and a possible interaction between GABAAreceptors and synaptotagmin-I at the podocyte level.
Genome-wide homozygosity estimation from genomic data is becoming an increasingly interesting research topic. The aim of this study was to compare different methods for estimating individual homozygosity-by-descent based on the information from human genome-wide scans rather than genealogies. We considered the four most commonly used methods and investigated their applicability to single-nucleotide polymorphism (SNP) data in both a simulation study and by using the human genotyped data. A total of 986 inhabitants from the isolated Island of Vis, Croatia (where inbreeding is present, but no pedigree-based inbreeding was observed at the level of F > 0.0625) were included in this study. All individuals were genotyped with the Illumina HumanHap300 array with 317,503 SNP markers.
Simulation data suggested that multi-point FEstim is the method most strongly correlated to true homozygosity-by-descent. Correlation coefficients between the homozygosity-by-descent estimates were high but only for inbred individuals, with nearly absolute correlation between single-point measures.
Deciding who is really inbred is a methodological challenge where multi-point approaches can be very helpful once the set of SNP markers is filtered to remove linkage disequilibrium. The use of several different methodological approaches and hence different homozygosity measures can help to distinguish between homozygosity-by-state and homozygosity-by-descent in studies investigating the effects of genomic autozygosity on human health.
We set out to identify common genetic determinants of the length of RR and QT intervals in 2,325 individuals from isolated European populations.
Methods and Results
We analyzed heart rate at rest, measured as RR interval, and length of corrected QT interval for association to 318,237 SNPs. RR interval was associated to common variants within GPR133, a G-Protein Coupled Receptor (rs885389, P = 3.9 × 10-8). QT interval was associated to the earlier reported NOS1AP gene (rs2880058, P = 2.00 × 10-10) and to a region on chromosome 13 (rs2478333, P = 4.34 × 10-8), which is 100 kb from the closest known transcript LOC730174 and has previously not been associated with length of QT interval.
Our results suggested association between RR interval and GPR133 and confirmed association between QT interval and NOS1AP.
genetics; heart rate; population
Genes for height have gained interest for decades, but only recently have candidate genes started to be identified. We have performed linkage analysis and genome-wide association for height in approximately 4000 individuals from five European populations. A total of five chromosomal regions showed suggestive linkage and in one of these regions, two SNPs (rs849140 and rs1635852) were associated with height (nominal P = 7.0 × 10−8 and P = 9.6 × 10−7, respectively). In total, five SNPs across the genome showed an association with height that reached the threshold of genome-wide significance (nominal P < 1.6 × 10−7). The association with height was replicated for two SNPs (rs1635852 and rs849140) using three independent studies (n = 31 077, n=1268 and n = 5746) with overall meta P-values of 9.4 × 10−10 and 5.3 × 10−8. These SNPs are located in the JAZF1 gene, which has recently been associated with type II diabetes, prostate and endometrial cancer. JAZF1 is a transcriptional repressor of NR2C2, which results in low IGF1 serum concentrations, perinatal and early postnatal hypoglycemia and growth retardation when knocked out in mice. Both the linkage and association analyses independently identified the JAZF1 region affecting human height. We have demonstrated, through replication in additional independent populations, the consistency of the effect of the JAZF1 SNPs on height. Since this gene also has a key function in the metabolism of growth, JAZF1 represents one of the strongest candidates influencing human height identified so far.
Genome-wide association studies (GWAS) have identified 38 larger genetic regions affecting classical blood lipid levels without adjusting for important environmental influences. We modeled diet and physical activity in a GWAS in order to identify novel loci affecting total cholesterol, LDL cholesterol, HDL cholesterol, and triglyceride levels. The Swedish (SE) EUROSPAN cohort (NSE = 656) was screened for candidate genes and the non-Swedish (NS) EUROSPAN cohorts (NNS = 3,282) were used for replication. In total, 3 SNPs were associated in the Swedish sample and were replicated in the non-Swedish cohorts. While SNP rs1532624 was a replication of the previously published association between CETP and HDL cholesterol, the other two were novel findings. For the latter SNPs, the p-value for association was substantially improved by inclusion of environmental covariates: SNP rs5400 (pSE,unadjusted = 3.6×10−5, pSE,adjusted = 2.2×10−6, pNS,unadjusted = 0.047) in the SLC2A2 (Glucose transporter type 2) and rs2000999 (pSE,unadjusted = 1.1×10−3, pSE,adjusted = 3.8×10−4, pNS,unadjusted = 0.035) in the HP gene (Haptoglobin-related protein precursor). Both showed evidence of association with total cholesterol. These results demonstrate that inclusion of important environmental factors in the analysis model can reveal new genetic susceptibility loci.
In this article we report a genome-wide association study on cholesterol levels in the human blood. We used a Swedish cohort to select genetic polymorphisms that showed the strongest association with cholesterol levels adjusted for diet and physical activity. We replicated several genetic loci in other European cohorts. This approach extends present genome-wide association studies on lipid levels, which did not take these lifestyle factors into account, to improve statistical results and discover novel genes. In our analysis, we could identify two genetic loci in the SLC2A2 (Glucose transporter type 2) and the HP (Haptoglobin-related protein precursor) gene whose effects on total cholesterol have not been reported yet. The results show that inclusion of important environmental factors in the analysis model can reveal new insights into genetic determinants of clinical parameters relevant for metabolic and cardiovascular disease.
Cervical cancer is one of the most important cancers in African women. Polymorphisms in the Fas (FasR) and Fas ligand (FasL) genes have been reported to be associated with cervical cancer in certain populations. This study investigated whether these polymorphisms are associated with cervical cancer or human papillomavirus (HPV) infection in South African women.
Participants were 447 women with invasive cervical cancer (106 black African and 341 women of mixed-ancestry) and 424 healthy women controls, matched by age, (101 black African and 323 women of mixed-ancestry) and domicile (rural or urban). Two polymorphisms in Fas gene (FasR-1377G/A, FasR-670A/G) and one in FasL gene (FasL844T/C) were genotyped by TaqMan. None of the polymorphisms, or the Fas haplotypes, showed a significant association with cervical cancer. There was also no association with HPV infection in the control group. However, on analysis of the control group, highly significant allele, genotype and haplotype differences were found between the two ethnic groups. There were generally low frequencies of FasR-1377A alleles, FasR-670A alleles and FasL-844C alleles in black women compared to the women of mixed-ancestry.
This is the first study on the role of Fas and FasL polymorphisms in cervical cancer in African populations. Our results suggest that these SNPs are not associated with cervical cancer in these populations. The allele frequencies of the three SNPs differed markedly between the indigenous African black and mixed-ancestry populations.
Sphingolipids have essential roles as structural components of cell membranes and in cell signalling, and disruption of their metabolism causes several diseases, with diverse neurological, psychiatric, and metabolic consequences. Increasingly, variants within a few of the genes that encode enzymes involved in sphingolipid metabolism are being associated with complex disease phenotypes. Direct experimental evidence supports a role of specific sphingolipid species in several common complex chronic disease processes including atherosclerotic plaque formation, myocardial infarction (MI), cardiomyopathy, pancreatic β-cell failure, insulin resistance, and type 2 diabetes mellitus. Therefore, sphingolipids represent novel and important intermediate phenotypes for genetic analysis, yet little is known about the major genetic variants that influence their circulating levels in the general population. We performed a genome-wide association study (GWAS) between 318,237 single-nucleotide polymorphisms (SNPs) and levels of circulating sphingomyelin (SM), dihydrosphingomyelin (Dih-SM), ceramide (Cer), and glucosylceramide (GluCer) single lipid species (33 traits); and 43 matched metabolite ratios measured in 4,400 subjects from five diverse European populations. Associated variants (32) in five genomic regions were identified with genome-wide significant corrected p-values ranging down to 9.08×10−66. The strongest associations were observed in or near 7 genes functionally involved in ceramide biosynthesis and trafficking: SPTLC3, LASS4, SGPP1, ATP10D, and FADS1–3. Variants in 3 loci (ATP10D, FADS3, and SPTLC3) associate with MI in a series of three German MI studies. An additional 70 variants across 23 candidate genes involved in sphingolipid-metabolizing pathways also demonstrate association (p = 10−4 or less). Circulating concentrations of several key components in sphingolipid metabolism are thus under strong genetic control, and variants in these loci can be tested for a role in the development of common cardiovascular, metabolic, neurological, and psychiatric diseases.
Although several rare monogenic diseases are caused by defects in enzymes involved in sphingolipid biosynthesis and metabolism, little is known about the major variants that control the circulating levels of these important bioactive molecules. As well as being essential components of plasma membranes and endosomes, sphingolipids play critical roles in cell surface protection, protein and lipid transport and sorting, and cellular signalling cascades. Experimental evidence supports a role for sphingolipids in several common complex chronic metabolic, cardiovascular, or neurological disease processes. Therefore, sphingolipids represent novel and important intermediate phenotypes for genetic analysis, and discovering the genetic variants that influence their circulating concentrations is an important step towards understanding how the genetic control of sphingolipids might contribute to common human disease. We have identified 32 variants in 7 genes that have a strong effect on the circulating plasma levels of 33 distinct sphingolipids, and 43 matched metabolite ratios. In a series of 3 German MI studies, we see association with MI for variants in 3 of the genes tested. Further cardiovascular, metabolic, neurological, and psychiatric disease associations can be tested with the variants described here, which may identify additional disease risk and potentially useful therapeutic targets.
Recent genome-wide association (GWA) studies of lipids have been conducted in samples ascertained for other phenotypes, particularly diabetes. Here we report the first GWA analysis of loci affecting total cholesterol (TC), low-density lipoprotein (LDL) cholesterol, high-density lipoprotein (HDL) cholesterol and triglycerides sampled randomly from 16 population-based cohorts and genotyped using mainly the Illumina HumanHap300-Duo platform. Our study included a total of 17,797-22,562 persons, aged 18-104 years and from geographic regions spanning from the Nordic countries to Southern Europe. We established 22 loci associated with serum lipid levels at a genome-wide significance level (P < 5 × 10-8), including 16 loci that were identified by previous GWA studies. The six newly identified loci in our cohort samples are ABCG5 (TC, P = 1.5 × 10-11; LDL, P = 2.6 × 10-10), TMEM57 (TC, P = 5.4 × 10-10), CTCF-PRMT8 region (HDL, P = 8.3 × 10-16), DNAH11 (LDL, P = 6.1 × 10-9), FADS3-FADS2 (TC, P = 1.5 × 10-10; LDL, P = 4.4 × 10-13) and MADD-FOLH1 region (HDL, P = 6 × 10-11). For three loci, effect sizes differed significantly by sex. Genetic risk scores based on lipid loci explain up to 4.8% of variation in lipids and were also associated with increased intima media thickness (P = 0.001) and coronary heart disease incidence (P = 0.04). The genetic risk score improves the screening of high-risk groups of dyslipidemia over classical risk factors.
Central abdominal fat is a strong risk factor for diabetes and cardiovascular disease. To identify common variants influencing central abdominal fat, we conducted a two-stage genome-wide association analysis for waist circumference (WC). In total, three loci reached genome-wide significance. In stage 1, 31,373 individuals of Caucasian descent from eight cohort studies confirmed the role of FTO and MC4R and identified one novel locus associated with WC in the neurexin 3 gene [NRXN3 (rs10146997, p = 6.4×10−7)]. The association with NRXN3 was confirmed in stage 2 by combining stage 1 results with those from 38,641 participants in the GIANT consortium (p = 0.009 in GIANT only, p = 5.3×10−8 for combined analysis, n = 70,014). Mean WC increase per copy of the G allele was 0.0498 z-score units (0.65 cm). This SNP was also associated with body mass index (BMI) [p = 7.4×10−6, 0.024 z-score units (0.10 kg/m2) per copy of the G allele] and the risk of obesity (odds ratio 1.13, 95% CI 1.07–1.19; p = 3.2×10−5 per copy of the G allele). The NRXN3 gene has been previously implicated in addiction and reward behavior, lending further evidence that common forms of obesity may be a central nervous system-mediated disorder. Our findings establish that common variants in NRXN3 are associated with WC, BMI, and obesity.
Obesity is a major health concern worldwide. In the past two years, genome-wide association studies of DNA markers known as SNPs (single nucleotide polymorphisms) have identified two novel genetic factors that may help scientists better understand why some people may be more susceptible to obesity. Similarly, this paper describes results from a large scale genome-wide association analysis for obesity susceptibility genes that includes 31,373 individuals from 8 separate studies. We uncovered a new gene influencing waist circumference, the neurexin 3 gene (NRXN3), which has been previously implicated in studies of addiction and reward behavior. These findings lend further evidence that our genes may influence our desire and consumption of food and, in turn, our susceptibility to obesity.