It is well established that autism spectrum disorders (ASD) have a strong genetic component. However, for at least 70% of cases, the underlying genetic cause is unknown1. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes—so-called sporadic or simplex families2,3, we sequenced all coding regions of the genome, i.e. the exome, for parent-child trios exhibiting sporadic ASD, including 189 new trios and 20 previously reported4. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19)4, for a total of 677 individual exomes from 209 families. Here we show de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD5. Moreover, 39% (49/126) of the most severe or disruptive de novo mutations map to a highly interconnected beta-catenin/chromatin remodeling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes, CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3, and SCN1A. Combined with copy number variant (CNV) data, these results suggest extreme locus heterogeneity but also provide a target for future discovery, diagnostics, and therapeutics.
Adverse neurodevelopmental sequelae are reported among children who undergo early cardiac surgery to repair congenital heart defects (CHD). APOE genotype has previously been determined to contribute to the prediction of these outcomes. Understanding further genetic causes for the development of poor neurobehavioral outcomes should enhance patient risk stratification and improve both prevention and treatment strategies.
We performed a prospective observational study of children who underwent cardiac surgery before six months of age; this included a neurodevelopmental evaluation between their fourth and fifth birthdays. Attention and behavioral skills were assessed through parental report utilizing the Attention Deficit-Hyperactivity Disorder-IV scale preschool edition (ADHD-IV), and Child Behavior Checklist (CBCL/1.5-5), respectively. Of the seven investigated, three neurodevelopmental phenotypes met genomic quality control criteria. Linear regression was performed to determine the effect of genome-wide genetic variation on these three neurodevelopmental measures in 316 subjects.
This genome-wide association study identified single nucleotide polymorphisms (SNPs) associated with three neurobehavioral phenotypes in the postoperative children ADHD-IV Impulsivity/Hyperactivity, CBCL/1.5-5 PDPs, and CBCL/1.5-5 Total Problems. The most predictive SNPs for each phenotype were: a LGALS8 intronic SNP, rs4659682, associated with ADHD-IV Impulsivity (P = 1.03×10−6); a PCSK5 intronic SNP, rs2261722, associated with CBCL/1.5-5 PDPs (P = 1.11×10−6); and an intergenic SNP, rs11617488, 50 kb from FGF9, associated with CBCL/1.5-5 Total Problems (P = 3.47×10−7). 10 SNPs (3 for ADHD-IV Impulsivity, 5 for CBCL/1.5-5 PDPs, and 2 for CBCL/1.5-5 Total Problems) had p<10−5.
No SNPs met genome-wide significance for our three neurobehavioral phenotypes; however, 10 SNPs reached a threshold for suggestive significance (p<10−5). Given the unique nature of this cohort, larger studies and/or replication are not possible. Studies to further investigate the mechanisms through which these newly identified genes may influence neurodevelopment dysfunction are warranted.
Polymorphisms within the ICAM1 structural gene have been shown to influence circulating levels of soluble intercellular adhesion molecule -1 (sICAM-1) but their relation to atherosclerosis has not been clearly established. We sought to determine whether ICAM1 SNPs are associated with circulating sICAM-1 concentration, coronary artery calcium (CAC), and common and internal carotid intima medial thickness (IMT).
Methods and Results
3,550 black and white Coronary Artery Risk Development in Young Adults (CARDIA) Study subjects who participated in the year 15 and/or 20 examinations and were part of the Young Adult Longitudinal Study of Antioxidants (YALTA) ancillary study were included in this analysis. In whites, rs5498 was significantly associated with sICAM-1 (p < 0.001) and each G-allele of rs5498 was associated with 5% higher sICAM-1 concentration. In blacks, each C-allele of rs5490 was associated with 6 % higher sICAM-1 level; this SNP was in strong linkage disequilibrium with rs5491, a functional variant. Subclinical measurements of atherosclerosis in either year 15 or year 20 were not significantly related to ICAM1 SNPs.
In CARDIA, ICAM1 DNA segment variants were associated with sICAM-1 protein level including the novel finding that levels differ by the functional variant rs5491. However, ICAM1 SNPs were not strongly related to either IMT or CAC. Our findings in CARDIA suggest that ICAM1 variants are not major early contributors to subclinical atherosclerosis.
cell adhesion molecules; atherosclerosis; coronary calcium; genetics; inflammation
Kabuki syndrome is a rare, multiple malformation disorder characterized by a distinctive facial appearance, cardiac anomalies, skeletal abnormalities, and mild to moderate intellectual disability. Simplex cases make up the vast majority of the reported cases with Kabuki syndrome, but parent-to-child transmission in more than a half-dozen instances indicates that it is an autosomal dominant disorder. We recently reported that Kabuki syndrome is caused by mutations in MLL2, a gene that encodes a Trithorax-group histone methyltransferase, a protein important in the epigenetic control of active chromatin states. Here, we report on the screening of 110 families with Kabuki syndrome. MLL2 mutations were found in 81/110 (74%) of families. In simplex cases for which DNA was available from both parents, 25 mutations were confirmed to be de novo, while a transmitted MLL2 mutation was found in two of three familial cases. The majority of variants found to cause Kabuki syndrome were novel nonsense or frameshift mutations that are predicted to result in haploinsufficiency. The clinical characteristics of MLL2 mutation-positive cases did not differ significantly from MLL2 mutation-negative cases with the exception that renal anomalies were more common in MLL2 mutation-positive cases. These results are important for understanding the phenotypic consequences of MLL2 mutations for individuals and their families as well as for providing a basis for the identification of additional genes for Kabuki syndrome.
Kabuki syndrome; MLL2; ALR; Trithorax group histone methyltransferase
Structural variations in the chromosome 22q11.2 region mediated by non-allelic homologous recombination result in 22q11.2 deletion (del22q11.2) and 22q11.2 duplication (dup22q11.2) syndromes. The majority of del22q11.2 cases have facial and cardiac malformations, immunologic impairments, specific cognitive profile and increased risk for schizophrenia and autism spectrum disorders. The phenotype of dup22q11.2 is frequently without physical features but includes the spectrum of neurocognitive abnormalities. Although there is substantial evidence that haploinsufficiency for TBX1 plays a role in the physical features of del22q11.2, it is not known which gene(s) in the critical 1.5 Mb region are responsible for the observed spectrum of behavioral phenotypes. We identified an individual with a balanced translocation 46,XY,t(1;22)(p36.1;q11.2) and a behavioral phenotype characterized by cognitive impairment, autism and schizophrenia in the absence of congenital malformations. Using somatic cell hybrids and comparative genomic hybridization we mapped the chromosome-22 breakpoint within intron 7 of the GNB1L gene. Copy number evaluations and direct DNA sequencing of GNB1L in 271 schizophrenia and 513 autism cases revealed dup22q11.2 in two families with autism and private GNB1L missense variants in conserved residues in three families (p=0.036). The identified missense variants affect residues in the WD40 repeat domains and are predicted to have deleterious effects on the protein. Prior studies provided evidence that GNB1L may have a role in schizophrenia. Our findings support involvement of GNB1L in autism spectrum disorders as well.
22q11.2; translocation; neurodevelopmental disorders
Evidence for the etiology of autism spectrum disorders (ASD) has consistently pointed to a strong genetic component complicated by substantial locus heterogeneity1,2. We sequenced the exomes of 20 sporadic cases of ASD and their parents, reasoning that these families would be enriched for de novo mutations of major effect. We identified 21 de novo mutations, of which 11 were protein-altering. Protein-altering mutations were significantly enriched for changes at highly conserved residues. We identified potentially causative de novo events in 4/20 probands, particularly among more severely affected individuals, in FOXP1, GRIN2B, SCN1A, and LAMC3. In the FOXP1 mutation carrier, we also observed a rare inherited CNTNAP2 mutation and provide functional support for a multihit model for disease risk3. Our results demonstrate that trio-based exome sequencing is a powerful approach for identifying novel candidate genes for ASD and suggest that de novo mutations may contribute substantially to the genetic risk for ASD.
Massively parallel sequencing has enabled the rapid, systematic identification of variants on a large scale. This has, in turn, accelerated the pace of gene discovery and disease diagnosis on a molecular level and has the potential to revolutionize methods particularly for the analysis of Mendelian disease. Using massively parallel sequencing has enabled investigators to interrogate variants both in the context of linkage intervals and also on a genome-wide scale, in the absence of linkage information entirely. The primary challenge now is to distinguish between background polymorphisms and pathogenic mutations. Recently developed strategies for rare monogenic disorders have met with some early success. These strategies include filtering for potential causal variants based on frequency and function, and also ranking variants based on conservation scores and predicted deleteriousness to protein structure. Here, we review the recent literature in the use of high-throughput sequence data and its analysis in the discovery of causal mutations for rare disorders.
Complement factor H shows very strong association with Age-related Macular Degeneration (AMD), and recent data suggest that multiple causal variants are associated with disease. To refine the location of the disease associated variants, we characterized in detail the structural variation at CFH and its paralogs, including two copy number polymorphisms (CNP), CNP147 and CNP148, and several rare deletions and duplications. Examination of 34 AMD-enriched extended families (N = 293) and AMD cases (White N = 4210 Indian = 134; Malay = 140) and controls (White N = 3229; Indian = 117; Malay = 2390) demonstrated that deletion CNP148 was protective against AMD, independent of SNPs at CFH. Regression analysis of seven common haplotypes showed three haplotypes, H1, H6 and H7, as conferring risk for AMD development. Being the most common haplotype H1 confers the greatest risk by increasing the odds of AMD by 2.75-fold (95% CI = [2.51, 3.01]; p = 8.31×10−109); Caucasian (H6) and Indian-specific (H7) recombinant haplotypes increase the odds of AMD by 1.85-fold (p = 3.52×10−9) and by 15.57-fold (P = 0.007), respectively. We identified a 32-kb region downstream of Y402H (rs1061170), shared by all three risk haplotypes, suggesting that this region may be critical for AMD development. Further analysis showed that two SNPs within the 32 kb block, rs1329428 and rs203687, optimally explain disease association. rs1329428 resides in 20 kb unique sequence block, but rs203687 resides in a 12 kb block that is 89% similar to a noncoding region contained in ΔCNP148. We conclude that causal variation in this region potentially encompasses both regulatory effects at single markers and copy number.
Self-identified race or ethnic group is used to determine normal reference standards in the prediction of pulmonary function. We conducted a study to determine whether the genetically determined percentage of African ancestry is associated with lung function and whether its use could improve predictions of lung function among persons who identified themselves as African American.
We assessed the ancestry of 777 participants self-identified as African American in the Coronary Artery Risk Development in Young Adults (CARDIA) study and evaluated the relation between pulmonary function and ancestry by means of linear regression. We performed similar analyses of data for two independent cohorts of subjects identifying themselves as African American: 813 participants in the Health, Aging, and Body Composition (HABC) study and 579 participants in the Cardiovascular Health Study (CHS). We compared the fit of two types of models to lung-function measurements: models based on the covariates used in standard prediction equations and models incorporating ancestry. We also evaluated the effect of the ancestry-based models on the classification of disease severity in two asthma-study populations.
African ancestry was inversely related to forced expiratory volume in 1 second (FEV1) and forced vital capacity in the CARDIA cohort. These relations were also seen in the HABC and CHS cohorts. In predicting lung function, the ancestry-based model fit the data better than standard models. Ancestry-based models resulted in the reclassification of asthma severity (based on the percentage of the predicted FEV1) in 4 to 5% of participants.
Current predictive equations, which rely on self-identified race alone, may misestimate lung function among subjects who identify themselves as African American. Incorporating ancestry into normative equations may improve lung-function estimates and more accurately categorize disease severity. (Funded by the National Institutes of Health and others.)
Although statins are efficacious for lowering LDL-cholesterol (LDLC), there is wide inter-individual variation in response. We tested the extent to which combined effects of common alleles of LDLR and HMGCR can contribute to this variability.
Methods and Results
Haplotypes in the LDLR 3′-untranslated region (3UTR) were tested for association with lipid-lowering response to simvastatin treatment in the Cholesterol and Pharmacogenetics (CAP) trial (335 African-Americans and 609European-Americans). LDLR haplotype 5 (L5)was associated with smaller simvastatin-induced reductions in LDLC, total cholesterol, non-HDL cholesterol, and apolipoprotein B (P=0.0002–0.03)in African-Americans, but not European-Americans. The combined presence of L5 and previously described HMGCR haplotypes in African-Americans was associated with significantly attenuated apoB reduction(−22.4±1.5% N=89) both compared to noncarriers (−30.6±1.5% N=78, P=0.0001) and to carriers of either individual haplotype (−28.2±1.1% N=158, P=0.001). We observed similar differences when measuring simvastatin-mediated induction of LDLR surface expression using lymphoblast cell lines (P=0.03).
We have identified a common LDLR 3UTR haplotype that is associated with attenuated lipid-lowering response to simvastatin treatment. Response was further reduced in individuals with both LDLR and previously described HMGCR haplotypes. Previously identified racial differences in statin efficacy were partially explained by increased prevalence of these combined haplotypes in African-Americans.
LDLR; HMGCR; statin; LDL-cholesterol; pharmacogenomics
While whole genome resequencing remains expensive, genomic partitioning provides an affordable means of targeting sequence efforts towards regions of high interest. There are several competitive methods for targeted capture; these include molecular inversion probes, microdroplet-segregated multiplex PCR, and on-array or in-solution capture-by-hybridization. Enrichment of the human exome by array hybridization has been successfully applied to pinpoint the causative allele of Mendelian disorders. This protocol focuses on the application of Agilent 1M arrays for capture-by-hybridization and sequencing on the Illumina platform, though the library preparation method may be adaptable to other vendor’s array platforms and sequencing technologies.
Resequencing; exome; hybridization; targeted enrichment
The discovery of expression quantitative trait loci (“eQTLs”) can
help to unravel genetic contributions to complex traits. We identified genetic
determinants of human liver gene expression variation using two independent
collections of primary tissue profiled with Agilent
(n = 206) and Illumina (n = 60)
expression arrays and Illumina SNP genotyping (550K), and we also incorporated
data from a published study (n = 266). We found that
∼30% of SNP-expression correlations in one study failed to replicate
in either of the others, even at thresholds yielding high reproducibility in
simulations, and we quantified numerous factors affecting reproducibility. Our
data suggest that drug exposure, clinical descriptors, and unknown factors
associated with tissue ascertainment and analysis have substantial effects on
gene expression and that controlling for hidden confounding variables
significantly increases replication rate. Furthermore, we found that
reproducible eQTL SNPs were heavily enriched near gene starts and ends, and
subsequently resequenced the promoters and 3′UTRs for 14 genes and tested
the identified haplotypes using luciferase assays. For three genes, significant
haplotype-specific in vitro functional differences correlated
directly with expression levels, suggesting that many bona fide
eQTLs result from functional variants that can be mechanistically isolated in a
high-throughput fashion. Finally, given our study design, we were able to
discover and validate hundreds of liver eQTLs. Many of these relate directly to
complex traits for which liver-specific analyses are likely to be relevant, and
we identified dozens of potential connections with disease-associated loci.
These included previously characterized eQTL contributors to diabetes, drug
response, and lipid levels, and they suggest novel candidates such as a role for
NOD2 expression in leprosy risk and
C2orf43 in prostate cancer. In general, the work presented
here will be valuable for future efforts to precisely identify and functionally
characterize genetic contributions to a variety of complex traits.
Many disease-associated genetic variants do not alter protein sequences and are
difficult to precisely identify. Discovery of expression quantitative trait loci
(eQTL), or correlations between genetic variants and gene expression levels,
offers one means of addressing this challenge. However, eQTL studies in primary
cells have several shortcomings. In particular, their reproducibility is largely
unknown, the variables that generate unreliable associations are
uncharacterized, and the resolution of their findings is constrained by linkage
disequilibrium. We performed a three-way replication study of eQTLs in primary
human livers. We demonstrated that ∼67% of cis-eQTL associations are
replicated in an independent study and that known polymorphisms overlapping
expression probes, SNP-to-gene distance, and unmeasured confounding variables
all influence the replication rate. We fine-mapped 14 eQTLs and identified
causative polymorphisms in the promoter or 3′UTR for 3 genes, suggesting
that a considerable fraction of eQTLs are driven by proximal variants that are
amenable to functional isolation. Finally, we found hundreds of overlaps between
SNPs associated with complex traits and replicated eQTL SNPs. Our data provide
both cautionary (i.e. non-reproducibility of many strong eQTLs)
and optimistic (i.e. precise identification of functional
non-coding variants) forecasts for future eQTL analyses and the complex traits
that they influence.
We demonstrate the successful application of exome sequencing1–3 to discover a gene for an autosomal dominant disorder, Kabuki syndrome (OMIM %147920). The exomes of ten unrelated probands were subjected to massively parallel sequencing. After filtering against SNP databases, there was no compelling candidate gene containing novel variants in all affected individuals. Less stringent filtering criteria permitted modest genetic heterogeneity or missing data, but identified multiple candidate genes. However, genotypic and phenotypic stratification highlighted MLL2, a Trithorax-group histone methyltransferase4, in which seven probands had novel nonsense or frameshift mutations. Follow-up Sanger sequencing detected MLL2 mutations in two of the three remaining cases, and in 26 of 43 additional cases. In families where parental DNA was available, the mutation was confirmed to be de novo (n = 12) or transmitted (n = 2) in concordance with phenotype. Our results strongly suggest that mutations in MLL2 are a major cause of Kabuki syndrome.
The distribution of lipoprotein(a) [Lp(a)] levels can differ dramatically across diverse racial/ethnic populations. The extent to which genetic variation in LPA can explain these differences is not fully understood. To explore this, 19 LPA tagSNPs were genotyped in 7,159 participants from the Third National Health and Nutrition Examination Survey (NHANES III). NHANES III is a diverse population-based survey with DNA samples linked to hundreds of quantitative traits, including serum Lp(a). Tests of association between LPA variants and transformed Lp(a) levels were performed across the three different NHANES subpopulations (non-Hispanic whites, non-Hispanic blacks, and Mexican Americans). At a significance threshold of p<0.0001, 15 of the 19 SNPs tested were strongly associated with Lp(a) levels in at least one subpopulation, six in at least two subpopulations, and none in all three subpopulations. In non-Hispanic whites, three variants were associated with Lp(a) levels, including previously known rs6919246 (p = 1.18×10−30). Additionally, 12 and 6 variants had significant associations in non-Hispanic blacks and Mexican Americans, respectively. The additive effects of these associated alleles explained up to 11% of the variance observed for Lp(a) levels in the different racial/ethnic populations. The findings reported here replicate previous candidate gene and genome-wide association studies for Lp(a) levels in European-descent populations and extend these findings to other populations. While we demonstrate that LPA is an important contributor to Lp(a) levels regardless of race/ethnicity, the lack of generalization of associations across all subpopulations suggests that specific LPA variants may be contributing to the observed Lp(a) between-population variance.
Signatures of natural selection occur throughout the human genome and can be detected at the sequence level. We have re-sequenced ABCE1, a host candidate gene essential for HIV-1 capsid assembly, in European- (n=23) and African-descent (Yoruban; n=24) reference populations for genetic variation discovery. We identified an excess of rare genetic variation in Yoruban samples, and the resulting Tajima’s D was low (−2.27). The trend of excess rare variation persisted in flanking candidate genes ANAPC10 and OTUD4, suggesting that this pattern of positive selection can be detected across the 184.5kb examined on chromosome 4. Because of ABCE1’s role in HIV-1 replication, we re-sequenced the candidate gene in three small cohorts of HIV-1-infected or resistant individuals. We were able to confirm the excess of rare genetic variation among HIV-1 positive African-American individuals (n=53; Tajima’s D = −2.34). These results highlight the potential importance of ABCE1’s role in infectious diseases such as HIV-1.
ABCE1; African-Americans; single nucleotide polymorphisms; HIV-1
We demonstrate the first successful application of exome sequencing to discover the gene for a rare, Mendelian disorder of unknown cause, Miller syndrome (OMIM %263750). For four affected individuals in three independent kindreds, we captured and sequenced coding regions to a mean coverage of 40X, and sufficient depth to call variants at ~97% of each targeted exome. Filtering against public SNP databases and a small number of HapMap exomes for genes with two novel variants in each of the four cases identified a single candidate gene, DHODH, which encodes a key enzyme in the pyrimidine de novo biosynthesis pathway. Sanger sequencing confirmed the presence of DHODH mutations in three additional families with Miller syndrome. Exome sequencing of a small number of unrelated, affected individuals is a powerful, efficient strategy for identifying the genes underlying rare Mendelian disorders and will likely transform the genetic analysis of monogenic traits.
Genome-wide association studies suggest that common genetic variants explain only a small fraction of heritable risk for common diseases, raising the question of whether rare variants account for a significant fraction of unexplained heritability1,2. While DNA sequencing costs have fallen dramatically3, they remain far from what is necessary for rare and novel variants to be routinely identified at a genome-wide scale in large cohorts. We have therefore sought to develop second-generation methods for targeted sequencing of all protein-coding regions (`exomes'), to reduce costs while enriching for discovery of highly penetrant variants. Here we report on the targeted capture and massively parallel sequencing of the exomes of twelve humans. These include eight HapMap individuals representing three populations4, and four unrelated individuals with a rare dominantly inherited disorder, Freeman-Sheldon syndrome (FSS)5. We demonstrate the sensitive and specific identification of rare and common variants in over 300 megabases (Mb) of coding sequence. Using FSS as a proof-of-concept, we show that candidate genes for monogenic disorders can be identified by exome sequencing of a small number of unrelated, affected individuals. This strategy may be extendable to diseases with more complex genetics through larger sample sizes and appropriate weighting of nonsynonymous variants by predicted functional impact.
Statins effectively lower total and plasma LDL-cholesterol, but the magnitude of decrease varies among individuals. To identify single nucleotide polymorphisms (SNPs) contributing to this variation, we performed a combined analysis of genome-wide association (GWA) results from three trials of statin efficacy.
Methods and Principal Findings
Bayesian and standard frequentist association analyses were performed on untreated and statin-mediated changes in LDL-cholesterol, total cholesterol, HDL-cholesterol, and triglyceride on a total of 3932 subjects using data from three studies: Cholesterol and Pharmacogenetics (40 mg/day simvastatin, 6 weeks), Pravastatin/Inflammation CRP Evaluation (40 mg/day pravastatin, 24 weeks), and Treating to New Targets (10 mg/day atorvastatin, 8 weeks). Genotype imputation was used to maximize genomic coverage and to combine information across studies. Phenotypes were normalized within each study to account for systematic differences among studies, and fixed-effects combined analysis of the combined sample were performed to detect consistent effects across studies. Two SNP associations were assessed as having posterior probability greater than 50%, indicating that they were more likely than not to be genuinely associated with statin-mediated lipid response. SNP rs8014194, located within the CLMN gene on chromosome 14, was strongly associated with statin-mediated change in total cholesterol with an 84% probability by Bayesian analysis, and a p-value exceeding conventional levels of genome-wide significance by frequentist analysis (P = 1.8×10−8). This SNP was less significantly associated with change in LDL-cholesterol (posterior probability = 0.16, P = 4.0×10−6). Bayesian analysis also assigned a 51% probability that rs4420638, located in APOC1 and near APOE, was associated with change in LDL-cholesterol.
Conclusions and Significance
Using combined GWA analysis from three clinical trials involving nearly 4,000 individuals treated with simvastatin, pravastatin, or atorvastatin, we have identified SNPs that may be associated with variation in the magnitude of statin-mediated reduction in total and LDL-cholesterol, including one in the CLMN gene for which statistical evidence for association exceeds conventional levels of genome-wide significance.
PRINCE and TNT are not registered. CAP is registered at Clinicaltrials.gov NCT00451828
The transcription factor hepatocyte nuclear factor 1 (HNF-1) α regulates the activity of a number of genes involved in innate immunity, blood coagulation, lipid and glucose transport and metabolism, and cellular detoxification. Common polymorphisms of the HNF-1α gene (HNF1A) were recently associated with plasma C-reactive protein (CRP) and gamma-glutamyl transferase (GGT) concentration in middle-aged to older European-Americans (EA).
Methods and Results
We assessed whether common variants of HNF1A are associated with CRP, GGT, and other atherosclerotic and metabolic risk factors, in the large, population-based CARDIA study of healthy young European-American (EA; n=2,154) and African-American (AA; n=2,083) adults. The minor alleles of Ile27Leu (rs1169288) and Ser486Asn (rs2464196) were associated with 0.10 to 0.15 standard deviation units lower CRP and GGT levels in EA. The same HNF1A coding variants were associated with higher LDL cholesterol, apolipoprotein B, creatinine, and fibrinogen in EA. We replicated the associations between HNF1A coding variants and CRP, fibrinogen, LDL cholesterol, and renal function in a second population-based sample of EA adults 65 years and older from the Cardiovascular Health Study. The HNF1A Ser486Asn and/or Ile27Leu variants were also associated with increased risk of subclinical coronary atherosclerosis in CARDIA and with incident coronary heart disease in CHS. The Ile27Leu and Ser486Asn variants were 3-fold less common than in EA. There was little evidence of association between HNF1A genotype and atherosclerosis-related phenotypes in AA.
Common polymorphisms of HNF1A appear to influence multiple phenotypes related to cardiovascular risk in the general population of younger and older EA adults.
atherosclerosis; genetics; C-reactive protein; HNF-1; gamma glutamyl transferase
Summary: Copy number variants (CNVs) contribute substantially to human genomic diversity, and development of accurate and efficient methods for CNV genotyping is a central problem in exploring human genotype–phenotype associations. SCIMMkit provides a robust, integrated implementation of three previously validated algorithms [SCIMM (SNP-Conditional Mixture Modeling), SCIMM-Search and SCOUT (SNP-Conditional OUTlier detection)] for targeted interrogation of CNVs using Illumina Infinium II and GoldenGate SNP assays. SCIMMkit is applicable to standardized genome-wide SNP arrays and customized multiplexed SNP panels, providing economy, efficiency and flexibility in experimental design.
Availability: Source code and documentation are available for noncommercial use at http://droog.gs.washington.edu/scimmkit.
Supplementary information: Supplementary data are available at Bioinformatics online.
Single nucleotide polymorphism (SNP) genotyping has emerged as a technology to incorporate copy-number variants (CNVs) into genetic analyses of human traits. However, the extent to which SNP platforms accurately capture CNVs remains unclear. Using independent, sequence-based CNV maps, we find that commonly used SNP platforms have limited or no probe coverage for a large fraction of CNVs. Despite this, in nine samples we inferred 368 CNVs using Illumina SNP genotyping data and experimentally validated over two-thirds of these. We also developed a method (SCIMM) to robustly genotype deletions using as few as two SNP probes. We find that HapMap SNPs are strongly correlated with 82% of common deletions, but the newest SNP platforms effectively tag about 50%. We conclude that currently available genome-wide SNP assays can capture CNVs accurately, but improvements in array designs, particularly in duplicated sequences, are necessary to facilitate more comprehensive analyses of genomic variation.
Rationale: Polymorphisms affecting Toll-like receptor (TLR)–mediated responses could predispose to excessive inflammation during an infection and contribute to an increased risk for poor outcomes in patients with sepsis.
Objectives: To identify hypermorphic polymorphisms causing elevated TLR-mediated innate immune cytokine and chemokine responses and to test whether these polymorphisms are associated with increased susceptibility to death, organ dysfunction, and infections in patients with sepsis.
Methods: We screened single-nucleotide polymorphisms (SNPs) in 43 TLR-related genes to identify variants affecting TLR-mediated inflammatory responses in blood from healthy volunteers ex vivo. The SNP associated most strongly with hypermorphic responses was tested for associations with death, organ dysfunction, and type of infection in two studies: a nested case–control study in a cohort of intensive care unit patients with sepsis, and a case–control study using patients with sepsis, patients with sepsis-related acute lung injury, and healthy control subjects.
Measurements and Main Results: The SNP demonstrating the most hypermorphic effect was the G allele of TLR1−7202A/G (rs5743551), which associated with elevated TLR1-mediated cytokine production (P < 2 × 10−20). TLR1−7202G marked a coding SNP that causes higher TLR1-induced NF-κB activation and higher cell surface TLR1 expression. In the cohort of patients with sepsis TLR1−7202G predicted worse organ dysfunction and death (odds ratio, 1.82; 95% confidence interval, 1.07–3.09). In the case-control study TLR1−7202G was associated with sepsis-related acute lung injury (odds ratio, 3.40; 95% confidence interval, 1.59–7.27). TLR1−7202G also associated with a higher prevalence of gram-positive cultures in both clinical studies.
Conclusions: Hypermorphic genetic variation in TLR1 is associated with increased susceptibility to organ dysfunction, death, and gram-positive infection in sepsis.
innate immunity; genetic variation; genetic predisposition
Genome-wide genetic association analysis represents an opportunity for comprehensive survey of genes governing lipid metabolism, potentially revealing new insights or even therapeutic strategies for cardiovascular disease and related metabolic disorders.
Methods and Results
We have performed large-scale, genome-wide genetic analysis among 6382 Caucasian women with replication in two cohorts of 970 additional Caucasian men and women for associations between common SNPs and LDL-C, HDL-C, triglycerides, apolipoprotein A1 (ApoA1), and apolipoprotein B (ApoB). Genome-wide associations (P<5×10−8) were found at the PCSK9 gene, the APOB gene, the LPL gene, the APOA1-APOA5 locus, the LIPC gene, the CETP gene, the LDLR gene, and the APOE locus. In addition, genome-wide associations with triglycerides at the GCKR gene confirm and extend emerging links between glucose and lipid metabolism. Still other genome-wide associations at the 1p13.3 locus are consistent with emerging biological properties for a region of the genome, possibly related to the SORT1 gene. Below genome-wide significance, our study provides confirmatory evidence for associations at five novel loci with LDL-C, HDL-C, or triglycerides reported recently in separate genome-wide association studies. The total proportion of variance explained by common variation at the genome-wide candidate loci ranges from 4.3% for triglycerides to 12.6% for ApoB.
Genome-wide associations at the GCKR gene and near the SORT1 gene as well as confirmatory associations at five additional novel loci suggest emerging biological pathways for lipid metabolism among Caucasian women.
lipoprotein; lipid; GWAS; cardiovascular disease