Prior studies have identified recurrent oncogenic mutations in colorectal adenocarcinoma1 and have surveyed exons of protein-coding genes for mutations in 11 affected individuals2,3. Here we report whole-genome sequencing from nine individuals with colorectal cancer, including primary colorectal tumors and matched adjacent non-tumor tissues, at an average of 30.7× and 31.9× coverage, respectively. We identify an average of 75 somatic rearrangements per tumor, including complex networks of translocations between pairs of chromosomes. Eleven rearrangements encode predicted in-frame fusion proteins, including a fusion of VTI1A and TCF7L2 found in 3 out of 97 colorectal cancers. Although TCF7L2 encodes TCF4, which cooperates with β-catenin4 in colorectal carcinogenesis5,6, the fusion lacks the TCF4 β-catenin–binding domain. We found a colorectal carcinoma cell line harboring the fusion gene to be dependent on VTI1A-TCF7L2 for anchorage-independent growth using RNA interference-mediated knockdown. This study shows previously unidentified levels of genomic rearrangements in colorectal carcinoma that can lead to essential gene fusions and other oncogenic events.
Lung adenocarcinoma, the most common subtype of non-small cell lung cancer, is responsible for over 500,000 deaths per year worldwide. Here, we report exome and genome sequences of 183 lung adenocarcinoma tumor/normal DNA pairs. These analyses revealed a mean exonic somatic mutation rate of 12.0 events/megabase and identified the majority of genes previously reported as significantly mutated in lung adenocarcinoma. In addition, we identified statistically recurrent somatic mutations in the splicing factor gene U2AF1 and truncating mutations affecting RBM10 and ARID1A. Analysis of nucleotide context-specific mutation signatures grouped the sample set into distinct clusters that correlated with smoking history and alterations of reported lung adenocarcinoma genes. Whole genome sequence analysis revealed frequent structural re-arrangements, including in-frame exonic alterations within EGFR and SIK2 kinases. The candidate genes identified in this study are attractive targets for biological characterization and therapeutic targeting of lung adenocarcinoma.
Despite recent insights into melanoma genetics, systematic surveys for driver mutations are challenged by an abundance of passenger mutations caused by carcinogenic ultraviolet (UV) light exposure. We developed a permutation-based framework to address this challenge, employing mutation data from intronic sequences to control for passenger mutational load on a per gene basis. Analysis of large-scale melanoma exome data by this approach discovered six novel melanoma genes (PPP6C, RAC1, SNX31, TACC1, STK19 and ARID2), three of which - RAC1, PPP6C and STK19 - harbored recurrent and potentially targetable mutations. Integration with chromosomal copy number data contextualized the landscape of driver mutations, providing oncogenic insights in BRAF- and NRAS-driven melanoma as well as those without known NRAS/BRAF mutations. The landscape also clarified a mutational basis for RB and p53 pathway deregulation in this malignancy. Finally, the spectrum of driver mutations provided unequivocal genomic evidence for a direct mutagenic role of UV light in melanoma pathogenesis.
Systemic lupus erythematosus (SLE) is a common systemic autoimmune disease with complex etiology but strong clustering in families (λS = ~30). We performed a genome-wide association scan using 317,501 SNPs in 720 women of European ancestry with SLE and in 2,337 controls, and we genotyped consistently associated SNPs in two additional independent sample sets totaling 1,846 affected women and 1,825 controls. Aside from the expected strong association between SLE and the HLA region on chromosome 6p21 and the previously confirmed non-HLA locus IRF5 on chromosome 7q32, we found evidence of association with replication (1.1 × 10−7 < Poverall < 1.6 × 10−23; odds ratio 0.82–1.62)in four regions: 16p11.2 (ITGAM), 11p15.5 (KIAA1542), 3p14.3 (PXK) and 1q25.1 (rs10798269). We also found evidence for association (P < 1 × 10−5) at FCGR2A, PTPN22 and STAT4, regions previously associated with SLE and other autoimmune diseases, as well as at ≥9 other loci (P < 2 × 10−7). Our results show that numerous genes, some with known immune-related functions, predispose to SLE.
As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ∼313 genes per genome, and ∼95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history1,2 and will help facilitate the development of new approaches for disease gene discovery3. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth4-6, notable for an excess of rare genetic variants, qualitatively suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European (n=4,298) and African (n=2,217) American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that ~73% of all protein-coding SNVs and ~86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs compared to other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, illustrate the profound effect recent human history has had on the burden of deleterious SNVs segregating in contemporary populations, and provides important practical information that can be used to prioritize variants in disease gene discovery.
Autism spectrum disorders are a genetically heterogeneous constellation of syndromes characterized by impairments in reciprocal social interaction. Available somatic treatments have limited efficacy. We have identified inactivating mutations in the gene BCKDK (Branched Chain Ketoacid Dehydrogenase Kinase) in consanguineous families with autism, epilepsy, and intellectual disability. The encoded protein is responsible for phosphorylation-mediated inactivation of the E1α subunit of branched-chain ketoacid dehydrogenase (BCKDH). Patients with homozygous BCKDK mutations display reductions in BCKDK messenger RNA and protein, E1α phosphorylation, and plasma branched-chain amino acids. Bckdk knockout mice show abnormal brain amino acid profiles and neurobehavioral deficits that respond to dietary supplementation. Thus, autism presenting with intellectual disability and epilepsy caused by BCKDK mutations represents a potentially treatable syndrome.
The somatic genetic basis of chronic lymphocytic leukemia, a common and clinically heterogeneous leukemia occurring in adults, remains poorly understood.
We obtained DNA samples from leukemia cells in 91 patients with chronic lymphocytic leukemia and performed massively parallel sequencing of 88 whole exomes and whole genomes, together with sequencing of matched germline DNA, to characterize the spectrum of somatic mutations in this disease.
Nine genes that are mutated at significant frequencies were identified, including four with established roles in chronic lymphocytic leukemia (TP53 in 15% of patients, ATM in 9%, MYD88 in 10%, and NOTCH1 in 4%) and five with unestablished roles (SF3B1, ZMYM3, MAPK1, FBXW7, and DDX3X). SF3B1, which functions at the catalytic core of the spliceosome, was the second most frequently mutated gene (with mutations occurring in 15% of patients). SF3B1 mutations occurred primarily in tumors with deletions in chromosome 11q, which are associated with a poor prognosis in patients with chronic lymphocytic leukemia. We further discovered that tumor samples with mutations in SF3B1 had alterations in pre–messenger RNA (mRNA) splicing.
Our study defines the landscape of somatic mutations in chronic lymphocytic leukemia and highlights pre-mRNA splicing as a critical cellular process contributing to chronic lymphocytic leukemia.
Prostate cancer is the second most common cancer in men worldwide and causes over 250,000 deaths each year1. Overtreatment of indolent disease also results in significant morbidity2. Common genetic alterations in prostate cancer include losses of NKX3.1 (8p21)3,4 and PTEN (10q23)5,6, gains of the androgen receptor gene (AR)7,8 and fusion of ETS-family transcription factor genes with androgen-responsive promoters9–11. Recurrent somatic base-pair substitutions are believed to be less contributory in prostate tumorigenesis12,13 but have not been systematically analyzed in large cohorts. Here we sequenced the exomes of 112 prostate tumor/normal pairs. Novel recurrent mutations were identified in multiple genes, including MED12 and FOXA1. SPOP was the most frequently mutated gene, with mutations involving the SPOP substrate binding cleft in 6–15% of tumors across multiple independent cohorts. SPOP-mutant prostate cancers lacked ETS rearrangements and exhibited a distinct pattern of genomic alterations. Thus, SPOP mutations may define a new molecular subtype of prostate cancer.
Neighboring genes are often coordinately expressed within cis-regulatory modules, but evidence that nonparalogous genes share functions in mammals is lacking. Here, we report that mutation of either TMEM138 or TMEM216 causes a phenotypically indistinguishable human ciliopathy, Joubert syndrome. Despite a lack of sequence homology, the genes are aligned in a head-to-tail configuration and joined by chromosomal rearrangement at the amphibian-to-reptile evolutionary transition. Expression of the two genes is mediated by a conserved regulatory element in the noncoding intergenic region. Coordinated expression is important for their interdependent cellular role in vesicular transport to primary cilia. Hence, during vertebrate evolution of genes involved in ciliogenesis, nonparalogous genes were arranged to a functional gene cluster with shared regulatory elements.
We report on results from whole-exome sequencing (WES) of 1,039 subjects diagnosed with autism spectrum disorders (ASD) and 870 controls selected from the NIMH repository to be of similar ancestry to cases. The WES data came from two centers using different methods to produce sequence and to call variants from it. Therefore, an initial goal was to ensure the distribution of rare variation was similar for data from different centers. This proved straightforward by filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. Results were evaluated using seven samples sequenced at both centers and by results from the association study. Next we addressed how the data and/or results from the centers should be combined. Gene-based analyses of association was an obvious choice, but should statistics for association be combined across centers (meta-analysis) or should data be combined and then analyzed (mega-analysis)? Because of the nature of many gene-based tests, we showed by theory and simulations that mega-analysis has better power than meta-analysis. Finally, before analyzing the data for association, we explored the impact of population structure on rare variant analysis in these data. Like other recent studies, we found evidence that population structure can confound case-control studies by the clustering of rare variants in ancestry space; yet, unlike some recent studies, for these data we found that principal component-based analyses were sufficient to control for ancestry and produce test statistics with appropriate distributions. After using a variety of gene-based tests and both meta- and mega-analysis, we found no new risk genes for ASD in this sample. Our results suggest that standard gene-based tests will require much larger samples of cases and controls before being effective for gene discovery, even for a disorder like ASD.
This study evaluates association of rare variants and autism spectrum disorders (ASD) in case and control samples sequenced by two centers. Before doing association analyses, we studied how to combine information across studies. We first harmonized the whole-exome sequence (WES) data, across centers, in terms of the distribution of rare variation. Key features included filtering called variants by fraction of missing data, read depth, and balance of alternative to reference reads. After filtering, the vast majority of variants calls from seven samples sequenced at both centers matched. We also evaluated whether one should combine summary statistics from data from each center (meta-analysis) or combine data and analyze it together (mega-analysis). For many gene-based tests, we showed that mega-analysis yields more power. After quality control of data from 1,039 ASD cases and 870 controls and a range of analyses, no gene showed exome-wide evidence of significant association. Our results comport with recent results demonstrating that hundreds of genes affect risk for ASD; they suggest that rare risk variants are scattered across these many genes, and thus larger samples will be required to identify those genes.
Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified1,2. To identify further genetic risk factors, we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n= 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant and the overall rate of mutation is only modestly higher than the expected rate. In contrast, there is significantly enriched connectivity among the proteins encoded by genes harboring de novo missense or nonsense mutations, and excess connectivity to prior ASD genes of major effect, suggesting a subset of observed events are relevant to ASD risk. The small increase in rate of de novo events, when taken together with the connections among the proteins themselves and to ASD, are consistent with an important but limited role for de novo point mutations, similar to that documented for de novo copy number variants. Genetic models incorporating these data suggest that the majority of observed de novo events are unconnected to ASD, those that do confer risk are distributed across many genes and are incompletely penetrant (i.e., not necessarily causal). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5 to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favor of CHD8 and KATNAL2 as genuine autism risk factors.
Medulloblastomas are the most common malignant brain tumors in children1. Identifying and understanding the genetic events that drive these tumors is critical for the development of more effective diagnostic, prognostic and therapeutic strategies. Recently, our group and others described distinct molecular subtypes of medulloblastoma based on transcriptional and copy number profiles2–5. Here, we utilized whole exome hybrid capture and deep sequencing to identify somatic mutations across the coding regions of 92 primary medulloblastoma/normal pairs. Overall, medulloblastomas exhibit low mutation rates consistent with other pediatric tumors, with a median of 0.35 non-silent mutations per megabase. We identified twelve genes mutated at statistically significant frequencies, including previously known mutated genes in medulloblastoma such as CTNNB1, PTCH1, MLL2, SMARCA4 and TP53. Recurrent somatic mutations were identified in an RNA helicase gene, DDX3X, often concurrent with CTNNB1 mutations, and in the nuclear co-repressor (N-CoR) complex genes GPS2, BCOR, and LDB1, novel findings in medulloblastoma. We show that mutant DDX3X potentiates transactivation of a TCF promoter and enhances cell viability in combination with mutant but not wild type beta-catenin. Together, our study reveals the alteration of Wnt, Hedgehog, histone methyltransferase and now N-CoR pathways across medulloblastomas and within specific subtypes of this disease, and nominates the RNA helicase DDX3X as a component of pathogenic beta-catenin signaling in medulloblastoma.
Multiple myeloma is an incurable malignancy of plasma cells, and its pathogenesis is poorly understood. Here we report the massively parallel sequencing of 38 tumor genomes and their comparison to matched normal DNAs. Several new and unexpected oncogenic mechanisms were suggested by the pattern of somatic mutation across the dataset. These include the mutation of genes involved in protein translation (seen in nearly half of the patients), genes involved in histone methylation, and genes involved in blood coagulation. In addition, a broader than anticipated role of NF-κB signaling was suggested by mutations in 11 members of the NF-κB pathway. Of potential immediate clinical relevance, activating mutations of the kinase BRAF were observed in 4% of patients, suggesting the evaluation of BRAF inhibitors in multiple myeloma clinical trials. These results indicate that cancer genome sequencing of large collections of samples will yield new insights into cancer not anticipated by existing knowledge.
Hepatocellular carcinoma (HCC) is a highly heterogeneous disease, and prior attempts to develop genomics-based classification for HCC have yielded highly divergent results, indicating difficulty to identify unified molecular anatomy. We performed a meta-analysis of gene expression profiles in datasets from 8 independent patient cohorts across the world. In addition, aiming to establish the real world applicability of a classification system, we profiled 118 formalin-fixed, paraffin-embedded tissues from an additional patient cohort. A total of 603 patients were analyzed, representing the major etiologies of HCC (hepatitis B and C) collected from Western and Eastern countries. We observed 3 robust HCC subclasses (termed S1, S2, and S3), each correlated with clinical parameters such as tumor size, extent of cellular differentiation, and serum alpha-fetoprotein levels. An analysis of the components of the signatures indicated that S1 reflected aberrant activation of the WNT signaling pathway, S2 was characterized by proliferation as well as MYC and AKT activation, and S3 was associated with hepatocyte differentiation. Functional studies indicated that the WNT pathway activation signature characteristic of S1 tumors was not simply the result of beta-catenin mutation, but rather was the result of TGF-beta activation, thus representing a new mechanism of WNT pathway activation in HCC. These experiments establish the first consensus classification framework for HCC based on gene-expression profiles, and highlight the power of integrating of multiple datasets to define a robust molecular taxonomy of the disease.
hepatocellular carcinoma; transcriptome; meta-analysis; transforming growth factor-beta; WNT pathway
As researchers begin probing deep coverage sequencing data for increasingly rare mutations and subclonal events, the fidelity of next generation sequencing (NGS) laboratory methods will become increasingly critical. Although error rates for sequencing and polymerase chain reaction (PCR) are well documented, the effects that DNA extraction and other library preparation steps could have on downstream sequence integrity have not been thoroughly evaluated. Here, we describe the discovery of novel C > A/G > T transversion artifacts found at low allelic fractions in targeted capture data. Characteristics such as sequencer read orientation and presence in both tumor and normal samples strongly indicated a non-biological mechanism. We identified the source as oxidation of DNA during acoustic shearing in samples containing reactive contaminants from the extraction process. We show generation of 8-oxoguanine (8-oxoG) lesions during DNA shearing, present analysis tools to detect oxidation in sequencing data and suggest methods to reduce DNA oxidation through the introduction of antioxidants. Further, informatics methods are presented to confidently filter these artifacts from sequencing data sets. Though only seen in a low percentage of reads in affected samples, such artifacts could have profoundly deleterious effects on the ability to confidently call rare mutations, and eliminating other possible sources of artifacts should become a priority for the research community.
Knowledge of “actionable” somatic genomic alterations present in each tumor (e.g., point mutations, small insertions/deletions, and copy number alterations that direct therapeutic options) should facilitate individualized approaches to cancer treatment. However, clinical implementation of systematic genomic profiling has rarely been achieved beyond limited numbers of oncogene point mutations. To address this challenge, we utilized a targeted, massively parallel sequencing approach to detect tumor genomic alterations in formalin-fixed, paraffin embedded (FFPE) tumor samples. Nearly 400-fold mean sequence coverage was achieved, and single nucleotide sequence variants, small insertions/deletions, and chromosomal copy number alterations were detected simultaneously with high accuracy compared to other methods in clinical use. Putatively actionable genomic alterations, including those that predict sensitivity or resistance to established and experimental therapies, were detected in each tumor sample tested. Thus, targeted deep sequencing of clinical tumor material may enable mutation-driven clinical trials and, ultimately, ”personalized” cancer treatment.
Because of the high risk of recurrence in high-grade serous ovarian carcinoma (HGS-OvCa), the development of outcome predictors could be valuable for patient stratification. Using the catalog of The Cancer Genome Atlas (TCGA), we developed subtype and survival gene expression signatures, which, when combined, provide a prognostic model of HGS-OvCa classification, named “Classification of Ovarian Cancer” (CLOVAR). We validated CLOVAR on an independent dataset consisting of 879 HGS-OvCa expression profiles. The worst outcome group, accounting for 23% of all cases, was associated with a median survival of 23 months and a platinum resistance rate of 63%, versus a median survival of 46 months and platinum resistance rate of 23% in other cases. Associating the outcome prediction model with BRCA1/BRCA2 mutation status, residual disease after surgery, and disease stage further optimized outcome classification. Ovarian cancer is a disease in urgent need of more effective therapies. The spectrum of outcomes observed here and their association with CLOVAR signatures suggests variations in underlying tumor biology. Prospective validation of the CLOVAR model in the context of additional prognostic variables may provide a rationale for optimal combination of patient and treatment regimens.
Melanoma is notable for its metastatic propensity, lethality in the advanced setting, and association with ultraviolet (UV) exposure early in life1. To obtain a comprehensive genomic view of melanoma, we sequenced the genomes of 25 metastatic melanomas and matched germline DNA. A wide range of point mutation rates was observed: lowest in melanomas whose primaries arose on non-UV exposed hairless skin of the extremities (3 and 14 per Mb genome), intermediate in those originating from hair-bearing skin of the trunk (range = 5 to 55 per Mb), and highest in a patient with a documented history of chronic sun exposure (111 per Mb). Analysis of whole-genome sequence data identified PREX2 - a PTEN-interacting protein and negative regulator of PTEN in breast cancer2 - as a significantly mutated gene with a mutation frequency of approximately 14% in an independent extension cohort of 107 human melanomas. PREX2 mutations are biologically relevant, as ectopic expression of mutant PREX2 accelerated tumor formation of immortalized human melanocytes in vivo. Thus, whole-genome sequencing of human melanoma tumors revealed genomic evidence of UV pathogenesis and discovered a new recurrently mutated gene in melanoma.
Congenital diarrheal disorders (CDDs) are a collection of rare, heterogeneous enteropathies with early onset and often severe outcomes. Here, we report a family of Ashkenazi Jewish descent, with 2 out of 3 children affected by CDD. Both affected children presented 3 days after birth with severe, intractable diarrhea. One child died from complications at age 17 months. The second child showed marked improvement, with resolution of most symptoms at 10 to 12 months of age. Using exome sequencing, we identified a rare splice site mutation in the DGAT1 gene and found that both affected children were homozygous carriers. Molecular analysis of the mutant allele indicated a total loss of function, with no detectable DGAT1 protein or activity produced. The precise cause of diarrhea is unknown, but we speculate that it relates to abnormal fat absorption and buildup of DGAT substrates in the intestinal mucosa. Our results identify DGAT1 loss-of-function mutations as a rare cause of CDDs. These findings prompt concern for DGAT1 inhibition in humans, which is being assessed for treating metabolic and other diseases.
Head and neck squamous cell carcinoma (HNSCC) is a common, morbid, and frequently lethal malignancy. To uncover its mutational spectrum, we analyzed whole-exome sequencing data from 74 tumor-normal pairs. The majority exhibited a mutational profile consistent with tobacco exposure; human papilloma virus was detectable by sequencing of DNA from infected tumors. In addition to identifying previously known HNSCC genes (TP53, CDKN2A, PTEN, PIK3CA, and HRAS), the analysis revealed many genes not previously implicated in this malignancy. At least 30% of cases harbored mutations in genes that regulate squamous differentiation (e.g., NOTCH1, IRF6, and TP63), implicating its dysregulation as a major driver of HNSCC carcinogenesis. More generally, the results indicate the ability of large-scale sequencing to reveal fundamental tumorigenic mechanisms.
Pacific Biosciences technology provides a fundamentally new data type that provides the potential to overcome some limitations of current next generation sequencing platforms by providing significantly longer reads, single molecule sequencing, low composition bias and an error profile that is orthogonal to other platforms. With these potential advantages in mind, we here evaluate the utility of the Pacific Biosciences RS platform for human medical amplicon resequencing projects.
We evaluated the Pacific Biosciences technology for SNP discovery in medical resequencing projects using the Genome Analysis Toolkit, observing high sensitivity and specificity for calling differences in amplicons containing known true or false SNPs. We assessed data quality: most errors were indels (~14%) with few apparent miscalls (~1%). In this work, we define a custom data processing pipeline for Pacific Biosciences data for human data analysis.
Critically, the error properties were largely free of the context-specific effects that affect other sequencing technologies. These data show excellent utility for follow-up validation and extension studies in human data and medical genetics projects, but can be extended to other organisms with a reference genome.
More than a thousand disease susceptibility loci have been identified via genome-wide association studies (GWAS) of common variants; however, the specific genes and full allelic spectrum of causal variants underlying these findings generally remain to be defined. We utilize pooled next-generation sequencing to study 56 genes in regions associated to Crohn’s Disease in 350 cases and 350 controls. Follow up genotyping of 70 rare and low-frequency protein-altering variants (MAF ~ .001-.05) in nine independent case-control series (16054 CD patients, 12153 UC patients, 17575 healthy controls) identifies four additional independent risk factors in NOD2, two additional protective variants in IL23R, a highly significant association to a novel, protective splice variant in CARD9 (p < 1e-16, OR ~ 0.29), as well as additional associations to coding variants in IL18RAP, CUL2, C1orf106, PTPN22 and MUC19. We extend the results of successful GWAS by providing novel, rare, and likely functional variants that will empower functional experiments and predictive models.
Whereas it is well established that plasma lipid levels have substantial heritability within populations, it remains unclear how many of the genetic determinants reported in previous studies (largely performed in European American cohorts) are relevant in different ethnicities.
We tested a set of ∼50,000 polymorphisms from ∼2,000 candidate genes and genetic loci from genome-wide association studies (GWAS) for association with low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG) in 25,000 European Americans and 9,000 African Americans in the National Heart, Lung, and Blood Institute (NHLBI) Candidate Gene Association Resource (CARe). We replicated associations for a number of genes in one or both ethnicities and identified a novel lipid-associated variant in a locus harboring ICAM1. We compared the architecture of genetic loci associated with lipids in both African Americans and European Americans and found that the same genes were relevant across ethnic groups but the specific associated variants at each gene often differed.
We identify or provide further evidence for a number of genetic determinants of plasma lipid levels through population association studies. In many loci the determinants appear to differ substantially between African Americans and European Americans.