Search tips
Search criteria

Results 1-25 (61)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
Document Types
1.  Somatic Mutations in Cerebral Cortical Malformations 
The New England journal of medicine  2014;371(8):733-743.
Although there is increasing recognition of the role of somatic mutations in genetic disorders, the prevalence of somatic mutations in neurodevelopmental disease and the optimal techniques to detect somatic mosaicism have not been systematically evaluated.
Using a customized panel of known and candidate genes associated with brain malformations, we applied targeted high-coverage sequencing (depth, ≥200×) to leukocyte-derived DNA samples from 158 persons with brain malformations, including the double-cortex syndrome (subcortical band heterotopia, 30 persons), polymicrogyria with megalencephaly (20), periventricular nodular heterotopia (61), and pachygyria (47). We validated candidate mutations with the use of Sanger sequencing and, for variants present at unequal read depths, subcloning followed by colony sequencing.
Validated, causal mutations were found in 27 persons (17%; range, 10 to 30% for each phenotype). Mutations were somatic in 8 of the 27 (30%), predominantly in persons with the double-cortex syndrome (in whom we found mutations in DCX and LIS1), persons with periventricular nodular heterotopia (FLNA), and persons with pachygyria (TUBB2B). Of the somatic mutations we detected, 5 (63%) were undetectable with the use of traditional Sanger sequencing but were validated through subcloning and subsequent sequencing of the subcloned DNA. We found potentially causal mutations in the candidate genes DYNC1H1, KIF5C, and other kinesin genes in persons with pachygyria.
Targeted sequencing was found to be useful for detecting somatic mutations in patients with brain malformations. High-coverage sequencing panels provide an important complement to whole-exome and whole-genome sequencing in the evaluation of somatic mutations in neuropsychiatric disease. (Funded by the National Institute of Neurological Disorders and Stroke and others.)
PMCID: PMC4274952  PMID: 25140959
2.  A variational Bayes discrete mixture test for rare variant association 
Genetic epidemiology  2014;38(1):21-30.
Recently, many statistical methods have been proposed to test for associations between rare genetic variants and complex traits. Most of these methods test for association by aggregating genetic variations within a predefined region, such as a gene. Although there is evidence that “aggregate” tests are more powerful than the single marker test, these tests generally ignore neutral variants and therefore are unable to identify specific variants driving the association with phenotype. We propose a novel aggregate rare-variant test that explicitly models a fraction of variants as neutral, tests associations at the gene-level, and infers the rare-variants driving the association. Simulations show that in the practical scenario where there are many variants within a given region of the genome with only a fraction causal our approach has greater power compared to other popular tests such as the Sequence Kernel Association Test (SKAT), the Weighted Sum Statistic (WSS), and the collapsing method of Morris and Zeggini (MZ). Our algorithm leverages a fast variational Bayes approximate inference methodology to scale to exome-wide analyses, a significant computational advantage over exact inference model selection methodologies. To demonstrate the efficacy of our methodology we test for associations between von Willebrand Factor (VWF) levels and VWF missense rare-variants imputed from the National Heart, Lung, and Blood Institute’s Exome Sequencing project into 2,487 African Americans within the VWF gene. Our method suggests that a relatively small fraction (~10%) of the imputed rare missense variants within VWF are strongly associated with lower VWF levels in African Americans.
PMCID: PMC4030763  PMID: 24482836
Exome sequencing study; approximate inference; von Willebrand Factor genetics
3.  Pharmacogenetic meta-analysis of genome-wide association studies of LDL cholesterol response to statins 
Postmus, Iris | Trompet, Stella | Deshmukh, Harshal A. | Barnes, Michael R. | Li, Xiaohui | Warren, Helen R. | Chasman, Daniel I. | Zhou, Kaixin | Arsenault, Benoit J. | Donnelly, Louise A. | Wiggins, Kerri L. | Avery, Christy L. | Griffin, Paula | Feng, QiPing | Taylor, Kent D. | Li, Guo | Evans, Daniel S. | Smith, Albert V. | de Keyser, Catherine E. | Johnson, Andrew D. | de Craen, Anton J. M. | Stott, David J. | Buckley, Brendan M. | Ford, Ian | Westendorp, Rudi G. J. | Eline Slagboom, P. | Sattar, Naveed | Munroe, Patricia B. | Sever, Peter | Poulter, Neil | Stanton, Alice | Shields, Denis C. | O’Brien, Eoin | Shaw-Hawkins, Sue | Ida Chen, Y.-D. | Nickerson, Deborah A. | Smith, Joshua D. | Pierre Dubé, Marie | Matthijs Boekholdt, S. | Kees Hovingh, G. | Kastelein, John J. P. | McKeigue, Paul M. | Betteridge, John | Neil, Andrew | Durrington, Paul N. | Doney, Alex | Carr, Fiona | Morris, Andrew | McCarthy, Mark I. | Groop, Leif | Ahlqvist, Emma | Bis, Joshua C. | Rice, Kenneth | Smith, Nicholas L. | Lumley, Thomas | Whitsel, Eric A. | Stürmer, Til | Boerwinkle, Eric | Ngwa, Julius S. | O’Donnell, Christopher J. | Vasan, Ramachandran S. | Wei, Wei-Qi | Wilke, Russell A. | Liu, Ching-Ti | Sun, Fangui | Guo, Xiuqing | Heckbert, Susan R | Post, Wendy | Sotoodehnia, Nona | Arnold, Alice M. | Stafford, Jeanette M. | Ding, Jingzhong | Herrington, David M. | Kritchevsky, Stephen B. | Eiriksdottir, Gudny | Launer, Leonore J. | Harris, Tamara B. | Chu, Audrey Y. | Giulianini, Franco | MacFadyen, Jean G. | Barratt, Bryan J. | Nyberg, Fredrik | Stricker, Bruno H. | Uitterlinden, André G. | Hofman, Albert | Rivadeneira, Fernando | Emilsson, Valur | Franco, Oscar H. | Ridker, Paul M. | Gudnason, Vilmundur | Liu, Yongmei | Denny, Joshua C. | Ballantyne, Christie M. | Rotter, Jerome I. | Adrienne Cupples, L. | Psaty, Bruce M. | Palmer, Colin N. A. | Tardif, Jean-Claude | Colhoun, Helen M. | Hitman, Graham | Krauss, Ronald M. | Wouter Jukema, J | Caulfield, Mark J.
Nature Communications  2014;5:5068.
Statins effectively lower LDL cholesterol levels in large studies and the observed interindividual response variability may be partially explained by genetic variation. Here we perform a pharmacogenetic meta-analysis of genome-wide association studies (GWAS) in studies addressing the LDL cholesterol response to statins, including up to 18,596 statin-treated subjects. We validate the most promising signals in a further 22,318 statin recipients and identify two loci, SORT1/CELSR2/PSRC1 and SLCO1B1, not previously identified in GWAS. Moreover, we confirm the previously described associations with APOE and LPA. Our findings advance the understanding of the pharmacogenetic architecture of statin response.
Statins are effectively used to prevent and manage cardiovascular disease, but patient response to these drugs is highly variable. Here, the authors identify two new genes associated with the response of LDL cholesterol to statins and advance our understanding of the genetic basis of drug response.
PMCID: PMC4220464  PMID: 25350695
4.  Imputation of coding variants in African Americans: better performance using data from the exome sequencing project 
Bioinformatics  2013;29(21):2744-2749.
Summary: Although the 1000 Genomes haplotypes are the most commonly used reference panel for imputation, medical sequencing projects are generating large alternate sets of sequenced samples. Imputation in African Americans using 3384 haplotypes from the Exome Sequencing Project, compared with 2184 haplotypes from 1000 Genomes Project, increased effective sample size by 8.3–11.4% for coding variants with minor allele frequency <1%. No loss of imputation quality was observed using a panel built from phenotypic extremes. We recommend using haplotypes from Exome Sequencing Project alone or concatenation of the two panels over quality score-based post-imputation selection or IMPUTE2’s two-panel combination.
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3799474  PMID: 23956302
5.  Exome Sequencing Reveals Novel Rare Variants in the Ryanodine Receptor and Calcium Channel Genes in Malignant Hyperthermia Families 
Anesthesiology  2013;119(5):1054-1065.
About half of malignant hyperthermia (MH) cases are associated with skeletal muscle ryanodine receptor 1 (RYR1) and calcium channel, voltage-dependent, L type, α1S subunit (CACNA1S) gene mutations, leaving many with an unknown cause. We chose to apply a sequencing approach to uncover causal variants in unknown cases. Sequencing the exome, the protein-coding region of the genome, has power at low sample sizes and identified the cause of over a dozen Mendelian disorders.
We considered four families with multiple MH cases but in whom no mutations in RYR1 and CACNA1S had been identified by Sanger sequencing of complementary DNA. Exome sequencing of two affecteds per family, chosen for maximum genetic distance, were compared. Variants were ranked by allele frequency, protein change, and measures of conservation among mammals to assess likelihood of causation. Finally, putative pathogenic mutations were genotyped in other family members to verify cosegregation with MH.
Exome sequencing revealed 1 rare RYR1 nonsynonymous variant in each of 3 families (Asp1056His, Val2627Met, Val4234Leu), and 1 CACNA1S variant (Thr1009Lys) in a 4th family. These were not seen in variant databases or in our control population sample of 5379 exomes. Follow-up sequencing in other family members verified cosegregation of alleles with MH.
Using both exome sequencing and allele frequency data from large sequencing efforts may aid genetic diagnosis of MH. In our sample, it was more sensitive for variant detection in known genes than Sanger sequencing of complementary DNA, and allows for the possibility of novel gene discovery.
PMCID: PMC4115638  PMID: 24013571
6.  Comparative effectiveness of next generation genomic sequencing for disease diagnosis: Design of a randomized controlled trial in patients with colorectal cancer/polyposis syndromes✩ 
Whole exome and whole genome sequencing are applications of next generation sequencing transforming clinical care, but there is little evidence whether these tests improve patient outcomes or if they are cost effective compared to current standard of care. These gaps in knowledge can be addressed by comparative effectiveness and patient-centered outcomes research. We designed a randomized controlled trial that incorporates these research methods to evaluate whole exome sequencing compared to usual care in patients being evaluated for hereditary colorectal cancer and polyposis syndromes. Approximately 220 patients will be randomized and followed for 12 months after return of genomic findings. Patients will receive findings associated with colorectal cancer in a first return of result visit, and findings not associated with colorectal cancer (incidental findings) during a second return of result visit. The primary outcome is efficacy to detect mutations associated with these syndromes; secondary outcomes include psychosocial impact, cost-effectiveness and comparative costs. The secondary outcomes will be obtained via surveys before and after each return visit. The expected challenges in conducting this randomized controlled trial include the relatively low prevalence of genetic disease, difficult interpretation of some genetic variants, and uncertainty about which incidental findings should be returned to patients. The approaches utilized in this study may help guide other investigators in clinical genomics to identify useful outcome measures and strategies to address comparative effectiveness questions about the clinical implementation of genomic sequencing in clinical care.
PMCID: PMC4175052  PMID: 24997220
Comparative effectiveness research; Genomics; Next generation sequencing; Randomized clinical trial; Outcomes research; Whole exome sequencing
7.  Altered splicing of ATP6AP2 causes X-linked parkinsonism with spasticity (XPDS) 
Human Molecular Genetics  2013;22(16):3259-3268.
We report a novel gene for a parkinsonian disorder. X-linked parkinsonism with spasticity (XPDS) presents either as typical adult onset Parkinson's disease or earlier onset spasticity followed by parkinsonism. We previously mapped the XPDS gene to a 28 Mb region on Xp11.2–X13.3. Exome sequencing of one affected individual identified five rare variants in this region, of which none was missense, nonsense or frame shift. Using patient-derived cells, we tested the effect of these variants on expression/splicing of the relevant genes. A synonymous variant in ATP6AP2, c.345C>T (p.S115S), markedly increased exon 4 skipping, resulting in the overexpression of a minor splice isoform that produces a protein with internal deletion of 32 amino acids in up to 50% of the total pool, with concomitant reduction of isoforms containing exon 4. ATP6AP2 is an essential accessory component of the vacuolar ATPase required for lysosomal degradative functions and autophagy, a pathway frequently affected in Parkinson's disease. Reduction of the full-size ATP6AP2 transcript in XPDS cells and decreased level of ATP6AP2 protein in XPDS brain may compromise V-ATPase function, as seen with siRNA knockdown in HEK293 cells, and may ultimately be responsible for the pathology. Another synonymous mutation in the same exon, c.321C>T (p.D107D), has a similar molecular defect of exon inclusion and causes X-linked mental retardation Hedera type (MRXSH). Mutations in XPDS and MRXSH alter binding sites for different splicing factors, which may explain the marked differences in age of onset and manifestations.
PMCID: PMC3723311  PMID: 23595882
8.  Exome Sequencing Identifies SMAD3 Mutations as a Cause of Familial Thoracic Aortic Aneurysm and Dissection with Intracranial and Other Arterial Aneurysms 
Circulation research  2011;109(6):680-686.
Thoracic aortic aneurysms leading to acute aortic dissections (TAAD) can be inherited in families in an autosomal dominant manner. As part of the spectrum of clinical heterogeneity of familial TAAD, we recently described families with multiple members that had TAAD and intracranial aneurysms or TAAD and intracranial and abdominal aortic aneurysms inherited in an autosomal dominant manner.
To identify the causative mutation in a large family with autosomal dominant inheritance of TAAD with intracranial and abdominal aortic aneurysms by performing exome sequencing of two distantly related individuals with TAAD and identifying shared rare variants.
Methods and Results
A novel frame shift mutation, p. N218fs (c.652delA), was identified in the SMAD3 gene and segregated with the vascular diseases in this family with a LOD score of 2.52. Sequencing of 181 probands with familial TAAD identified three additional SMAD3 mutations in 4 families, p.R279K (c.836G>A), p.E239K (c.715G>A), and p.A112V (c.235C>T) resulting in a combined LOD score of 5.21. These four mutations were notably absent in 2300 control exomes. SMAD3 mutations were recently described in patients with Aneurysms Osteoarthritis Syndrome and some of the features of this syndrome were identified in individuals in our cohort, but these features were notably absent in many SMAD3 mutation carriers.
SMAD3 mutations are responsible for 2% of familial TAAD. Mutations are found in families with TAAD alone, along with families with TAAD, intracranial aneurysms, aortic and bilateral iliac aneurysms segregating in an autosomal dominant manner.
PMCID: PMC4115811  PMID: 21778426
thoracic aortic aneurysm and dissection; intracranial aneurysm; arterial aneurysms; SMAD3
9.  TGFB2 loss of function mutations cause familial thoracic aortic aneurysms and acute aortic dissections associated with mild systemic features of the Marfan syndrome 
Nature genetics  2012;44(8):916-921.
A predisposition for thoracic aortic aneurysms leading to acute aortic dissections can be inherited in families in an autosomal dominant manner. Genome-wide linkage analysis of two large unrelated families with thoracic aortic disease, followed by whole exome sequencing of affected relatives, identified causative mutations in TGFB2. These mutations, a frameshift mutation in exon 6 and a nonsense mutation in exon 4, segregated with disease with a combined LOD score of 7.7. Sanger sequencing of 276 probands from families with inherited thoracic aortic disease identified two additional TGFB2 mutations. TGFB2 encodes the transforming growth factor beta-2 (TGF-β2) and the mutations are predicted to cause haploinsufficiency for TGFB2, but aortic tissue from cases paradoxically shows increased TGF-β2 expression and immunostaining. Thus, haploinsufficiency of TGFB2 predisposes to thoracic aortic disease, suggesting the initial pathway driving disease is decreased cellular TGF-β2 levels leading to a secondary increase in TGF-β2 production in the diseased aorta.
PMCID: PMC4033668  PMID: 22772371
10.  Exome Sequencing Implicates an Increased Burden of Rare Potassium Channel Variants in the Risk of Drug Induced Long QT Syndrome 
To test the hypothesis that rare variants are associated with Drug-induced long QT syndrome (diLQTS) and torsade de pointes (TdP).
diLQTS is associated with the potentially fatal arrhythmia TdP. The contribution of rare genetic variants to the underlying genetic framework predisposing diLQTS has not been systematically examined.
We performed whole exome sequencing (WES) on 65 diLQTS cases and 148 drug-exposed controls of European descent. We employed rare variant analyses (variable threshold [VT] and sequence kernel association test [SKAT]) and gene-set analyses to identify genes enriched with rare amino-acid coding (AAC) variants associated with diLQTS. Significant associations were reanalyzed by comparing diLQTS cases to 515 ethnically matched controls from the NHLBI GO Exome Sequencing Project (ESP).
Rare variants in 7 genes were enriched in the diLQTS cases according to SKAT or VT compared to drug exposed controls (p<0.001). Of these, we replicated the diLQTS associations for KCNE1 and ACN9 using 515 ESP controls (p<0.05). A total of 37% of the diLQTS cases also had ≥1 rare AAC variant, as compared to 21% of controls (p=0.009), in a predefined set of seven congenital LQTS (cLQTS) genes encoding potassium channels or channel modulators (KCNE1,KCNE2,KCNH2,KCNJ2, KCNJ5,KCNQ1,AKAP9).
By combining WES with aggregated rare variant analyses, we implicate rare variants in KCNE1 and ACN9 as risk factors for diLQTS. Moreover, diLQTS cases were more burdened by rare AAC variants in cLQTS genes encoding potassium channel modulators, supporting the idea that multiple rare variants, notably across cLQTS genes, predispose to diLQTS.
PMCID: PMC4018823  PMID: 24561134
exome; torsade des pointes; long QT syndrome; genetics, adverse drug event
11.  A statin-dependent QTL for GATM expression is associated with statin-induced myopathy 
Nature  2013;502(7471):377-380.
Statins are widely prescribed for lowering plasma low-density lipoprotein (LDL) concentrations and cardiovascular disease risk1, but there is considerable interindividual variation in treatment response2,3 and increasing concern regarding the potential for adverse effects, including myopathy4 and type 2 diabetes5. Despite evidence for substantial genetic influence on LDL concentrations6, pharmacogenomic trials have failed to identify genetic variations with large effects on either statin efficacy7-9 or toxicity10, and have yielded little information regarding mechanisms that modulate statin response. Here we identify a downstream target of statin treatment by screening for the effects of in vitro statin exposure on genetic associations with gene expression levels in lymphoblastoid cell lines derived from 480 participants of a clinical trial of simvastatin treatment7. This analysis identified six expression quantitative trait loci (eQTLs) that interacted with simvastatin exposure including rs9806699, a cis-eQTL for the gene GATM that encodes glycine amidinotransferase, a rate-limiting enzyme in creatine synthesis. We found this locus to be associated with incidence of statin-induced myotoxicity in two separate populations (meta-analysis odds ratio = 0.60, 95% confidence interval = 0.45-0.81, P=6.0×10-4). Furthermore, we found that GATM knockdown in hepatocyte-derived cell lines attenuated transcriptional response to sterol depletion, demonstrating that GATM may act as a functional link between statin-mediated cholesterol lowering and susceptibility to statin-induced myopathy.
PMCID: PMC3933266  PMID: 23995691
12.  Exome Sequencing and Genome-Wide Linkage Analysis in 17 Families Illustrates the Complex Contribution of TTN Truncating Variants to Dilated Cardiomyopathy 
Circulation. Cardiovascular genetics  2013;6(2):10.1161/CIRCGENETICS.111.000062.
Familial dilated cardiomyopathy is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic dilated cardiomyopathy (DCM) cases.
Methods and Results
We used an unbiased genome-wide approach employing both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, MLOD 1.59. We identified six TTN truncating variants carried by affected with DCM in 7 of 17 DCM families (LOD 2.99); 2 of these 7 families also had novel missense variants segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable to five other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ~5,400 cases from the Exome Sequencing Project was ~23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity LOD score of 1.74.
These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.
PMCID: PMC3815606  PMID: 23418287
genetics; human; genome-wide analysis; dilated cardiomyopathy; exome
13.  Characterization of Statin Dose-response within Electronic Medical Records 
Efforts to define the genetic architecture underlying variable statin response have met with limited success possibly because previous studies were limited to effect based on one-single-dose. We leveraged electronic medical records (EMRs) to extract potency (ED50) and efficacy (Emax) of statin dose-response curves and tested them for association with 144 pre-selected variants. Two large biobanks were used to construct dose-response curves for 2,026 (simvastatin) and 2,252 subjects (atorvastatin). Atorvastatin was more efficacious, more potent, and demonstrated less inter-individual variability than simvastatin. A pharmacodynamic variant emerging from randomized trials (PRDM16) was associated with Emax for both. For atorvastatin, Emax was 51.7 mg/dl in homozygous for the minor allele versus 75.0 mg/dl for those homozygous for the major allele. We also identified several loci associated with ED50. The extraction of rigorously defined traits from EMRs for pharmacogenetic studies represents a promising approach to further understand of genetic factors contributing to drug response.
PMCID: PMC3944214  PMID: 24096969
14.  Novel Rare Variants in Congenital Cardiac Arrhythmia Genes are Frequent in Drug-induced Torsades de Pointes 
The pharmacogenomics journal  2012;13(4):325-329.
Marked prolongation of the QT interval and polymorphic ventricular tachycardia following medication (drug-induced long QT syndrome, diLQTS) is a severe adverse drug reaction (ADR) that phenocopies congenital long QT syndrome (cLQTS) and one of the leading causes for drug withdrawal and relabeling. We evaluated the frequency of rare non-synonymous variants in genes contributing to the maintenance of heart rhythm in cases of diLQTS using targeted capture coupled to next generation sequencing. Eleven of 31 diLQTS subjects (36%) carried a novel missense mutation in genes with known congenital arrhythmia associations or a known cLQTS mutation. In the 26 Caucasian subjects, 23% carried a highly conserved rare variant predicted to be deleterious to protein function in these genes compared with only 2-4% in public databases (p < 0.003). We conclude that rare variation in genes responsible for congenital arrhythmia syndromes is frequent in diLQTS. Our findings demonstrate that diLQTS is a pharmacogenomic syndrome predisposed by rare genetic variants.
PMCID: PMC3422407  PMID: 22584458
pharmacogenomics; sudden cardiac death; adverse drug reaction; next generation sequencing
15.  Utilizing Graph Theory to Select the Largest Set of Unrelated Individuals for Genetic Analysis 
Genetic epidemiology  2012;37(2):136-141.
Many statistical analyses of genetic data rely on the assumption of independence among samples. Consequently, relatedness is either modeled in the analysis or samples are removed to “clean” the data of any pairwise relatedness above a tolerated threshold. Current methods do not maximize the number of unrelated individuals retained for further analysis, and this is a needless loss of resources. We report a novel application of graph theory that identifies the maximum set of unrelated samples in any dataset given a user-defined threshold of relatedness as well as all networks of related samples. We have implemented this method into an open source program called Pedigree Reconstruction and Identification of a Maximum Unrelated Set, PRIMUS. We show that PRIMUS outperforms the three existing methods, allowing researchers to retain up to 50% more unrelated samples. A unique strength of PRIMUS is its ability to weight the maximum clique selection using additional criteria (e.g. affected status and data missingness). PRIMUS is a permanent solution to identifying the maximum number of unrelated samples for a genetic analysis.
PMCID: PMC3770842  PMID: 22996348
genome-wide association study; Bron–Kerbosch; cryptic relatedness; bioinformatics; sample selection
16.  “Mandibulofacial Dysostosis with Microcephaly” Caused by EFTUD2 Mutations: Expanding the Phenotype 
Heterozygous mutations in the EFTUD2 were identified in 12 individuals with a rare sporadic craniofacial condition termed Mandibulofacial dysostosis with microcephaly (MIM 610536). We present clinical and radiographic features of three additional patients with de novo heterozygous mutations in EFTUD2.. Although clinical features overlap with findings of the original report (choanal atresia, cleft palate, maxillary and mandibular hypoplasia, and microtia), microcephaly was present in two of three patients and cognitive impairment was milder in those with head circumference proportional to height. Our cases expand the phenotypic spectrum to include epibulbar dermoids and zygomatic arch clefting. We suggest that craniofacial computed tomography studies to assess cleft of zygomatic arch may assist in making this diagnosis. We recommend consideration of EFTUD2 testing in individuals with features of oculo-auriculo-vertebral spectrum and bilateral microtia, or individuals with atypical CHARGE syndrome who do not have a CHD7 mutation, particularly those with a zygomatic arch cleft. The absence of microcephaly in one patient indicates that it is a highly variable phenotypic feature.
PMCID: PMC3535578  PMID: 23239648
craniofacial development; EFTUD2; epibulbar dermoid; craniofacial microsomia; oculo-auriculo-vertebral spectrum (OAVS); choanal atresia
17.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes 
Science (New York, N.Y.)  2012;337(6090):64-69.
As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ∼313 genes per genome, and ∼95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits.
PMCID: PMC3708544  PMID: 22604720
18.  Analysis of 6,515 exomes reveals a recent origin of most human protein-coding variants 
Nature  2012;493(7431):216-220.
Establishing the age of each mutation segregating in contemporary human populations is important to fully understand our evolutionary history1,2 and will help facilitate the development of new approaches for disease gene discovery3. Large-scale surveys of human genetic variation have reported signatures of recent explosive population growth4-6, notable for an excess of rare genetic variants, qualitatively suggesting that many mutations arose recently. To more quantitatively assess the distribution of mutation ages, we resequenced 15,336 genes in 6,515 individuals of European (n=4,298) and African (n=2,217) American ancestry and inferred the age of 1,146,401 autosomal single nucleotide variants (SNVs). We estimate that ~73% of all protein-coding SNVs and ~86% of SNVs predicted to be deleterious arose in the past 5,000-10,000 years. The average age of deleterious SNVs varied significantly across molecular pathways, and disease genes contained a significantly higher proportion of recently arisen deleterious SNVs compared to other genes. Furthermore, European Americans had an excess of deleterious variants in essential and Mendelian disease genes compared to African Americans, consistent with weaker purifying selection due to the out-of-Africa dispersal. Our results better delimit the historical details of human protein-coding variation, illustrate the profound effect recent human history has had on the burden of deleterious SNVs segregating in contemporary populations, and provides important practical information that can be used to prioritize variants in disease gene discovery.
PMCID: PMC3676746  PMID: 23201682
19.  Variation in the TLR10/TLR1/TLR6 Locus is the Major Genetic Determinant of Inter-Individual Difference in TLR1/2-Mediated Responses 
Genes and immunity  2012;14(1):52-57.
Toll-like receptor (TLR)-mediated innate immune responses are important in early host defense. Using a candidate gene approach, we previously identified genetic variation within TLR1 that is associated with hyper-responsiveness to a TLR1/2 agonist in vitro and with death and organ dysfunction in patients with sepsis. Here we report a genome-wide association study designed to identify genetic loci controlling whole blood cytokine responses to the TLR1/2 lipopeptide agonist, Pam3CSK4 ex vivo. We identified a very strong association (p<1×10−27) between genetic variation within the TLR10/1/6 locus on chromosome 4, and Pam3CSK4-induced cytokine responses. This was the predominant association explaining over 35% of the population variance for this phenotype. Notably, strong associations were observed within TLR10 suggesting genetic variation in TLR10 may influence bacterial lipoprotein-induced responses. These findings establish the TLR10/1/6 locus as the dominant common genetic factor controlling inter-individual variability in Pam3CSK4-induced whole blood responses in the healthy population.
PMCID: PMC3554851  PMID: 23151486
TLR; polymorphism; genomics; innate immunity
21.  Autosomal Dominant Familial Dyskinesia and Facial Myokymia: Single Exome Sequencing Identifies a Mutation in Adenylate Cyclase 5 
Archives of neurology  2012;69(5):630-635.
Familial dyskinesia with facial myokymia (FDFM) is an autosomal dominant disorder that is exacerbated by anxiety. In a five-generation family of German ancestry we previously mapped FDFM to chromosome 3p21-3q21. The 72.5 Mbp linkage region was too large for traditional positional mutation identification.
To identify the gene responsible for FDFM by exome resequencing of a single affected individual.
Design, Setting and Participants
We performed whole exome sequencing in one affected individual and used a series of bioinformatic filters, including functional significance and presence in dbSNP or 1000 Genomes project, to reduce the number of candidate variants. Co-segregation analysis was performed in 15 additional individuals in three generations.
The exome contained 23428 single nucleotide variants, of which 9391 were missense, nonsense or splice site alterations. The critical region contained 323 variants, five of which were not present in one of the sequence-databases. Adenylate cyclase 5 (ADCY5) was the only gene in which the variant (c.2176G>A) was co-transmitted perfectly with disease status and was not present in 3510 control Caucasian exomes. This residue is highly conserved and the change is nonconservative and predicted to be damaging.
ADCY5 is highly expressed in striatum. Mice deficient in Adcy5 develop a movement disorder that is worsened by stress. We conclude that FDFM likely results from a missense mutation in ADCY5. This study demonstrates the power of a single exome sequence in combination with linkage information to identify causative genes for rare autosomal dominant Mendelian diseases.
PMCID: PMC3508680  PMID: 22782511
22.  Evaluating pathogenicity of rare variants from dilated cardiomyopathy in the exome era 
Human exome sequencing is a recently developed tool to aid in the discovery of novel coding variants. Now broadly applied, exome sequencing datasets provide a novel opportunity to evaluate the allele frequencies of previously published pathogenic rare variants.
Methods and Results
We examined the exome dataset from the NHLBI Exome Sequencing Project (ESP) and compared this dataset with a catalog of 197 previously published rare variants reported as causative of dilated cardiomyopathy (DCM) from familial and sporadic cases. Of these 197, 33 (16.8%) were also present in the ESP database, raising the question of whether they were uncommon polymorphisms. Supporting functional data has been published for 14 of the 33 (42%), suggesting they are unlikely to be false positives. The frequencies of these functional variants in the ESP dataset ranged from 0.02–1.33% (median 0.04%), which when applied as a cut-off to filter variants in a DCM pedigree identified an additional DCM candidate gene. A greater proportion of sporadic DCM cases had variants that were present in the ESP dataset vs novel variants (i.e. not in ESP; 44% vs 21%), p=0.002), suggesting some of the variants identified as disease causing in sporadic DCM are either false positives or low penetrance alleles in human populations.
Rare nonsynonymous variants identified in DCM subjects also present at very low frequencies in public databases are likely relevant for DCM. Allele frequencies >0.04% are of less certain pathogenicity, especially if indentified in sporadic cases, although this cut-off should be viewed as preliminary.
PMCID: PMC3332064  PMID: 22337857
cardiomyopathy; genetics; genes
23.  Sporadic autism exomes reveal a highly interconnected protein network of de novo mutations 
Nature  2012;485(7397):246-250.
It is well established that autism spectrum disorders (ASD) have a strong genetic component. However, for at least 70% of cases, the underlying genetic cause is unknown1. Under the hypothesis that de novo mutations underlie a substantial fraction of the risk for developing ASD in families with no previous history of ASD or related phenotypes—so-called sporadic or simplex families2,3, we sequenced all coding regions of the genome, i.e. the exome, for parent-child trios exhibiting sporadic ASD, including 189 new trios and 20 previously reported4. Additionally, we also sequenced the exomes of 50 unaffected siblings corresponding to these new (n = 31) and previously reported trios (n = 19)4, for a total of 677 individual exomes from 209 families. Here we show de novo point mutations are overwhelmingly paternal in origin (4:1 bias) and positively correlated with paternal age, consistent with the modest increased risk for children of older fathers to develop ASD5. Moreover, 39% (49/126) of the most severe or disruptive de novo mutations map to a highly interconnected beta-catenin/chromatin remodeling protein network ranked significantly for autism candidate genes. In proband exomes, recurrent protein-altering mutations were observed in two genes, CHD8 and NTNG1. Mutation screening of six candidate genes in 1,703 ASD probands identified additional de novo, protein-altering mutations in GRIN2B, LAMC3, and SCN1A. Combined with copy number variant (CNV) data, these results suggest extreme locus heterogeneity but also provide a target for future discovery, diagnostics, and therapeutics.
PMCID: PMC3350576  PMID: 22495309
24.  Results of Genome-Wide Analyses on Neurodevelopmental Phenotypes at Four-Year Follow-Up following Cardiac Surgery in Infancy 
PLoS ONE  2012;7(9):e45936.
Adverse neurodevelopmental sequelae are reported among children who undergo early cardiac surgery to repair congenital heart defects (CHD). APOE genotype has previously been determined to contribute to the prediction of these outcomes. Understanding further genetic causes for the development of poor neurobehavioral outcomes should enhance patient risk stratification and improve both prevention and treatment strategies.
We performed a prospective observational study of children who underwent cardiac surgery before six months of age; this included a neurodevelopmental evaluation between their fourth and fifth birthdays. Attention and behavioral skills were assessed through parental report utilizing the Attention Deficit-Hyperactivity Disorder-IV scale preschool edition (ADHD-IV), and Child Behavior Checklist (CBCL/1.5-5), respectively. Of the seven investigated, three neurodevelopmental phenotypes met genomic quality control criteria. Linear regression was performed to determine the effect of genome-wide genetic variation on these three neurodevelopmental measures in 316 subjects.
This genome-wide association study identified single nucleotide polymorphisms (SNPs) associated with three neurobehavioral phenotypes in the postoperative children ADHD-IV Impulsivity/Hyperactivity, CBCL/1.5-5 PDPs, and CBCL/1.5-5 Total Problems. The most predictive SNPs for each phenotype were: a LGALS8 intronic SNP, rs4659682, associated with ADHD-IV Impulsivity (P = 1.03×10−6); a PCSK5 intronic SNP, rs2261722, associated with CBCL/1.5-5 PDPs (P = 1.11×10−6); and an intergenic SNP, rs11617488, 50 kb from FGF9, associated with CBCL/1.5-5 Total Problems (P = 3.47×10−7). 10 SNPs (3 for ADHD-IV Impulsivity, 5 for CBCL/1.5-5 PDPs, and 2 for CBCL/1.5-5 Total Problems) had p<10−5.
No SNPs met genome-wide significance for our three neurobehavioral phenotypes; however, 10 SNPs reached a threshold for suggestive significance (p<10−5). Given the unique nature of this cohort, larger studies and/or replication are not possible. Studies to further investigate the mechanisms through which these newly identified genes may influence neurodevelopment dysfunction are warranted.
PMCID: PMC3457986  PMID: 23049896
25.  Polymorphisms in the ICAM1 gene predict circulating soluble intercellular adhesion molecule-1(sICAM-1) 
Atherosclerosis  2011;216(2):390-394.
Polymorphisms within the ICAM1 structural gene have been shown to influence circulating levels of soluble intercellular adhesion molecule -1 (sICAM-1) but their relation to atherosclerosis has not been clearly established. We sought to determine whether ICAM1 SNPs are associated with circulating sICAM-1 concentration, coronary artery calcium (CAC), and common and internal carotid intima medial thickness (IMT).
Methods and Results
3,550 black and white Coronary Artery Risk Development in Young Adults (CARDIA) Study subjects who participated in the year 15 and/or 20 examinations and were part of the Young Adult Longitudinal Study of Antioxidants (YALTA) ancillary study were included in this analysis. In whites, rs5498 was significantly associated with sICAM-1 (p < 0.001) and each G-allele of rs5498 was associated with 5% higher sICAM-1 concentration. In blacks, each C-allele of rs5490 was associated with 6 % higher sICAM-1 level; this SNP was in strong linkage disequilibrium with rs5491, a functional variant. Subclinical measurements of atherosclerosis in either year 15 or year 20 were not significantly related to ICAM1 SNPs.
In CARDIA, ICAM1 DNA segment variants were associated with sICAM-1 protein level including the novel finding that levels differ by the functional variant rs5491. However, ICAM1 SNPs were not strongly related to either IMT or CAC. Our findings in CARDIA suggest that ICAM1 variants are not major early contributors to subclinical atherosclerosis.
PMCID: PMC3402038  PMID: 21392767
cell adhesion molecules; atherosclerosis; coronary calcium; genetics; inflammation

Results 1-25 (61)