PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (55)
 

Clipboard (0)
None

Select a Filter Below

Year of Publication
more »
1.  A Genome-Wide Assessment of the Role of Untagged Copy Number Variants in Type 1 Diabetes 
PLoS Genetics  2014;10(5):e1004367.
Genome-wide association studies (GWAS) for type 1 diabetes (T1D) have successfully identified more than 40 independent T1D associated tagging single nucleotide polymorphisms (SNPs). However, owing to technical limitations of copy number variants (CNVs) genotyping assays, the assessment of the role of CNVs has been limited to the subset of these in high linkage disequilibrium with tag SNPs. The contribution of untagged CNVs, often multi-allelic and difficult to genotype using existing assays, to the heritability of T1D remains an open question. To investigate this issue, we designed a custom comparative genetic hybridization array (aCGH) specifically designed to assay untagged CNV loci identified from a variety of sources. To overcome the technical limitations of the case control design for this class of CNVs, we genotyped the Type 1 Diabetes Genetics Consortium (T1DGC) family resource (representing 3,903 transmissions from parents to affected offspring) and used an association testing strategy that does not necessitate obtaining discrete genotypes. Our design targeted 4,309 CNVs, of which 3,410 passed stringent quality control filters. As a positive control, the scan confirmed the known T1D association at the INS locus by direct typing of the 5′ variable number of tandem repeat (VNTR) locus. Our results clarify the fact that the disease association is indistinguishable from the two main polymorphic allele classes of the INS VNTR, class I-and class III. We also identified novel technical artifacts resulting into spurious associations at the somatically rearranging loci, T cell receptor, TCRA/TCRD and TCRB, and Immunoglobulin heavy chain, IGH, loci on chromosomes 14q11.2, 7q34 and 14q32.33, respectively. However, our data did not identify novel T1D loci. Our results do not support a major role of untagged CNVs in T1D heritability.
Author Summary
For many complex traits, and in particular type 1 diabetes (T1D), the genome-wide association study (GWAS) design has been successful at detecting a large number of loci that contribute disease risk. However, in the case of T1D as well as almost all other traits, the sum of these loci does not fully explain the heritability estimated from familial studies. This observation raises the possibility that additional variants exist but have not yet been found because they have not effectively been targeted by the GWAS design. Here, we focus on a specific class of large deletions/duplications called copy number variants (CNVs), and more precisely to the subset of these loci that mutate rapidly, which are highly polymorphic. A consequence of this high level of polymorphism is that these variants have typically not been captured by previous GWAS studies. We use a family based design that is optimized to capture these previously untested variants. We then perform a genome-wide scan to assess their contribution to T1D. Our scan was technically successful but did not identify novel associations. This suggests that little was missed by the GWAS strategy, and that the remaining heritability of T1D is most likely driven by a large number of variants, either rare of common, but with a small individual contribution to disease risk.
doi:10.1371/journal.pgen.1004367
PMCID: PMC4038470  PMID: 24875393
2.  DeNovoGear: de novo indel and point mutation discovery and phasing 
Nature methods  2013;10(10):985-987.
We present the DeNovoGear software for analyzing de novo mutations from familial and somatic tissue sequencing data. DeNovoGear uses likelihood-based error modeling to reduce the false positive rate of mutation discovery in exome analysis, and fragment information to identify the parental origin of germline mutations. We used our program to create a whole-genome de novo indel callset with a 95% validation rate, producing a direct estimate of the human germline indel mutation rate.
doi:10.1038/nmeth.2611
PMCID: PMC4003501  PMID: 23975140
3.  Cerebral organoids model human brain development and microcephaly 
Nature  2013;501(7467):10.1038/nature12517.
The complexity of the human brain has made it difficult to study many brain disorders in model organisms, and highlights the need for an in vitro model of human brain development. We have developed a human pluripotent stem cell-derived 3D organoid culture system, termed cerebral organoid, which develops various discrete though interdependent brain regions. These include cerebral cortex containing progenitor populations that organize and produce mature cortical neuron subtypes. Furthermore, cerebral organoids recapitulate features of human cortical development, namely characteristic progenitor zone organization with abundant outer radial glial stem cells. Finally, we use RNAi and patient-specific iPS cells to model microcephaly, a disorder that has been difficult to recapitulate in mice. We demonstrate premature neuronal differentiation in patient organoids, a defect that could explain the disease phenotype. Our data demonstrate that 3D organoids can recapitulate development and disease of even this most complex human tissue.
doi:10.1038/nature12517
PMCID: PMC3817409  PMID: 23995685
4.  The Rate of Nonallelic Homologous Recombination in Males Is Highly Variable, Correlated between Monozygotic Twins and Independent of Age 
PLoS Genetics  2014;10(3):e1004195.
Nonallelic homologous recombination (NAHR) between highly similar duplicated sequences generates chromosomal deletions, duplications and inversions, which can cause diverse genetic disorders. Little is known about interindividual variation in NAHR rates and the factors that influence this. We estimated the rate of deletion at the CMT1A-REP NAHR hotspot in sperm DNA from 34 male donors, including 16 monozygotic (MZ) co-twins (8 twin pairs) aged 24 to 67 years old. The average NAHR rate was 3.5×10−5 with a seven-fold variation across individuals. Despite good statistical power to detect even a subtle correlation, we observed no relationship between age of unrelated individuals and the rate of NAHR in their sperm, likely reflecting the meiotic-specific origin of these events. We then estimated the heritability of deletion rate by calculating the intraclass correlation (ICC) within MZ co-twins, revealing a significant correlation between MZ co-twins (ICC = 0.784, p = 0.0039), with MZ co-twins being significantly more correlated than unrelated pairs. We showed that this heritability cannot be explained by variation in PRDM9, a known regulator of NAHR, or variation within the NAHR hotspot itself. We also did not detect any correlation between Body Mass Index (BMI), smoking status or alcohol intake and rate of NAHR. Our results suggest that other, as yet unidentified, genetic or environmental factors play a significant role in the regulation of NAHR and are responsible for the extensive variation in the population for the probability of fathering a child with a genomic disorder resulting from a pathogenic deletion.
Author Summary
Many genetic disorders are caused by deletions of specific regions of DNA in sperm or egg cells that go on to produce a child. This can occur through ectopic homologous recombination between highly similar segments of DNA at different positions within the genome. Little is known about the differences in rates of deletion between individuals or the factors that influence this. We analysed the rate of deletion at one such section of DNA in sperm DNA from 34 male donors, including 16 monozygotic co-twins. We observed a seven-fold variation in deletion rate across individuals. Deletion rate is significantly correlated between monozygote co-twins, indicating that deletion rate is heritable. This heritability cannot be explained by age, any known genetic regulator of deletion rate, Body Mass Index, smoking status or alcohol intake. Our results suggest that other, as yet unidentified, genetic or environmental factors play a significant role in the regulation of deletion. These factors are responsible for the extensive variation in the population for the probability of fathering a child with a genomic disorder resulting from a pathogenic deletion.
doi:10.1371/journal.pgen.1004195
PMCID: PMC3945173  PMID: 24603440
5.  Exome sequencing improves genetic diagnosis of structural fetal abnormalities revealed by ultrasound 
Human Molecular Genetics  2014;23(12):3269-3277.
The genetic etiology of non-aneuploid fetal structural abnormalities is typically investigated by karyotyping and array-based detection of microscopically detectable rearrangements, and submicroscopic copy-number variants (CNVs), which collectively yield a pathogenic finding in up to 10% of cases. We propose that exome sequencing may substantially increase the identification of underlying etiologies. We performed exome sequencing on a cohort of 30 non-aneuploid fetuses and neonates (along with their parents) with diverse structural abnormalities first identified by prenatal ultrasound. We identified candidate pathogenic variants with a range of inheritance models, and evaluated these in the context of detailed phenotypic information. We identified 35 de novo single-nucleotide variants (SNVs), small indels, deletions or duplications, of which three (accounting for 10% of the cohort) are highly likely to be causative. These are de novo missense variants in FGFR3 and COL2A1, and a de novo 16.8 kb deletion that includes most of OFD1. In five further cases (17%) we identified de novo or inherited recessive or X-linked variants in plausible candidate genes, which require additional validation to determine pathogenicity. Our diagnostic yield of 10% is comparable to, and supplementary to, the diagnostic yield of existing microarray testing for large chromosomal rearrangements and targeted CNV detection. The de novo nature of these events could enable couples to be counseled as to their low recurrence risk. This study outlines the way for a substantial improvement in the diagnostic yield of prenatal genetic abnormalities through the application of next-generation sequencing.
doi:10.1093/hmg/ddu038
PMCID: PMC4030780  PMID: 24476948
6.  Empirical research on the ethics of genomic research 
There is no universally accepted definition of what an incidental finding is [Wolf et al., 2008] and broadly speaking this could include variants of known and unknown clinical significance, variants linked to highly penetrant, serious, life-threatening conditions, non-paternity or ancestry data. For the purposes of our study, we have adopted a pragmatic distinction between ‘pertinent’ and ‘incidental’ findings as set out in this text. Whilst in the US definitions of incidental findings are becoming accepted in practice [Green et al., 2013] it is still not known how and whether these also apply elsewhere around the world.
doi:10.1002/ajmg.a.36067
PMCID: PMC3884757  PMID: 23813698
7.  DECIPHER: database for the interpretation of phenotype-linked plausibly pathogenic sequence and copy-number variation 
Nucleic Acids Research  2013;42(D1):D993-D1000.
The DECIPHER database (https://decipher.sanger.ac.uk/) is an accessible online repository of genetic variation with associated phenotypes that facilitates the identification and interpretation of pathogenic genetic variation in patients with rare disorders. Contributing to DECIPHER is an international consortium of >200 academic clinical centres of genetic medicine and ≥1600 clinical geneticists and diagnostic laboratory scientists. Information integrated from a variety of bioinformatics resources, coupled with visualization tools, provides a comprehensive set of tools to identify other patients with similar genotype–phenotype characteristics and highlights potentially pathogenic genes. In a significant development, we have extended DECIPHER from a database of just copy-number variants to allow upload, annotation and analysis of sequence variants such as single nucleotide variants (SNVs) and InDels. Other notable developments in DECIPHER include a purpose-built, customizable and interactive genome browser to aid combined visualization and interpretation of sequence and copy-number variation against informative datasets of pathogenic and population variation. We have also introduced several new features to our deposition and analysis interface. This article provides an update to the DECIPHER database, an earlier instance of which has been described elsewhere [Swaminathan et al. (2012) DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders. Hum. Mol. Genet., 21, R37–R44].
doi:10.1093/nar/gkt937
PMCID: PMC3965078  PMID: 24150940
8.  DECIPHER: web-based, community resource for clinical interpretation of rare variants in developmental disorders 
Human Molecular Genetics  2012;21(R1):R37-R44.
Patients with developmental disorders often harbour sub-microscopic deletions or duplications that lead to a disruption of normal gene expression or perturbation in the copy number of dosage-sensitive genes. Clinical interpretation for such patients in isolation is hindered by the rarity and novelty of such disorders. The DECIPHER project (https://decipher.sanger.ac.uk) was established in 2004 as an accessible online repository of genomic and associated phenotypic data with the primary goal of aiding the clinical interpretation of rare copy-number variants (CNVs). DECIPHER integrates information from a variety of bioinformatics resources and uses visualization tools to identify potential disease genes within a CNV. A two-tier access system permits clinicians and clinical scientists to maintain confidential linked anonymous records of phenotypes and CNVs for their patients that, with informed consent, can subsequently be shared with the wider clinical genetics and research communities. Advances in next-generation sequencing technologies are making it practical and affordable to sequence the whole exome/genome of patients who display features suggestive of a genetic disorder. This approach enables the identification of smaller intragenic mutations including single-nucleotide variants that are not accessible even with high-resolution genomic array analysis. This article briefly summarizes the current status and achievements of the DECIPHER project and looks ahead to the opportunities and challenges of jointly analysing structural and sequence variation in the human genome.
doi:10.1093/hmg/dds362
PMCID: PMC3459644  PMID: 22962312
9.  Mutations in B4GALNT1 (GM2 synthase) underlie a new disorder of ganglioside biosynthesis 
Brain  2013;136(12):3618-3624.
Glycosphingolipids are ubiquitous constituents of eukaryotic plasma membranes, and their sialylated derivatives, gangliosides, are the major class of glycoconjugates expressed by neurons. Deficiencies in their catabolic pathways give rise to a large and well-studied group of inherited disorders, the lysosomal storage diseases. Although many glycosphingolipid catabolic defects have been defined, only one proven inherited disease arising from a defect in ganglioside biosynthesis is known. This disease, because of defects in the first step of ganglioside biosynthesis (GM3 synthase), results in a severe epileptic disorder found at high frequency amongst the Old Order Amish. Here we investigated an unusual neurodegenerative phenotype, most commonly classified as a complex form of hereditary spastic paraplegia, present in families from Kuwait, Italy and the Old Order Amish. Our genetic studies identified mutations in B4GALNT1 (GM2 synthase), encoding the enzyme that catalyzes the second step in complex ganglioside biosynthesis, as the cause of this neurodegenerative phenotype. Biochemical profiling of glycosphingolipid biosynthesis confirmed a lack of GM2 in affected subjects in association with a predictable increase in levels of its precursor, GM3, a finding that will greatly facilitate diagnosis of this condition. With the description of two neurological human diseases involving defects in two sequentially acting enzymes in ganglioside biosynthesis, there is the real possibility that a previously unidentified family of ganglioside deficiency diseases exist. The study of patients and animal models of these disorders will pave the way for a greater understanding of the role gangliosides play in neuronal structure and function and provide insights into the development of effective treatment therapies.
doi:10.1093/brain/awt270
PMCID: PMC3859217  PMID: 24103911
ganglioside biosynthesis; B4GALNT1; Amish; SPG26; hereditary spastic paraplegia
11.  NDUFA4 Mutations Underlie Dysfunction of a Cytochrome c Oxidase Subunit Linked to Human Neurological Disease 
Cell Reports  2013;3(6):1795-1805.
Summary
The molecular basis of cytochrome c oxidase (COX, complex IV) deficiency remains genetically undetermined in many cases. Homozygosity mapping and whole-exome sequencing were performed in a consanguineous pedigree with isolated COX deficiency linked to a Leigh syndrome neurological phenotype. Unexpectedly, affected individuals harbored homozygous splice donor site mutations in NDUFA4, a gene previously assigned to encode a mitochondrial respiratory chain complex I (NADH:ubiquinone oxidoreductase) subunit. Western blot analysis of denaturing gels and immunocytochemistry revealed undetectable steady-state NDUFA4 protein levels, indicating that the mutation causes a loss-of-function effect in the homozygous state. Analysis of one- and two-dimensional blue-native polyacrylamide gels confirmed an interaction between NDUFA4 and the COX enzyme complex in control muscle, whereas the COX enzyme complex without NDUFA4 was detectable with no abnormal subassemblies in patient muscle. These observations support recent work in cell lines suggesting that NDUFA4 is an additional COX subunit and demonstrate that NDUFA4 mutations cause human disease. Our findings support reassignment of the NDUFA4 protein to complex IV and suggest that patients with unexplained COX deficiency should be screened for NDUFA4 mutations.
Graphical Abstract
Highlights
•Mutations in NDUFA4, assigned to encode a complex I subunit, cause human COX deficiency•Confirmed interaction between NDUFA4 and the COX holoenzyme in control muscle•The COX holoenzyme without NDUFA4 is detectable with no abnormal subassemblies in patient muscle•NDUFA4 is essential for complex IV activity, but is not required for assembly of the COX holoenzyme
Isolated cytochrome c oxidase (COX) deficiency is a frequent finding in human mitochondrial disease. Mutations in nuclear-encoded structural subunits are extremely rare, and, in many cases, the molecular basis remains undetermined. Recent evidence in cell lines has suggested that NDUFA4, previously assigned to encode a complex I subunit, actually encodes a structural component of COX. Hanna and colleagues now demonstrate that NDUFA4 mutations cause human COX deficiency, thus confirming NDUFA4 as a COX subunit that is essential for the enzyme’s activity.
doi:10.1016/j.celrep.2013.05.005
PMCID: PMC3701321  PMID: 23746447
12.  Quantifying single nucleotide variant detection sensitivity in exome sequencing 
BMC Bioinformatics  2013;14:195.
Background
The targeted capture and sequencing of genomic regions has rapidly demonstrated its utility in genetic studies. Inherent in this technology is considerable heterogeneity of target coverage and this is expected to systematically impact our sensitivity to detect genuine polymorphisms. To fully interpret the polymorphisms identified in a genetic study it is often essential to both detect polymorphisms and to understand where and with what probability real polymorphisms may have been missed.
Results
Using down-sampling of 30 deeply sequenced exomes and a set of gold-standard single nucleotide variant (SNV) genotype calls for each sample, we developed an empirical model relating the read depth at a polymorphic site to the probability of calling the correct genotype at that site. We find that measured sensitivity in SNV detection is substantially worse than that predicted from the naive expectation of sampling from a binomial. This calibrated model allows us to produce single nucleotide resolution SNV sensitivity estimates which can be merged to give summary sensitivity measures for any arbitrary partition of the target sequences (nucleotide, exon, gene, pathway, exome). These metrics are directly comparable between platforms and can be combined between samples to give “power estimates” for an entire study. We estimate a local read depth of 13X is required to detect the alleles and genotype of a heterozygous SNV 95% of the time, but only 3X for a homozygous SNV. At a mean on-target read depth of 20X, commonly used for rare disease exome sequencing studies, we predict 5–15% of heterozygous and 1–4% of homozygous SNVs in the targeted regions will be missed.
Conclusions
Non-reference alleles in the heterozygote state have a high chance of being missed when commonly applied read coverage thresholds are used despite the widely held assumption that there is good polymorphism detection at these coverage levels. Such alleles are likely to be of functional importance in population based studies of rare diseases, somatic mutations in cancer and explaining the “missing heritability” of quantitative traits.
doi:10.1186/1471-2105-14-195
PMCID: PMC3695811  PMID: 23773188
13.  Human Spermatogenic Failure Purges Deleterious Mutation Load from the Autosomes and Both Sex Chromosomes, including the Gene DMRT1 
PLoS Genetics  2013;9(3):e1003349.
Gonadal failure, along with early pregnancy loss and perinatal death, may be an important filter that limits the propagation of harmful mutations in the human population. We hypothesized that men with spermatogenic impairment, a disease with unknown genetic architecture and a common cause of male infertility, are enriched for rare deleterious mutations compared to men with normal spermatogenesis. After assaying genomewide SNPs and CNVs in 323 Caucasian men with idiopathic spermatogenic impairment and more than 1,100 controls, we estimate that each rare autosomal deletion detected in our study multiplicatively changes a man's risk of disease by 10% (OR 1.10 [1.04–1.16], p<2×10−3), rare X-linked CNVs by 29%, (OR 1.29 [1.11–1.50], p<1×10−3), and rare Y-linked duplications by 88% (OR 1.88 [1.13–3.13], p<0.03). By contrasting the properties of our case-specific CNVs with those of CNV callsets from cases of autism, schizophrenia, bipolar disorder, and intellectual disability, we propose that the CNV burden in spermatogenic impairment is distinct from the burden of large, dominant mutations described for neurodevelopmental disorders. We identified two patients with deletions of DMRT1, a gene on chromosome 9p24.3 orthologous to the putative sex determination locus of the avian ZW chromosome system. In an independent sample of Han Chinese men, we identified 3 more DMRT1 deletions in 979 cases of idiopathic azoospermia and none in 1,734 controls, and found none in an additional 4,519 controls from public databases. The combined results indicate that DMRT1 loss-of-function mutations are a risk factor and potential genetic cause of human spermatogenic failure (frequency of 0.38% in 1306 cases and 0% in 7,754 controls, p = 6.2×10−5). Our study identifies other recurrent CNVs as potential causes of idiopathic azoospermia and generates hypotheses for directing future studies on the genetic basis of male infertility and IVF outcomes.
Author Summary
Infertility is a disease that prevents the transmission of DNA from one generation to the next, and consequently it has been difficult to study the genetics of infertility using classical human genetics methods. Now, new technologies for screening entire genomes for rare and patient-specific mutations are revolutionizing our understanding of reproductively lethal diseases. Here, we apply techniques for variation discovery to study a condition called azoospermia, the failure to produce sperm. Large deletions of the Y chromosome are the primary known genetic risk factor for azoospermia, and genetic testing for these deletions is part of the standard treatment for this condition. We have screened over 300 men with azoospermia for rare deletions and duplications, and find an enrichment of these mutations throughout the genome compared to unaffected men. Our results indicate that sperm production is affected by mutations beyond the Y chromosome and will motivate whole-genome analyses of larger numbers of men with impaired spermatogenesis. Our finding of an enrichment of rare deleterious mutations in men with poor sperm production also raises the possibility that the slightly increased rate of birth defects reported in children conceived by in vitro fertilization may have a genetic basis.
doi:10.1371/journal.pgen.1003349
PMCID: PMC3605256  PMID: 23555275
14.  Harnessing genomics to identify environmental determinants of heritable disease 
Mutation research  2012;752(1):6-9.
Next-generation sequencing technologies can now be used to directly measure heritable de novo DNA sequence mutations in humans. However, these techniques have not been used to examine environmental factors that induce such mutations and their associated diseases. To address this issue, a working group on environmentally induced germline mutation analysis (ENIGMA) met in October 2011 to propose the necessary foundational studies, which include sequencing of parent–offspring trios from highly exposed human populations, and controlled dose–response experiments in animals. These studies will establish background levels of variability in germline mutation rates and identify environmental agents that influence these rates and heritable disease. Guidance for the types of exposures to examine come from rodent studies that have identified agents such as cancer chemotherapeutic drugs, ionizing radiation, cigarette smoke, and air pollution as germ-cell mutagens. Research is urgently needed to establish the health consequences of parental exposures on subsequent generations.
doi:10.1016/j.mrrev.2012.08.002
PMCID: PMC3556182  PMID: 22935230
Germ cell; Heritable mutation; Next generation sequencing; Copy number variants
15.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry 
Bentley, David R. | Balasubramanian, Shankar | Swerdlow, Harold P. | Smith, Geoffrey P. | Milton, John | Brown, Clive G. | Hall, Kevin P. | Evers, Dirk J. | Barnes, Colin L. | Bignell, Helen R. | Boutell, Jonathan M. | Bryant, Jason | Carter, Richard J. | Cheetham, R. Keira | Cox, Anthony J. | Ellis, Darren J. | Flatbush, Michael R. | Gormley, Niall A. | Humphray, Sean J. | Irving, Leslie J. | Karbelashvili, Mirian S. | Kirk, Scott M. | Li, Heng | Liu, Xiaohai | Maisinger, Klaus S. | Murray, Lisa J. | Obradovic, Bojan | Ost, Tobias | Parkinson, Michael L. | Pratt, Mark R. | Rasolonjatovo, Isabelle M. J. | Reed, Mark T. | Rigatti, Roberto | Rodighiero, Chiara | Ross, Mark T. | Sabot, Andrea | Sankar, Subramanian V. | Scally, Aylwyn | Schroth, Gary P. | Smith, Mark E. | Smith, Vincent P. | Spiridou, Anastassia | Torrance, Peta E. | Tzonev, Svilen S. | Vermaas, Eric H. | Walter, Klaudia | Wu, Xiaolin | Zhang, Lu | Alam, Mohammed D. | Anastasi, Carole | Aniebo, Ify C. | Bailey, David M. D. | Bancarz, Iain R. | Banerjee, Saibal | Barbour, Selena G. | Baybayan, Primo A. | Benoit, Vincent A. | Benson, Kevin F. | Bevis, Claire | Black, Phillip J. | Boodhun, Asha | Brennan, Joe S. | Bridgham, John A. | Brown, Rob C. | Brown, Andrew A. | Buermann, Dale H. | Bundu, Abass A. | Burrows, James C. | Carter, Nigel P. | Castillo, Nestor | Catenazzi, Maria Chiara E. | Chang, Simon | Cooley, R. Neil | Crake, Natasha R. | Dada, Olubunmi O. | Diakoumakos, Konstantinos D. | Dominguez-Fernandez, Belen | Earnshaw, David J. | Egbujor, Ugonna C. | Elmore, David W. | Etchin, Sergey S. | Ewan, Mark R. | Fedurco, Milan | Fraser, Louise J. | Fajardo, Karin V. Fuentes | Furey, W. Scott | George, David | Gietzen, Kimberley J. | Goddard, Colin P. | Golda, George S. | Granieri, Philip A. | Green, David E. | Gustafson, David L. | Hansen, Nancy F. | Harnish, Kevin | Haudenschild, Christian D. | Heyer, Narinder I. | Hims, Matthew M. | Ho, Johnny T. | Horgan, Adrian M. | Hoschler, Katya | Hurwitz, Steve | Ivanov, Denis V. | Johnson, Maria Q. | James, Terena | Jones, T. A. Huw | Kang, Gyoung-Dong | Kerelska, Tzvetana H. | Kersey, Alan D. | Khrebtukova, Irina | Kindwall, Alex P. | Kingsbury, Zoya | Kokko-Gonzales, Paula I. | Kumar, Anil | Laurent, Marc A. | Lawley, Cynthia T. | Lee, Sarah E. | Lee, Xavier | Liao, Arnold K. | Loch, Jennifer A. | Lok, Mitch | Luo, Shujun | Mammen, Radhika M. | Martin, John W. | McCauley, Patrick G. | McNitt, Paul | Mehta, Parul | Moon, Keith W. | Mullens, Joe W. | Newington, Taksina | Ning, Zemin | Ng, Bee Ling | Novo, Sonia M. | O'Neill, Michael J. | Osborne, Mark A. | Osnowski, Andrew | Ostadan, Omead | Paraschos, Lambros L. | Pickering, Lea | Pike, Andrew C. | Pike, Alger C. | Pinkard, D. Chris | Pliskin, Daniel P. | Podhasky, Joe | Quijano, Victor J. | Raczy, Come | Rae, Vicki H. | Rawlings, Stephen R. | Rodriguez, Ana Chiva | Roe, Phyllida M. | Rogers, John | Rogert Bacigalupo, Maria C. | Romanov, Nikolai | Romieu, Anthony | Roth, Rithy K. | Rourke, Natalie J. | Ruediger, Silke T. | Rusman, Eli | Sanches-Kuiper, Raquel M. | Schenker, Martin R. | Seoane, Josefina M. | Shaw, Richard J. | Shiver, Mitch K. | Short, Steven W. | Sizto, Ning L. | Sluis, Johannes P. | Smith, Melanie A. | Sohna, Jean Ernest Sohna | Spence, Eric J. | Stevens, Kim | Sutton, Neil | Szajkowski, Lukasz | Tregidgo, Carolyn L. | Turcatti, Gerardo | vandeVondele, Stephanie | Verhovsky, Yuli | Virk, Selene M. | Wakelin, Suzanne | Walcott, Gregory C. | Wang, Jingwen | Worsley, Graham J. | Yan, Juying | Yau, Ling | Zuerlein, Mike | Rogers, Jane | Mullikin, James C. | Hurles, Matthew E. | McCooke, Nick J. | West, John S. | Oaks, Frank L. | Lundberg, Peter L. | Klenerman, David | Durbin, Richard | Smith, Anthony J.
Nature  2008;456(7218):53-59.
doi:10.1038/nature07517
PMCID: PMC2581791  PMID: 18987734
16.  Inheritance of low-frequency regulatory SNPs and a rare null mutation in exon-junction complex subunit RBM8A causes TAR 
Nature genetics  2012;44(4):435-S2.
The exon-junction complex (EJC) performs essential RNA processing tasks1-5. Here, we describe the first human disorder, Thrombocytopenia with Absent Radii6 (TAR), caused by deficiency in one of the four EJC subunits. A compound inheritance mechanism of a rare null allele and one of two low-frequency SNPs in the regulatory regions of RBM8A, encoding the Y14 subunit of EJC, causes TAR. We found that this mechanism explained 53 of 55 cases (P<5×10−228) with the rare congenital malformation syndrome. Fifty-one of those 53 carried a previously associated7 submicroscopic deletion of 1q21.1; two carried a truncation or frameshift null mutation in RBM8A. We show that the two regulatory SNPs result in reduction of RBM8A transcription in vitro and that Y14 expression is reduced in platelets from TAR cases. Our data implicate Y14 insufficiency, and presumably EJC defect, as the cause of TAR syndrome.
doi:10.1038/ng.1083
PMCID: PMC3428915  PMID: 22366785
17.  Genome-Wide Screen for Metabolic Syndrome Susceptibility Loci Reveals Strong Lipid Gene Contribution but No Evidence for Common Genetic Basis for Clustering of Metabolic Syndrome Traits 
Background
Genome-wide association (GWA) studies have identified several susceptibility loci for metabolic syndrome (MetS) component traits, but have had variable success in identifying susceptibility loci to the syndrome as an entity. We conducted a GWA study on MetS and its component traits in four Finnish cohorts consisting of 2637 MetS cases and 7927 controls, both free of diabetes, and followed the top loci in an independent sample with transcriptome and NMR-based metabonomics data. Furthermore, we tested for loci associated with multiple MetS component traits using factor analysis and built a genetic risk score for MetS.
Methods and Results
A previously known lipid locus, APOA1/C3/A4/A5 gene cluster region (SNP rs964184), was associated with MetS in all four study samples (P=7.23×10−9 in meta-analysis). The association was further supported by serum metabolite analysis, where rs964184 associated with various VLDL, TG, and HDL metabolites (P=0.024-1.88×10−5). Twenty-two previously identified susceptibility loci for individual MetS component traits were replicated in our GWA and factor analysis. Most of these associated with lipid phenotypes and none with two or more uncorrelated MetS components. A genetic risk score, calculated as the number of alleles in loci associated with individual MetS traits, was strongly associated with MetS status.
Conclusions
Our findings suggest that genes from lipid metabolism pathways have the key role in the genetic background of MetS. We found little evidence for pleiotropy linking dyslipidemia and obesity to the other MetS component traits such as hypertension and glucose intolerance.
doi:10.1161/CIRCGENETICS.111.961482
PMCID: PMC3378651  PMID: 22399527
metabolic syndrome; risk factors; genome-wide association study; meta-analysis; lipids
19.  Challenges and standards in integrating surveys of structural variation 
Nature genetics  2007;39(7 Suppl):S7-15.
There has been an explosion of data describing newly recognized structural variants in the human genome. In the flurry of reporting, there has been no standard approach to collecting the data, assessing its quality or describing identified features. This risks becoming a rampant problem, in particular with respect to surveys of copy number variation and their application to disease studies. Here, we consider the challenges in characterizing and documenting genomic structural variants. From this, we derive recommendations for standards to be adopted, with the aim of ensuring the accurate presentation of this form of genetic variation to facilitate ongoing research.
doi:10.1038/ng2093
PMCID: PMC2698291  PMID: 17597783
20.  Global variation in copy number in the human genome 
Nature  2006;444(7118):444-454.
Copy number variation (CNV) of DNA sequences is functionally significant but has yet to be fully ascertained. We have constructed a first-generation CNV map of the human genome through the study of 270 individuals from four populations with ancestry in Europe, Africa or Asia (the HapMap collection). DNA from these individuals was screened for CNV using two complementary technologies: single nucleotide polymorphism (SNP) genotyping arrays, and clone-based comparative genomic hybridization. 1,447 copy number variable regions covering 360 megabases (12% of the genome) were identified in these populations; these CNV regions contained hundreds of genes, disease loci, functional elements and segmental duplications. Strikingly, these CNVs encompassed more nucleotide content per genome than SNPs, underscoring the importance of CNV in genetic diversity and evolution. The data obtained delineate linkage disequilibrium patterns for many CNVs, and reveal dramatic variation in copy number among populations. We also demonstrate the utility of this resource for genetic disease studies.
doi:10.1038/nature05329
PMCID: PMC2669898  PMID: 17122850
21.  Mutation spectrum revealed by breakpoint sequencing of human germline CNVs 
Nature genetics  2010;42(5):385-391.
Precisely characterizing the breakpoints of copy number variants (CNVs) is crucial for assessing their functional impact. However, fewer than 0% of known germline CNVs have been mapped to the single-nucleotide level. We characterized the sequence breakpoints from a dataset of all CNVs detected in three unrelated individuals in previous array-based CNV discovery experiments. We used targeted hybridization-based DNA capture and 454 sequencing to sequence 324 CNV breakpoints, including 315 deletions. We observed two major breakpoint signatures: 70% of the deletion breakpoints have 1–30 bp of microhomology, whereas 33% of deletion breakpoints contain 1–367 bp of inserted sequence. The co-occurrence of microhomology and inserted sequence is low (10%), suggesting that there are at least two different mutational mechanisms. Approximately 5% of the breakpoints represent more complex rearrangements, including local microinversions, suggesting a replication-based strand switching mechanism. Despite a rich literature on DNA repair processes, reconstruction of the molecular events generating each of these mutations is not yet possible.
doi:10.1038/ng.564
PMCID: PMC3428939  PMID: 20364136
22.  Relative impact of nucleotide and copy number variation on gene expression phenotypes 
Science (New York, N.Y.)  2007;315(5813):848-853.
Extensive studies are currently being performed to associate disease susceptibility with one form of genetic variation, namely single nucleotide polymorphisms (SNPs). In recent years another type of common genetic variation has been characterised, namely structural variation, including copy number variations (CNVs). To determine the overall contribution of CNVs to complex phenotypes we have performed association analyses of expression levels of 14,925 transcripts with SNPs and CNVs in individuals who are part of the International HapMap project. SNPs and CNVs captured 83.6% and 17.7% of the total detected genetic variation in gene expression, respectively, but the signals from the two types of variation had little overlap. Interrogation of the genome for both types of variants may be an effective way to elucidate the causes of complex phenotypes and disease in humans.
doi:10.1126/science.1136678
PMCID: PMC2665772  PMID: 17289997
23.  A systematic survey of loss-of-function variants in human protein-coding genes 
Science (New York, N.Y.)  2012;335(6070):823-828.
Genome sequencing studies indicate that all humans carry many genetic variants predicted to cause loss of function (LoF) of protein-coding genes, suggesting unexpected redundancy in the human genome. Here we apply stringent filters to 2,951 putative LoF variants obtained from 185 human genomes to determine their true prevalence and properties. We estimate that human genomes typically contain ~100 genuine LoF variants with ~20 genes completely inactivated. We identify rare and likely deleterious LoF alleles, including 26 known and 21 predicted severe disease-causing variants, as well as common LoF variants in non-essential genes. We describe functional and evolutionary differences between LoF-tolerant and recessive disease genes, and a method for using these differences to prioritize candidate genes found in clinical sequencing studies.
doi:10.1126/science.1215040
PMCID: PMC3299548  PMID: 22344438
24.  Origins and functional impact of copy number variation in the human genome 
Nature  2009;464(7289):704-712.
Structural variations of DNA greater than 1 kilobase in size account for most bases that vary among human genomes, but are still relatively under-ascertained. Here we use tiling oligonucleotide microarrays, comprising 42 million probes, to generate a comprehensive map of 11,700 copy number variations (CNVs) greater than 443 base pairs, of which most (8,599) have been validated independently. For 4,978 of these CNVs, we generated reference genotypes from 450 individuals of European, African or East Asian ancestry. The predominant mutational mechanisms differ among CNV size classes. Retrotransposition has duplicated and inserted some coding and non-coding DNA segments randomly around the genome. Furthermore, by correlation with known trait-associated single nucleotide polymorphisms (SNPs), we identified 30 loci with CNVs that are candidates for influencing disease susceptibility. Despite this, having assessed the completeness of our map and the patterns of linkage disequilibrium between CNVs and SNPs, we conclude that, for complex traits, the heritability void left by genome-wide association studies will not be accounted for by common CNVs.
doi:10.1038/nature08516
PMCID: PMC3330748  PMID: 19812545
25.  Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing 
Nature genetics  2010;42(5):400-405.
Copy number variants (CNVs) account for the majority of human genomic diversity in terms of base coverage. Here, we have developed and applied a new method to combine high-resolution array comparative genomic hybridization (CGH) data with whole-genome DNA sequencing data to obtain a comprehensive catalog of common CNVs in Asian individuals. The genomes of 30 individuals from three Asian populations (Korean, Chinese and Japanese) were interrogated with an ultra-high-resolution array CGH platform containing 24 million probes. Whole-genome sequencing data from a reference genome (NA10851, with 28.3× coverage) and two Asian genomes (AK1, with 27.8× coverage and AK2, with 32.0× coverage) were used to transform the relative copy number information obtained from array CGH experiments into absolute copy number values. We discovered 5,177 CNVs, of which 3,547 were putative Asian-specific CNVs. These common CNVs in Asian populations will be a useful resource for subsequent genetic studies in these populations, and the new method of calling absolute CNVs will be essential for applying CNV data to personalized medicine.
doi:10.1038/ng.555
PMCID: PMC3329635  PMID: 20364138

Results 1-25 (55)