DNA sequence information underpins genetic research, enabling discoveries of important biological or medical benefit. Sequencing projects have traditionally employed long (400–800 bp) reads, but the existence of reference sequences for the human and many other genomes makes it possible to develop new, fast approaches to re-sequencing, whereby shorter reads are compared to a reference to identify intra-species genetic variation. We report an approach that generates several billion bases of accurate nucleotide sequence per experiment at low cost. Single molecules of DNA are attached to a flat surface, amplified in situ and used as templates for synthetic sequencing with fluorescent reversible terminator deoxyribonucleotides. Images of the surface are analysed to generate high quality sequence. We demonstrate application of this approach to human genome sequencing on flow-sorted X chromosomes and then scale the approach to determine the genome sequence of a male Yoruba from Ibadan, Nigeria. We build an accurate consensus sequence from >30x average depth of paired 35-base reads. We characterise four million SNPs and four hundred thousand structural variants, many of which are previously unknown. Our approach is effective for accurate, rapid and economical whole genome re-sequencing and many other biomedical applications.
StellaBase, the Nematostella vectensis Genomics Database, is a web-based resource that will facilitate desktop and bench-top studies of the starlet sea anemone. Nematostella is an emerging model organism that has already proven useful for addressing fundamental questions in developmental evolution and evolutionary genomics. StellaBase allows users to query the assembled Nematostella genome, a confirmed gene library, and a predicted genome using both keyword and homology based search functions. Data provided by these searches will elucidate gene family evolution in early animals. Unique research tools, including a Nematostella genetic stock library, a primer library, a literature repository and a gene expression library will provide support to the burgeoning Nematostella research community. The development of StellaBase accompanies significant upgrades to CnidBase, the Cnidarian Evolutionary Genomics Database. With the completion of the first sequenced cnidarian genome, genome comparison tools have been added to CnidBase. In addition, StellaBase provides a framework for the integration of additional species-specific databases into CnidBase. StellaBase is available at .
Melanocytes, the pigment-producing cells, arise from multipotent neural crest (NC) cells during embryogenesis. Many genes required for melanocyte development were identified using mouse pigmentation mutants. The variable spotting mouse pigmentation mutant arose spontaneously at the Jackson Laboratory. We identified a G-to-A nucleotide transition in exon 3 of the Ets1 gene in variable spotting, which results in a missense G102E mutation. Homozygous variable spotting mice exhibit sporadic white spotting. Similarly, mice carrying a targeted deletion of Ets1 exhibit hypopigmentation; nevertheless, the function of Ets1 in melanocyte development is unknown. The transcription factor Ets1 is widely expressed in developing organs and tissues, including the NC. In the chick, Ets1 is required for the expression of Sox10, a transcription factor critical for the development of various NC derivatives, including melanocytes. We show that Ets1 is required early for murine NC cell and melanocyte precursor survival in vivo. Given the importance of Ets1 for Sox10 expression in the chick, we investigated a potential genetic interaction between these genes by comparing the hypopigmentation phenotypes of single and double heterozygous mice. The incidence of hypopigmentation in double heterozygotes was significantly greater than in single heterozygotes. The area of hypopigmentation in double heterozygotes was significantly larger than would be expected from the addition of the areas of hypopigmentation of single heterozygotes, suggesting that Ets1 and Sox10 interact synergistically in melanocyte development. Since Sox10 is also essential for enteric ganglia development, we examined the distal colons of Ets1 null mutants and found a significant decrease in enteric innervation, which was exacerbated by Sox10 heterozygosity. At the molecular level, Ets1 was found to activate an enhancer critical for Sox10 expression in NC-derived structures. Furthermore, enhancer activation was significantly inhibited by the variable spotting mutation. Together, these results suggest that Ets1 and Sox10 interact to promote proper melanocyte and enteric ganglia development from the NC.
melanocyte; neural crest; Ets1; Sox10; enteric ganglia
Neandertals, the closest evolutionary relatives of present-day humans, lived in large parts of Europe and western Asia before disappearing 30,000 years ago. We present a draft sequence of the Neandertal genome composed of more than 4 billion nucleotides from three individuals. Comparisons of the Neandertal genome to the genomes of five present-day humans from different parts of the world identify a number of genomic regions that may have been affected by positive selection in ancestral modern humans, including genes involved in metabolism and in cognitive and skeletal development. We show that Neandertals shared more genetic variants with present-day humans in Eurasia than with present-day humans in sub-Saharan Africa, suggesting that gene flow from Neandertals into the ancestors of non-Africans occurred before the divergence of Eurasian groups from each other.
The transcription factor SOX10 is essential for all stages of Schwann cell development including myelination. SOX10 cooperates with other transcription factors to activate the expression of key myelin genes in Schwann cells and is therefore a context-dependent, pro-myelination transcription factor. As such, the identification of genes regulated by SOX10 will provide insight into Schwann cell biology and related diseases. While genome-wide studies have successfully revealed SOX10 target genes, these efforts mainly focused on myelinating stages of Schwann cell development. We propose that less-biased approaches will reveal novel functions of SOX10 outside of myelination.
We developed a stringent, computational-based screen for genome-wide identification of SOX10 response elements. Experimental validation of a pilot set of predicted binding sites in multiple systems revealed that SOX10 directly regulates a previously unreported alternative promoter at SOX6, which encodes a transcription factor that inhibits glial cell differentiation. We further explored the utility of our computational approach by combining it with DNase-seq analysis in cultured Schwann cells and previously published SOX10 ChIP-seq data from rat sciatic nerve. Remarkably, this analysis enriched for genomic segments that map to loci involved in the negative regulation of gliogenesis including SOX5, SOX6, NOTCH1, HMGA2, HES1, MYCN, ID4, and ID2. Functional studies in Schwann cells revealed that: (1) all eight loci are expressed prior to myelination and down-regulated subsequent to myelination; (2) seven of the eight loci harbor validated SOX10 binding sites; and (3) seven of the eight loci are down-regulated upon repressing SOX10 function.
Our computational strategy revealed a putative novel function for SOX10 in Schwann cells, which suggests a model where SOX10 activates the expression of genes that inhibit myelination during non-myelinating stages of Schwann cell development. Importantly, the computational and functional datasets we present here will be valuable for the study of transcriptional regulation, SOX protein function, and glial cell biology.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-016-3167-3) contains supplementary material, which is available to authorized users.
Comparative sequence analysis; Transcriptional regulation; Ultra-conserved sequences; Myelination; Schwann cells; SOX10
Alternative isoform regulation (AIR) vastly increases transcriptome diversity and plays an important role in numerous biological processes and pathologies. However, the detection and analysis of isoform-level differential regulation is difficult, particularly in the face of complex and incompletely-annotated transcriptomes. Here we have used Illumina short-read/high-throughput RNA-Seq to identify 55 genes that exhibit neurally-regulated AIR in the pineal gland, and then used two other complementary experimental platforms to further study and characterize the Ttc8 gene, which is involved in Bardet-Biedl syndrome and non-syndromic retinitis pigmentosa. Use of the JunctionSeq analysis tool led to the detection of several novel exons and splice junctions in this gene, including two novel alternative transcription start sites which were found to display disproportionately strong neurally-regulated differential expression in several independent experiments. These high-throughput sequencing results were validated and augmented via targeted qPCR and long-read Pacific Biosciences SMRT sequencing. We confirmed the existence of numerous novel splice junctions and the selective upregulation of the two novel start sites. In addition, we identified more than 20 novel isoforms of the Ttc8 gene that are co-expressed in this tissue. By using information from multiple independent platforms we not only greatly reduce the risk of errors, biases, and artifacts influencing our results, we also are able to characterize the regulation and splicing of the Ttc8 gene more deeply and more precisely than would be possible via any single platform. The hybrid method outlined here represents a powerful strategy in the study of the transcriptome.
Cnidarians are a group of early branching animals including corals, jellyfish and hydroids that are renowned for their high regenerative ability, growth plasticity and longevity. Because cnidarian genomes are conventional in terms of protein-coding genes, their remarkable features are likely a consequence of epigenetic regulation. To facilitate epigenetics research in cnidarians, we analysed the histone complement of the cnidarian model organism Hydractinia echinata using phylogenomics, proteomics, transcriptomics and mRNA in situ hybridisations.
We find that the Hydractinia genome encodes 19 histones and analyse their spatial expression patterns, genomic loci and replication-dependency. Alongside core and other replication-independent histone variants, we find several histone replication-dependent variants, including a rare replication-dependent H3.3, a female germ cell-specific H2A.X and an unusual set of five H2B variants, four of which are male germ cell-specific. We further confirm the absence of protamines in Hydractinia.
Since no protamines are found in hydroids, we suggest that the novel H2B variants are pivotal for sperm DNA packaging in this class of Cnidaria. This study adds to the limited number of full histone gene complements available in animals and sets a comprehensive framework for future studies on the role of histones and their post-translational modifications in cnidarian epigenetics. Finally, it provides insight into the evolution of spermatogenesis.
Electronic supplementary material
The online version of this article (doi:10.1186/s13072-016-0085-1) contains supplementary material, which is available to authorized users.
Histone; Chromatin; Cnidaria; Histone variants; Sperm-specific histones
Patients with autosomal dominant vibratory urticaria have localized hives and systemic manifestations in response to dermal vibration, with coincident degranulation of mast cells and increased histamine levels in serum. We identified a previously unknown missense substitution in ADGRE2 (also known as EMR2), which was predicted to result in the replacement of cysteine with tyrosine at amino acid position 492 (p.C492Y), as the only nonsynonymous variant cosegregating with vibratory urticaria in two large kindreds. The ADGRE2 receptor undergoes autocatalytic cleavage, producing an extracellular subunit that noncovalently binds a transmembrane subunit. We showed that the variant probably destabilizes an autoinhibitory subunit interaction, sensitizing mast cells to IgE-independent vibration-induced degranulation. (Funded by the National Institutes of Health.)
Reproducibility is receiving increased attention across many domains of science and genomics is no exception. Efforts to identify copy number variations (CNVs) from exome sequence (ES) data have been increasing. Many algorithms have been published to discover CNVs from exomes and a major challenge is the reproducibility in other datasets. Here we test exome CNV calling reproducibility under three conditions: data generated by different sequencing centers; varying sample sizes; and varying capture methodology.
Four CNV tools were tested: eXome Hidden Markov Model (XHMM), Copy Number Inference From Exome Reads (CoNIFER), EXCAVATOR, and Copy Number Analysis for Targeted Resequencing (CONTRA). To examine the reproducibility, we ran the callers on four datasets, varying sample sizes of N = 10, 30, 75, 100, 300, and data with different capture methodology. We examined the false negative (FN) calls and false positive (FP) calls for potential limitations of the CNV callers. The positive predictive value (PPV) was measured by checking the CNV call concordance against single nucleotide polymorphism array.
Using independently generated datasets, we examined the PPV for each dataset and observed wide range of PPVs. The PPV values were highly data dependent (p <0.001). For the sample sizes and capture method analyses, we tested the callers in triplicates. Both analyses resulted in wide ranges of PPVs, even for the same test. Interestingly, negative correlations between the PPV and the sample sizes were observed for CoNIFER (ρ = –0.80). Further examination of FN calls showed that 44 % of these were missed by all callers and were attributed to the CNV size (46 % spanned ≤3 exons). Overlap of the FP calls showed that FPs were unique to each caller, indicative of algorithm dependency.
Our results demonstrate that further improvements in CNV callers are necessary to improve reproducibility and to include wider spectrum of CNVs (including the small CNVs). These CNV callers should be evaluated on multiple independent, heterogeneously generated datasets of varying size to increase robustness and utility. These approaches to the evaluation of exome CNV are essential to support wide utility and applicability of CNV discovery in exome studies.
Electronic supplementary material
The online version of this article (doi:10.1186/s13073-016-0336-6) contains supplementary material, which is available to authorized users.
Copy number variations (CNV); Exomes; CNV predictions; Reproducibility
Broadly neutralizing antibodies (bNAbs) against HIV-1-Env V1V2 arise in multiple donors. However, atomic-level interactions had only been determined with antibodies from a single donor, making commonalities in recognition uncertain. Here we report the co-crystal structure of V1V2 with antibody CH03 from a second donor and model Env interactions of antibody CAP256-VRC26 from a third. These V1V2-directed bNAbs utilized strand-strand interactions between a protruding antibody loop and a V1V2 strand, but differed in their N-glycan recognition. Ontogeny analysis indicated protruding loops to develop early, with glycan interactions maturing over time. Altogether, the multidonor information suggested V1V2-directed bNAbs to form an ‘extended class’, for which we engineered ontogeny-specific antigens: Env trimers with chimeric V1V2s that interacted with inferred ancestor and intermediate antibodies. The ontogeny-based design of vaccine antigens described here may provide a general means for eliciting antibodies of a desired class.
Antibody 10E8 targets the membrane-proximal external region (MPER) of HIV-1 gp41, neutralizes >97% of HIV-1 isolates, and lacks the auto-reactivity often associated with MPER-directed antibodies. The developmental pathway of 10E8 might therefore serve as a promising template for vaccine design, but samples from time-of-infection—often used to infer the B cell record—are unavailable. In this study, we used crystallography, next-generation sequencing (NGS), and functional assessments to infer the 10E8 developmental pathway from a single time point. Mutational analysis indicated somatic hypermutation of the 2nd-heavy chain-complementarity determining region (CDR H2) to be critical for neutralization, and structures of 10E8 variants with V-gene regions reverted to genomic origin for heavy-and-light chains or heavy chain-only showed structural differences >2 Å relative to mature 10E8 in the CDR H2 and H3. To understand these developmental changes, we used bioinformatic sieving, maximum likelihood, and parsimony analyses of immunoglobulin transcripts to identify 10E8-lineage members, to infer the 10E8-unmutated common ancestor (UCA), and to calculate 10E8-developmental intermediates. We were assisted in this analysis by the preservation of a critical D-gene segment, which was unmutated in most 10E8-lineage sequences. UCA and early intermediates weakly bound a 26-residue-MPER peptide, whereas HIV-1 neutralization and epitope recognition in liposomes were only observed with late intermediates. Antibody 10E8 thus develops from a UCA with weak MPER affinity and substantial differences in CDR H2 and H3 from the mature 10E8; only after extensive somatic hypermutation do 10E8-lineage members gain recognition in the context of membrane and HIV-1 neutralization.
Systemic autoinflammatory diseases are driven by abnormal activation of innate immunity1. Herein we describe a new syndrome caused by high penetrance heterozygous germline mutations in the NFκB regulatory protein TNFAIP3 (A20) in six unrelated families with early onset systemic inflammation. The syndrome resembles Behçet’s disease (BD), which is typically considered a polygenic disorder with onset in early adulthood2. A20 is a potent inhibitor of the NFκB signaling pathway3. TNFAIP3 mutant truncated proteins are likely to act by haploinsufficiency since they do not exert a dominant-negative effect in overexpression experiments. Patients’ cells show increased degradation of IκBα and nuclear translocation of NFκB p65, and increased expression of NFκB-mediated proinflammatory cytokines. A20 restricts NFκB signals via deubiquitinating (DUB) activity. In cells expressing the mutant A20 protein, there is defective removal of K63-linked ubiquitin from TRAF6, NEMO, and RIP1 after TNF stimulation. NFκB-dependent pro-inflammatory cytokines are potential therapeutic targets for these patients.
The site on the HIV-1 gp120 glycoprotein that binds the CD4 receptor is recognized by broadly reactive antibodies, several of which neutralize over 90% of HIV-1 strains. To understand how antibodies achieve such neutralization, we isolated CD4-binding-site (CD4bs) antibodies and analyzed 16 co-crystal structures –8 determined here– of CD4bs antibodies from 14 donors. The 16 antibodies segregated by recognition mode and developmental ontogeny into two types: CDR H3-dominated and VH-gene-restricted. Both could achieve greater than 80% neutralization breadth, and both could develop in the same donor. Although paratope chemistries differed, all 16 gp120-CD4bs antibody complexes showed geometric similarity, with antibody-neutralization breadth correlating with antibody-angle of approach relative to the most effective antibody of each type. The repertoire for effective recognition of the CD4 supersite thus comprises antibodies with distinct paratopes arrayed about two optimal geometric orientations, one achieved by CDR H3 ontogenies and the other achieved by VH-gene-restricted ontogenies.
Although RNA-Seq data provide unprecedented isoform-level expression information, detection of alternative isoform regulation (AIR) remains difficult, particularly when working with an incomplete transcript annotation. We introduce JunctionSeq, a new method that builds on the statistical techniques used by the well-established DEXSeq package to detect differential usage of both exonic regions and splice junctions. In particular, JunctionSeq is capable of detecting differential usage of novel splice junctions without the need for an additional isoform assembly step, greatly improving performance when the available transcript annotation is flawed or incomplete. JunctionSeq also provides a powerful and streamlined visualization toolset that allows bioinformaticians to quickly and intuitively interpret their results. We tested our method on publicly available data from several experiments performed on the rat pineal gland and Toxoplasma gondii, successfully detecting known and previously validated AIR genes in 19 out of 19 gene-level hypothesis tests. Due to its ability to query novel splice sites, JunctionSeq is still able to detect these differences even when all alternative isoforms for these genes were not included in the transcript annotation. JunctionSeq thus provides a powerful method for detecting alternative isoform regulation even with low-quality annotations. An implementation of JunctionSeq is available as an R/Bioconductor package.
RUNX1-RUNX1T1; t(8;21); inv(16); CBFB-MYH11; relapse; AML; DHX15; GATA2
Mutations of TCF4, which encodes a basic helix-loop-helix transcription factor, cause Pitt-Hopkins syndrome (PTHS) via multiple genetic mechanisms. TCF4 is a complex locus expressing multiple transcripts by alternative splicing and use of multiple promoters. To address the relationship between mutation of these transcripts and phenotype, we report a three-generation family segregating mild intellectual disability with a chromosomal translocation disrupting TCF4.
Using whole genome sequencing, we detected a complex unbalanced karyotype disrupting TCF4 (46,XY,del(14)(q23.3q23.3)del(18)(q21.2q21.2)del(18)(q21.2q21.2)inv(18)(q21.2q21.2)t(14;18)(q23.3;q21.2)(14pter®14q23.3::18q21.2®18q21.2::18q21.1®18qter;18pter®18q21.2::14q23.3®14qter). Subsequent transcriptome sequencing, qRT-PCR and nCounter analyses revealed that cultured skin fibroblasts and peripheral blood had normal expression of genes along chromosomes 14 or 18 and no marked changes in expression of genes other than TCF4. Affected individuals had 12–33 fold higher mRNA levels of TCF4 than did unaffected controls or individuals with PTHS. Although the derivative chromosome generated a PLEKHG3-TCF4 fusion transcript, the increased levels of TCF4 mRNA arose from transcript variants originating distal to the translocation breakpoint, not from the fusion transcript.
Although validation in additional patients is required, our findings suggest that the dysmorphic features and severe intellectual disability characteristic of PTHS are partially rescued by overexpression of those short TCF4 transcripts encoding a nuclear localization signal, a transcription activation domain, and the basic helix-loop-helix domain.
Electronic supplementary material
The online version of this article (doi:10.1186/s13023-016-0439-6) contains supplementary material, which is available to authorized users.
Intellectual disability; Promoter utilization; Pitt-Hopkins syndrome; TCF4; Gene expression; Translocation; Transcriptome; RNAseq
HIV-1-neutralizing antibodies develop in most HIV-1-infected individuals, although highly effective antibodies are generally observed only after years of chronic infection. Here we characterize the rate of maturation and extent of diversity for the lineage that produced the broadly neutralizing antibody VRC01 through longitudinal sampling of peripheral B cell transcripts over 15 years and co-crystal structures of lineage members. Next-generation sequencing identified VRC01-lineage transcripts, which encompassed diverse antibodies organized into distinct phylogenetic clades. Prevalent clades maintained characteristic features of antigen recognition, though each evolved binding loops and disulfides that formed distinct recognition surfaces. Over the course of the study period, VRC01-lineage clades showed continuous evolution, with rates of ~2 substitutions per 100 nucleotides per year, comparable to that of HIV-1 evolution. This high rate of antibody evolution provides a mechanism by which antibody lineages can achieve extraordinary diversity and, over years of chronic infection, develop effective HIV-1 neutralization.
Whole-exome sequencing (WES) is rapidly evolving into a tool of choice for rapid, and inexpensive identification of molecular genetic lesions within targeted regions of the human genome. While biases in WES coverage of nucleotides in targeted regions are recognized, it is not well understood how repetition of WES improves the interpretation of sequencing results in a clinical diagnostic setting.
To address this, we compared independently generated exome-capture of six individuals from three-generations sequenced in triplicate. This generated between 48x-86x mean target depth of high-quality mapped bases (>Q20) for each technical replicate library. Cumulatively, we achieved 179 - 208x average target coverage for each individual in the pedigree. Using this experimental design, we evaluated stochastics in WES interpretation, genotyping sensitivity, and accuracy to detect de novo variants.
In this study, we show that repetition of WES improved the interpretation of the capture target regions after aggregating the data (93.5 - 93.9 %). Compared to 81.2 - 89.6 % (50.2-55.4 Mb of 61.7 M) coverage of targeted bases at ≥20x in the individual technical replicates, the aggregated data covered 93.5 - 93.9 % of targeted bases (57.7 – 58.0 of 61.7 M) at ≥20x threshold, suggesting a 4.3 – 12.7 % improvement in coverage. Each individual’s aggregate dataset recovered 3.4 – 6.4 million bases within variable targeted regions. We uncovered technical variability (2-5 %) inherent to WES technique. We also show improved interpretation in assessing clinically important regions that lack interpretation under current conditions, affecting 12–16 of the 56 genes recommended for secondary analysis by American College of Medical Genetics (ACMG). We demonstrate that comparing technical replicate WES datasets and their derived aggregate data can effectively address overall WES genotyping discrepancies.
We describe a method to evaluate the reproducibility and stochastics in exome library preparation, and delineate the advantages of aggregating the data derived from technical replicates. The implications of this study are directly applicable to improved experimental design and provide an opportunity to rapidly, efficiently, and accurately arrive at reliable candidate nucleotide variants.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-015-2107-y) contains supplementary material, which is available to authorized users.
Public health officials have raised concerns that plasmid transfer between Enterobacteriaceae species may spread resistance to carbapenems, an antibiotic class of last resort, thereby rendering common healthcare-associated infections nearly impossible to treat. We performed comprehensive surveillance and genomic sequencing to identify carbapenem-resistant Enterobacteriaceae in the NIH Clinical Center patient population and hospital environment in order to to articulate the diversity of carbapenemase-encoding plasmids and survey the mobility of and assess the mobility of these plasmids between bacterial species. We isolated a repertoire of carbapenemase-encoding Enterobacteriaceae, including multiple strains of Klebsiella pneumoniae, Klebsiella oxytoca, Escherichia coli, Enterobacter cloacae, Citrobacter freundii, and Pantoea species. Long-read genome sequencing with full end-to-end assembly revealed that these organisms carry the carbapenem-resistance genes on a wide array of plasmids. Klebsiella pneumoniae and Enterobacter cloacae isolated simultaneously from a single patient harbored two different carbapenemase-encoding plasmids, overriding the epidemiological scenario of plasmid transfer between organisms within this patient. We did, however, find evidence supporting horizontal transfer of carbapenemase-encoding plasmids between Klebsiella pneumoniae, Enterobacter cloacae and Citrobacter freundii in the hospital environment. Our comprehensive sequence data, with full plasmid identification, challenges assumptions about horizontal gene transfer events within patients and identified wider possible connections between patients and the hospital environment. In addition, we identified a new carbapenemase-encoding plasmid of potentially high clinical impact carried by Klebsiella pneumoniae, Escherichia coli, Enterobacter cloacae and Pantoea species, from unrelated patients and the hospital environment.
The term neurotranscriptomics is used here to describe genome-wide analysis of neural control of transcriptomes. In this report, next-generation RNA sequencing was using to analyze the effects of neonatal (5-days-of-age) surgical stimulus deprivation on the adult rat pineal transcriptome. In intact animals, more than 3000 coding genes were found to exhibit differential expression (adjusted-p < 0.001) on a night/day basis in the pineal gland (70% of these increased at night, 376 genes changed more than 4-fold in either direction). Of these, more than two thousand genes were not previously known to be differentially expressed on a night/day basis. The night/day changes in expression were almost completely eliminated by neonatal removal (SCGX) or decentralization (DCN) of the superior cervical ganglia (SCG), which innervate the pineal gland. Other than the loss of rhythmic variation, surgical stimulus deprivation had little impact on the abundance of most genes; of particular interest, expression levels of the melatonin-synthesis-related genes Tph1, Gch1, and Asmt displayed little change (less than 35%) following DCN or SCGX. However, strong and consistent changes were observed in the expression of a small number of genes including the gene encoding Serpina1, a secreted protease inhibitor that might influence extracellular architecture. Many of the genes that exhibited night/day differential expression in intact animals also exhibited similar changes following in vitro treatment with norepinephrine, a superior cervical ganglia transmitter, or with an analog of cyclic AMP, a norepinephrine second messenger in this tissue. These findings are of significance in that they establish that the pineal-defining transcriptome is established prior to the neonatal period. Further, this work expands our knowledge of the biological process under neural control in this tissue and underlines the value of RNA sequencing in revealing how neurotransmission influences cell biology.
Limb body wall complex (LBWC) and amniotic band sequence (ABS) are multiple congenital anomaly conditions with craniofacial, limb, and ventral wall defects. LBWC and ABS are considered separate entities by some, and a continuum of severity of the same condition by others. The etiology of LBWC/ABS remains unknown and multiple hypotheses have been proposed. One individual with features of LBWC and his unaffected parents were whole exome sequenced and Sanger sequenced as confirmation of the mutation. Functional studies were conducted using morpholino knockdown studies followed by human mRNA rescue experiments. Using whole exome sequencing, a de novo heterozygous mutation was found in the gene IQCK: c.667C>G; p.Q223E and confirmed by Sanger sequencing in an individual with LBWC. Morpholino knockdown of iqck mRNA in the zebrafish showed ventral defects including failure of ventral fin to develop and cardiac edema. Human wild-type IQCK mRNA rescued the zebrafish phenotype, whereas human p.Q223E IQCK mRNA did not, but worsened the phenotype of the morpholino knockdown zebrafish. This study supports a genetic etiology for LBWC/ABS, or potentially a new syndrome.
Amniotic bands; ectopia cordis; limb anomalies; ventral midline defect
Despite the tremendous drop in the cost of nucleotide sequencing in recent years, many research projects still utilize sequencing of pools containing multiple samples for the detection of sequence variants as a cost saving measure. Various software tools exist to analyze these pooled sequence data, yet little has been reported on the relative accuracy and ease of use of these different programs.
In this manuscript we evaluate five different variant detection programs—The Genome Analysis Toolkit (GATK), CRISP, LoFreq, VarScan, and SNVer—with regard to their ability to detect variants in synthetically pooled Illumina sequencing data, by creating simulated pooled binary alignment/map (BAM) files using single-sample sequencing data from varying numbers of previously characterized samples at varying depths of coverage per sample. We report the overall runtimes and memory usage of each program, as well as each program’s sensitivity and specificity to detect known true variants.
GATK, CRISP, and LoFreq all gave balanced accuracy of 80 % or greater for datasets with varying per-sample depth of coverage and numbers of samples per pool. VarScan and SNVer generally had balanced accuracy lower than 80 %. CRISP and LoFreq required up to four times less computational time and up to ten times less physical memory than GATK did, and without filtering, gave results with the highest sensitivity. VarScan and SNVer had generally lower false positive rates, but also significantly lower sensitivity than the other three programs.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0624-y) contains supplementary material, which is available to authorized users.
Pooling; Sequencing; Algorithms
High-throughput next-generation RNA sequencing has matured into a viable and powerful method for detecting variations in transcript expression and regulation. Proactive quality control is of critical importance as unanticipated biases, artifacts, or errors can potentially drive false associations and lead to flawed results.
We have developed the Quality of RNA-Seq Toolset, or QoRTs, a comprehensive, multifunction toolset that assists in quality control and data processing of high-throughput RNA sequencing data.
QoRTs generates an unmatched variety of quality control metrics, and can provide cross-comparisons of replicates contrasted by batch, biological sample, or experimental condition, revealing any outliers and/or systematic issues that could drive false associations or otherwise compromise downstream analyses. In addition, QoRTs simultaneously replaces the functionality of numerous other data-processing tools, and can quickly and efficiently generate quality control metrics, coverage counts (for genes, exons, and known/novel splice-junctions), and browser tracks. These functions can all be carried out as part of a single unified data-processing/quality control run, greatly reducing both the complexity and the total runtime of the analysis pipeline. The software, source code, and documentation are available online at http://hartleys.github.io/QoRTs.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-015-0670-5) contains supplementary material, which is available to authorized users.
Quality Control; RNA-Seq; Next-generation sequencing; Differential expression; Differential transcript regulation; Differential splicing
Over the past 5 years, a new generation of highly potent and broadly neutralizing HIV-1 antibodies has been identified. These antibodies can protect against lentiviral infection in nonhuman primates (NHPs), suggesting that passive antibody transfer would prevent HIV-1 transmission in humans. To increase the protective efficacy of such monoclonal antibodies, we employed next-generation sequencing, computational bioinformatics, and structure-guided design to enhance the neutralization potency and breadth of VRC01, an antibody that targets the CD4 binding site of the HIV-1 envelope. One variant, VRC07-523, was 5- to 8-fold more potent than VRC01, neutralized 96% of viruses tested, and displayed minimal autoreactivity. To compare its protective efficacy to that of VRC01 in vivo, we performed a series of simian-human immunodeficiency virus (SHIV) challenge experiments in nonhuman primates and calculated the doses of VRC07-523 and VRC01 that provide 50% protection (EC50). VRC07-523 prevented infection in NHPs at a 5-fold lower concentration than VRC01. These results suggest that increased neutralization potency in vitro correlates with improved protection against infection in vivo, documenting the improved functional efficacy of VRC07-523 and its potential clinical relevance for protecting against HIV-1 infection in humans.
IMPORTANCE In the absence of an effective HIV-1 vaccine, alternative strategies are needed to block HIV-1 transmission. Direct administration of HIV-1-neutralizing antibodies may be able to prevent HIV-1 infections in humans. This approach could be especially useful in individuals at high risk for contracting HIV-1 and could be used together with antiretroviral drugs to prevent infection. To optimize the chance of success, such antibodies can be modified to improve their potency, breadth, and in vivo half-life. Here, knowledge of the structure of a potent neutralizing antibody, VRC01, that targets the CD4-binding site of the HIV-1 envelope protein was used to engineer a next-generation antibody with 5- to 8-fold increased potency in vitro. When administered to nonhuman primates, this antibody conferred protection at a 5-fold lower concentration than the original antibody. Our studies demonstrate an important correlation between in vitro assays used to evaluate the therapeutic potential of antibodies and their in vivo effectiveness.