Immune systems evolve as essential strategies to maintain homeostasis with the environment, prevent microbial assault and recycle damaged host tissues. The immune system is composed of two components, innate and adaptive immunity. The former is common to all animals while the latter consists of a vertebrate-specific system that relies on somatically derived lymphocytes and is associated with near limitless genetic diversity as well as long-term memory. Deuterostome invertebrates provide a view of immune repertoires in phyla that immediately predate the origins of vertebrates. Genomic studies in amphioxus, a cephalochordate, have revealed homologs of genes encoding most innate immune receptors found in vertebrates; however, many of the gene families have undergone dramatic expansions, greatly increasing the innate immune repertoire. In addition, domain-swapping accounts for the innovation of new predicted pathways of receptor function. In both amphioxus and Ciona, a urochordate, the VCBPs (variable region containing chitin-binding proteins), which consist of immunoglobulin V (variable) and chitin binding domains, mediate recognition through the V domains. The V domains of VCBPs in amphioxus exhibit high levels of allelic complexity that presumably relate to functional specificity. Various features of the amphioxus immune repertoire reflect novel selective pressures, which likely have resulted in innovative strategies. Functional genomic studies underscore the value of amphioxus as a model for studying innate immunity and may help reveal how unique relationships between innate immune receptors and both pathogens and symbionts factored in the evolution of adaptive immune systems.
innate immunity; Toll-like receptors; expanded immune repertoire; allelic complexity; gut immunity
The p53 transcription factor regulates the synthesis of mRNAs encoding proteins involved in diverse cellular stress responses such as cell-cycle arrest, apoptosis, autophagy and senescence. In this review, we discuss how these mRNAs are concurrently regulated at the post-transcriptional level by microRNAs (miRNAs) and RNA-binding proteins (RBPs), which consequently modify the p53 transcriptional program in a cell type- and stimulus-specific manner. We also discuss the action of specific miRNAs and RBPs that are direct transcriptional targets of p53 and how they act coordinately with protein-coding p53 target genes to orchestrate p53-dependent cellular responses.
p53; post-transcriptional regulation; RNA-binding proteins; miRNA
The regulation of mRNA translation is a major checkpoint in the flux of information from the transcriptome to the proteome. Critical for translational control are the trans-acting factors, RNA-binding proteins (RBPs) and small RNAs that bind to the mRNA and modify its translatability. This review summarizes the mechanisms by which RBPs regulate mRNA translation, with special focus on those binding to the 3′-untranslated region. It also discusses how recent high-throughput technologies are revealing exquisite layers of complexity and are helping to untangle translational regulation at a genome-wide scale.
RNA-binding protein; translation; UTR; RNP; CLIP; ribosome profiling
The mitogen-activated protein kinase kinases (the MAPK/ERK kinases; MKKs or MEKs) and their downstream substrates, the extracellular-regulated kinases have been intensively studied for their roles in development and disease. Until recently, it had been assumed any mutation affecting their function would have lethal consequences. However, the identification of MEK1 and MEK2 mutations in developmental syndromes as well as chemotherapy-resistant tumors, and the discovery of genomic variants in MEK1 and MEK2 have led to the realization the extent of genomic variation associated with MEKs is much greater than had been appreciated. In this review, we will discuss these recent advances, relating them to what is currently understood about the structure and function of MEKs, and describe how they change our understanding of the role of MEKs in development and disease.
MEK; MAPK; ERK; cardio-facial cutaneous syndrome; cancer; SNP
Piwi-interacting RNAs (piRNAs) and CRISPR RNAs (crRNAs) are two recently discovered classes of small noncoding RNA that are found in animals and prokaryotes, respectively. Both of these novel RNA species function as components of adaptive immune systems that protect their hosts from foreign nucleic acids—piRNAs repress transposable elements in animal germlines, whereas crRNAs protect their bacterial hosts from phage and plasmids. The piRNA and CRISPR systems are nonhomologous but rather have independently evolved into logically similar defense mechanisms based on the specificity of targeting via nucleic acid base complementarity. Here we review what is known about the piRNA and CRISPR systems with a focus on comparing their evolutionary properties. In particular, we highlight the importance of several factors on the pattern of piRNA and CRISPR evolution, including the population genetic environment, the role of alternate defense systems and the mechanisms of acquisition of new piRNAs and CRISPRs.
piRNA; CRISPR; co-evolution; transposable elements; phage; plasmids
One of the main motivations to study amphioxus is its potential for understanding the last common ancestor of chordates, which notably gave rise to the vertebrates. An important feature in this respect is the slow evolutionary rate that seems to have characterized the cephalochordate lineage, making amphioxus an interesting proxy for the chordate ancestor, as well as a key lineage to include in comparative studies. Whereas slow evolution was first noticed at the phenotypic level, it has also been described at the genomic level. Here, we examine whether the amphioxus genome is indeed a good proxy for the genome of the chordate ancestor, with a focus on protein-coding genes. We investigate genome features, such as synteny, gene duplication and gene loss, and contrast the amphioxus genome with those of other deuterostomes that are used in comparative studies, such as Ciona, Oikopleura and urchin.
deuterostomes; evolutionary rates; gene duplication; gene loss; orthology; synteny
DNA microarrays have emerged as a viable platform for detection of pathogenic organisms in clinical and environmental samples. These microbial detection arrays occupy a middle ground between low cost, narrowly focused assays such as multiplex PCR and more expensive, broad-spectrum technologies like high-throughput sequencing. While pathogen detection arrays have been used primarily in a research context, several groups are aggressively working to develop arrays for clinical diagnostics, food safety testing, environmental monitoring and biodefense. Statistical algorithms that can analyze data from microbial detection arrays and provide easily interpretable results are absolutely required in order for these efforts to succeed. In this article, we will review the most promising array designs and analysis algorithms that have been developed to date, comparing their strengths and weaknesses for pathogen detection and discovery.
microarrays; pathogens; genomics
In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings.
targeted enrichment; next-generation sequencing; genome partitioning; exome; genetic variation
Schizophrenia (SZ) is a complex disorder resulting from both genetic and environmental causes with a lifetime prevalence world-wide of 1%; however, there are no specific, sensitive and validated biomarkers for SZ. A general unifying hypothesis has been put forward that disease-associated single nucleotide polymorphisms (SNPs) from genome-wide association study (GWAS) are more likely to be associated with gene expression quantitative trait loci (eQTL). We will describe this hypothesis and review primary methodology with refinements for testing this paradigmatic approach in SZ. We will describe biomarker studies of SZ and testing enrichment of SNPs that are associated both with eQTLs and existing GWAS of SZ. SZ-associated SNPs that overlap with eQTLs can be placed into gene–gene expression, protein–protein and protein–DNA interaction networks. Further, those networks can be tested by reducing/silencing the gene expression levels of critical nodes. We present pilot data to support these methods of investigation such as the use of eQTLs to annotate GWASs of SZ, which could be applied to the field of biomarker discovery. Those networks that have association with SNP markers, especially cis-regulated expression, might lead to a more clear understanding of important candidate genes that predispose to disease and alter expression. This method has general application to many complex disorders.
expression quantitative trait loci; cis-regulatory SNPs; GWAS; gene expression; lymphoblastoid cell lines
The systematic investigation of the phenotypes associated with genotypes in model organisms holds the promise of revealing genotype–phenotype relations directly and without additional, intermediate inferences. Large-scale projects are now underway to catalog the complete phenome of a species, notably the mouse. With the increasing amount of phenotype information becoming available, a major challenge that biology faces today is the systematic analysis of this information and the translation of research results across species and into an improved understanding of human disease. The challenge is to integrate and combine phenotype descriptions within a species and to systematically relate them to phenotype descriptions in other species, in order to form a comprehensive understanding of the relations between those phenotypes and the genotypes involved in human disease. We distinguish between two major approaches for comparative phenotype analyses: the first relies on evolutionary relations to bridge the species gap, while the other approach compares phenotypes directly. In particular, the direct comparison of phenotypes relies heavily on the quality and coherence of phenotype and disease databases. We discuss major achievements and future challenges for these databases in light of their potential to contribute to the understanding of the molecular mechanisms underlying human disease. In particular, we discuss how the use of ontologies and automated reasoning can significantly contribute to the analysis of phenotypes and demonstrate their potential for enabling translational research.
phenotype; animal model; disease; database; comparative phenomics; ontology
With the growing number of microRNAs (miRNAs) being identified each year, more innovative molecular tools are required to efficiently characterize these small RNAs in living animal systems. Caenorhabditis elegans is a powerful model to study how miRNAs regulate gene expression and control diverse biological processes during development and in the adult. Genetic strategies such as large-scale miRNA deletion studies in nematodes have been used with limited success since the majority of miRNA genes do not exhibit phenotypes when individually mutated. Recent work has indicated that miRNAs function in complex regulatory networks with other small RNAs and protein-coding genes, and therefore the challenge will be to uncover these functional redundancies. The use of miRNA inhibitors such as synthetic antisense 2′-O-methyl oligoribonucleotides is emerging as a promising in vivo approach to dissect out the intricacies of miRNA regulation.
microRNA; Caenorhabditis elegans; miRNA inhibitors; antisense 2′-O-methyl oligoribonucleotides
Drosophila melanogaster has a long history as a model organism with several unique features that make it an ideal research tool for the study of the relationship between genotype and phenotype. Importantly fundamental genetic principles as well as key human disease genes have been uncovered through the use of Drosophila. The contribution of the fruit fly to science and medicine continues in the postgenomic era as cell-based Drosophila RNAi screens are a cost-effective and scalable enabling technology that can be used to quantify the contribution of different genes to diverse cellular processes. Drosophila high-throughput screens can also be used as integral part of systems-level approaches to describe the architecture and dynamics of cellular networks.
RNAi, cell-based screens; Drosophila melanogaster; genetic interactions; systems biology
Morpholino oligonucleotides (MOs) are an effective, gene-specific antisense knockdown technology used in many model systems. Here we describe the application of MOs in zebrafish (Danio rerio) for in vivo functional characterization of gene activity. We summarize our screening experience beginning with gene target selection. We then discuss screening parameter considerations and data and database management. Finally, we emphasize the importance of off-target effect management and thorough downstream phenotypic validation. We discuss current morpholino limitations, including reduced stability when stored in aqueous solution. Advances in MO technology now provide a measure of spatiotemporal control over MO activity, presenting the opportunity for incorporating more finely tuned analyses into MO-based screening. Therefore, with careful management, MOs remain a valuable tool for discovery screening as well as individual gene knockdown analysis.
morpholinos; zebrafish; knockdown
More than 70 years after the first ex situ genebanks have been established, major efforts in this field are still concerned with issues related to further completion of individual collections and securing of their storage. Attempts regarding valorization of ex situ collections for plant breeders have been hampered by the limited availability of phenotypic and genotypic information. With the advent of molecular marker technologies first efforts were made to fingerprint genebank accessions, albeit on a very small scale and mostly based on inadequate DNA marker systems. Advances in DNA sequencing technology and the development of high-throughput systems for multiparallel interrogation of thousands of single nucleotide polymorphisms (SNPs) now provide a suite of technological platforms facilitating the analysis of several hundred of Gigabases per day using state-of-the-art sequencing technology or, at the same time, of thousands of SNPs. The present review summarizes recent developments regarding the deployment of these technologies for the analysis of plant genetic resources, in order to identify patterns of genetic diversity, map quantitative traits and mine novel alleles from the vast amount of genetic resources maintained in genebanks around the world. It also refers to the various shortcomings and bottlenecks that need to be overcome to leverage the full potential of high-throughput DNA analysis for the targeted utilization of plant genetic resources.
genetic resources; next-generation sequencing; SNP; allele mining; genetic diversity; association analysis
Enhancers, silencer and insulators are DNA elements that play central roles in regulation of the genome that are crucial for development and differentiation. In metazoans, these elements are often separated from target genes by distances that can reach 100 Kb. How regulation can be accomplished over long distances has long been intriguing. Current data indicate that although the mechanisms by which these diverse regulatory elements affect gene transcription may vary, an underlying feature is the establishment of close contacts or chromatin loops. With the generalization of this principle, new questions emerge, such as how the close contacts are formed and stabilized and, importantly, how they contribute to the regulation of transcriptional output at target genes. This review will concentrate on examples where a functional role and a mechanistic understanding has been explored for loops formed between genes and their regulatory elements or among the elements themselves.
long range interactions; chromosome conformation capture; enhancers; silencers; insulators
The eukaryotic cell nucleus displays a high degree of spatial organization, with discrete functional subcompartments that provide microenvironments where specialized processes take place. Concordantly, the genome also adopts defined conformations that, in part, enable specific genomic regions to interface with these functional centers. Yet the roles of many subcompartments and the genomic regions that contact them have not been explored fully. More fundamentally, it is not entirely clear how genome organization impacts function, and vice versa. The past decade has witnessed the development of a new breed of methods that are capable of assessing the spatial organization of the genome. These stand to further our understanding of the relationship between genome structure and function, and potentially assign function to various nuclear subcompartments. Here, we review the principal techniques used for analyzing genomic interactions, the functional insights they have afforded and discuss the outlook for future advances in nuclear structure and function dynamics.
genome organization; nuclear structure and function; chromosome conformation capture; next generation sequencing
Pluripotent embryonic stem (ES) cells are specialized cells with a dynamic chromatin structure, which is intimately connected with their pluripotency and physiology. In recent years somatic cells have been reprogrammed to a pluripotent state through over-expression of a defined set of transcription factors. These cells, known as induced pluripotent stem (iPS) cells, recapitulate ES cell properties and can be differentiated to apparently all cell lineages, making iPS cells a suitable replacement for ES cells in future regenerative medicine. Chromatin modifiers play a key function in establishing and maintaining pluripotency, therefore, elucidating the mechanisms controlling chromatin structure in both ES and iPS cells is of utmost importance to understanding their properties and harnessing their therapeutic potential. In this review, we discuss recent studies that provide a genome-wide view of the chromatin structure signature in ES cells and iPS cells and that highlight the central role of histone modifiers and chromatin remodelers in pluripotency maintenance and induction.
embryonic stem cells; induced pluripotent stem cells; reprogramming; epigenetics; chromatin structure; differentiation
The mechanisms regulating the coordinate activation of tens of thousands of replication origins in multicellular organisms remain poorly explored. Recent advances in genomics have provided valuable information about the sites at which DNA replication is initiated and the selection mechanisms of specific sites in both yeast and vertebrates. Studies in yeast have advanced to the point that it is now possible to develop convincing models for origin selection. A general model has emerged, but yeast data have also revealed an unsuspected diversity of strategies for origin positioning. We focus here on the ways in which chromatin structure may affect the formation of pre-replication complexes, a prerequisite for origin activation. We also discuss the need to exercise caution when trying to extrapolate yeast models directly to more complex vertebrate genomes.
DNA replication origin; nucleosome positioning; chromatin structure; transcription factors; genome-wide studies
Chromatin modifications at both histones and DNA are critical for regulating gene expression. Mis-regulation of such epigenetic marks can lead to pathological states; indeed, cancer affecting the hematopoietic system is frequently linked to epigenetic abnormalities. Here, we discuss the different types of modifications and their general impact on transcription, as well as the polycomb group of proteins, which effect transcriptional repression and are often mis-regulated. Further, we discuss how chromosomal translocations leading to fusion proteins can aberrantly regulate gene transcription through chromatin modifications within the hematopoietic system. PML–RARa, AML1–ETO and MLL-fusions are examples of fusion proteins that mis-regulate epigenetic modifications (either directly or indirectly), which can lead to acute myeloblastic leukemia (AML). An in-depth understanding of the mechanisms behind the mis-regulation of epigenetic modifications that lead to the development and progression of AMLs could be critical for designing effective treatments.
chromatin; epigenetics; transcription; leukemia; polycom
Structural variations are widespread in the human genome and can serve as genetic markers in clinical and evolutionary studies. With the advances in the next-generation sequencing technology, recent methods allow for identification of structural variations with unprecedented resolution and accuracy. They also provide opportunities to discover variants that could not be detected on conventional microarray-based platforms, such as dosage-invariant chromosomal translocations and inversions. In this review, we will describe some of the sequencing-based algorithms for detection of structural variations and discuss the key issues in future development.
copy number variations; paired-end sequencing; chromosomal alterations; translocations; indels
Next generation sequencing has brought epigenomic studies to the forefront of current research. The power of massively parallel sequencing coupled to innovative molecular and computational techniques has allowed researchers to profile the epigenome at resolutions that were unimaginable only a few years ago. With early proof of concept studies published, the field is now moving into the next phase where the importance of method standardization and rigorous quality control are becoming paramount. In this review we will describe methodologies that have been developed to profile the epigenome using next generation sequencing platforms. We will discuss these in terms of library preparation, sequence platforms and analysis techniques.
epigenomics; next generation sequencing
The epigenome plays the pivotal role as interface between genome and environment. True genome-wide assessments of epigenetic marks, such as DNA methylation (methylomes) or chromatin modifications (chromatinomes), are now possible, either through high-throughput arrays or increasingly by second-generation DNA sequencing methods. The ability to collect these data at this level of resolution enables us to begin to be able to propose detailed questions, and interrogate this information, with regards to changes that occur due to development, lineage and tissue-specificity, and significantly those caused by environmental influence, such as ageing, stress, diet, hormones or toxins. Common complex traits are under variable levels of genetic influence and additionally epigenetic effect. The detection of pathological epigenetic alterations will reveal additional insights into their aetiology and how possible environmental modulation of this mechanism may occur. Due to the reversibility of these marks, the potential for sequence-specific targeted therapeutics exists. This review surveys recent epigenomic advances and their current and prospective application to the study of common diseases.
Genomics; epigenetics; epigenomics; common disease; complex traits; gene environment interaction
Gene Set Enrichment (GSE) is a computational technique which determines whether a priori defined set of genes show statistically significant differential expression between two phenotypes. Currently, the gene sets used for GSE are derived from annotation or pathway databases, which often contain computationally based and unrepresentative data. Here, we propose a novel approach for the generation of comprehensive and biologically derived gene sets, deriving sets through the application of machine learning techniques to gene expression data. These gene sets can be produced for specific tissues, developmental stages or environments. They provide a powerful and functionally meaningful way in which to mine genomewide association and next generation sequencing data in order to identify disease-associated variants and pathways.
gene set enrichment; annotation database; gene expression data; machine learning; next generation sequencing