Drosophilists have identified many, or perhaps most, of the key regulatory genes determining sex using classical genetics, however, regulatory genes must ultimately result in the deployment of the genome in a quantitative manner, replete with complex interactions with other regulatory pathways. In the last decade, genomics has provided a rich picture of the transcriptional profile of the sexes that underlies sexual dimorphism. The current challenge is linking transcriptional profiles with the regulatory genes. This will be a complex synthesis, but the prospects for progress are outstanding.
drosophila; transcriptome; sex determination; effector; selector
Researchers have now had access to the fully sequenced Drosophila melanogaster genome for over a decade, and the sequenced genomes of 11 additional Drosophila species have been available for almost 5 years, with more species’ genomes becoming available every year [Adams MD, Celniker SE, Holt RA, et al. The genome sequence of Drosophila melanogaster. Science 2000;287:2185–95; Clark AG, Eisen MB, Smith DR, et al. Evolution of genes and genomes on the Drosophila phylogeny. Nature 2007;450:203–18]. Although the best studied of the D. melanogaster transcription factors (TFs) were cloned before sequencing of the genome, the availability of sequence data promised to transform our understanding of TFs and gene regulatory networks. Sequenced genomes have allowed researchers to generate tools for high-throughput characterization of gene expression levels, genome-wide TF localization and analyses of evolutionary constraints on DNA elements across multiple species. With an estimated 700 DNA-binding proteins in the Drosophila genome, it will be many years before each potential sequence-specific TF is studied in detail, yet the last decade of functional genomics research has already impacted our view of gene regulatory networks and TF DNA recognition.
Drosophila; transcription factor; genomics; enhancer; Zelda
Plasmodium falciparum is an obligate intracellular parasite and the leading cause of severe malaria responsible for tremendous morbidity and mortality particularly in sub-Saharan Africa. Successful completion of the P. falciparum genome sequencing project in 2002 provided a comprehensive foundation for functional genomic studies on this pathogen in the following decade. Over this period, a large spectrum of experimental approaches has been deployed to improve and expand the scope of functionally annotated genes. Meanwhile, rapidly evolving methods of systems biology have also begun to contribute to a more global understanding of various aspects of the biology and pathogenesis of malaria. Herein we provide an overview on metabolic modelling, which has the capability to integrate information from functional genomics studies in P. falciparum and guide future malaria research efforts towards the identification of novel candidate drug targets.
Plasmodium falciparum; central carbon metabolism; systems biology; flux-balance analysis; constraint-based modelling; in silico gene essentiality
Methylation of histone H3 at lysine 4 (H3K4) is a conserved feature of active chromatin catalyzed by methyltransferases of the SET1-family (SET1A, SET1B, MLL1, MLL2, MLL3 and MLL4 in humans). These enzymes participate in diverse gene regulatory networks with a multitude of known biological functions, including direct involvement in several human disease states. Unlike most lysine methyltransferases, SET1-family enzymes are only fully active in the context of a multi-subunit complex, which includes a protein module comprised of WDR5, RbBP5, ASH2L and DPY-30 (WRAD). These proteins bind in close proximity to the catalytic SET domain of SET1-family enzymes and stimulate H3K4 methyltransferase activity. The mechanism by which WRAD promotes catalysis involves elements of allosteric control and possibly the utilization of a second H3K4 methyltransferase active site present within WRAD itself. WRAD components also engage in physical interactions that recruit SET1-family proteins to target sites on chromatin. Here, the known molecular mechanisms through which WRAD enables the function of SET1-related enzymes will be reviewed.
SET1; MLL; WDR5; RbBP5; ASH2L; DPY-30
Epigenetic genome marking and chromatin regulation are central to establishing tissue-specific gene expression programs, and hence to several biological processes. Until recently, the only known epigenetic mark on DNA in mammals was 5-methylcytosine, established and propagated by DNA methyltransferases and generally associated with gene repression. All of a sudden, a host of new actors—novel cytosine modifications and the ten eleven translocation (TET) enzymes—has appeared on the scene, sparking great interest. The challenge is now to uncover the roles they play and how they relate to DNA demethylation. Knowledge is accumulating at a frantic pace, linking these new players to essential biological processes (e.g. cell pluripotency and development) and also to cancerogenesis. Here, we review the recent progress in this exciting field, highlighting the TET enzymes as epigenetic DNA modifiers, their physiological roles, and their functions in health and disease. We also discuss the need to find relevant TET interactants and the newly discovered TET–O-linked N-acetylglucosamine transferase (OGT) pathway.
epigenetics; DNA methylation; hydroxymethylation; TET proteins; OGT
There is an increasing availability of complete or draft genome sequences for microbial organisms. These data form a potentially valuable resource for genotype–phenotype association and gene function prediction, provided that phenotypes are consistently annotated for all the sequenced strains. In this review, we address the requirements for successful gene-trait matching. We outline a basic protocol for microbial functional genomics, including genome assembly, annotation of genotypes (including single nucleotide polymorphisms, orthologous groups and prophages), data pre-processing, genotype–phenotype association, visualization and interpretation of results. The methodologies for association described herein can be applied to other data types, opening up possibilities to analyze transcriptome–phenotype associations, and correlate microbial population structure or activity, as measured by metagenomics, to environmental parameters.
genotype–phenotype association; genome-wide association studies; functional genomics; microbial genomics; random forest
Epigenetic memory represents a natural mechanism whereby the identity of a cell is maintained through successive cell cycles, allowing the specification and maintenance of differentiation during development and in adult cells. Cancer is a loss or reversal of the stable differentiated state of adult cells and may be mediated in part by epigenetic changes. The identity of somatic cells can also be reversed experimentally by nuclear reprogramming. Nuclear reprogramming experiments reveal the mechanisms required to activate embryonic gene expression in adult cells and thus provide insight into the reversal of epigenetic memory. In this article, we will introduce epigenetic memory and the mechanisms by which it may operate. We limit our discussion primarily to the context of nuclear reprogramming and briefly discuss the relevance of memory and reprogramming to cancer biology.
epigenetic memory; nuclear reprogramming; cancer; histone modifications; DNA methylation
Immune systems evolve as essential strategies to maintain homeostasis with the environment, prevent microbial assault and recycle damaged host tissues. The immune system is composed of two components, innate and adaptive immunity. The former is common to all animals while the latter consists of a vertebrate-specific system that relies on somatically derived lymphocytes and is associated with near limitless genetic diversity as well as long-term memory. Deuterostome invertebrates provide a view of immune repertoires in phyla that immediately predate the origins of vertebrates. Genomic studies in amphioxus, a cephalochordate, have revealed homologs of genes encoding most innate immune receptors found in vertebrates; however, many of the gene families have undergone dramatic expansions, greatly increasing the innate immune repertoire. In addition, domain-swapping accounts for the innovation of new predicted pathways of receptor function. In both amphioxus and Ciona, a urochordate, the VCBPs (variable region containing chitin-binding proteins), which consist of immunoglobulin V (variable) and chitin binding domains, mediate recognition through the V domains. The V domains of VCBPs in amphioxus exhibit high levels of allelic complexity that presumably relate to functional specificity. Various features of the amphioxus immune repertoire reflect novel selective pressures, which likely have resulted in innovative strategies. Functional genomic studies underscore the value of amphioxus as a model for studying innate immunity and may help reveal how unique relationships between innate immune receptors and both pathogens and symbionts factored in the evolution of adaptive immune systems.
innate immunity; Toll-like receptors; expanded immune repertoire; allelic complexity; gut immunity
The involvement of epigenetic processes in the origin and progression of cancer is now widely appreciated. Consequently, targeting the enzymatic machinery that controls the epigenetic regulation of the genome has emerged as an attractive new strategy for therapeutic intervention. The development of epigenetic drugs requires a detailed knowledge of the processes that govern chromatin regulation. Over the recent years, mass spectrometry (MS) has become an indispensable tool in epigenetics research. In this review, we will give an overview of the applications of MS-based proteomics in studying various aspects of chromatin biology. We will focus on the use of MS in the discovery and mapping of histone modifications and how novel proteomic approaches are being utilized to identify and study chromatin-associated proteins and multi-subunit complexes. Finally, we will discuss the application of proteomic methods in the diagnosis and prognosis of cancer based on epigenetic biomarkers and comment on their future impact on cancer epigenetics.
Mass spectrometry; quantitative proteomics; chromatin; histone modifications; epigenetic readers; cancer epigenetics
Carcinogenesis is thought to occur through a combination of mutational and epimutational events that disrupt key pathways regulating cellular growth and division. The DNA methylomes of cancer cells can exhibit two striking differences from normal cells; a global reduction of DNA methylation levels and the aberrant hypermethylation of some sequences, particularly CpG islands (CGIs). This aberrant hypermethylation is often invoked as a mechanism causing the transcriptional inactivation of tumour suppressor genes that directly drives the carcinogenic process. Here, we review our current understanding of this phenomenon, focusing on how global analysis of cancer methylomes indicates that most affected CGI genes are already silenced prior to aberrant hypermethylation during cancer development. We also discuss how genome-scale analyses of both normal and cancer cells have refined our understanding of the elusive mechanism(s) that may underpin aberrant CGI hypermethylation.
epigenetics; epigenomics; cancer; DNA methylation; CpG islands
The p53 transcription factor regulates the synthesis of mRNAs encoding proteins involved in diverse cellular stress responses such as cell-cycle arrest, apoptosis, autophagy and senescence. In this review, we discuss how these mRNAs are concurrently regulated at the post-transcriptional level by microRNAs (miRNAs) and RNA-binding proteins (RBPs), which consequently modify the p53 transcriptional program in a cell type- and stimulus-specific manner. We also discuss the action of specific miRNAs and RBPs that are direct transcriptional targets of p53 and how they act coordinately with protein-coding p53 target genes to orchestrate p53-dependent cellular responses.
p53; post-transcriptional regulation; RNA-binding proteins; miRNA
The regulation of mRNA translation is a major checkpoint in the flux of information from the transcriptome to the proteome. Critical for translational control are the trans-acting factors, RNA-binding proteins (RBPs) and small RNAs that bind to the mRNA and modify its translatability. This review summarizes the mechanisms by which RBPs regulate mRNA translation, with special focus on those binding to the 3′-untranslated region. It also discusses how recent high-throughput technologies are revealing exquisite layers of complexity and are helping to untangle translational regulation at a genome-wide scale.
RNA-binding protein; translation; UTR; RNP; CLIP; ribosome profiling
The mitogen-activated protein kinase kinases (the MAPK/ERK kinases; MKKs or MEKs) and their downstream substrates, the extracellular-regulated kinases have been intensively studied for their roles in development and disease. Until recently, it had been assumed any mutation affecting their function would have lethal consequences. However, the identification of MEK1 and MEK2 mutations in developmental syndromes as well as chemotherapy-resistant tumors, and the discovery of genomic variants in MEK1 and MEK2 have led to the realization the extent of genomic variation associated with MEKs is much greater than had been appreciated. In this review, we will discuss these recent advances, relating them to what is currently understood about the structure and function of MEKs, and describe how they change our understanding of the role of MEKs in development and disease.
MEK; MAPK; ERK; cardio-facial cutaneous syndrome; cancer; SNP
Piwi-interacting RNAs (piRNAs) and CRISPR RNAs (crRNAs) are two recently discovered classes of small noncoding RNA that are found in animals and prokaryotes, respectively. Both of these novel RNA species function as components of adaptive immune systems that protect their hosts from foreign nucleic acids—piRNAs repress transposable elements in animal germlines, whereas crRNAs protect their bacterial hosts from phage and plasmids. The piRNA and CRISPR systems are nonhomologous but rather have independently evolved into logically similar defense mechanisms based on the specificity of targeting via nucleic acid base complementarity. Here we review what is known about the piRNA and CRISPR systems with a focus on comparing their evolutionary properties. In particular, we highlight the importance of several factors on the pattern of piRNA and CRISPR evolution, including the population genetic environment, the role of alternate defense systems and the mechanisms of acquisition of new piRNAs and CRISPRs.
piRNA; CRISPR; co-evolution; transposable elements; phage; plasmids
One of the main motivations to study amphioxus is its potential for understanding the last common ancestor of chordates, which notably gave rise to the vertebrates. An important feature in this respect is the slow evolutionary rate that seems to have characterized the cephalochordate lineage, making amphioxus an interesting proxy for the chordate ancestor, as well as a key lineage to include in comparative studies. Whereas slow evolution was first noticed at the phenotypic level, it has also been described at the genomic level. Here, we examine whether the amphioxus genome is indeed a good proxy for the genome of the chordate ancestor, with a focus on protein-coding genes. We investigate genome features, such as synteny, gene duplication and gene loss, and contrast the amphioxus genome with those of other deuterostomes that are used in comparative studies, such as Ciona, Oikopleura and urchin.
deuterostomes; evolutionary rates; gene duplication; gene loss; orthology; synteny
DNA microarrays have emerged as a viable platform for detection of pathogenic organisms in clinical and environmental samples. These microbial detection arrays occupy a middle ground between low cost, narrowly focused assays such as multiplex PCR and more expensive, broad-spectrum technologies like high-throughput sequencing. While pathogen detection arrays have been used primarily in a research context, several groups are aggressively working to develop arrays for clinical diagnostics, food safety testing, environmental monitoring and biodefense. Statistical algorithms that can analyze data from microbial detection arrays and provide easily interpretable results are absolutely required in order for these efforts to succeed. In this article, we will review the most promising array designs and analysis algorithms that have been developed to date, comparing their strengths and weaknesses for pathogen detection and discovery.
microarrays; pathogens; genomics
In this review, we discuss the latest targeted enrichment methods and aspects of their utilization along with second-generation sequencing for complex genome analysis. In doing so, we provide an overview of issues involved in detecting genetic variation, for which targeted enrichment has become a powerful tool. We explain how targeted enrichment for next-generation sequencing has made great progress in terms of methodology, ease of use and applicability, but emphasize the remaining challenges such as the lack of even coverage across targeted regions. Costs are also considered versus the alternative of whole-genome sequencing which is becoming ever more affordable. We conclude that targeted enrichment is likely to be the most economical option for many years to come in a range of settings.
targeted enrichment; next-generation sequencing; genome partitioning; exome; genetic variation
Schizophrenia (SZ) is a complex disorder resulting from both genetic and environmental causes with a lifetime prevalence world-wide of 1%; however, there are no specific, sensitive and validated biomarkers for SZ. A general unifying hypothesis has been put forward that disease-associated single nucleotide polymorphisms (SNPs) from genome-wide association study (GWAS) are more likely to be associated with gene expression quantitative trait loci (eQTL). We will describe this hypothesis and review primary methodology with refinements for testing this paradigmatic approach in SZ. We will describe biomarker studies of SZ and testing enrichment of SNPs that are associated both with eQTLs and existing GWAS of SZ. SZ-associated SNPs that overlap with eQTLs can be placed into gene–gene expression, protein–protein and protein–DNA interaction networks. Further, those networks can be tested by reducing/silencing the gene expression levels of critical nodes. We present pilot data to support these methods of investigation such as the use of eQTLs to annotate GWASs of SZ, which could be applied to the field of biomarker discovery. Those networks that have association with SNP markers, especially cis-regulated expression, might lead to a more clear understanding of important candidate genes that predispose to disease and alter expression. This method has general application to many complex disorders.
expression quantitative trait loci; cis-regulatory SNPs; GWAS; gene expression; lymphoblastoid cell lines
The systematic investigation of the phenotypes associated with genotypes in model organisms holds the promise of revealing genotype–phenotype relations directly and without additional, intermediate inferences. Large-scale projects are now underway to catalog the complete phenome of a species, notably the mouse. With the increasing amount of phenotype information becoming available, a major challenge that biology faces today is the systematic analysis of this information and the translation of research results across species and into an improved understanding of human disease. The challenge is to integrate and combine phenotype descriptions within a species and to systematically relate them to phenotype descriptions in other species, in order to form a comprehensive understanding of the relations between those phenotypes and the genotypes involved in human disease. We distinguish between two major approaches for comparative phenotype analyses: the first relies on evolutionary relations to bridge the species gap, while the other approach compares phenotypes directly. In particular, the direct comparison of phenotypes relies heavily on the quality and coherence of phenotype and disease databases. We discuss major achievements and future challenges for these databases in light of their potential to contribute to the understanding of the molecular mechanisms underlying human disease. In particular, we discuss how the use of ontologies and automated reasoning can significantly contribute to the analysis of phenotypes and demonstrate their potential for enabling translational research.
phenotype; animal model; disease; database; comparative phenomics; ontology
With the growing number of microRNAs (miRNAs) being identified each year, more innovative molecular tools are required to efficiently characterize these small RNAs in living animal systems. Caenorhabditis elegans is a powerful model to study how miRNAs regulate gene expression and control diverse biological processes during development and in the adult. Genetic strategies such as large-scale miRNA deletion studies in nematodes have been used with limited success since the majority of miRNA genes do not exhibit phenotypes when individually mutated. Recent work has indicated that miRNAs function in complex regulatory networks with other small RNAs and protein-coding genes, and therefore the challenge will be to uncover these functional redundancies. The use of miRNA inhibitors such as synthetic antisense 2′-O-methyl oligoribonucleotides is emerging as a promising in vivo approach to dissect out the intricacies of miRNA regulation.
microRNA; Caenorhabditis elegans; miRNA inhibitors; antisense 2′-O-methyl oligoribonucleotides
Drosophila melanogaster has a long history as a model organism with several unique features that make it an ideal research tool for the study of the relationship between genotype and phenotype. Importantly fundamental genetic principles as well as key human disease genes have been uncovered through the use of Drosophila. The contribution of the fruit fly to science and medicine continues in the postgenomic era as cell-based Drosophila RNAi screens are a cost-effective and scalable enabling technology that can be used to quantify the contribution of different genes to diverse cellular processes. Drosophila high-throughput screens can also be used as integral part of systems-level approaches to describe the architecture and dynamics of cellular networks.
RNAi, cell-based screens; Drosophila melanogaster; genetic interactions; systems biology
Morpholino oligonucleotides (MOs) are an effective, gene-specific antisense knockdown technology used in many model systems. Here we describe the application of MOs in zebrafish (Danio rerio) for in vivo functional characterization of gene activity. We summarize our screening experience beginning with gene target selection. We then discuss screening parameter considerations and data and database management. Finally, we emphasize the importance of off-target effect management and thorough downstream phenotypic validation. We discuss current morpholino limitations, including reduced stability when stored in aqueous solution. Advances in MO technology now provide a measure of spatiotemporal control over MO activity, presenting the opportunity for incorporating more finely tuned analyses into MO-based screening. Therefore, with careful management, MOs remain a valuable tool for discovery screening as well as individual gene knockdown analysis.
morpholinos; zebrafish; knockdown
More than 70 years after the first ex situ genebanks have been established, major efforts in this field are still concerned with issues related to further completion of individual collections and securing of their storage. Attempts regarding valorization of ex situ collections for plant breeders have been hampered by the limited availability of phenotypic and genotypic information. With the advent of molecular marker technologies first efforts were made to fingerprint genebank accessions, albeit on a very small scale and mostly based on inadequate DNA marker systems. Advances in DNA sequencing technology and the development of high-throughput systems for multiparallel interrogation of thousands of single nucleotide polymorphisms (SNPs) now provide a suite of technological platforms facilitating the analysis of several hundred of Gigabases per day using state-of-the-art sequencing technology or, at the same time, of thousands of SNPs. The present review summarizes recent developments regarding the deployment of these technologies for the analysis of plant genetic resources, in order to identify patterns of genetic diversity, map quantitative traits and mine novel alleles from the vast amount of genetic resources maintained in genebanks around the world. It also refers to the various shortcomings and bottlenecks that need to be overcome to leverage the full potential of high-throughput DNA analysis for the targeted utilization of plant genetic resources.
genetic resources; next-generation sequencing; SNP; allele mining; genetic diversity; association analysis
Enhancers, silencer and insulators are DNA elements that play central roles in regulation of the genome that are crucial for development and differentiation. In metazoans, these elements are often separated from target genes by distances that can reach 100 Kb. How regulation can be accomplished over long distances has long been intriguing. Current data indicate that although the mechanisms by which these diverse regulatory elements affect gene transcription may vary, an underlying feature is the establishment of close contacts or chromatin loops. With the generalization of this principle, new questions emerge, such as how the close contacts are formed and stabilized and, importantly, how they contribute to the regulation of transcriptional output at target genes. This review will concentrate on examples where a functional role and a mechanistic understanding has been explored for loops formed between genes and their regulatory elements or among the elements themselves.
long range interactions; chromosome conformation capture; enhancers; silencers; insulators
The eukaryotic cell nucleus displays a high degree of spatial organization, with discrete functional subcompartments that provide microenvironments where specialized processes take place. Concordantly, the genome also adopts defined conformations that, in part, enable specific genomic regions to interface with these functional centers. Yet the roles of many subcompartments and the genomic regions that contact them have not been explored fully. More fundamentally, it is not entirely clear how genome organization impacts function, and vice versa. The past decade has witnessed the development of a new breed of methods that are capable of assessing the spatial organization of the genome. These stand to further our understanding of the relationship between genome structure and function, and potentially assign function to various nuclear subcompartments. Here, we review the principal techniques used for analyzing genomic interactions, the functional insights they have afforded and discuss the outlook for future advances in nuclear structure and function dynamics.
genome organization; nuclear structure and function; chromosome conformation capture; next generation sequencing