We review different uses of the retroviral mutagenesis technology as the tool to manipulate the zebrafish genome. In addition to serving as a mutagen in a phenotype-driven forward mutagenesis screen as it was originally adapted for, retroviral insertional mutagenesis can also be exploited in reverse genetic approaches, delivering enhancer- and gene-trap vectors for the purpose of examining gene expression patterns and mutagenesis, making sensitized mutants amenable for chemical and genetic modifier screens, and producing gain-of-function mutations by epigenetically overexpressing the retroviral-inserted genes. From a technology point of view, we also summarize the recent advances in the high-throughput cloning of retroviral integration sites, a pivotal step toward identifying mutations. Lastly, we point to some potential directions that retroviral mutagenesis might take from the lessons of studying other model organisms.
genetics; moloney murine leukemia virus; enhancer traps; gene traps; linker-mediated PCR; zebrafish
Quantification of a transcriptional profile is a useful way to evaluate the activity of a cell at a given point in time. Although RNA-Seq has revolutionized transcriptional profiling, the costs of RNA-Seq are still significantly higher than microarrays, and often the depth of data delivered from RNA-Seq is in excess of what is needed for simple transcript quantification. Digital Gene Expression (DGE) is a cost-effective, sequence-based approach for simple transcript quantification: by sequencing one read per molecule of RNA, this technique can be used to efficiently count transcripts while obviating the need for transcript-length normalization and reducing the total numbers of reads necessary for accurate quantification. Here, we present trieFinder, a program specifically designed to rapidly map, parse, and annotate DGE tags of various lengths against cDNA and/or genomic sequence databases.
The trieFinder algorithm maps DGE tags in a two-step process. First, it scans FASTA files of RefSeq, UniGene, and genomic DNA sequences to create a database of all tags that can be derived from a predefined restriction site. Next, it compares the experimental DGE tags to this tag database, taking advantage of the fact that the tags are stored as a prefix tree, or “trie”, which allows for linear-time searches for exact matches. DGE tags with mismatches are analyzed by recursive calls in the data structure. We find that, in terms of alignment speed, the mapping functionality of trieFinder compares favorably with Bowtie.
trieFinder can quickly provide the user an annotation of the DGE tags from three sources simultaneously, simplifying transcript quantification and novel transcript detection, delivering the data in a simple parsed format, obviating the need to post-process the alignment results. trieFinder is available at http://research.nhgri.nih.gov/software/trieFinder/.
RNA-Seq; Transcriptional profiling; DGE; SAGE
Neurosensory epithelia in the inner ear are the crucial structures for hearing and balance functions. Therefore, it is important to understand the cellular and molecular features of the epithelia, which are mainly composed of two types of cells: hair cells (HCs) and supporting cells (SCs). Here we choose to study the inner ear sensory epithelia in adult zebrafish not only because the epithelial structures are highly conserved in all vertebrates studied, but also because the adult zebrafish is able to regenerate HCs, an ability that mammals lose shortly after birth. We use the inner ear of adult zebrafish as a model system to study the mechanisms of inner ear HC regeneration in adult vertebrates that could be helpful for clinical therapy of hearing/balance deficits in human as a result of HC loss.
Here we demonstrate how to do gross and fine dissections of inner ear sensory epithelia in adult zebrafish. The gross dissection removes the tissues surrounding the inner ear and is helpful for preparing tissue sections, which allows us to examine the detailed structure of the sensory epithelia. The fine dissection cleans up the non-sensory-epithelial tissues of each individual epithelium and enables us to examine the heterogeneity of the whole epithelium easily in whole-mount epithelial samples.
Substantial intrastrain variation at the nucleotide level complicates molecular and genetic studies in zebrafish, such as the use of CRISPRs or morpholinos to inactivate genes. In the absence of robust inbred zebrafish lines, we generated NHGRI-1, a healthy and fecund strain derived from founder parents we sequenced to a depth of ∼50×. Within this strain, we have identified the majority of the genome that matches the reference sequence and documented most of the variants. This strain has utility for many reasons, but in particular it will be useful for any researcher who needs to know the exact sequence (with all variants) of a particular genomic region or who wants to be able to robustly map sequences back to a genome with all possible variants defined.
zebrafish; SNV; genome sequence; CRISPR; variants
Retroviruses integrate into the host genome in patterns specific to each virus. Understanding the causes of these patterns can provide insight into viral integration mechanisms, pathology and genome evolution, and is critical to the development of safe gene therapy vectors. We generated murine leukemia virus integrations in human HepG2 and K562 cells and subjected them to second-generation sequencing, using a DNA barcoding technique that allowed us to quantify independent integration events. We characterized >3 700 000 unique integration events in two ENCODE-characterized cell lines. We find that integrations were most highly enriched in a subset of strong enhancers and active promoters. In both cell types, approximately half the integrations were found in <2% of the genome, demonstrating genomic influences even narrower than previously believed. The integration pattern of murine leukemia virus appears to be largely driven by regions that have high enrichment for multiple marks of active chromatin; the combination of histone marks present was sufficient to explain why some strong enhancers were more prone to integration than others. The approach we used is applicable to analyzing the integration pattern of any exogenous element and could be a valuable preclinical screen to evaluate the safety of gene therapy vectors.
All nonmammalian vertebrates studied can regenerate inner ear mechanosensory receptors, i.e. hair cells (Corwin and Cotanche, 1988; Lombarte et al., 1993; Baird et al., 1996), but mammals only possess a very limited capacity for regeneration after birth (Roberson and Rubel, 1994). As a result, mammals suffer from permanent deficiencies in hearing and balance once their inner ear hair cells are lost. The mechanisms of hair cell regeneration are poorly understood. Because the inner ear sensory epithelium is highly conserved in all vertebrates (Fritzsch et al., 2007), we chose to study hair cell regeneration mechanism in adult zebrafish, hoping the results would be transferrable to inducing hair cell regeneration in mammals. We defined the comprehensive network of genes involved in hair cell regeneration in the inner ear of adult zebrafish with the powerful transcriptional profiling technique, Digital Gene Expression (DGE), which leverages the power of next-generation sequencing ('t Hoen et al., 2008). We also identified a key pathway, stat3/socs3, and demonstrated its role in promoting hair cell regeneration through stem cell activation, cell division, and differentiation. In addition, transient pharmacological up-regulation of stat3 signaling accelerated hair cell regeneration without over-producing cells. Taking other published datasets into account (Sano et al., 1999; Schebesta et al., 2006; Zhu et al., 2008; Riehle et al., 2008; Dierssen et al., 2008; Qin et al., 2009), we propose that the stat3/socs3 pathway is a key response in all tissue regeneration and thus an important therapeutic target for a broad application in tissue repair and injury healing.
ZInC (Zebrafish Insertional Collection, http://research.nhgri.nih.gov/ZInC/) is a web-searchable interface of insertional mutants in zebrafish. Over the last two decades, the zebrafish has become a popular model organism for studying vertebrate development as well as for modeling human diseases. To facilitate such studies, we are generating a genome-wide knockout resource that targets every zebrafish protein-coding gene. All mutant fish are freely available to the scientific community through the Zebrafish International Resource Center (ZIRC). To assist researchers in finding mutant and insertion information, we developed a comprehensive database with a web front-end, the ZInC. It can be queried using multiple types of input such as ZFIN (Zebrafish Information Network) IDs, UniGene accession numbers and gene symbols from zebrafish, human and mouse. In the future, ZInC may include data from other insertional mutation projects as well. ZInC cross-references all integration data with the ZFIN (http://zfin.org/).
Gene expression profiling is a powerful technique for studying biological processes, especially tissue/organ-specific ones, at the molecular level. With the rapid development of the next-generation sequencing techniques, high throughput sequencing-based expression profiling techniques have been more and more widely adopted in molecular biology studies. In this chapter, we described a protocol for applying one of the sequencing-based expression profiling techniques, Digital Gene Expression (DGE), for zebrafish research. The protocol provides guidelines for wet-bench experimental procedures as well as for bioinformatics-focused data analyses. We also discuss potential issues/challenges with the use of DGE.
Folic acid supplementation reduces the risk of neural tube defects and congenital heart defects. The biological mechanisms through which folate prevents birth defects are not well understood. We explore the use of zebrafish as a model system to investigate the role of folate metabolism during development.
We first identified zebrafish orthologs of 12 human folate metabolic genes. RT-PCR and in situ analysis indicated maternal transcripts supply the embryo with mRNA so that the embryo has an intact folate pathway. To perturb folate metabolism we exposed zebrafish embryos to methotrexate (MTX), a potent inhibitor of dihydrofolate reductase (Dhfr) an essential enzyme in the folate metabolic pathway. Embryos exposed to high doses of MTX exhibited developmental arrest prior to early segmentation. Lower doses of MTX resulted in embryos with a shortened anterior-posterior axis and cardiac defects: linear heart tubes or incomplete cardiac looping. Inhibition of dhfr mRNA with antisense morpholino oligonucleotides resulted in embryonic lethality. One function of the folate pathway is to provide essential one-carbon units for dTMP synthesis, a rate-limiting step of DNA synthesis. After 24 hours of exposure to high levels of MTX, mutant embryos continue to incorporate the thymidine analog BrdU. However, additional experiments indicate that these embryos have fewer mitotic cells, as assayed with phospho-histone H3 antibodies, and that treated embryos have perturbed cell cycles.
Our studies demonstrate that human and zebrafish utilize similar one-carbon pathways. Our data indicate that folate metabolism is essential for early zebrafish development. Zebrafish studies of the folate pathway and its deficiencies could provide insight into the underlying etiology of human birth defects and the natural role of folate in development.
Because of the structural and molecular similarities between the two systems, the lateral line, a fish and amphibian specific sensory organ, has been widely used in zebrafish as a model to study the development/biology of neuroepithelia of the inner ear. Both organs have hair cells, which are the mechanoreceptor cells, and supporting cells providing other functions to the epithelium. In most vertebrates (excluding mammals), supporting cells comprise a pool of progenitors that replace damaged or dead hair cells. However, the lack of regenerative capacity in mammals is the single leading cause for acquired hearing disorders in humans.
In an effort to understand the regenerative process of hair cells in fish, we characterized and cloned an egfp transgenic stable fish line that trapped tnks1bp1, a highly conserved gene that has been implicated in the maintenance of telomeres' length. We then used this Tg(tnks1bp1:EGFP) line in a FACsorting strategy combined with microarrays to identify new molecular markers for supporting cells.
We present a Tg(tnks1bp1:EGFP) stable transgenic line, which we used to establish a transcriptional profile of supporting cells in the zebrafish lateral line. Therefore we are providing a new set of markers specific for supporting cells as well as candidates for functional analysis of this important cell type. This will prove to be a valuable tool for the study of regeneration in the lateral line of zebrafish in particular and for regeneration of neuroepithelia in general.
Regeneration; hair cells; progenitor cells; lateral line; zebrafish; supporting cells; accessory cells; microarrays; Tnk1bp1
The increasing number of developmental events and molecular mechanisms associated with the Hedgehog (Hh) pathway from Drosophila to vertebrates, suggest that gene regulation is crucial for diverse cellular responses, including target genes not yet described. Although several high-throughput, genome-wide approaches have yielded information at the genomic, transcriptional and proteomic levels, the specificity of Gli binding sites related to direct target gene activation still remain elusive. This study aims to identify novel putative targets of Gli transcription factors through a protein-DNA binding assay using yeast, and validating a subset of targets both in-vitro and in-vivo. Testing in different Hh/Gli gain- and loss-of-function scenarios we here identified known (e.g., ptc1) and novel Hh-regulated genes in zebrafish embryos.
The combined yeast-based screening and MEME/MAST analysis were able to predict Gli transcription factor binding sites, and position mapping of these sequences upstream or in the first intron of promoters served to identify new putative target genes of Gli regulation. These candidates were validated by qPCR in combination with either the pharmacological Hh/Gli antagonist cyc or the agonist pur in Hh-responsive C3H10T1/2 cells. We also used small-hairpin RNAs against Gli proteins to evaluate targets and confirm specific Gli regulation their expression. Taking advantage of mutants that have been identified affecting different components of the Hh/Gli signaling system in the zebrafish model, we further analyzed specific novel candidates. Studying Hh function with pharmacological inhibition or activation complemented these genetic loss-of-function approaches. We provide evidence that in zebrafish embryos, Hh signaling regulates sfrp2, neo1, and c-myc expression in-vivo.
A recently described yeast-based screening allowed us to identify new Hh/Gli target genes, functionally important in different contexts of vertebrate embryonic development.
Hh/Gli targets; zebrafish; purmorphamine; cyclopamine; neogenin 1; c-myc; sfrp2
Development of the posterior lateral line (PLL) system in zebrafish involves cell migration, proliferation and differentiation of mechanosensory cells. The PLL forms when cranial placodal cells delaminate and become a coherent, migratory primordium that traverses the length of the fish to form this sensory system. As it migrates, the primordium deposits groups of cells called neuromasts, the specialized organs that contain the mechanosensory hair cells. Therefore the primordium provides both a model for studying collective directional cell migration and the differentiation of sensory cells from multipotent progenitor cells.
Through the combined use of transgenic fish, Fluorescence Activated Cell Sorting and microarray analysis we identified a repertoire of key genes expressed in the migrating primordium and in differentiated neuromasts. We validated the specific expression in the primordium of a subset of the identified sequences by quantitative RT-PCR, and by in situ hybridization. We also show that interfering with the function of two genes, f11r and cd9b, defects in primordium migration are induced. Finally, pathway construction revealed functional relationships among the genes enriched in the migrating cell population.
Our results demonstrate that this is a robust approach to globally analyze tissue-specific expression and we predict that many of the genes identified in this study will show critical functions in developmental events involving collective cell migration and possibly in pathological situations such as tumor metastasis.
The Hedgehog (Hh) signaling pathway plays critical instructional roles during embryonic development. Mis-regulation of Hh/Gli signaling is a major causative factor in human congenital disorders and in a variety of cancers. The zebrafish is a powerful genetic model for the study of Hh signaling during embryogenesis, as a large number of mutants have been identified affecting different components of the Hh/Gli signaling system. By performing global profiling of gene expression in different Hh/Gli gain- and loss-of-function scenarios we identified several known (e.g. ptc1 and nkx2.2a) as well as a large number of novel Hh regulated genes that are differentially expressed in embryos with altered Hh/Gli signaling function. By uncovering changes in tissue specific gene expression, we revealed new embryological processes that are influenced by Hh signaling. We thus provide a comprehensive survey of Hh/Gli regulated genes during embryogenesis and we identify new Hh-regulated genes that may be targets of mis-regulation during tumorogenesis.
hedgehog; detour(dtr); slow muscle omitted(smu); interrenal gland; pronephros; nervous system; microarray; transcriptional profiling
In humans, the absence or irreversible loss of hair cells, the sensory mechanoreceptors in the cochlea, accounts for a large majority of acquired and congenital hearing disorders. In the auditory and vestibular neuroepithelia of the inner ear, hair cells are accompanied by another cell type called supporting cells. This second cell population has been described as having stem cell-like properties, allowing efficient hair cell replacement during embryonic and larval/fetal development of all vertebrates. However, mammals lose their regenerative capacity in most inner ear neuroepithelia in postnatal life. Remarkably, reptiles, birds, amphibians, and fish are different in that they can regenerate hair cells throughout their lifespan. The lateral line in amphibians and in fish is an additional sensory organ, which is used to detect water movements and is comprised of neuroepithelial patches, called neuromasts. These are similar in ultra-structure to the inner ear's neuroepithelia and they share the expression of various molecular markers. We examined the regeneration process in hair cells of the lateral line of zebrafish larvae carrying a retroviral integration in a previously uncharacterized gene, phoenix (pho). Phoenix mutant larvae develop normally and display a morphologically intact lateral line. However, after ablation of hair cells with copper or neomycin, their regeneration in pho mutants is severely impaired. We show that proliferation in the supporting cells is strongly decreased after damage to hair cells and correlates with the reduction of newly formed hair cells in the regenerating phoenix mutant neuromasts. The retroviral integration linked to the phenotype is in a novel gene with no known homologs showing high expression in neuromast supporting cells. Whereas its role during early development of the lateral line remains to be addressed, in later larval stages phoenix defines a new class of proteins implicated in hair cell regeneration.
By screening for regeneration deficient zebrafish mutations, we identified a zebrafish mutant line deficient in a highly specific regeneration process, the renewal of hair cells in the lateral line. Although this organ is specific to fish and amphibians, it contains essentially the same mechanosensory cells (the hair cells) that function in the ear for sound and balance detection in all vertebrates. Mammals are unusual vertebrates in that they have lost the ability to regenerate functional hair cells after damage by sound or chemical exposure. All other vertebrates retain their ability to regenerate their hair cells after damage, but this process is not well understood at the molecular level. The retroviral insertion linked to the phoenix mutation is in a new gene family class that is specifically required for the supporting cells to enter into mitosis after hair cell damage. What is particularly unusual about this mutation is that it appears not to affect the normal development and differentiation pathways, but only seems to affect the cells' post-differentiation regeneration.
Knowledge of all binding sites for transcriptional activators and repressors is essential for computationally aided identification of transcriptional networks. The techniques developed for defining the binding sites of transcription factors tend to be cumbersome and not adaptable to high throughput. We refined a versatile yeast strategy to rapidly and efficiently identify genomic targets of DNA-binding proteins. Yeast expressing a transcription factor is mated to yeast containing a library of genomic fragments cloned upstream of the reporter gene URA3. DNA fragments with target-binding sites are identified by growth of yeast clones in media lacking uracil. The experimental approach was validated with the tumor suppressor protein p53 and the forkhead protein FoxI1 using genomic libraries for zebrafish and mouse generated by shotgun cloning of short genomic fragments. Computational analysis of the genomic fragments recapitulated the published consensus-binding site for each protein. Identified fragments were mapped to identify the genomic context of each binding site. Our yeast screening strategy, combined with bioinformatics approaches, will allow both detailed and high-throughput characterization of transcription factors, scalable to the analysis of all putative DNA-binding proteins.
All forkhead (Fox) proteins contain a highly conserved DNA binding domain whose structure is remarkably similar to the winged-helix structures of histones H1 and H5. Little is known about Fox protein binding in the context of higher-order chromatin structure in living cells. We created a stable cell line expressing FoxI1-green fluorescent protein (GFP) or FoxI1-V5 fusion proteins under control of the reverse tetracycline-controlled transactivator doxycycline inducible system and found that unlike most transcription factors, FoxI1 remains bound to the condensed chromosomes during mitosis. To isolate DNA fragments directly bound by the FoxI1 protein within living cells, we performed chromatin immunoprecipitation assays (ChIPs) with antibodies to either enhanced GFP or the V5 epitope and subcloned the FoxI1-enriched DNA fragments. Sequence analyses indicated that 88% (106/121) of ChIP sequences contain the consensus binding sites for all Fox proteins. Testing ChIP sequences with a quantitative DNase I hypersensitivity assay showed that FoxI1 created stable DNase I sensitivity changes in condensed chromosomes. The majority of ChIP targets and random targets increased in resistance to DNase I in FoxI1-expressing cells, but a small number of targets became more accessible to DNase I. Consistently, the accessibility of micrococcal nuclease to chromatin was generally inhibited. Micrococcal nuclease partial digestion generated a ladder in which all oligonucleosomes were slightly longer than those observed with the controls. On the basis of these findings, we propose that FoxI1 is capable of remodeling chromatin higher-order structure and can stably create site-specific changes in chromatin to either stably create or remove DNase I hypersensitive sites.
Integration into the host genome is one of the hallmarks of the retroviral life cycle and is catalyzed by virus-encoded integrases. While integrase has strict sequence requirements for the viral DNA ends, target site sequences have been shown to be very diverse. We carefully examined a large number of integration target site sequences from several retroviruses, including human immunodeficiency virus type 1, simian immunodeficiency virus, murine leukemia virus, and avian sarcoma-leukosis virus, and found that a statistical palindromic consensus, centered on the virus-specific duplicated target site sequence, was a common feature at integration target sites for these retroviruses.
Recombinant adeno-associated virus (rAAV) vector holds promise for gene therapy. Despite a low frequency of chromosomal integration of vector genomes, recent studies have raised concerns about the risk of rAAV integration because integration occurs preferentially in genes and accompanies chromosomal deletions, which may lead to loss-of-function insertional mutagenesis. Here, by analyzing 347 rAAV integrations in mice, we elucidate novel features of rAAV integration: the presence of hot spots for integration and a strong preference for integrating near gene regulatory sequences. The most prominent hot spot was a harmless chromosomal niche in the rRNA gene repeats, whereas nearly half of the integrations landed near transcription start sites or CpG islands, suggesting the possibility of activating flanking cellular disease genes by vector integration, similar to retroviral gain-of-function insertional mutagenesis. Possible cancer-related genes were hit by rAAV integration at a frequency of 3.5%. In addition, the information about chromosomal changes at 218 integration sites and 602 breakpoints of vector genomes have provided a clue to how vector terminal repeats and host chromosomal DNA are joined in the integration process. Thus, the present study provides new insights into the risk of rAAV-mediated insertional mutagenesis and the mechanisms of rAAV integration.
The Sleeping Beauty (SB) transposon is an emerging tool for transgenesis, gene discovery, and therapeutic gene delivery in mammals. Here we studied 1,336 SB insertions in primary and cultured mammalian cells in order to better understand its target site preferences. We report that, although widely distributed, SB integration recurrently targets certain genomic regions and shows a small but significant bias toward genes and their upstream regulatory sequences. Compared to those of most integrating viruses, however, the regional preferences associated with SB-mediated integration were much less pronounced and were not significantly influenced by transcriptional activity. Insertions were also distinctly nonrandom with respect to intergenic sequences, including a strong bias toward microsatellite repeats, which are predominantly enriched in noncoding DNA. Although we detected a consensus sequence consistent with a twofold dyad symmetry at the target site, the most widely used sites did not match this consensus. In conjunction with an observed SB integration preference for bent DNA, these results suggest that physical properties may be the major determining factor in SB target site selection. These findings provide basic insights into the transposition process and reveal important distinctions between transposon- and virus-based integrating vectors.
The mitochondrial outer membrane protein, Mmm1p, is required for normal mitochondrial shape in yeast. To identify new morphology proteins, we isolated mutations incompatible with the mmm1-1 mutant. One of these mutants, mmm2-1, is defective in a novel outer membrane protein. Lack of Mmm2p causes a defect in mitochondrial shape and loss of mitochondrial DNA (mtDNA) nucleoids. Like the Mmm1 protein (Aiken Hobbs, A.E., M. Srinivasan, J.M. McCaffery, and R.E. Jensen. 2001. J. Cell Biol. 152:401–410.), Mmm2p is located in dot-like particles on the mitochondrial surface, many of which are adjacent to mtDNA nucleoids. While some of the Mmm2p-containing spots colocalize with those containing Mmm1p, at least some of Mmm2p is separate from Mmm1p. Moreover, while Mmm2p and Mmm1p both appear to be part of large complexes, we find that Mmm2p and Mmm1p do not stably interact and appear to be members of two different structures. We speculate that Mmm2p and Mmm1p are components of independent machinery, whose dynamic interactions are required to maintain mitochondrial shape and mtDNA structure.
mitochondrial morphology; mtDNA nucleoids; outer membrane protein