Although forward genetics has been a powerful tool to study gene function, it is fundamentally limited in its effectiveness by issues such as overlapping gene function and the need to have an observable or measurable phenotype. It is thus also desirable to go in the‘reverse’ direction and systematically disrupt the genes in the zebrafish genome and test the effects of gene inactivation, either one at a time or on a larger scale. This targeted mutagenesis approach has proven to be a powerful means to elucidate gene function in the mouse. However, due to the current lack of long-term zebrafish ES cultures, targeted mutagenesis mediated by homologous recombination is still unfeasible in zebrafish. Recently a target-selected reverse genetic approach, Targeting Induced Local Lesions IN Genomes(TILLING), has been developed to identify a desired mutation from a population of ENU-mutagenized fish through either DNA resequencing or CEL1 nuclease assays [
21–23]. However, this technique requires a significant effort in time and money for each mutation identified [
24,
25]. Performing saturation mutagenesis using retroviruses as the mutagen followed by mapping the retroviral integrations and cryopreserving the corresponding sperm samples could be an alternative approach to targeted gene knockouts. Once saturation mutagenesis is achieved, any desired mutant can be readily recovered by
in vitro fertilization (IVF) from the archived sperm sample containing the desired mutation. Furthermore, this approach takes the middle road between the random and targeted mutagenesis approaches as phenotypes resulted from the disruption of uncharacterized genes can also be assessed by systematically testing mutant alleles of the uncharacterized genes. There are two large-scale efforts utilizing similar approaches, one a collaboration between three groups including our laboratory, University of California, Los Angeles, and Peking University, and the other, a private company, Znomics Inc. Both groups use the VSV-G-pseudotyped MLV-based retroviruses originally developed in the large-scale retroviral mutagenesis screen at Massachusetts Institute of Technology [
16–20]. Following is a comparison of these two independent projects.
The mutagenesis pipeline in our laboratories () begins with the production of high-titer pseudotyped MLV followed by the injection of the retroviruses into blastula-stage embryos. We have improved the original protocols in producing pseudotyped MLV and in microinjection techniques [
26] so that the infection rate in the injected embryos (i.e. founders) averages ~45 proviral copies per cell, which is approximately 3-fold higher than previous retroviral mutagenesis screens achieved [
13,
20]. We raise the founders, outcross them with wild-type fish, raise a small number of male F
1 fish, and subsequently sacrifice them to cryopreserve their testes. A small DNA sample from individual F
1 fish is used to clone the genomic sequences adjacent to the retroviral integrations; the cloned sequences are mapped back to the zebrafish genome to locate the retroviral integrations and simultaneously indexed to the corresponding sperm samples. We expect to generate enough retroviral integrations to ‘hit’ every zebrafish gene at least once in 3–5 years. This insertional mutant library will be a public domain resource, freely available for the entire zebrafish research community.
The mutagenesis project by Znomics, Inc. takes a similar approach. The major difference between the two approaches is that Znomics, Inc. cryopreserves the sperm samples from the original injected founder fish and maps the integrations in the founder sperm samples instead of using the F1 male fish. Because the germline of injected founder fish is highly mosaic, any given retroviral integration found in the founder germline is estimated present in 3–20% of the F1 progeny produced by IVF. Because of the mosaicism, more unique retroviral integrations can be recovered from the founder sperm than from the F1's (as in our approach). This reduces the initial work and allows for very high unique integration recovery from a relatively small number of fish. However, it is often necessary to screen a large number of F1 progeny to find any given integration and the number of identified fish recovered from the IVF can be relatively small. This drawback is bypassed in our approach because the integrations are recovered and mapped in the F1 progeny; any given integration would thus be represented in 50% of F2 progeny produced by IVF of the F1 sperm sample. At the time of this writing, Znomics Inc. has integrations in more than 10 000 genes with sequence tags associated with these integrations deposited in the company's publicly accessible database. Fish carrying specific integrations can be purchased directly from the company.
To know how many integrations it will take to ‘hit’ every zebrafish gene with retroviral integrations, it is critical to know what percentage of integrations lands in genes. Based on our pilot data examining ~600 unique mapped integrations in the zebrafish genome [
26], we found that ~65% MLV integrations landed either in genes or within 3 kb on either side of the genes. Furthermore, ~65% of those at or near gene hits landed in the first intron or <3 kb upstream of the genes. This apparent preference for MLV to integrate at the 5′ end of the gene has also been observed in the mouse [
27,
28] and human tissue culture cells [
29]. We further demonstrated that not only are integrations mutagenic when they directly hit exons, but the integrations landing in the first intron of genes are also highly mutagenic; 80% of the first-intron hits have the mRNA level of the affected genes reduced to <30% of wild-type, usually >90% reduction. Integrations landing in the putative promoter region may also be mutagenic. Overall, roughly one in five retroviral integrations will result in a gene disruption by reducing the mRNA level to less than 30% of the wild-type level.
Other than the direct exon disruptions, the mechanisms by which retroviral integrations abrogate gene expression are currently not clear, especially for the cases with integrations landing in the intronic or promoter region of a gene. One possible mechanism for the intronic integrations to become mutagenic is that the retroviral integration can cause a premature truncation of the gene's transcript, either by using the polyadenylation signals of the LTRs or by using a cryptic polyadenylation signal present in the antisense orientation of the virus. Premature termination of transcription may result in the de-stabilization of the truncated transcript. A similar mechanism has been suggested for MLV to abrogate gene expression in the mouse [
30]. Both 5′ and 3′ LTRs of the MLV used in the current zebrafish retroviral mutagenesis contain the canonical polyadenylation signal, AATAAA, raising the possibility that this mechanism could account for at least a portion of the gene abrogation events. In some cases, integrations landing in or close to the exon–intron junction may induce aberrant splicing events (e.g. exon skipping), producing truncated proteins, thereby abrogating gene function (Behra M and Burgess SM, unpublished data and [
31]). Truncated proteins can also be intentionally produced by another mechanism such as the retrovirus used in our mutagenesis scheme that contains a splice-in, splice-out, frameshift-producing gene-trap cassette (C and [
13]). When the retrovirus lands in an intron with the correct orientation, the splice donor in the preceding endogenous exon splices into the splice acceptor of the gene-trap cassette and the splice donor in the gene-trap cassette splices out into the next endogenous exon, resulting in the gene-trap cassette being inserted between the two adjacent exons, creating a frameshift or truncated fusion mutant. However, this mechanism of mutations appears to not be particularly efficient, as the endogenous splicing machinery often skips this gene-trap ‘exon’. To date we have only observed 1 gene-trap event among 11, correctly oriented integrations in introns (Jao L and Burgess SM, unpublished data). Another possible mechanism of retrovirus-induced gene inactivation in zebrafish might be mediated by
de novo methylation of host sequences flanking the integration sites, as the insertion of a provirus has been shown to change the methylation pattern of host DNA in the mouse and the retrovirus-induced methylation has been correlated with gene inactivation [
32]. This has not been demonstrated in zebrafish, but it remains a distinct possibility.