In this study, we created a fission yeast insertion mutant library in which all mutants were tagged with unique barcode sequences and stored as two readily available selection platforms. The 384-well mutant arrays allow genetic screens on individual mutants and can be extended to genetic approaches such as synthetic genetic array (SGA) [45
]. These mutant arrays have been used to identify mutants with four distinct phenotypes (Table ) as well as strains that are hyper-sensitive to cancer chemotherapeutics camptothecin and bleomycin (Hale and Runge, unpublished data). In addition to 384-well mutant arrays, mutant pools of 1800 mutants are available for parallel analysis.
The insertion mutagenesis used in this study relied on random non-homologous recombination, where a vast majority of transformants have unstable, circularized vector DNA and only a small portion have stable insertions. To facilitate the collection of stable insertion mutants, we included low-dose 5-FOA in our initial selecting medium as an effort to eliminate unstable cells bearing high copy number of ura4+
vector and producing high levels of Ura4p (Additional file 1
: Figure S1), and subsequently re-screened for mutants that were stably ura4+
and 5-FOA-sensitive (Figure ). While this approach increased the proportion of stable insertion mutants among the total transformants (from 4% to 30%), we note that some mutants might have been excluded. For example, insertions into genomic regions where expression switches between on and off states (e.g. telomeres [47
]) would be excluded from the final library. Likewise, insertion at a locus that causes high ura4+
expression or tandem integration of many functional ura4+
markers could result in increased sensitivity to low-dose 5-FOA and eliminate some mutants during the initial selection.
To increase the versatility of the mutant library, some previously characterized functional DNA sequences were included in our insertion vector, including the lox71 sequence and the mutated human HSP70 promoter. We demonstrated that the mutated lox66 and lox71 could undergo Cre recombinase-dependent integration in S. pombe
, similar to what has been reported in mammalian cells [29
], and showed that this method can be used to clone genomic sequences surrounding the insertion mutations using the pLox66 plasmid. The mutated human HSP70 promoter, which exhibits dramatically reduced activity in S. pombe
, was tethered with a lexA binding site for its potential activation by a lexA DNA-binding domain and transactivator fusion protein. While the goal was to provide an opportunity to ectopically express nearby genes, the tandem integrations of insertion vector and co-integration of mitochondrial DNA could impede the utilization of this promoter to activate genes near the insertion site. Testing of the HSP70 promoter was, therefore, not pursued.
Another novel addition to this insertion mutant library is the inclusion of unique barcodes. These barcodes allow one to take a census of the selected mutants by tracking barcode frequencies after the selection. Barcode sequencing can be facilitated by converting individual Sfi I site-bordered barcodes to barcode oligomers (Figure ), which allows for the generation of multiple barcode sequences per Sanger sequencing reaction, providing a cost-effective and rapid alternative compared to cloning and sequencing individual barcodes. Although the high-throughput sequencing or microarray is possible analytic means for our mutant library, the design of our barcodes provides a medium-throughput alternative which requires only easily accessible laboratory techniques. Moreover, in contrast to the necessity of sophisticated bioinformatics support for microarray and high-throughput sequencing, the sequences of several hundred barcodes obtained from our approach can be easily sorted and analyzed with basic spreadsheet software. We note that our oligomerization and sequencing strategy can also be used to monitor other nucleic acids in cells (e.g. RNAs, mitochondrial genomes). By amplifying a small region around a sequence difference between two different nucleic acids, our oligomerization and sequencing method can easily produce ~200 or more sequences, which should be sufficient for determining the relative proportions of the two forms with data that are much easier to process than data from a high-throughput sequencing experiment.
Random insertion mutant libraries may overcome some challenges in the study of essential genes. Mutants that lack essential genes are not present in haploid gene deletion mutant banks such as the budding and fission yeast ORF deletion collections. Among the S. pombe
insertion mutants analyzed in this work, we found that in six mutants, the insertions were identified in or adjacent to essential genes (Additional file 2
: Table S1), presumably generating truncated proteins or altering the expression of these genes. These results indicate that this insertion mutant library approach provides opportunities for functional analysis of essential genes.
The insertion events characterized in this library are consistent with those shown in previous studies, including insertions in both genes and intergenic regions, large deletions of insertion DNA and little or no deletions of surrounding chromosomal sequences (Additional file 5
: Table S2, and [20
]). We also discovered 16 mutants with mitochondrial DNA co-integrated with the insertion vector. The presence of mitochondrial DNA in the wild type S. pombe
nuclear genome has recently been characterized with one wild type strain containing 12 mitochondrial DNA insertions in its nuclear chromosomes [48
]. Mitochondrial DNA fragments were also found in all repaired plasmid-based double strand breaks in S. pombe
cells in an independent study [26
]. It is worth noting that capture of mitochondrial DNA in the nuclear genome has also been observed in hemiascomycetous yeasts, plant, insect, rodent and human cells and appears to be an active and ongoing process [49
], indicating that S. pombe
transformation provides a way to study this process.
One consequence of mitochondrial and tandem ura4+ DNA insertions is that they can impede the detection of the insertion sites by TAIL-PCR. We have tested three additional approaches for mapping insertion mutations: splinkerette PCR, inverse splinkerette PCR and lox66/71-dependent cloning. While these methods were not 100% efficient, we showed that they could complement each other in defining insertion sites. One advantage of TAIL-PCR is that it detects the junction of insertion vector and genomic sequences and provides the exact location of the integrated vector. In contrast to TAIL-PCR, inverse splinkerette PCR in our mutants directly determined the closest EcoR V sites to the 5’ end (λ buffer end) of insertion vector. Together with the length of the PCR products, only approximate regions of insertion could be obtained. While we did not follow up results from our inverse splinkerette PCR, one could determine the exact location of insertion by cloning the PCR product and sequencing the genomic regions with gene-specific primers. Depending on the complexity of insertion structure, splinkerette PCR could determine the insertion vector-chromosome junction or the closest restriction sites used in the assay to the insertion site. It is important to note that all mutants are tagged by the lox71 sequence, which allows the cloning of the genomic sequences flanking the insertion in E. coli in the event that insertion mutations could not be mapped by these three PCR methods.
In addition to non-homologous recombination-based integration, other methods for generating insertion mutations include transposon-mediated mutagenesis. At least three types of transposons have been analyzed in a genome-wide context in S. pombe
. The S. pombe
retrotransposon Tf1 has been shown to exhibit preference for targeting the promoters of RNA polymerase II transcribed genes [56
]. The piggyBac
) transposon, originally isolated from cabbage looper moth, preferably targets TTAA sites in the genome. Although as much as 79% of transposition events of piggyBac
) analyzed in a haploid S. pombe
strain was located in intergenic regions, it was assumed that this seemingly preference of PB
transposon for intergenic sequences was a consequence of selective pressure on insertions in ORFs that cause reduced fitness [57
]. High throughput sequencing performed in both studies indicated that the transposition events of both transposons broadly distribute among the three chromosomes. The Hermes
transposon from housefly Musca domestica
has strong preference for T at position 2 and for A at position 7 of the target sites and has been adapted for S. pombe
. The limited number of insertion mutants analyzed in that study suggested that Hermes
targets both intergenic regions and coding sequences with no apparent bias [58
]. Insertions generated by non-homologous recombination in our work also have a broad distribution on the three chromosomes, similar to what was observed in Tf1 retrotransposon and PB
] and a previous report on non-homologous recombination in S. pombe
]. Although 60% (16/26) of the partially characterized insertions resided in ORFs or non-coding RNA genes, the enrichment of this type of insertions may be due to pre-selection of the corresponding mutants by visible phenotypes. Thus, the three transposon approaches and our insertion vector approach can create a wide variety of mutations. Our approach has the advantage of adding unique barcodes to each insertion, which has not been applied to the transposon approaches.
The main difference between non-homologous integration and transposon transposition is the structures of insertion events. While transposons generally integrate at individual genomic locations as unmodified single copies with defined junctions between genomic and transposon DNA, tandem integration of the insertion vector DNA or co-insertion of non-nuclear DNA during non-homologous recombination-based integration make this junction more variable. The simple insertion events in transposon mutagenesis allow for high-throughput sequencing for mapping insertion sites, while the complex insertion structures generated by non-homologous recombination require insertion sites to be determined by low/medium-throughput approaches. Nonetheless, the high mutation variety, the presence of random barcodes, and the availability of multiple methods for mapping insertion mutations still make this insertion mutant library an attractive tool for genome-wide studies that can complement the existing S. pombe ORF deletion set.