|Home | About | Journals | Submit | Contact Us | Français|
We used a genetic screen based on tRNA-mediated suppression (TMS) in a Schizosaccharomyces pombe La protein (Sla1p) mutant. Suppressor pre-tRNASerUCA-C47:6U with a debilitating substitution in its variable arm fails to produce tRNA in a sla1-rrm mutant deficient for RNA chaperone-like activity. The parent strain and spontaneous mutant were analyzed using Solexa sequencing. One synonymous single-nucleotide polymorphism (SNP), unrelated to the phenotype, was identified. Further sequence analyses found a duplication of the tRNASerUCA-C47:6U gene, which was shown to cause the phenotype. Ninety percent of 28 isolated mutants contain duplicated tRNASerUCA-C47:6U genes. The tRNA gene duplication led to a disproportionately large increase in tRNASerUCA-C47:6U levels in sla1-rrm but not sla1-null cells, consistent with non-specific low-affinity interactions contributing to the RNA chaperone-like activity of La, similar to other RNA chaperones. Our analysis also identified 24 SNPs between ours and S. pombe 972h- strain yFS101 that was recently sequenced using Solexa. By including mitochondrial (mt) DNA in our analysis, overall coverage increased from 52% to 96%. mtDNA from our strain and yFS101 shared 14 mtSNPs relative to a ‘reference’ mtDNA, providing the first identification of these S. pombe mtDNA discrepancies. Thus, strain-specific and spontaneous phenotypic mutations can be mapped in S. pombe by Solexa sequencing.
tRNA gene number and arrangement are of interest to genome biologists (1–7) because tRNA abundance is matched to codon usage (3) and tRNA genes affect nuclear and genome organization [(8,9), reviewed in ref. 10, and see ref. 11]. Biased codon usage in functionally related mRNAs suggests that the relative levels of tRNA isoacceptors may reflect a means of genetic control (12–15). Most eukaryotes contain a variable number of tRNA genes, from ~200 to several thousand (4). Yet, in humans as in species from yeast to dog, about 12 codons have no tRNA with direct Watson:Crick basepairing (Supplementary Table and ref. 4), and must rely on wobble decoding. Further, the ratio of different isoacceptor tRNA genes can vary significantly between species (16) (Supplementary Table S1). Thus, eukaryotic evolution has been accompanied by highly variable proliferation of certain tRNA genes while many others are absent or extremely underrepresented.
Promoters for eukaryotic tRNA genes are all quite similar and provide similar transcription output. As such, differences in tRNA levels usually occur by varying tRNA gene copy number (17) which appears to coevolve with codon usage (3,5). Sequence identity suggests that multicopy tRNA genes arise by duplication and are maintained by gene conversion [(4), reviewed in refs 10,18 and 19]. Chromatin at eukaryotic tRNA loci contains RNA polymerase III (pol III) and other factors including the RNA-binding, molecular chaperone protein, La (20,21). Transcription termination by pol III generates UUU-3′OH ends on its nascent RNA products (22,23) which is a binding site for La (24–26) and thus directs early RNA processing events (27).
Maturation of a nascent pre-tRNA to a mature functional molecule requires many enzymatic and carrier-mediated processes (28). Inherent variability in the sequence and structure of different tRNAs suggests that they will differ in their activity as substrate for a particular enzyme, binding protein, or process (27). Indeed, significant variability in the order of processing and modification steps exists for different tRNAs (27). In the collective ‘pathway’ of tRNA maturation some events differ in the extent to which they are concentration-driven, kinetics-limited, spatially ordered and amenable to alternate mechanisms (27). Processes that degrade pre-tRNAs, such as nuclear surveillance, compete with some of the productive steps (27). Accordingly, pre-tRNAs that are not efficiently engaged by productive processing become subjected to surveillance systems that actively degrade hypomodified, improperly processed or structurally impaired pre-tRNAs (28–31). As a chaperone for pol III transcripts, La protein can offset this type of decay (27,32). Thus, the course for a pre-tRNA can vary depending on if it is engaged by La and/or other factors (27,28). Moreover, recent findings that unprocessed pre-tRNAs can activate stress signals (33,34) that can impact the small interfering RNA (siRNA) (35) or other pathways (31,36–40) suggest that our understanding of tRNA metabolism is incomplete.
While La serves to facilitate pre-tRNA processing, it is nonessential in yeasts (24). La contains two conserved RNA binding motifs, the La motif (LAM) and an adjacent RRM1 that act differently during tRNA maturation (41). Two types of chaperone activity have been attributed to La proteins, molecular chaperone activity and RNA chaperone activity (27). Its principal RNA-binding activity, UUU-3′-OH binding, is mediated primarily by the LAM to protect against 3′ exonucleases and promote 3′ end processing and RNP maturation. Mutations in human La (hLa) or Schizosaccharomyces pombe La, Sla1p, that disrupt RNA 3′ end binding or intracellular trafficking cause tRNA processing defects consistent with dysfunctional molecular chaperone activity (42,43).
The RNA chaperone-like activity noted for human and yeast La proteins (44–47) is an enigmatic aspect of La function that is discernible from UUU-3′OH binding. La promotes the maturation of pre-tRNAs with structural impairments that otherwise subject them to misfolding and degradation (32,46). The typical β-sheet surface and associated loop-3 of La RRM1 constitute an RNA-binding site that confers RNA chaperone-like activity (32,47). Two classes of mutations in La RRM1 have been distinguished. Mutation of basic residues in RRM1 loop-3 do not impair UUU-3′OH binding but significantly decrease binding to other parts of the pre-tRNA, decrease RNA chaperone activity and are detrimental to tRNA maturation (47). Mutation of the two invariant aromatic residues on the RRM1 β-sheet surface debilitates hLa and Sla1p for maturation of structurally impaired pre-tRNAs in vivo, but with too subtle a defect in RNA binding for detection in vitro (32). Nonetheless, these β-sheet surface aromatic mutations inactivate hLa and Sla1p for maturation of structurally impaired pre-tRNASer even in cells that lack Rrp6p (32), the 3′ exonuclease component of nuclear surveillance that degrades one class of structurally impaired pre-tRNAs (30,32,48). This observation suggested a novel, rrp6-independent degradation pathway for structurally impaired pre-tRNASer in the sla1-rrm1 mutants (32).
To try to better understand this, we developed a genetic screen in S. pombe that relies on accumulation of a functional suppressor tRNASerUCA-C47:6U despite a debilitating substitution of a G:C to a G:U basepair in its variable arm (Figure 1A). In cells carrying wild-type Sla1p, the pre-tRNASerUCA-C47:6U from this allele is matured and functionally suppresses a nonsense codon in ade6-704, alleviating accumulation of red pigment. However, this allele does not produce tRNA in the sla1-rrm1 mutant carrying the point mutations in the RRM1 β-sheet surface at levels sufficient to mediate TMS (32). We isolated spontaneous revertants that restore tRNASer-mediated suppression in sla1-rrm cells and mapped the phenotypic mutation by whole genome sequencing. We establish that this approach is applicable to S. pombe and report that the phenotypic mutation is duplication of the suppressor tRNASer gene. We show that a surprising disproportionate increase in suppressor-tRNA levels that occurs with tRNA gene duplication requires sla1-rrm, provoking a model in which activation by La of tRNA maturation is dependent on the concentration of the pre-tRNA.
In the process of genome analysis, we also uncovered 24 strain-specific single-nucleotide polymorphism (SNPs) in the genomes of our laboratory strain and the reference strain 972h- (yFS101) which were separated ~30 years ago and passed through multiple laboratories.
Strains are listed in Table 3. sla1-Y157A or sla1-Y157A/F201A RRM1 mutations were introduced into ySH18 which harbors a copy of tRNASerUCA-C47:6U and associated pJK148 DNA at the leu1-32 locus as described (49) to generate yMWB2-4 (test strain) and yMWB3-15 (parent), respectively. yMWB3-15 was plated on media containing limiting (10mg/l) adenine, and spontaneously arising white colonies (e.g. yDA317) were isolated after 3 days of growth and characterized as described.
The S. pombe genome used in this study originated from the sequence obtained at Sanger Institute, dated June 2008 (50). Modifications were introduced to better reflect the strains used in this study following a first pass of read alignment. SNP and insertion/deletion modifications noted by the Broad Institute (51) were incorporated after identifying identical mutations in the parent strain. pJK148 vector sequence including the suppressor tRNA insertion (49) was incorporated at the leu1-32 locus as described (52).
The four poly-N gaps in the Sanger reference were probed as follows. Gap-flanking sequences were used in a seeded search of unaligned reads. For the gap at Chr 2 position 80202 this led to isolation of continuous sequence tracts on either side. A BLAST search identified the previously resolved ~18-kb gap-spanning sequence (53) which was inserted to replace the poly-N gap, and predicted genes in this ~18kb were incorporated into the sequence annotations. Introduction of this sequence provided continuous coverage at near average depth across the region. As a result, some Chr 2 gene positions in our reference are offset by ~18000 relative to the Sanger positions (Tables 1 and and4).4). Similar attempts to resolve the remaining three contig gaps (Chr 1: 29662; Chr 2: 1616568; and Chr 2: 1083349) were unsuccessful with this read data set.
Sequencing of samples was according to standard methods. Unpaired 36-bp reads were collected and pooled under designation WT (yMWB3-15), MUT (yDA317) or TEST (yMWB2-4) for each. Our updated S. pombe reference genome was used in alignments performed using MAQ (Mapping and Assembly with Qualities) software (56) without any trimming. SNPs were called from sequence alignments. Deletions and insertions were identified and called by alignment of flanking sequence seeds to regions of zero coverage using the total read database, followed by verification by polymerase chain reaction (PCR) amplification of genomic DNA and conventional sequencing.
Reads were obtained and adjusted to Sanger FASTQ format and pooled for each strain dataset (57). MAQ software was utilized in alignment of reads and calling of SNPs (56). Alignments were manipulated with SAMtools (58). Programs used in our analysis were written primarily in Perl with some use of BioPerl modules (59) and are described and available for free download as the MATCH-G toolset at http://science.nichd.nih.gov/confluence/display/smcb/MATCH-G+Program or through the Maraia laboratory website http://science.nichd.nih.gov/confluence/display/smcb/Home. Briefly, these programs performed the following tasks: assignment of SNPs to the reference genome and determination of resulting mutations with corresponding confidence scores; determination of regions corresponding to zero read coverage; seeded search and alignment of zero coverage flanking sequence regions for determination of small deletions and insertions; comparison of annotated feature average mapped read depth between data sets; and a sliding averaging window comparison of mapped read depth between data sets.
Mutations were verified from DNA prepared from acid-washed bead-lysed S. pombe cells. Following basic phenol–chloroform extraction, lysate was treated with DNase-free RNase A and DNA was ethanol precipitated and quantitated by nanodrop. PCR amplification of regions of interest was performed and purified using a QIAGEN QIAquick PCR cleanup kit. PCR products were sequenced by ACGT, Inc. (Glenview, Ill) to verify mutations identified by Illumina sequencing.
Identification of regions of duplication was performed using mapped coverage depth by two approaches. In the first approach, for each annotated feature ‘context’ of the genome, the number of mapped reads at each position was summed and averaged over the full length of the annotated region. These averages were compared for yMWB3-15 and yDA317 to obtain a ratio corrected by a factor of 1.12 to reflect the difference in average mapped read depth for these strains. The ratios were compared at each feature to establish statistical distribution for all annotated features, filtering out only those <50nt long, and copy number exceeding 1000 reads, to limit noise.
In the second one, context-independent approach, we used a sliding averaging window of 400 bases and average mapped read depth was compared for yMWB3-15 and yDA317. Positions where the average depth difference exceeded 1.8-fold were identified as indicative of a duplication or loss of copy event and combined into continuous units.
The ~400-bp suppressor tRNASerUCA-C47:6U PstI–SacI gene fragment (49) and ~150bp of flanking DNA on either side was PCR amplified from the pJK148 vector used to create yWMB3-15, and ligated into pFA6A at the SalI and BamHI sites. pFA6A with and without the suppressor tRNA insert were linearized with NdeI and transformed into yMWB3-15 and yMWB1-1 in parallel. Transformed cells were plated onto YES media then replica plated onto YES+G418 to screen for integrants.
Three G418-resistant transformed colonies from the empty pFA6A and five from the suppressor tRNA-containing pFA6A were isolated and streak purified. These were then spotted onto media with limiting adenine (10mg/l) alongside yMWB3-15 and yDA317 for comparison to examine suppression phenotype.
Total RNA was prepared from cells and northern blotted as described (49) using 10% TBE-urea gels and iBlot transfer apparatus (InVitrogen). The RNA was transferred to a Genescreen-Plus membrane which was UV-cross-linked and vacuum baked and hybridized with a 32P-5′-GACAGAGCCCATTAGATTTGAAG DNA complementary to spliced mature suppressor tRNASerUCA-C47:6U in the presence of an equal amount of unlabelled oligo GGCAGAGCCCATTAGATTTGAAG to block hybridization of the 32P-5′-probe with the highly related endogenous tRNASerUGA, a method known as homolog exclusion probing (60). After washing, the membrane was sealed, exposed and quantitated by PhosphorImager. The membrane was stripped, re-hybridized with a U5 snRNA probe to standardize quantitation.
Genomic copy number of the suppressor tRNA gene was verified by semi-quantitative duplex PCR utilizing a single copy region of the genome (sla1 gene) for comparison. Purified genomic DNA was incubated with two primer pairs (sla1F: CCGAATATTGTTACGATTTAAGCATT, sla1R: AACATTGTCTTCATGAGTAGG AAA, SUPF: GCGGGCCTCTTCGCTATTA, SUPR: GGCTCCTATGTTGTGTGG AATT) and amplified as follows (95°C 5min; 25 cycles at 95°C 45s, 55°C 45s, 72°C 90s; 72°C 3min, 4°C). Twenty-five cycles had been demonstrated to be within the linear range of amplification for both primer sets (data not shown). Bands were stained with ethidium bromide, visualized by UV and quantitated using VisionWorksLS software. A ~700-bp band density corresponding to the suppressor tRNA was divided by the ~1000-bp band density corresponding to sla1. The ratios for each of four independent PCR data sets were normalized using yMWB3-15 ratio as representing a 1:1 copy number.
Tandem duplication was detected by a PCR assay strategy (61) using four primers: (i) ATAAGGAAGCCTTGGGAGGA, (ii) TATCCGCTCACAATTCCACA, (iii) CCAATTCGCCCTATAGTGAGTC and (iv) TGGATGAACATTGTAAATGGTAGG. The positions of primers 1–4 are indicated in Figure 4A. PCR was performed on purified genomic DNA in 25mM Tricine pH 8.7, 85mM KOAc, 8% glycerol, 1% DMSO and 1.2mM Mg(OAc)2 using a mixture of polymerases 2U Tth (Promega) and 0.1U Vent (NEB) in 50 ul. Cycling was 95°C 5min; 30 cycles at 95°C 45s, 55°C 45s, 68°C 7min; 72°C 3min, 4°C. Five microliters of each reaction was resolved on 1% agarose gel.
Our laboratory strains yAS95 and yAS99 containing ade6-704, were derived in 1996 after crossing sp1190 (62) (a.k.a., yAS50) with spDJV1 (63) derived from spB67, a diploid with the ade6-M210/ade6-216 alleles (64). The strain generated for the present study, yMWB3-15 (parent) harbors ade6-704, which contains a UGA stop codon (65) that leads to accumulation of red pigment when grown in media with limiting adenine (66). Suppressor tRNASerUCA can suppress this nonsense stop codon and lead to production of functional Ade6p enzyme, which suppresses red pigment. yMWB3-15 contains the suppressor tRNASerUCA-C47:6U allele which carries a nucleotide substitution in the variable arm, and the sla1-rrm mutant allele sla1-Y157A,F201A (Figure 1B). The tRNASer substitution changes a G:C basepair to a G:U basepair in the tRNASerUCA variable arm. This substitution prevents tRNASerUCA-C47:6U-mediated suppression in sla1-Δ as well as sla1-Y157,F201A cells (32). yMWB3-15 harbors ade6-704, tRNASerUCA-C47:6U and chromosomal sla1-Y157A,F201A, and expresses Sla1p-Y157A,F201A at wild-type levels (data not shown), but accumulates red pigment due to lack of tRNA-mediated suppression (TMS).
yMWB3-15 was plated on limiting adenine media in order to isolate spontaneously arising white colonies. Multiple isolated colonies were purified and subjected to conventional sequencing of the tRNASer-C47:6U, sla1-Y157A,F201A and ade6-704 loci to screen for mutants that had not reverted these alleles to their functional counterparts (no reversions were found, data not shown). The mutant designated yDA317 was chosen for Solexa sequencing because it exhibited higher levels of mature suppressor tRNASerUCA-C47:6U, than the parent strain (data not shown, see below).
yMWB2-4 is a strain generated during the development of yMWB3-15 that was known to harbor mutations in and around the sla1+ locus that differ from the parent strain. yMWB2-4 was subjected to whole genome sequencing in parallel with parent (yMWB3-15) and mutant (yDA317) to serve as a test and to help train our analytical methods. SNPs relative to the parent were determined blindly by MAQ alignment software (56) utilizing S. pombe genome 972h- (GenBank) with modifications (1,51) as the alignment reference (Table 1). This led to identification of all of the yMWB2-4 mutations at the nhp6-sla1-tim40 locus (Table 1, known), as well as three additional differences (unknown), all of which were verified by conventional sequencing.
We then compared the parent (yMWB3-15) and spontaneous mutant yDA317 and identified only one SNP, which represents a synonymous mutation to valine 74 of the erg3+ gene open reading frame (Table 1). Analysis of this SNP in erg3+ predicted no significant change in valine codon usage or local RNA structure. Furthermore, no reverse strand RNA or open reading frame (ORF) was annotated nor predicted. Given the number of adventitious synonymous SNPs discovered in our test strain (and in yFS101, below), it seemed unlikely that the erg3+ synonymous SNP caused the TMS phenotype in yDA317. We therefore sought to examine the whole genome sequences of our parent and mutant strains for mutations other than SNPs by further analysis of the Solexa data.
Insertions and deletions were sought by identifying sequence reads that flanked regions of zero mapped coverage in the ref. genome, followed by verification by PCR sequencing. This approach identified a single base deletion in an annotated miscRNA, at chromosome 2, Sanger position T603280 (Table 1). However, this deletion relative to the reference was shared by the parent and mutant strains (Table 1) and is not responsible for the mutant phenotype.
The next approach was to determine if a change in copy number of a gene or region had occurred, and, for this, two strategies were employed. At the positions of known and predicted genes, the mapped read depth was averaged across the entire length of the annotated feature, as an annotation context-dependent approach. These averages led to ratios of mapped read depth for the parent and mutant at each known transcription/annotation unit. These ratios were corrected slightly to reflect the difference in average read depth for the entire genomes of the strains, 40 for the parent and 45 for the mutant. When the ratios were plotted in a simple statistical distribution, the copy number varied along a standard bell curve at a mean of a 1-to-1 ratio (Figure 2). One significant outlier, however, was in the number of reads mapping to the suppressor tRNASerUCA-C47:6U sequence, suggesting gene duplication. The position on the bell curve of this copy number variation was more than 4 SD away from the median, toward increased copy number in the mutant strain (Figure 2).
The potential for gene duplication was then assessed by a second strategy that also yielded size information as part of a context-independent analysis of mapped read depth. For each base of all three chromosomes, the average read depth of that base and 200 bases on either side was compared between parent and mutant strains using a cutoff of ≥1.8 fold difference in read depth between strains. This approach identified four regions of ~2-fold increase in mapped reads in the mutant relative to the parent, two intergenic, one corresponding to a 43-bp region of the USP coding sequence and a large section (5.8kb) corresponding to the suppressor tRNASerUCA-C47:6U allele plus associated flanking DNA used for its prior integration in ySH18, a precursor of yMWB3-15 (Table 2). While both strategies identified few differences in copy number at loci other than the suppressor tRNASerUCA-C47:6U gene, the others were not as well supported statistically as the tRNASerUCA-C47:6U gene duplication (see ‘Discussion’ section), nor could they as readily explain the suppression phenotype. Thus, the tRNASerUCA-C47:6U gene duplication was an excellent candidate as causative of the suppression phenotype.
Since two analytical approaches suggested duplication of the suppressor tRNASerUCA-C47:6U gene, we tested if introduction of an additional copy of this gene would result in the suppression phenotype. The parent strain was transformed by chromosomal integration of linearized kanR-containing pFA6A vector harboring either a suppressor tRNASerUCA-C47:6U insert of ~300bp, or no insert. Transformed cells were plated on YES, then replica plated onto YES+G418. G418-resistant colonies were randomly picked from both the no-insert and the suppressor tRNASerUCA-C47:6U-insert transformants, streak purified and transferred to limiting adenine for comparison to each other, parent and mutant, for TMS phenotype. None of the three analyzed transformants derived from the empty vector showed suppression (Figure 3A, row 3). By contrast, all five analyzed transformants derived from the suppressor tRNASerUCA-C47:6U gene clearly demonstrated suppression greater than in the parent strain yMWB3-15 (Figure 3A, compare rows 1 and 4) and to a similar degree as in the spontaneous mutant yDA317 (Figure 3A, row 2). This demonstrated that introduction of a suppressor tRNASerUCA-C47:6U gene into our parent strain results in a phenocopy of the mutant phenotype. Moreover, although yDA317 contained an extra tRNASerUCA-C47:6U gene as well as associated pJK148 vector DNA, the present result isolated the phenotypic segment to the extra tRNASerUCA-C47:6U gene. It also showed that the synonymous erg3-SNP in yDA317 was not required for the suppression phenotype. The data argue that the suppressor tRNASerUCA-C47:6U gene duplication detected by whole genome sequencing is responsible for the suppression phenotype in the yDA317 mutant.
As noted in the ‘Introduction’ section, preliminary characterization indicated that yDA317 exhibited higher levels of tRNASerUCA-C47:6U than the parent, yMWB3-15. To evaluate if introduction of an extra tRNASerUCA-C47:6U gene also led to elevated tRNASerUCA-C47:6U levels, we performed northern blots. Figure 3B demonstrates increased suppressor tRNASerUCA-C47:6U levels in both yDA317 mutant and yMWB3-15 transformed with the suppressor tRNASerUCA-C47:6U gene, relative to the nontransformed parent strain. Using U5 snRNA for calibration, suppressor tRNASerUCA-C47:6U levels were reproducibly elevated roughly 8-fold in the yDA317 mutant relative to the yMWB3-15 parent (Figure 3B, compare lanes 2 and 3). Similar increase occurred in yMWB3-15 transformed with the suppressor tRNASerUCA-C47:6U gene (lane 5) but not with the vector only (lane 4).
This verification was achieved by semi-quantitative duplex PCR amplification of genomic DNA at the suppressor tRNASerUCA-C47:6U and the sla1 loci, the latter as a single copy gene control in the same reaction (Figure 3C). PCR products from four independent reactions were quantitated for the tRNASerUCA-C47:6U bands and normalized to the sla1 PCR bands in the same lanes with the results shown below the lanes of Figure 3C. The tRNASerUCA-C47:6U fragment was 2-fold increased in yDA317 relative to the yMWB3-15 parent strain (lanes 2 and 3), consistent with estimation derived from the Solexa sequence read coverage. A similar increase was observed for the tRNA-transformed parent (Figure 3C, lane 5) relative to the empty vector-transformed parent strain (lane 4). The cumulative data establish that a suppressor tRNASerUCA-C47:6U gene duplication causes elevated tRNASerUCA-C47:6U levels and increased tRNASerUCA-C47:6U-mediated suppression.
The Solexa data indicated duplication of the suppressor tRNA gene and flanking DNA. We suspected a tandem duplication and tested this using a four-primer PCR approach (61). Genomic DNA was amplified using different combinations of four primers as diagramed in Figure 4A. The expected PCR fragments from primer pairs 1, 2 and 3, 4 should occur regardless of copy number and can be considered as positive controls for the primers (Figure 4B, lanes 1, 5, 8 and 12). Primer 2 alone or primer 3 alone would produce products only if tandem duplication occurred in opposite orientation, which was not observed (Figure 4B, lanes 6 and 7, and 13 and 14). However, combining primers 2 and 3 produced a ~5.2-kb fragment from yDA317 but not yMWB3-15 (Figure 4B, compare lanes 3 and 10). This fragment would arise only if tandem duplication occurred in the same orientation. As confirmation, this 5157-bp fragment was subjected to standard sequencing and found to have the anticipated sequence including an intact leu1+ (data not shown). (Although yDA317 contains two leu1+ genes these cells do not exhibit growth advantage over yMWB3-15, which contains a single leu1+ gene, in media lacking leucine, data not shown.) From these data, it appeared that the yMWB3-15 region demarcated by the gray filled triangles (Figure 4A) containing the suppressor tRNA gene was duplicated in yDA317.
Using the PCR tRNA gene duplication assay we examined genomic DNA from 30 additional mutants isolated at the same time as yDA317 and stored at −80°C. Upon reevaluation, three of the mutants had weak or no suppression phenotype and exhibited a single copy tRNASerUCA-C47:6U gene. Of the remaining 27 suppressed mutants, 21 had two copies of the tRNASerUCA-C47:6U gene and four had three or more copies. Thus, 25 of 28 spontaneous mutants (90%) with increased TMS had undergone tRNASerUCA-C47:6U gene duplication or additional amplification. The three suppressed mutants with a single tRNASerUCA-C47:6U gene did not express elevated levels of tRNASerUCA-C47:6U (data not shown) and are thus of questionable interest.
The northern blot data were somewhat surprising because prior experience with TMS suggested that an increase in suppressor tRNASerUCA levels necessary to cause a TMS phenotype would be greater than a 2-fold increase expected from a gene duplication.
We considered two models by which a tRNASerUCA-C47:6U gene duplication can lead to a disproportionately large increase in tRNASer-C47:6U levels, which could potentially be distinguished experimentally. It was assumed for both models that the increase in pre-tRNASerUCA-C47:6U transcription that results from gene duplication would overcome a concentration-dependent process that would allow some of the pre-tRNASerUCA-C47:6U to evade decay for at least long enough to be engaged by a more productive maturation process that promotes accumulation. In the first model, a 2-fold increase in the concentration of nascent pre-tRNA substrate might enhance the rate of association with the Sla1-Y157A,F201A protein, whereas in the second model, the increase in pre-tRNA might exceed the capacity of a degradative enzyme. The former was more appealing for multiple reasons including because as noted in the ‘Introduction’ section, Sla1p-rrm is expected to exhibit only a subtle deficiency in RNA binding relative to wild type Sla1p (32), which might be over ridden by a 2-fold increase in the concentration of nascent pre-tRNASerUCA-C47:6U ligand. In the second model, the increased pre-tRNASerUCA-C47:6U substrate might exceed the capacity of a degradative nuclease in the absence of Sla1p-rrm. It might be expected that in the absence of Sla1p more pre-tRNASerUCA-C47:6U would be targeted for decay than in its presence, and the increased pre-tRNA produced from a second copy of the tRNASerUCA-C47:6U gene would further exceed the capacity of the decay pathway and lead to an increased TMS phenotype.
We did an experiment to ask if sla1-rrm was required for the TMS phenotype upon increase in tRNASerUCA-C47:6U gene copy number. We examined yMWB1-1 (sla1-Δ) which is isogenic with yMWB3-15 except that it is devoid of Sla1-rrm protein (Table 3 and data not shown). Neither yMWB1-1 nor yMWB3-15 exhibits TMS (Figure 5A, row 1). yMWB1-1 was transformed with empty vector or vector containing the tRNASerUCA-C47:6U gene, and then examined for TMS, with parallel transformation of yMWB3-15 to serve as control. Individual transformants were picked randomly and then examined for TMS (Figure 5A, rows 2–5). Introduction of vector alone or vector plus additional copy of suppressor tRNA gene into yMWB1-1 did not produce suppression (Figure 5A rows 4 and 5). This was in contrast to yMWB3-15 (sla1-Y157A,F201A) into which an extra copy of the suppressor tRNA gene produced increased TMS in all five transformants, whereas empty vector did not (Figure 5A, rows 2 and 3, respectively). These data would appear to distinguish between the two models considered above and suggest that the mechanism by which tRNASerUCA-C47:6U gene duplication leads to the observed TMS phenotype requires Sla1-rrm protein (see ‘Discussion’ section).
The data are consistent with the expectation that an increase in nascent pre-tRNASerUCA-C47:6U produced from the second copy gene would be enough to increase its rate of association with the Sla1p-rrm, with resulting positive effects on TMS. If so, we might expect that overexpression of ectopic Sla1p-rrm in yMWB3-15 would lead to TMS. Indeed, ectopic expression of either wild-type sla1+ or sla1-Y157A,F201A, led to TMS (Figure 5B). The data in Figure 5 support the contention that increasing the association of pre-tRNASerUCA-C47:6U with Sla1p-Y157A,F201A, is sufficient to lead to TMS.
The original Sanger Institute sequence assembly of 972h- (1) was recently compared to that obtained using Solexa technology for a single 972h- isolate designated yFS101 (51). We compared our sequence to the yFS101 sequence obtained by Solexa technology and identified 24 differences (Table 4). At each of these positions, yDA317 and yMWB3-15 were identical. We prepared DNA from yFS101 (kindly provided by N. Rhind) and verified these differences in yMWB3-15 and yFS101 in parallel by conventional sequencing. The 24 SNPs consisted of four coding mutations that changed identities of single amino acids in each of four protein sequences, one miscRNA of unknown function, four synonymous coding and 14 intergenic or intronic mutations (Table 4). In addition to these, the Solexa data confirmed mutation in ade6-704 allele in our strain, a T→A that changes a cysteine codon at position 215 of Ade6p to a stop codon (Table 4) as reported (65), as well as mutations in the ura4+ locus (data not shown).
After establishing our modified reference genome for yMWB3-15 and obtaining and verifying the nuclear DNA SNPs relative to yFS101, we further modified it by incorporating the available S. pombe mt DNA reference sequence (54,55). This had two remarkable effects on the analysis. Overall mapped read coverage increased from 52% to 96%, with 44% of the reads mapping to the mtDNA. Accordingly, read depth for the mt DNA was uniformly very high, with a range of 5000–10000 coverage (e.g. see Table 4). Assuming ~100 copies of mtDNA per S. pombe cell (67), this degree of observed coverage is congruent with our nuclear DNA coverage.
With a cutoff of 85% identity, we observed 13 SNPs and two single-nucleotide insertion/deletions in the mtDNAs of the reference mtDNA sequence (54,55) as compared to our parent and mutant strains (Table 4, Mt). In contrast to the observed SNPs in the nuclear chromosomes, the majority of these mt mutations result in coding mutations. The source(s) of these mtDNA differences is unknown (see ‘Discussion’ section).
We then examined the available yFS101 data set which had not previously been analyzed for mtDNA (51), comparing it to the reference mtDNA (54,55). The same set of 13 SNPs and two insertion/deletions was identified in yFS101 as in our strain, revealing 100% identity of yMWB3-15 and yFS101 throughout their mtDNA (Table 4, Mt). In conclusion, our strain and yFS101 differ in 24 SNPs in their nuclear DNA but are identical in their mtDNA.
One conclusion that can be drawn from this work is that comparative whole genome sequencing of S. pombe can be used to detect and map mutations in spontaneous mutants selected for a phenotype, in this case related to tRNA expression, after one plating. The analytical tools described here can detect insertions, deletions, differences in gene copy number and SNPs. This led to the discovery that duplication of a suppressor tRNASerUCA-C47:6U gene is responsible for the selected phenotype.
To compare our strains with each other we first generated an electronic version of our parent strain genome using the original S. pombe reference as a platform (1). Having this in place, we could then compare our strain with the reference 972h- yFS101 genome that had been recently updated using Illumina (Solexa) technology (1,51). This revealed that our strain differs from yFS101 by only 24 SNPs (in addition to expected difference in marker alleles such as ade6-704) (Table 4) (65).
A second conclusion is that the RNA chaperone-like activity of La protein appears to be able to activate its target pre-tRNA ligands for subsequent accumulation of the mature functional tRNA. As will be discussed in more detail below, consistent with other RNA chaperones that function via low-affinity and nonspecific interactions with their RNA ligands, our data suggest that the deficiency in nonspecific RNA binding activity of the mutated Sla1-rrm protein can apparently be overcome by the increase in nascent pre-tRNA that would result from a 2-fold increase in gene dosage. Although a 2-fold increase in a nascent pre-tRNA may appear modest, the effects should be considered in the context of the increased concentration of a single pre-tRNA in a diverse mixture of other pre-tRNAs that compete with variable affinity for La. Accordingly, as a result of a 2-fold increase in suppressor pre-tRNA it should more competitively associate with the Sla1-rrm protein and thereby benefit from its chaperone-like activity.
Our genetic screen was designed to select for the appearance of mutants that exhibited a gain-of-function in codon-specific TMS of ade6-704. Indeed the output of the screen reported here was found to be an increase in the codon-specific TMS which occurred in association with increased suppressor tRNASerUCA-C47:6U levels. Analysis led to the discovery that duplication of the suppressor tRNASerUCA-C47:6U gene is responsible for the selected phenotype. Although at first glance this might appear to have been likely, it was unexpected for the following reasons. First, on the basis of other genetic screens of yeast, SNPs appear to be more often uncovered than gene duplication. Second, prior and current evidence indicate that the deficiency in tRNASerUCA-C47:6U expression in our parent strain is at the posttranscriptional level, due to degradation of the precursor-tRNA in the absence of protection by La activity-2 (32). Thus, our expectation was that we might find a debilitating mutation in a gene that contributes to a surveillance-related degradation of the pre-tRNASerUCA-C47:6U in the absence of La activity-2. Finally, prior experience suggested that the increase of tRNASerUCA levels required to cause increased TMS phenotype was anticipated to be greater than the 2-fold increase expected from a gene duplication. On this issue, we were surprised by the disproportionately high (8-fold) level of increase in mature suppressor tRNASerUCA-C47:6U in yDA317 and upon intentional introduction of a second copy of the tRNASerUCA-C47:6U gene.
We hoped to identify a trans-acting factor whose mutation would allow the structurally defective pre-tRNASer-C47:6U to overcome decay as in wild type sla1+ cells. Surprisingly, tRNASerUCA-C47:6U gene duplication led to disproportionately high levels of tRNASerUCA-C47:6U accumulation. An initial idea that the tandem arrangement of tRNASerUCA-C47:6U genes in yDA317 contributes to activated transcription due to increased local concentration of pol III was dismissed after finding that non-targeted introduction of a second tRNASerUCA-C47:6U gene also led to increased suppression in all of the multiple transformants examined, with a 5–8-fold increase in tRNASerUCA-C47:6U levels (Figure 3, data not shown). Toward gaining insight into the mechanism by which the gene duplication led to unexpectedly high levels of tRNASerUCA-C47:6U we determined that this was dependent on the presence of sla1-rrm allele which carries the Y157A,F201A substitutions (or wild type sla1+) since introduction of an extra copy tRNASer-C47:6U gene into the sla1-null strain yMWB1-1 did not cause suppression (Figure 5A). Likewise overexpression of sla1-Y157A,F201A (or wild type sla1+) also led to increased TMS in yMWB3-15 (and yMWB1-1, not shown) (Figure 5B).
The question is how a 2-fold increase in synthesis of a structurally deficient pre-tRNA, as would be expected to occur upon gene duplication, can lead to a greater-fold increase in tRNA accumulation in a La-dependent manner. We believe that a 2-fold increase in nascent pre-tRNA is sufficient to overcome the slight decrease in the RNA-binding affinity of Sla1p caused by the Y157A,F201A mutations to the β-sheet surface of RRM1. By overcoming this binding deficiency, the pre-tRNASerUCA-C47:6U would benefit from the RNA chaperone-like activity of Sla1p and evade nuclear decay during a critically vulnerable period in its maturation. Avoiding decay during a vulnerable period, even transiently, could allow a fraction of the tRNASerUCA-C47:6U to accumulate, e.g. by export to the cytoplasm or other locale away from the nuclear decay system.
To better understand the Sla1p-rrm related aspects of this model, some of the RNA-binding characteristics of La protein and other RRM proteins as well as RNA chaperone activities should be considered. Different RRM-containing proteins use their β-sheets and connecting loops in a variety of ways to engage their target RNAs (68). In some RRM proteins, the aromatic residues equivalent to Sla1p Y157 and F201 interact with RNA via stacking in a sequence-nonspecific manner and contribute significantly to overall affinity, whereas in others their contribution is much less (68). For La protein, the LaM confers high-affinity UUU-3′OH recognition, while the RRM1 β-sheet surface appears to confer nonspecific binding to other regions of the pre-tRNA as might be expected for interaction with a variety of RNAs that differ in sequence (47). Whereas mutation of the conserved aromatic residues on the RRM1 β-sheet of some other RRM proteins decrease RNA binding, for La RRM1 no significant reduction in pre-RNA binding was observed in vitro, although the negative effects on TMS were apparent in vivo (32). Thus, mutation of the RRM1 β-sheet aromatic residues appears to minimally decrease the nonspecific RNA-binding activity of La protein (32,47). Accordingly, a 2-fold increase in the concentration of pre-tRNA ligand may be sufficient to increase the rate of association with Sla1p-Y157,F201 and overcome this minimal deficiency. Consistent with this model is that nonspecific, low-affinity interactions indeed characterize other RNA chaperones that assist RNAs during folding (69). That sla1p-rrm confers this activity advances our understanding of the RNA chaperone-like activity of La protein.
Finally, as noted in the third paragraph of the ‘Discussion’ above, we should also consider the effect of a 2-fold increase in the concentration of a single pre-tRNA on the potential competitive strength for La binding in the presence of all of the other nascent pre-tRNAs. Accordingly, a 2-fold increase in nascent suppressor pre-tRNA should confer better competition for the Sla1-rrm protein and thereby afford it the benefit of its chaperone-like activity.
The number of discrepancies detected in our parent and mutant strain was low, consisting of a synonymous SNP and a duplication of >500bp, the latter of which was causative of the selective phenotype. We note that in addition to the suppressor tRNA gene, three other duplications were detected (Table 2), although the relatively small size of these regions at 23, 43 and 108bp, and the intergenic locale of two of them, made them unlikely candidates as causative of the phenotype. Moreover, for these loci the ratio of mutant-to-parent reads was <2.0, whereas for the suppressor tRNA gene the ratio was >2.0. Perhaps, more significant, the read depth or coverage at each of these three loci was lower than for the suppressor tRNA gene (Table 2), and likely result from relative mapped read depth noise exceeding the averaging window. On the basis of the last two points, we suspect that these may not reflect true duplications in the yDA317 strain, although we have not attempted to verify this.
The original sequence assembly of 972h- (1) was recently compared to that obtained using Solexa technology for a single 972 isolate designated yFS101 (51). In that case, 200-fold coverage provided a very low false positive rate. Significantly, 190 discrepancies were observed and noted to be possibly due to errors in the reference 972 assembly, sequence polymorphisms in isolates of 972 or systematic errors (51). Our sequence matched the yFS101 sequence in all of these previously noted discrepancies, but did not match either the original reference nor the yFS101 at 24 other positions (Table 2). At each of these 24 positions, yDA317 and yMWB3-15 were identical. These differences in yMWB3-15 and yFS101 were verified by PCR of the loci from both strains (yFS101 provided by N. Rhind) and conventional sequencing (Table 4, last two columns).
The 24 SNPs observed in our strain and yFS101 consisted of five coding mutations that changed the identity of an amino acid in a protein sequence, one miscRNA of unknown function, four synonymous coding and 14 intergenic or intronic SNPs (Table 4). Functional consequences of these SNPs are unknown. These could, in some cases, reflect mutations that confer a phenotype or be completely neutral, we do not know.
Of the four coding changes, we note the following. For Rpb2p, L872R changes a conserved Leu in the several yeast sequences examined and therefore appears to be a mutation in our strain or its predecessor. For Puf3p, I709M occurs close to the end of the protein; the surrounding sequence is highly conserved among Schizosaccharomyces species (although not S. cerevisiae or human) and, therefore, appears to be a mutation in our strain or its predecessor.
Twenty-four SNPs observed in two strains that were separated about 30 years ago suggest genetic drift or possibly inadvertent selection if a SNP conferred a phenotype. In any case, we would caution against the use of this data for the purpose of calculating a ‘mutation rate’ because the length of time these strains have been continuously growing (versus frozen storage) is unknown as are other aspects of their propagation. Nonetheless, the perspective that arises is that the overall mutation burden in laboratory strains of S. pombe appears not to be very high. Indeed, it is low enough to make whole genome sequencing of spontaneous mutants a plausible endeavor.
We note that this study is to be distinguished from a previous report of whole genome sequencing for mutant mapping in S. pombe (70). In that case, sequencing was done after cells were mutagenized with N-methyl-N′-nitro-N-nitrosoguanidine (MNG) and the mutant was isolated. Comparative analyses led to a relatively large number of SNPs, 73 of which were present in the selected mutant that had been mutagenized (70). The presence of so many associated mutations required a significant amount of genetic techniques before the selected phenotype could be attributed to the causative mutation (70). By contrast, our analyses of whole genome sequencing after spontaneous mutation led to one synonymous SNP plus a gene duplication that was shown to be causative of the selected phenotype, with relatively far less requirement for genetic techniques.
That we isolated a tRNA gene duplication rather than SNPs as a common mutation (90%) in our TMS mutants may reflect an idiosyncrasy of our genetic screen, or perhaps a characteristic of tRNA genes. As noted above, SNPs would appear to be more often uncovered from genetic screens of yeast than gene duplication. However, since mutants obtained from genetic screens involve characterization prior to detailed analysis, it seems possible that typical prerequisite examination might disfavor characterization of gene duplication mutants. The ability to map phenotype causative mutations by whole genome sequencing without the need for extensive prerequisite characterization may allow a wider variety of mutations to be isolated.
The toolset of programs compiled for this purpose is designated MATCH-G (Mutational Analysis Toolset Comparing wHole Genomes) and is available with description and instructions for free download at http://science.nichd.nih.gov/confluence/display/smcb/MATCH-G+Program. Some of the programs preexisted and some were developed specifically for this purpose. Use requires minimal basic computer knowledge (folder structure, file formats) and is written to be able to scale to any size genome for the purpose of finding differences between two sequence data sets where a pre-existing related reference genome exists. It is run either from the command prompt or with a simple graphic-user interface. No coding or knowledge of programming languages is required for use. Use in MacOSX or UNIX-like environments is most straightforward where terminal windows and Perl are present by default. Use in windows operating systems may be more complicated.
Our work demonstrates that whole-genome sequencing of S. pombe has the ability to detect single-nucleotide mutations, base insertions and deletions, and changes in gene copy number. The tools developed here should be applicable toward identifying factors and pathway components in high-throughput screens that produce spontaneously arising mutants. This work expands the tools available for the S. pombe community. The tools developed here for comparative whole genome sequencing could readily be applied to S. cerevisiae and other model systems.
Supplementary Data are available at NAR Online.
The Intramural Research Program of the NICHD; National Institutes of Health. Funding for open access charge: National Institutes of Health.
Conflict of interest statement. None declared.
We would like to thank Sergei Gaidamakov and Steven Salzberg for contributions to this work, and Jim Kennison for the tandem duplication PCR strategy. We also thank Valerie Wood and Nick Rhind for information and materials related to the reference strain.