The completion of the genome sequencing projects for major pathogens Trypanosoma brucei, Trypanosoma cruzi and Leishmania major has enabled numerous studies that would have been difficult or impossible to perform otherwise. New technologies in sequencing and protein analyses promise further rapid expansion in our capabilities. The keys to successful use of these new tools are recognizing the power and limitations of studies performed thus far, grasping the unrealized potential of new and developing technologies, and creating access to a multidisciplinary set of skills that will facilitate research, particularly in the bioinformatic analysis of the reams of data that will be forthcoming. In this Discussion, we will provide an overview of kinetoplastid genomics studies with emphasis on studies advanced through genomic data, and a preview of what may come in the near future.
bioinformatics; Leishmania; Trypanosoma; trypanosome
The genus Perkinsus occupies a precarious phylogenetic position. To gain a better understanding of the relationship between perkinsids, dinoflagellates and other alveolates, we analyzed the nuclear-encoded spliced-leader (SL) RNA and mitochondrial genes, intron prevalence, and multi-protein phylogenies. In contrast to the canonical 22-nt SL found in dinoflagellates (DinoSL), P. marinus has a shorter (21-nt) and a longer (22-nt) SL with slightly different sequences than DinoSL. The major SL RNA transcripts range in size between 80–83 nt in P. marinus, and ∼83 nt in P. chesapeaki, significantly larger than the typical ≤56-nt dinoflagellate SL RNA. In most of the phylogenetic trees based on 41 predicted protein sequences, P. marinus branched at the base of the dinoflagellate clade that included the ancient taxa Oxyrrhis and Amoebophrya, sister to the clade of apicomplexans, and in some cases clustered with apicomplexans as a sister to the dinoflagellate clade. Of 104 Perkinsus spp. genes examined 69.2% had introns, a higher intron prevalence than in dinoflagellates. Examination of Perkinsus spp. mitochondrial cytochrome B and cytochrome C oxidase subunit I genes and their cDNAs revealed no mRNA editing, but these transcripts can only be translated when frameshifts are introduced at every AGG and CCC codon as if AGGY codes for glycine and CCCCU for proline. These results, along with the presence of the numerous uncharacterized ‘marine alveolate group I' and Perkinsus-like lineages separating perkinsids from core dinoflagellates, expand support for the affiliation of the genus Perkinsus with an independent lineage (Perkinsozoa) positioned between the phyla of Apicomplexa and Dinoflagellata.
The 5′ cap of human messenger RNA consists of an inverted 7-methylguanosine linked to the first transcribed nucleotide by a unique 5′–5′ triphosphate bond followed by 2′-O-ribose methylation of the first and often the second transcribed nucleotides, likely serving to modify efficiency of transcript processing, translation and stability. We report the validation of a human enzyme that methylates the ribose of the second transcribed nucleotide encoded by FTSJD1, henceforth renamed HMTR2 to reflect function. Purified recombinant hMTr2 protein transfers a methyl group from S-adenosylmethionine to the 2′-O-ribose of the second nucleotide of messenger RNA and small nuclear RNA. Neither N7 methylation of the guanosine cap nor 2′-O-ribose methylation of the first transcribed nucleotide are required for hMTr2, but the presence of cap1 methylation increases hMTr2 activity. The hMTr2 protein is distributed throughout the nucleus and cytosol, in contrast to the nuclear hMTr1. The details of how and why specific transcripts undergo modification with these ribose methylations remains to be elucidated. The 2′-O-ribose RNA cap methyltransferases are present in varying combinations in most eukaryotic and many viral genomes. With the capping enzymes in hand their biological purpose can be ascertained.
Chagas disease has a diverse pathology caused by the parasite Trypanosoma cruzi, and is indigenous to Central and South America. A pronounced feature of the trypanosomes is the kinetoplast, which is comprised of catenated maxicircles and minicircles that provide the transcripts involved in uridine insertion/deletion RNA editing. T. cruzi exchange genetic material through a hybridization event. Extant strains are grouped into six discrete typing units by nuclear markers, and three clades, A, B, and C, based on maxicircle gene analysis. Clades A and B are the more closely related. Representative clade B and C maxicircles are known in their entirety, and portions of A, B, and C clades from multiple strains show intra-strain heterogeneity with the potential for maxicircle taxonomic markers that may correlate with clinical presentation.
To perform a genome-wide analysis of the three maxicircle clades, the coding region of clade A representative strain Sylvio X10 (a.k.a. Silvio X10) was sequenced by PCR amplification of specific fragments followed by assembly and comparison with the known CL Brener and Esmeraldo maxicircle sequences. The clade A rRNA and protein coding region maintained synteny with clades B and C. Amino acid analysis of non-edited and 5'-edited genes for Sylvio X10 showed the anticipated gene sequences, with notable frameshifts in the non-edited regions of Cyb and ND4. Comparisons of genes that undergo extensive uridine insertion and deletion display a high number of insertion/deletion mutations that are likely permissible due to the post-transcriptional activity of RNA editing.
Phylogenetic analysis of the entire maxicircle coding region supports the closer evolutionary relationship of clade B to A, consistent with uniparental mitochondrial inheritance from a discrete typing unit TcI parental strain and studies on smaller fragments of the mitochondrial genome. Gene variance that can be corrected by RNA editing hints at an unusual depth for maxicircle taxonomic markers, which will aid in the ability to distinguish strains, their corresponding symptoms, and further our understanding of the T. cruzi population structure. The prevalence of apparently compromised coding regions outside of normally edited regions hints at undescribed but active mechanisms of genetic exchange.
Spliced leader (SL) trans-splicing is a common mRNA processing mechanism in dinoflagellates, in which a 22-nt sequence is transferred from the 5′-end of a small noncoding RNA, the SL RNA, to the 5′-end of mRNA molecules. Although the SL RNA gene was shown initially to be organized as tandem repeats with transcripts of 50–60 nt, shorter than most of their counterparts in other organisms, other gene organizations and transcript lengths were reported subsequently. To address the evolutionary gradient of gene organization complexity, we thoroughly examined transcript and gene organization of the SL RNA in a phylogenetically and ecologically diverse group of dinoflagellates representing four Orders. All these dinoflagellates possessed SL RNA transcripts of 50–60 nt, although in one species additional transcripts of up to 92 nt were also detected. At the genomic level, various combinations of SL RNA and 5S rRNA tandem gene arrays, including SL RNA–only, 5S rRNA–only, and mixed SL RNA–5S rRNA (SL–5S) clusters, were amplified by polymerase chain reaction for six dinoflagellates, containing intergenic spacers ranging from 88 bp to over 1.2 kb. Of these species, no SL–5S cluster was detected in Prorocentrum minimum, and only Karenia brevis showed the U6 small nuclear RNA gene associated with these mixed arrays. The 5S rRNA–only array was also found in three dinoflagellates, along with two SL–5S-adjacent arrangements found in two other species that could represent junctions. Two species contained multimeric SL exon repeats with no associated intron. These results suggest that 1) both the SL RNA tandem repeat and the SL–5S cluster genomic organizations are an “ancient” and widespread feature within the phylum of dinoflagellates and 2) rampant genomic duplication and recombination are ongoing independently in each dinoflagellate lineage, giving rise to the highly complex and diversified genomic arrangements of the SL RNA gene, while conserving the length and structure of the functional SL RNA.
Dinoflagellate; SL RNA; complex genomic arrangement
Through trans-splicing of a 39-nt Spliced Leader (SL) onto each protein-coding transcript, mature kinetoplastid mRNA acquire a hypermethylated 5′-cap structure, but its function has been unclear. Gene deletions for three Trypanosoma brucei cap 2′-O-ribose methyltransferases, TbMTr1, TbMTr2, and TbMTr3, reveal distinct roles for four 2′-O-methylated nucleotides. Elimination of individual gene pairs yields viable cells, however attempts at double knockouts resulted in the generation of a TbMTr2−/−/TbMTr3−/− cell line only. Absence of both kinetoplastid-specific enzymes in TbMTr2−/−/TbMTr3−/− lines yielded substrate SL RNA and mRNA with cap 1. TbMTr1−/− translation is comparable to wildtype, while cap 3 and cap 4 loss reduced translation rates, exacerbated by the additional loss of cap 2. TbMTr1−/− and TbMTr2−/−/TbMTr3−/− lines grow to lower densities under normal culture conditions relative to wildtype cells, with growth rate differences apparent under low serum conditions. Cell viability may not tolerate delays at both the nucleolar Sm-independent and nucleoplasmic Sm-dependent stages of SL RNA maturation combined with reduced rates of translation. A minimal level of mRNA cap ribose methylation is essential for trypanosome viability, providing the first functional role for the cap 4.
gene knockout; methyltransferase; ribose 2′-O-methylation; SL RNA; spliced leader; trans-splicing
Kinetoplastid flagellates attach a 39-nucleotide spliced leader (SL) upstream of protein-coding regions in polycistronic RNA precursors through trans splicing. SL modifications include cap 2′-O-ribose methylation of the first four nucleotides and pseudouridine (ψ) formation at uracil 28. In Trypanosoma brucei, TbMTr1 performs 2′-O-ribose methylation of the first transcribed nucleotide, or cap 1. We report the characterization of an SL RNA processing complex with TbMTr1 and the SLA1 H/ACA small nucleolar ribonucleoprotein (snoRNP) particle that guides SL ψ28 formation. TbMTr1 is in a high-molecular-weight complex containing the four conserved core proteins of H/ACA snoRNPs, a kinetoplastid-specific protein designated methyltransferase-associated protein (TbMTAP), and the SLA1 snoRNA. TbMTAP-null lines are viable but have decreased SL RNA processing efficiency in cap methylation, 3′-end maturation, and ψ28 formation. TbMTAP is required for association between TbMTr1 and the SLA1 snoRNP but does not affect U1 small nuclear RNA methylation. A complex methylation profile in the mRNA population of TbMTAP-null lines indicates an additional effect on cap 4 methylations. The TbMTr1 complex specializes the SLA1 H/ACA snoRNP for efficient processing of multiple modifications on the SL RNA substrate.
Many components of the RNA polymerase II transcription machinery have been identified in kinetoplastid protozoa, but they diverge substantially from other eukaryotes. Furthermore, protein-coding genes in these organisms lack individual transcriptional regulation, since they are transcribed as long polycistronic units. The transcription initiation sites are assumed to lie within the 'divergent strand-switch' regions at the junction between opposing polycistronic gene clusters. However, the mechanism by which Kinetoplastidae initiate transcription is unclear, and promoter sequences are undefined.
The chromosomal location of TATA-binding protein (TBP or TRF4), Small Nuclear Activating Protein complex (SNAP50), and H3 histones were assessed in Leishmania major using microarrays hybridized with DNA obtained through chromatin immunoprecipitation (ChIP-chip). The TBP and SNAP50 binding patterns were almost identical and high intensity peaks were associated with tRNAs and snRNAs. Only 184 peaks of acetylated H3 histone were found in the entire genome, with substantially higher intensity in rapidly-dividing cells than stationary-phase. The majority of the acetylated H3 peaks were found at divergent strand-switch regions, but some occurred at chromosome ends and within polycistronic gene clusters. Almost all these peaks were associated with lower intensity peaks of TBP/SNAP50 binding a few kilobases upstream, evidence that they represent transcription initiation sites.
The first genome-wide maps of DNA-binding protein occupancy in a kinetoplastid organism suggest that H3 histones at the origins of polycistronic transcription of protein-coding genes are acetylated. Global regulation of transcription initiation may be achieved by modifying the acetylation state of these origins.
mRNA cap 1 2′-O-ribose methylation is a widespread modification that is implicated in processing, trafficking, and translational control in eukaryotic systems. The eukaryotic enzyme has yet to be identified. In kinetoplastid flagellates trans-splicing of spliced leader (SL) to polycistronic precursors conveys a hypermethylated cap 4, including a cap 0 m7G and seven additional methylations on the first 4 nucleotides, to all nuclear mRNAs. We report the first eukaryotic cap 1 2′-O-ribose methyltransferase, TbMTr1, a member of a conserved family of viral and eukaryotic enzymes. Recombinant TbMTr1 methylates the ribose of the first nucleotide of an m7G-capped substrate. Knockdowns and null mutants of TbMTr1 in Trypanosoma brucei grow normally, with loss of 2′-O-ribose methylation at cap 1 on substrate SL RNA and U1 small nuclear RNA. TbMTr1-null cells have an accumulation of cap 0 substrate without further methylation, while spliced mRNA is modified efficiently at position 4 in the absence of 2′-O-ribose methylation at position 1; downstream cap 4 methylations are independent of cap 1. Based on TbMTr1-green fluorescent protein localization, 2′-O-ribose methylation at position 1 occurs in the nucleus. Accumulation of 3′-extended SL RNA substrate indicates a delay in processing and suggests a synergistic role for cap 1 in maturation.
Members of the family Trypanosomatidae infect many organisms, including animals, plants and humans. Plant-infecting trypanosomes are grouped under the single genus Phytomonas, failing to reflect the wide biological and pathological diversity of these protists. While some Phytomonas spp. multiply in the latex of plants, or in fruit or seeds without apparent pathogenicity, others colonize the phloem sap and afflict plants of substantial economic value, including the coffee tree, coconut and oil palms. Plant trypanosomes have not been studied extensively at the genome level, a major gap in understanding and controlling pathogenesis. We describe the genome sequences of two plant trypanosomatids, one pathogenic isolate from a Guianan coconut and one non-symptomatic isolate from Euphorbia collected in France. Although these parasites have extremely distinct pathogenic impacts, very few genes are unique to either, with the vast majority of genes shared by both isolates. Significantly, both Phytomonas spp. genomes consist essentially of single copy genes for the bulk of their metabolic enzymes, whereas other trypanosomatids e.g. Leishmania and Trypanosoma possess multiple paralogous genes or families. Indeed, comparison with other trypanosomatid genomes revealed a highly streamlined genome, encoding for a minimized metabolic system while conserving the major pathways, and with retention of a full complement of endomembrane organelles, but with no evidence for functional complexity. Identification of the metabolic genes of Phytomonas provides opportunities for establishing in vitro culturing of these fastidious parasites and new tools for the control of agricultural plant disease.
Some plant trypanosomes, single-celled organisms living in phloem sap, are responsible for important palm diseases, inducing frequent expensive and toxic insecticide treatments against their insect vectors. Other trypanosomes multiply in latex tubes without detriment to their host. Despite the wide range of behaviors and impacts, these trypanosomes have been rather unceremoniously lumped into a single genus: Phytomonas. A battery of molecular probes has been used for their characterization but no clear phylogeny or classification has been established. We have sequenced the genomes of a pathogenic phloem-specific Phytomonas from a diseased South American coconut palm and a latex-specific isolate collected from an apparently healthy wild euphorb in the south of France. Upon comparison with each other and with human pathogenic trypanosomes, both Phytomonas revealed distinctive compact genomes, consisting essentially of single-copy genes, with the vast majority of genes shared by both isolates irrespective of their effect on the host. A strong cohort of enzymes in the sugar metabolism pathways was consistent with the nutritional environments found in plants. The genetic nuances may reveal the basis for the behavioral differences between these two unique plant parasites, and indicate the direction of our future studies in search of effective treatment of the crop disease parasites.
The structurally complex network of minicircles and maxicircles comprising the mitochondrial DNA of kinetoplastids mirrors the complexity of the RNA editing process that is required for faithful expression of encrypted maxicircle genes. Although a few of the guide RNAs that direct this editing process have been discovered on maxicircles, guide RNAs are mostly found on the minicircles. The nuclear and maxicircle genomes have been sequenced and assembled for Trypanosoma cruzi, the causative agent of Chagas disease, however the complement of 1.4-kb minicircles, carrying four guide RNA genes per molecule in this parasite, has been less thoroughly characterised.
Fifty-four CL Brener and 53 Esmeraldo strain minicircle sequence reads were extracted from T. cruzi whole genome shotgun sequencing data. With these sequences and all published T. cruzi minicircle sequences, 108 unique guide RNAs from all known T. cruzi minicircle sequences and two guide RNAs from the CL Brener maxicircle were predicted using a local alignment algorithm and mapped onto predicted or experimentally determined sequences of edited maxicircle open reading frames. For half of the sequences no statistically significant guide RNA could be assigned. Likely positions of these unidentified gRNAs in T. cruzi minicircle sequences are estimated using a simple Hidden Markov Model. With the local alignment predictions as a standard, the HMM had an ~85% chance of correctly identifying at least 20 nucleotides of guide RNA from a given minicircle sequence. Inter-minicircle recombination was documented. Variable regions contain species-specific areas of distinct nucleotide preference. Two maxicircle guide RNA genes were found.
The identification of new minicircle sequences and the further characterization of all published minicircles are presented, including the first observation of recombination between minicircles. Extrapolation suggests a level of 4% recombinants in the population, supporting a relatively high recombination rate that may serve to minimize the persistence of gRNA pseudogenes. Characteristic nucleotide preferences observed within variable regions provide potential clues regarding the transcription and maturation of T. cruzi guide RNAs. Based on these preferences, a method of predicting T. cruzi guide RNAs using only primary minicircle sequence data was created.
The spliced leader (SL) RNA provides the 5' m7G cap and first 39 nt for all nuclear mRNAs in kinetoplastids. This small nuclear RNA is transcribed by RNA polymerase II from individual promoters. In Leishmania tarentolae the SL RNA genes reside in two multi-copy tandem arrays designated MINA and MINB. The transcript accumulation from the SL promoter on the drug-selected, episomal SL RNA gene cassette pX-tSL is ~10% that of the genomic array in uncloned L. tarentolae transfectants. This disparity is neither sequence- nor copy-number related, and thus may be due to interference of SL promoter function by epigenetic factors. To explore these possibilities we examined the nucleoplasmic localization of the SL RNA genes as well as their nucleosomal architecture.
The genomic SL RNA genes and the episome did not co-localize within the nucleus. Each genomic repeat contains one nucleosome regularly positioned within the non-transcribed intergenic region. The 363-bp MINA array was resistant to micrococcal nuclease digestion between the -258 and -72 positions relative to the transcription start point due to nucleosome association, leaving the promoter elements and the entire transcribed region exposed for protein interactions. A pattern of ~164-bp protected segments was observed, corresponding to the amount of DNA typically bound by a nucleosome. By contrast, nucleosomes on the pX-tSL episome were randomly distributed over the episomal SL cassette, reducing transcription factor access to the episomal promoter by approximately 74%. Cloning of the episome transfectants revealed a range of transcriptional activities, implicating a mechanism of epigenetic heredity.
The disorganized nucleosomes on the pX episome are in a permissive conformation for transcription of the SL RNA cassette approximately 25% of the time within a given parasite. Nucleosome interference is likely the major factor in the apparent transcriptional repression of the SL RNA gene cassette. Coupled with the requirement for run-around transcription that drives expression of the selectable drug marker, transcription of the episomal SL may be reduced even further due to sub-optimal nucleoplasmic localization and initiation complex disruption.
In kinetoplastids spliced leader (SL) RNA is trans-spliced onto the 5′ ends of all nuclear mRNAs, providing a universal exon with a unique cap. Mature SL contains an m7G cap, ribose 2′-O methylations on the first four nucleotides, and base methylations on nucleotides 1 and 4 (AACU). This structure is referred to as cap 4. Mutagenized SL RNAs that exhibit reduced cap 4 are trans-spliced, but these mRNAs do not associate with polysomes, suggesting a direct role in translation for cap 4, the primary SL sequence, or both. To separate SL RNA sequence alterations from cap 4 maturation, we have examined two ribose 2′-O-methyltransferases in Trypanosoma brucei. Both enzymes fall into the Rossmann fold class of methyltransferases and model into a conserved structure based on vaccinia virus homolog VP39. Knockdown of the methyltransferases individually or in combination did not affect growth rates and suggests a temporal placement in the cap 4 formation cascade: TbMT417 modifies A2 and is not required for subsequent steps; TbMT511 methylates C3, without which U4 methylations are reduced. Incomplete cap 4 maturation was reflected in substrate SL and mRNA populations. Recombinant methyltransferases bind to a methyl donor and show preference for m7G-capped RNAs in vitro. Both enzymes reside in the nucleoplasm. Based on the cap phenotype of substrate SL stranded in the cytosol, A2, C3, and U4 methylations are added after nuclear reimport of Sm protein-complexed substrate SL RNA. As mature cap 4 is dispensable for translation, cap 1 modifications and/or SL sequences are implicated in ribosomal interaction.
The mitochondrial DNA of kinetoplastid flagellates is distinctive in the eukaryotic world due to its massive size, complex form and large sequence content. Comprised of catenated maxicircles that contain rRNA and protein-coding genes and thousands of heterogeneous minicircles encoding small guide RNAs, the kinetoplast network has evolved along with an extreme form of mRNA processing in the form of uridine insertion and deletion RNA editing. Many maxicircle-encoded mRNAs cannot be translated without this post-transcriptional sequence modification.
We present the complete sequence and annotation of the Trypanosoma cruzi maxicircles for the CL Brener and Esmeraldo strains. Gene order is syntenic with Trypanosoma brucei and Leishmania tarentolae maxicircles. The non-coding components have strain-specific repetitive regions and a variable region that is unique for each strain with the exception of a conserved sequence element that may serve as an origin of replication, but shows no sequence identity with L. tarentolae or T. brucei. Alternative assemblies of the variable region demonstrate intra-strain heterogeneity of the maxicircle population. The extent of mRNA editing required for particular genes approximates that seen in T. brucei. Extensively edited genes were more divergent among the genera than non-edited and rRNA genes. Esmeraldo contains a unique 236-bp deletion that removes the 5'-ends of ND4 and CR4 and the intergenic region. Esmeraldo shows additional insertions and deletions outside of areas edited in other species in ND5, MURF1, and MURF2, while CL Brener has a distinct insertion in MURF2.
The CL Brener and Esmeraldo maxicircles represent two of three previously defined maxicircle clades and promise utility as taxonomic markers. Restoration of the disrupted reading frames might be accomplished by strain-specific RNA editing. Elements in the non-coding region may be important for replication, transcription, and anchoring of the maxicircle within the kinetoplast network.
In all trypanosomatids, trans splicing of the spliced leader (SL) RNA is a required step in the maturation of all nucleus-derived mRNAs. The SL RNA is transcribed with an oligo-U 3′ extension that is removed prior to trans splicing. Here we report the identification and characterization of a nonexosomal, 3′→5′ exonuclease required for SL RNA 3′-end formation in Trypanosoma brucei. We named this enzyme SNIP (for snRNA incomplete 3′ processing). The central 158-amino-acid domain of SNIP is related to the exonuclease III (ExoIII) domain of the 3′→5′ proofreading ɛ subunit of Escherichia coli DNA polymerase III holoenzyme. SNIP had a preference for oligo(U) 3′ extensions in vitro. RNA interference-mediated knockdown of SNIP resulted in a growth defect and correlated with the accumulation of one- to two- nucleotide 3′ extensions of SL RNA, U2 and U4 snRNAs, a five-nucleotide extension of 5S rRNA, and the destabilization of U3 snoRNA and U2 snRNA. SNIP-green fluorescent protein localized to the nucleoplasm, and substrate SL RNA derived from SNIP knockdown cells showed wild-type cap 4 modification, indicating that SNIP acts on SL RNA after cytosolic trafficking. Since the primary SL RNA transcript was not the accumulating species in SNIP knockdown cells, SL RNA 3′-end formation is a multistep process in which SNIP provides the ultimate 3′-end polishing. We speculate that SNIP is part of an organized nucleoplasmic machinery responsible for processing of SL RNA.
The Sm-binding site of the kinetoplastid spliced leader RNA has been implicated in accurate spliced leader RNA maturation and trans-splicing competence. In Trypanosoma brucei, RNA interference-mediated knockdown of SmD1 caused defects in spliced leader RNA maturation, displaying aberrant 3′-end formation, partial formation of cap 4, and overaccumulation in the cytoplasm; U28 pseudouridylation was unaffected.
The kinetoplastid protozoan spliced leader (SL) RNA is the common substrate pre-mRNA utilized in all trans-splicing reactions. Here we show by fluorescence in situ hybridization that the SL RNA is present in the cytoplasm of Leishmania tarentolae and Trypanosoma brucei. Treatment with the karyopherin-specific inhibitor leptomycin B was toxic to T. brucei and eliminated the cytoplasmic SL RNA, suggesting that cytoplasmic SL RNA was dependent on the nuclear exporter exportin 1 (XPO1). Ectopic expression of xpo1 with a C506S mutation in T. brucei conferred resistance to leptomycin B. A reduction in SL RNA 3′ extension removal and 5′ methylation of nucleotide U4 was observed in wild-type T. brucei treated with leptomycin B, suggesting that the cytoplasmic stage is necessary for SL RNA biogenesis. This study demonstrates spatial and mechanistic similarities between the posttranscriptional trafficking of the kinetoplastid protozoan SL RNA and the metazoan cis-spliceosomal small nuclear RNAs.
Addition of a 39-nucleotide (nt) spliced leader (SL) by trans splicing is a basic requirement for all trypanosome nuclear mRNAs. The SL RNA in Leishmania tarentolae is a 96-nt precursor transcript synthesized by a polymerase that resembles polymerase II most closely. To analyze SL RNA genesis, we mutated SL RNA intron structures and sequence elements: stem-loops II and III, the Sm-binding site, and the downstream T tract. Using an exon-tagged SL RNA gene, we examined the phenotypes produced by a second-site 10-bp linker scan mutagenic series and directed mutagenesis. Here we report that transcription is terminated by the T tract, which is common to the 3′ end of all kinetoplastid SL RNA genes, and that more than six T’s are required for efficient termination in vivo. We describe mutants whose SL RNAs end in the T tract or appear to lack efficient termination but can generate wild-type 3′ ends. Transcriptionally active nuclear extracts show staggered products in the T tract, directed by eight or more T’s. The in vivo and in vitro data suggest that SL RNA transcription termination is staggered in the T tract and is followed by nucleolytic processing to generate the mature 3′ end. We show that the Sm-binding site and stem-loop III structures are necessary for correct 3′-end formation. Thus, we have defined the transcription termination element for the SL RNA gene. The termination mechanism differs from that of vertebrate small nuclear RNA genes and the SL RNA homologue in Ascaris.
First characterized in Trypanosoma brucei, the spliced leader-associated (SLA) RNA gene locus has now been isolated from the kinetoplastids Leishmania tarentolae and Trypanosoma cruzi. In addition to the T. brucei SLA RNA, both L. tarentolae and T. cruzi SLA RNA repeat units also yield RNAs of 75 or 76 nucleotides (nt), 92 or 94 nt, and ∼450 or ∼350 nt, respectively, each with significant sequence identity to transcripts previously described from the T. brucei SLA RNA locus. Cell fractionation studies localize the three additional RNAs to the nucleolus; the presence of box C/D-like elements in two of the transcripts suggests that they are members of a class of small nucleolar RNAs (snoRNAs) that guide modification and cleavage of rRNAs. Candidate rRNA-snoRNA interactions can be found for one domain in each of the C/D element-containing RNAs. The putative target site for the 75/76-nt RNA is a highly conserved portion of the small subunit rRNA that contains 2′-O-ribose methylation at a conserved position (Gm1830) in L. tarentolae and in vertebrates. The 92/94-nt RNA has the potential to form base pairs near a conserved methylation site in the large subunit rRNA, which corresponds to position Gm4141 of small rRNA 2 in T. brucei. These data suggest that trypanosomatids do not obey the general 5-bp rule for snoRNA-mediated methylation.