Examination of the 5′ UTRs of all available protein-encoding genes from one of the earliest-diverging eukaryote studied to date,
T. vaginalis (
8,
14), has revealed a highly conserved TCA
+1YT/A motif surrounding the start site of transcription (
32). The functional analysis of this motif, presented here, shows that this element is essential for transcription, is interchangeable between trichomonad genes, and can be replaced by a mammalian initiator (Inr) element. We demonstrate that this motif is a promoter element which is both structurally and functionally similar to metazoan Inrs (
3,
15,
27,
39,
40). Specific, conserved nucleotides comprising the core of this promoter element are shown to be necessary for accurate selection of the start of transcription of trichomonad genes, demonstrating its function as a bona fide Inr. These studies show that the Inr acts as a ubiquitous core promoter element in trichomonads.
It is remarkable that this early-diverging eukaryote appears not to use TATA elements to direct transcription initiation but instead invariably uses an Inr with strong similarity to metazoan Inrs. The structural and functional similarities between trichomonad and metazoan Inrs are particularly striking, since this is the first cognate promoter shown to be used by both protist and metazoan genes. There are, however, interesting differences between this protist Inr and its metazoan homologue. As revealed by our transcription analyses of mutant trichomonad Inrs, this Inr appears to have stricter sequence requirements than those observed for metazoan Inrs (
19,
22). This is also reflected in a stronger consensus sequence for trichomonad Inrs. It is noteworthy that the Inr is found in all
T. vaginalis genes, indicating that it is essential for transcription of all protein-encoding genes. This differs from the situation with metazoan Inrs, where only a small subset of genes have been shown to rely on the Inr for transcriptional activity (
39). The trichomonad Inr also differs from its metazoan counterpart in its close proximity to the ATG translation initiation codon (see Fig. ). The Inr is invariably located within 20 nucleotides of the ATG and may be as close as 6 nucleotides, resulting in unusually short 5′ UTRs for trichomonad mRNAs. The selection for a strict spatial conservation between the Inr and the ATG initiation codon is likely due to requirements for efficient translation of the mRNAs; however, since nothing is known about the translational machinery of trichomonads, this remains speculative.
Studies showing that transcription of protein-encoding genes in
T. vaginalis is insensitive to the fungal toxin α-amanitin, an inhibitor of RNA polymerase II, have raised the question whether these genes are transcribed by this polymerase (
33). The fact that the homologous Inr in metazoa is an RNA polymerase II promoter (
39) indicates that RNA polymerase II transcribes these genes in
T. vaginalis as well. RNA polymerase II promoters in other protists appear not to use either metazoan-like Inrs or TATA boxes, with the possible exception of the apicomplexan genus
Toxoplasma (
5,
9,
21,
30,
38,
41). The recent observation that a sequence similar to that of the Inr described here is required for the transcription of a
Toxoplasma gene (
30) raises the possibility that the Inr plays a more general role in the transcription of protist genes. However, it should be noted that this study did not test whether this sequence element actually functions as an Inr. Since so little is currently known about core promoter elements in protists, future studies on genes from this diverse group of eukaryotes will be necessary to determine whether the Inr, or elements which are functionally homologous to the Inr, is a common, essential feature of protists promoters.
The presence of a highly conserved ubiquitous Inr element that is indispensable for transcription initiation of
T. vaginalis protein-encoding genes strongly indicates that the Inr evolved early during eukaryotic evolution. Although this promoter element is not generally conserved in all eukaryotes, little divergence has occurred between trichomonad and metazoan Inrs, supporting a common origin. Nevertheless, we cannot rule out the possibility that the metazoan and trichomonad Inrs evolved independently of one other. However, it seems more likely that the lack of conservation of the Inr in previously examined, early-evolving eukaryotes reflects either the divergence of these organisms or the limited number of sequence-specific protist promoters that have been identified. Indeed, conserved elements do surround and contain the transcription start site of genes in the entamoebid
Entamoeba histolytica (
38) and the diplomonad
Giardia lamblia (
13); however, these elements have no sequence similarity to each other or to trichomonad or metazoan Inrs. Finally, evidence of proteins that interact with metazoan Inrs exists (
31,
39), but sequence-specific, Inr-binding proteins have been difficult to identify. Purification of the trichomonad nuclear protein that specifically recognizes a functional, but not a nonfunctional, Inr should advance our understanding of transcription in eukaryotes, since this protein will likely be a homologue of metazoan Inr-binding proteins.