More than half of all mammalian transcripts are produced from promoters with no known promoter elements – a crucial fact that went unappreciated until genome-wide studies were undertaken. Since kinetoplastid protozoa share conserved transcription factors with other eukaryotes, but lack complex transcriptional regulation, they may serve as a model for studying these mechanisms involved in non-conventional transcription initiation. On the other hand, if aspects of kinetoplastid transcription turn out to be distinct among eukaryotes, then these mechanisms could provide good targets for clinical therapy. The first whole-genome studies aimed at understanding the mechanisms of kinetoplastid transcription initiation are presented here.
The roles of TBP and SNAP
50 in kinetoplastid transcription have been the subject of a several studies, with apparently contradictory observations. This study confirms the expected role of TBP and SNAP
50 in binding to the SL RNA promoter, as observed in several different kinetoplastids using a variety of different experimental techniques [
14,
24,
35,
46]. There is also strong evidence that TBP and SNAP
50 are involved in transcription of snRNAs [
23,
24]. However, in
T. brucei, SNAP
50 did not bind to an snRNA promoter under conditions where it efficiently bound the SL RNA gene promoter [
25]. In the data presented here, TBP and SNAP
50 bind universally to all snRNA promoter regions, and both proteins are found at all tRNA and 5S rRNA promoter regions as well. This contrasts with ChIP data from
L. tarentolae, where only TBP was observed binding to those sites [
24]. Without similar whole-genome studies in these other kinetoplastid model organisms, the apparent contradiction in all of the available data is difficult to resolve.
With regards to rRNA transcription, TBP knockdown by RNA interference in
T. brucei had no effect on steady-state rRNA levels [
26], and TBP or SNAP
50 did not bind above background to the rRNA promoter region [
24], even though an element of the
T. brucei rRNA promoter is reported to bind SNAP
50 in vitro [
47]. The data here show both TBP and SNAP
50 apparently bind within the rRNA gene coding sequence, but not the promoter. One possible explanation is that it may represent precipitation of rRNAs along with TBP/SNAP
50 nascent polypeptides, although how this RNA would be labeled and why it was not also seen with H3 is not clear. Alternatively, the apparent pattern of binding may be merely an artifact caused by repetitive sequences, although this does not appear to be the case for the SL RNA locus. However, transcription of rRNAs is notably distinct from organism to organism even among closely-related crown-group eukaryotes, and species-specific effects may explain the apparent differences between
T. brucei and
L. major. With key components still being identified [
31], the full story of RNA polymerase I transcription in kinetoplastids is clearly yet to be written.
K9/K14 acetylation of histone H3 is a marker for sites of active transcription initiation in other eukaryotic systems. Our observation that similar H3 acetylation is found at all divergent strand-switch regions, as well as a few other sites throughout the L. major genome, in a polarity consistent with expected direction of transcription, suggests that this acetylation represents the first marker for sites of RNA polymerase II-mediated transcription initiation of protein-coding genes in kinetoplastids. This conclusion is bolstered by our finding that peaks of TBP binding were observed immediately upstream of the vast majority of acetylated regions. This suggests that histone acetylation is a marker for open chromatin, which makes specific regions of the genome available for binding of transcription initiation complexes, as observed in other organisms. Since TBP and SNAP50 signals were correlated genome-wide, and both were associated with H3 acetylation at the 5' end of polycistronic gene clusters, it seems likely that SNAP50 has a role as a general transcription factor in kinetoplastid transcription, an adaptation that would be unique among all model eukaryotes.
The hypothesis that kinetoplastids regulate overall transcription rates according to cell density [
43] led to the idea that the mechanism could involve changes in chromatin structure. If histone acetylation serves as markers for transcriptional availability, then one should observe a decrease in acetylation levels in stationary cells when compared to rapidly-dividing ones. Our observation that acetyl-H3 peaks are considerably reduced in magnitude in stationary stage cells supports the hypotheses that kinetoplastids do regulate overall transcription rates.
Aside from identifying the sites of transcription initiation, one of the main goals of this research is to identify DNA elements responsible for recruiting RNA polymerases to those sites. Several different methods have been used successfully in other organisms to identify regulatory elements from an enriched selection of sequences likely to contain similar elements. M
EME and M
AST analysis of sequences at sites of enhanced ChIP signal easily recovered known tRNA promoter elements; but no clear motifs were discovered for protein-coding gene transcription initiation sites. The only identifiable motif found in a majority of the isolated sequences was a G-tract (or C-tract) longer than 10 nucleotides. Similar elements were found within the transcription-enhancing 73-bp sequence derived from the
L. major chr1 strand-switch region [
4] which is conserved across
Leishmania species [
45]. However, because of their ubiquitous presence in regions devoid of TBP and acetylated H3 peaks, it is unlikely that these elements are sufficient to direct transcription initiation. Furthermore, the breadth of the TBP and SNAP
50 peaks is not indicative of typical promoter-directed initiation from a single initiation point. This is consistent with the finding that the chr1 strand-switch region contains several distinct transcription sites in both directions [
4].
Previous bioinformatic analyses of several divergent strand-switch regions from
L. major revealed an unusually high AT composition, a lack of putative hairpins and a strong curvature of the DNA [
48]. Our data indicate that the peaks of acetylated H3 are associated with increased AT content ~1–2 kb downstream of the G/C-tracts above. The mechanistic implications of this finding are not yet clear, although it is tempting to speculate that it may be associated with enhanced melting of the DNA strands during transcription initiation. Local bending of the divergent strand switch regions could allow access to transcription initiation [
49] and binding of proteins to DNA can drastically alter the shape of the DNA in ways that can increase curvature or facilitate secondary structures. In mapping predicted curvature genome-wide, it is clear that the divergent strand-switch regions do possess greater predicted curvature based on the dinucleotide stacking models used. However, other regions not associated with ChIP signal peaks also possess predicted curvatures of similar magnitudes. Thus, while DNA secondary structures and curvature may play a role in kinetoplastid transcription, the mechanisms by which this may happen elude current bioinformatics predictions. Even with ChIP data suggesting that a protein binds to a given region, the induced bending that a protein will cause cannot be predicted without doing very involved
in vitro bending assays or resolving the structure of the protein-bound DNA. Other secondary structures, like triple helices for example, are difficult to predict bioinformatically at present.