While considering the contribution of KLF1 expression to regulation of the Cldn11
gene in mouse testis, we encountered diffuse transcription initiation in exon 2 of the Klf1
gene. In previous analyses of erythroid tissue and a number of bone marrow-derived stem cell lines, the exon 1 TSS has been the only site identified [23
]. However, transcripts of 0.8–0.9 kb in length were previously observed in testis by Lingrel et al. [40
]. In the current study, we have determined the intragenic TSSs of these RNAs in exon 2 and have characterized their temporal expression during postnatal testis development. In addition, we observe full-length Klf1
transcripts in testis as well as protein expression in Sertoli cells and spermatogonia.
Mouse transcriptome support of the exon 2 Klf1
TSSs is not limited to testis; FANTOM CAGE data (Supplemental Fig. S1 and Supplemental Table S1), which are high-quality, rigorously characterized data sets derived from direct sequencing, support intragenic TSSs in bone marrow, liver, cerebellum, and lung. Together with deep-sequencing and Northern blot evidence of intragenic initiation, this multitissue expression profile supports our contention that truncated Klf1
RNAs arise from specific, nonstochastic intragenic initiation. Furthermore, the absence of these RNA species in spleen total RNA (this study and Anderson et al. [40
]) indicates that intragenic transcription from the Klf1
CGI is tissue restricted. Finally, our Southern blot analysis suggests that expression of these truncated, noncoding transcripts are not associated with developmentally regulated or tissue-specific hypermethylation of the overlapping CGI. Nevertheless, we have examined 50% of the CpGs in the CGI and cannot exclude the possibility that low-level or site-specific methylation may regulate the developmental expression of Klf1
Several lines of evidence presented here argue for intragenic transcription initiation of the Klf1
gene in testis. First, we demonstrate developmentally regulated expression of a 0.8- to 0.9-kb transcript lacking exon 1 of the gene using Northern blot (). Second, because our 5′ RACE reaction is based on cap trapping and 5′ cap addition to primary transcripts occurs cotranscriptionally for short genes (reviewed in Cowling [46
]), it is likely that our cloned cDNAs arise from the TSSs of a bona fide TATA-less promoter rather than RNA from truncation artifacts. Third, in silico prediction of an evolutionarily conserved intragenic CGI overlying exon 2 of KLF1
in placental mammals from dog to human, as well as methylation-dependent cloning of this region from human blood cell genomic DNA [2
], is suggestive of function. Fourth, in silico alignment to the Klf1
gene of EST clusters from multiple libraries, tissues, and species indicates that exon 2 TSS clusters are evolutionarily conserved, even though they do not generally contain ORFs of substantial length or homology to KLF1 (Supplemental Table S2).
Two recently published genome-wide analyses also lend support to the existence of exon 2 intragenic TSSs in the KLF1
gene from multiple species. In one study [47
], a digital DNAse hypersensitivity analysis of human epiderm-, mesoderm-, and endoderm-derived ENCODE cell lines (UCSC Genome Browser track: Digital DNAse I Hypersensitivity, hg19) reveals an open chromatin conformation across the 3′ end of exon 2 in the KLF1
gene for most of the 74 cell lines tested in duplicate, which is consistent with transcriptional initiation of truncated transcripts. A fine-scale analysis of the DNAse I hypersensitivity region supports its extension through intron 2 and until the start of exon 3, coincident with the 5′ end (putative TSS) of the EST, W57216.
In the second study of adult human forebrain chromatin [5
], histone 3 lysine 4 trimethylation (H3K4Me3) status, which is a well-established marker of transcriptional activity at canonical promoters, is indicative of transcription initiation across the 3′ end of exon 2 of KLF1
(refer to Worksheet 2 of Supplemental File 1 from Maunakea et al. [5
] at hg18 chr19:12857086–12858039). In addition, RNA-seq analysis in this tissue reveals a TSS centered at nucleotide 647 of the KLF1
mRNA (NM_006563.3), which corresponds to nucleotide 667 of mouse Klf1
(NM_010635.2) only 68 bp upstream of the most abundant 5′ RACE cluster. Together, these studies provide evidence that KLF1
exon 2 is characterized by a chromatin conformation that is consistent with transcriptional initiation and that truncated mRNAs from this gene arise in at least two tissues from different species.
It is unlikely that functional proteins could be generated from AUG
-dependent initiator codons, particularly for those with homology to KLF1, although a possible exception is found in sheep. However, there is potential for noncanonical translation of a family of amino terminally truncated KLF1 isoforms in multiple species from dog to human. Supplemental Figure S4 shows four conserved in-frame non-AUG
-dependent initiator codons in exon 2, downstream of the TSSs identified in the current study (B), that could initiate translation. With regard to our 5′ RACE analysis, the major TSS at nucleotide 735 () would lead to translation initiation at nucleotide 813, and the resulting protein isoform would be 13.4 kD and would include the nuclear localization signals at amino acids 275–296/293–376 [48
] as well as the full-length zinc finger domain for binding to its CCN CNC CCN target sites in DNA.
FIG. 5 5′ RACE and public transcriptome data support intragenic transcription initiation in exon 2 of the mouse Klf1 gene. A) Relative abundance of intragenic TSSs from all sources identified in the current study across the ~800 bp of exon 2 (more ...)
In light of functional studies to characterize the KLF1 transactivation domain (reviewed in Siatecka and Bieker [25
]), the properties of the short isoforms of KLF1 could differ significantly from the full-length protein in several respects. First, short isoforms would have a longer half-life in the cell because they lack two amino-terminal PEST sequences [49
]. Second, they would be comparatively weak transcriptional activators because they include a critical acetylation site at lysine 288 for interacting with SWI/SNF-related complexes but lack a regulatory phosphorylation site at threonine 41. Finally, the truncated proteins would be relatively weak repressors because they include the acetylation site at lysine 302 for binding to Sin3A/HDAC1 but lack the sumoylation site at lysine 74 that can bind to the NuRD inhibitor complex. Thus, because KLF1 probably interacts with its target binding sites on DNA as a monomer [24
], the greater stability of the short isoforms could enable them to competitively inhibit the recruitment of full-length KLF1 to transcriptional complexes and, thereby, influence gene expression.
Despite the potential for short non-AUG
-dependent KLF1s in testis, we cannot detect these proteins on Western blots from testis or spleen. Three monoclonal antibodies raised against full-length KLF1 and the approximate binding sites have been mapped by Western blot of truncated recombinant proteins from bacteria [50
]. Only the 6B3 antibody recognizes KLF1 from testis homogenates (data not shown), and the epitope lies within the amino-terminal 60 amino acids of KLF1. Accordingly, this antibody recognizes the full-length protein but not theoretical truncated isoforms. The commercial carboxyl-terminal antibody (LS) used in this study also recognizes the full-length protein but does not reveal any proteins approximating 13 kD.
An aspect of our study of potential interest is the developmental activation of Klf1
exon 2 TSS clusters from P20 () to adulthood, which is concomitant with increasing abundance of round and condensed spermatids (). Germ line cells make up 75% of cells in the seminiferous epithelium at the beginning of this developmental period [52
] and rise to 95% by P45 and beyond [53
], with relatively transcriptionally inactive meiotic spermatocytes accounting for the vast majority. The temporal correlation between truncated Klf1
RNA induction and spermatocyte dominance of the seminiferous epithelium during development suggests germ cell-based expression of these RNAs. The levels of full-length Klf1
transcript increase up to P20, when somatic cells are abundant, and subsequently remain relatively constant in similar fashion to the proportion of somatic cells, which falls below 5% by P35. Indeed, we are able to confirm Sertoli cell expression of canonical KLF1 using immunocytochemistry (). Together, these data are consistent with expression of full-length Klf1
mRNA by Sertoli and germ cells and truncated Klf1
transcripts by differentiating germ line cells, although we have not directly demonstrated that spermatids express the 0.8-kb Klf1
FIG. 7 Developmental changes in relative abundance of different cell types in testis. The relative proportions of each major cell type in mouse testis are plotted as a function of age between P6 and P84 (y-axis, left) to illustrate the changing composition of (more ...)
In light of our data and taking into account the lack of EST evidence for antisense transcripts in the Klf1
gene (antisense CAGE tags in Supplemental are very low abundance and likely background), it is tempting to speculate that TSS choice and evolutionarily conserved transcription of spliced ORF-less RNAs from exon 2 initiation sites may repress canonical Klf1
expression by diverting transcriptional machinery away from the core promoter at exon 1 by a mechanism known as squelching [54
]. Alternatively, recent work on promoter-associated short RNAs [55
] shows that posttranscriptional processing may generate intermediates with 5′ caps that are indistinguishable from mature functional mRNAs. However, consideration of the ENCODE DNAse I hypersensitivity data and the H3K4Me3 deep-sequencing profiles provides two independent lines of evidence that exon 2 of human KLF1
has an epigenetic configuration with multiple properties of a TSS [5
]. This chromatin state strongly suggests that the truncated Klf1
transcripts initiating in this region from multiple species are genuine transcription initiation events rather than posttranscriptional cleavage and recapping.