|Home | About | Journals | Submit | Contact Us | Français|
Variation in chromatin composition and organization often reflects differences in genome function. Histone variants, for example, replace canonical histones to contribute to regulation of numerous nuclear processes including transcription, DNA repair and chromosome segregation. Here we focus on H2A.Bbd, a rapidly evolving variant found in mammals but not in invertebrates. We report that in human cells, nucleosomes bearing H2A.Bbd form unconventional chromatin structures enriched within actively transcribed genes and characterized by shorter DNA protection and nucleosome spacing. Analysis of transcriptional profiles from cells depleted for H2A.Bbd demonstrated widespread changes in gene expression with a net down-regulation of transcription and disruption of normal mRNA splicing patterns. In particular, we observed changes in exon inclusion rates and increased presence of intronic sequences in mRNA products upon H2A.Bbd depletion. Taken together, our results indicate that H2A.Bbd is involved in formation of a specific chromatin structure that facilitates both transcription and initial mRNA processing.
The dynamic packaging of DNA into chromatin regulates gene expression by altering the accessibility of DNA to the transcriptional machinery. The basic building block of chromatin is the nucleosome, consisting of about 150 base pairs (bp) of DNA wrapped around a core of eight histones of four different types; H2A, H2B, H3 and H4 (Luger et al., 1997). Histones are subject to modifications including addition and removal of covalent marks and replacement of ’major’ histones with non-allelic variants. The exchange of an entire histone with a variant represents a drastic alteration in local chromatin organization and as such is expected to be significant to chromatin biology. Replacement histones are typically expressed throughout the cell cycle (Wu and Bonner, 1981) and play a role in various processes by affecting chromatin structure and dynamics during G1 and G2 beyond bulk packaging of the genome. These processes include chromosome segregation, DNA repair, transcriptional regulation and mRNA processing (Talbert and Henikoff, 2010). However, the biological pathways and functions of histone variants are not fully understood.
In this study, we focus on a histone H2A family member, the mammalian-specific variant H2A.Bbd (Barr Body Deficient). This is one of the most rapidly evolving variants, which shares approximately 50% sequence identity with the major H2A histone (Chadwick and Willard, 2001; Gonzalez-Romero et al., 2008; Malik and Henikoff, 2003). The histone H2A family includes a number of replacement variants shown to have distinct biophysical properties and biological functions. For instance, H2A.Z is associated with transcriptional regulation and is enriched in nucleosomes positioned near transcriptional start sites (TSS) (Barski et al., 2007; John et al., 2008). The macroH2A variant is associated with the inactivated X chromosome in female mammals, indicating a possible relationship with gene silencing. A recent study showed depletion of this variant at the TSS of active genes (Buschbeck et al., 2009; Gamble et al., 2010). In contrast to macroH2A, H2A.Bbd is excluded from the inactive X chromosome and is enriched in active regions of genome (Chadwick and Willard, 2001). The genes encoding H2A.Bbd are expressed in a variety of tissues and cell types in both mouse and human, suggesting a widespread role in chromatin regulation (Gonzalez-Romero et al., 2008).
Biochemical analyses using recombinant H2A.Bbd support the association of this variant with active transcription. H2A.Bbd nucleosomes were reported to be less stable than those bearing the major H2A histone and to organize only ~120 bp of DNA, indicating that DNA in H2A.Bbd chromatin may be more accessible and, thus, more amenable to transcription (Bao et al., 2004). An increased transcription rate was demonstrated in vitro on arrays of H2A.Bbd nucleosomes as compared to arrays containing nucleosomes with the major H2A histone (Zhou et al., 2007). Furthermore, the H2A.Bbd nucleosomes exhibit faster exchange rates in vivo, a hallmark of active transcription (Gautier et al., 2004).
To further study the role of H2A.Bbd in vivo, we profiled the genome-wide localization of nucleosomes containing H2A.Bbd, macroH2A1.2, H2A.Z, and major H2A using highthroughput sequencing. We report that H2A.Bbd is associated with altered chromatin organization, is significantly enriched in actively transcribed genes, and is associated with the elongating form of RNA polymerase. Analysis of the protein composition of H2A.Bbd-enriched chromatin and the effects of H2A.Bbd depletion on gene expression indicates that H2A.Bbd is involved with mRNA splicing. Therefore, our data imply that H2A.Bbd may link transcription and subsequent mRNA processing, revealing a role for nucleosomes in integrating these two critical gene expression pathways.
To characterize the function of nucleosomes containing H2A.Bbd, we compared the properties of variants from the H2A family using HeLa cells stably expressing FLAG-tagged versions of H2A.Bbd, macroH2A1.2, H2A.Z and the major H2A histone. Each recombinant histone variant was found predominantly in the chromatin fraction (H2A - 99%, H2A.Z - 99%, H2A.Bbd - 84%, macroH2A - 99%), suggesting that each recombinant protein was efficiently incorporated. The recombinant histones H2A, H2A.Z, and macroH2A were expressed at levels lower than the endogenous histone but sufficiently high to constitute a significant fraction of the total histone content (17%, 33%, and 44% respectively). The level of the recombinant H2A.Bbd protein relative to endogenous H2A.Bbd could not be calculated since the amount of endogenous H2A.Bbd protein is unknown (see Experimental Procedures). Western analysis using FLAG antisera showed that recombinant H2A.Bbd is 4-fold less abundant than recombinant H2A.Z and 12-fold less abundant than endogenous H2A.Z (data not shown). Nuclei were MNase-digested, and chromatin was enriched for each variant by immuno-precipitation. Purified DNA was subjected to high-throughput sequencing (ChIP-seq) using the Illumina platform, resulting in at least 10 million uniquely mappable tags for each variant (Table S1).
We began characterization of the H2A.Bbd in vivo by determining whether the unusual biochemical properties of this variant (e.g., length of DNA protected from nuclease digestion by the histone octamer) were conserved in cultured cells. Structural features of chromatin containing this variant can be inferred from deep sequencing data (Figure 1A) (Tolstorukov et al., 2009). Correlation analysis of the tag frequencies on the positive and negative DNA strands reveals that the H2A.Bbd nucleosomes protect an average of 124 bp of DNA (Figure 1B), while other forms of histone H2A analyzed in this study show protection lengths close to 147 bp as expected for a canonical nucleosome (Figure S1A). This observation was corroborated by capillary gel electrophoresis analysis of the DNA fragments submitted for sequencing (Figure 1C, Figure S1B). Thus, these data are in agreement with previous reports based on in vitro-assembled H2A.Bbd nucleosomes (Bao et al., 2004).
We analyzed tag distributions on the same strands to estimate the average nucleosome repeat length, which represents the sum of the nucleosome-associated DNA and the intervening linker DNA. Thus, this analysis addresses the issue of whether the shortened amount of DNA associated with the H2A.Bbd nucleosomes is also reflected in H2A.Bbd nucleosome spacing in cells. We found that the average repeat length is 137 bp for H2A.Bbd nucleosomes, substantially shorter than the 170–180 bp reported for other nucleosomes in human chromatin and other variants analyzed in our study (Lohr et al., 1977) (Figure 1D, Figure S1C). This finding agrees with a previous study in which nucleosomes were assembled on a template in vitro (Bao et al., 2004). Assuming that H2A.Bbd nucleosomes are arranged adjacently, this indicates that the linker length in H2A.Bbd chromatin is 13 bp (137–124) rather than canonical 25–30 bp. We note that canonical nucleosomes, which organize 147 bp of DNA (Luger et al., 1997), could not be arranged in arrays with the repeat lengths of 137 bp observed for H2A.Bbd due to likely steric clashes between adjacent nucleosomes. The shorter protection observed for the H2A.Bbd nucleosomes have been thought to reflect dynamic breathing of the DNA arms off the histone surface (Zhou et al., 2007). However, the shortened repeat length observed for H2A.Bbd nucleosomes argues for stable changes in the nucleosome particle upon H2A.Bbd incorporation (Doyen et al., 2006), as only nucleosomes with altered structural organization can be arranged with such short repeats.
Deviation of the H2A.Bbd variant from major H2A histone in both primary sequence and nucleosome arrangement suggests that H2A.Bbd nucleosomes may serve a distinct function in the genome and are unlikely to be distributed randomly. We therefore analyzed the ChIP-seq data to investigate and compare genomic distribution of H2A.Bbd and other members of the H2A family (Figure 2). In agreement with previous reports, H2A.Z nucleosomes showed localized enrichment at TSS (Barski et al., 2007) and macroH2A occupied extended regions that often encompassed multiple genes (Gamble et al., 2010). In contrast, H2A.Bbd was enriched within individual genes and the regions of its enrichment often corresponded to the regions of macroH2A depletion. Overall, 63% of H2A.Bbd enrichment clusters overlap with genes and 36% of genes are associated with at least one H2A.Bbd locus, in comparison to 25% and 14% respectively for macroH2A (see Experimental Procedures), indicating a possible role for H2A.Bbd within genes.
To characterize the localization in transcription units further, we calculated the average profile of each variant around the TSS and transcription termination sites (TTS) (Figure 3A–D). In this analysis, the genes were divided into five groups (quantiles) by expression levels and the profiles were normalized by the tag densities in bulk chromatin. H2A.Bbd was enriched in bodies of expressed genes, with strong enrichment throughout the entire gene and clear correlation with the expression level. H2A.Bbd was depleted in the immediate vicinity of TSS, with steep increase to the body of the genes, especially for highly expressed genes (Figure 3A). In contrast, macroH2A was depleted throughout expressed genes but showed elevated enrichment levels over both promoters and transcribed regions of silent genes (Figure 3B). In accordance with published results, H2A.Z was enriched while H2A was depleted at the start of active genes. Association of the H2A variants with gene expression status was statistically significant (Figure S2A–C).
In principle, the average profile may be a composite of multiple classes of genes, each containing distinct enrichment patterns across the transcribed regions. To rule out this possibility, we grouped the genes using K-means clustering after averaging the profiles over the TSS-proximal, gene body, and the TTS-proximal regions. We then computed the average expression for each resulting cluster (Figure S2D–F). The result of this analysis was consistent with the data presented in Figure 3, showing for example that the clusters characterized by high H2A.Bbd enrichment within gene bodies are comprised of expressed genes.
Since we used cells expressing the recombinant H2A.Bbd in addition to endogenous histone, we wondered if H2A.Bbd enrichment in transcribed genes could be explained merely by nonspecific association with regions of high nucleosome turnover. However, we observed depletion rather than enrichment of H2A.Bbd at DNase I hypersensitive sites (Figure S2G) and TSS (Figure 3), which are often associated with high nucleosome turnover rates (Deal et al., 2010; Mito et al., 2007). Thus, our observation of H2A.Bbd enrichment in transcribed regions is unlikely to be a simple consequence of non-specific association of H2A.Bbd with ‘open’ chromatin regions.
The enrichment of H2A.Bbd within active genes may have resulted from strong association of H2A.Bbd with a group of active genes involved in specific biological processes or from ubiquitous enrichment of H2A.Bbd in all active genes. To distinguish between these alternatives, we performed gene ontology (GO) analysis for the top 10% of the H2A.Bbd enriched genes (Table 1). This analysis showed that genes involved in intracellular metabolic processes, including protein and RNA metabolism, are strongly over-represented among the H2A.Bbd-enriched genes. This is in contrast to the results for macroH2A-enriched genes, which showed enrichment for intercellular processes including development (Gamble et al., 2010). To distinguish H2A.Bbd-specific GO categories from those enriched in highly expression genes, we repeated the analysis using only the top 40% of active genes as our background reference set, and we identified similar GO categories (Table S2). We conclude that H2A.Bbd enrichment is a characteristic of genes that are involved in a broad range of cellular functions but is not pronounced in all active genes.
We expected that H2A variant-containing chromatin would be enriched for covalent marks appropriate to the regulatory characteristics of genes associated with the particular variant (Barski et al., 2007). To investigate whether the expected covalent marks on histone H3 colocalize with H2A variant chromatin, we analyzed bulk-purified H2A variant chromatin by western blot (Figure 4A). Marks correlated with gene silencing were depleted from H2A.Bbd chromatin (see results for H3K27me3 and H3K79me3 marks) while active marks were enriched (cf. results for H3K4me2 and H3K4me3 marks). A covalent modification enriched in gene bodies without preference for particular regions of gene bodies (H3K27me1) was also slightly enriched with H2A.Bbd. Overall, the abundance of H3 modifications on bulk H2A.Bbd chromatin is in agreement with the localization observed previously (see Figure 3).
To further explore the possible involvement of H2A.Bbd in transcriptional regulation, we probed equal amounts of purified chromatin enriched in each of the H2A variants with an antibody against the N-terminus of RNA Pol II by western blot (Figure 4B). As expected, relative to H2A, both H2A.Z and H2A.Bbd chromatin are enriched for total RNA Pol II. The Ser2 phosphorylated form of RNA Pol II, however, is only enriched with H2A.Bbd chromatin. The observed specificity of the elongating RNA Pol II to H2A.Bbd chromatin is consistent with the distinct distribution patterns of H2A variants along genes shown in Figure 3. Thus, our results indicate that H2A.Bbd is a reliable marker of expressed genes and is associated with other chromatin features of transcribed genes.
The genomic distribution and association of H2A.Bbd with elongating RNA Pol II led us to consider that H2A.Bbd might be involved in the regulation of transcription or processes related to transcriptional elongation. To investigate how deficiency in H2A.Bbd can affect gene expression, we depleted H2A.Bbd using short-hairpin RNA (shRNA) construct and used highthroughput sequencing of nuclear RNA (RNA-seq) to characterize changes in transcript abundance. To rule out possible artifacts due to over-expression of H2A.Bbd in the transgenic cell line, we used such cells only to check efficiency of the shRNA constructs, while the actual RNA-seq experiment was performed in non-transgenic cells. Transcription profiles for cells treated with shRNA constructs targeting the H2A.Bbd transcript were compared with those obtained using a control shRNA with no homology to the human genome.
Depletion of H2A.Bbd results in substantial changes in gene expression levels, with genes being both up- and down-regulated (Figure 4C). With two biological replicates in each experiment, we identified ~1,200 differentially expressed genes (false discovery rate of 0.05 and greater than two fold-change for a conservative estimate; see Experimental Procedures), with more genes down-regulated than up-regulated (716 and 522 genes respectively, P=6·10−9).
To evaluate if H2A.Bbd nucleosomes directly regulate transcription, we examined how the amount of H2A.Bbd in a gene is correlated with the extent of change in gene expression following H2A.Bbd depletion. The expression fold-change is mildly correlated with the enrichment of the gene in H2A.Bbd (Pearson’s correlation coefficient, R = 0.2, Figure S3). Also, the range of the expression fold-change is considerably smaller for the genes with lower levels of H2A.Bbd enrichment. These findings, combined with the association of H2A.Bbd with the marks of active transcription (Figures 4A,B), suggest the importance of this variant for positive regulation of transcription. We note, however, that the skew in the number of affected genes towards up-regulation is only modest (less than two-fold), which can be indicative of a complex regulatory role of this variant. We therefore considered other regulatory mechanisms that may involve H2A.Bbd.
To further investigate possible functions of H2A.Bbd, we undertook a proteomic approach to determine the factors that are associated with H2A.Bbd chromatin using mass spectrometry. We isolated chromatin enriched for H2A.Bbd using the epitope-tagged H2A.Bbd cell line and compared it to similarly purified macroH2A-enriched chromatin (Figure 5A). The mass spectrometry analysis revealed that 57 proteins were enriched only with H2A.Bbd chromatin and additional 9 proteins were at least two-fold enriched over macroH2A (Table S3). Some proteins were also enriched with macroH2A (data not shown), most notably the previously described PARP1, which was present in roughly stoichiometric amounts with the tagged macroH2A (Figure 5A) (Ouararhni et al., 2006). The majority of proteins specific to H2A.Bbd chromatin were RNA processing factors including 15 proteins identified as components of the spliceosome, leading us to the surprising hypothesis that H2A.Bbd regulates interactions with RNA splicing machinery. In particular, five of the seven members of the Sf3b complex required for branch point recognition and U2 snRNP assembly were found specifically enriched in the H2A.Bbd sample (Corrionero et al., 2011; Folco et al., 2011; Gozani et al., 1996). These data indicate that H2A.Bbd chromatin associates with proteins important for assembly and function of the spliceosome.
The results obtained with mass spectrometry were confirmed by western blot comparing similar amounts of purified chromatin containing H2A variants. To ensure that Sf3b association with H2A.Bbd is not a common feature of chromatin present in active gene bodies, the histone variant H3.3 was included in this analysis since H3.3 is also found inside active genes (Jin et al., 2009; Schwartz and Ahmad, 2005). Again, proteins from the Sf3b complex as well as other spliceosome factors were found enriched with H2A.Bbd chromatin (Figure 5B). Enzymes involved in incorporation of H2A.Z and H3.3 (EP400 and DAXX) were found enriched with chromatin only from the purified variant, validating the purification of H2A.Z and H3.3 chromatin. Since MNase used to solubilize chromatin prior to purification is a potent single stranded nuclease, little or no RNA remained in the purification, indicating that the specific interaction of Sf3b with H2A.Bbd chromatin likely occurs through nucleosomes and not RNA (Figure S4A). We conclude that H2A.bbd nucleosomes associate with proteins linked to the spliceosome, particularly with the Sf3b subcomplex required for assembly of the U2 snRNP.
Furthermore, analysis of the H2A.Bbd density around splice sites reveals a distinct pattern in nucleosome positioning around the intron-exon junction, which extends to the branch point recognized by U2 snRNP (Figure 5C). Comparison of the H2A.Bbd and H2A.Z densities shows that this pattern is specific to H2A.Bbd nucleosomes. In particular, the peak in the tag density located on the intron side of the junction is pronounced for H2A.Bbd and not for H2A.Z. Interestingly, the distance between two stable H2A.Bbd nucleosome positions flanking the junction is about 135 bp, which is less than the DNA protection length in canonical nucleosomes but is consistent with the size of DNA fragments associated with H2A.Bbd nucleosomes. This observation suggests that the specific structural properties of H2A.Bbd nucleosomes, described above, can be operational for splice site definition.
To establish the extent of the association between H2A.Bbd and splicing, we compared the enrichment of H2A.Bbd in single- and multiple-exon genes (Figure 5D–E). We hypothesized that if H2A.Bbd were generally important to RNA splicing, multiple-exon genes (average length: ~50 Kb) would be more enriched with H2A.Bbd since single-exon genes (average length: ~2 Kb) are not likely to require splicing. As expected, we observed significant enrichment along the entire length of multiple-exon genes, in contrast to non-uniform and weak enrichment in single-exon genes. As another control, we used a set of randomly selected genes with the same number of genes as in the single-exon group (Figure S4B), which confirmed that the difference between the single- and multi-exon groups is not a function of group size. We also confirmed that the fraction of single-exon genes in a group decreases with the increase in H2A.Bbd density (Figure S4C). Based on these unexpected observations, we hypothesized that the H2A.Bbd variant is associated with mRNA processing. Alterations in splicing efficiency are also likely to lead to changes in steady state amounts of affected RNAs, consistent with the conclusion of the expression analysis described above.
To investigate the role of histone H2A.Bbd in regulating RNA processing, we further characterized the RNA-seq data from H2A.Bbd-depleted cells. As in the analysis of the transcription effects of tagged H2A.Bbd, the cells treated with shRNA with no homology to the human genome were used as control. To determine whether the effects observed upon H2A.Bbd depletion were specific to this variant, the cells depleted for another H2A variant associated with active chromatin, H2A.Z, were used for comparison. Analysis of RNA-seq data revealed changes in mRNA splicing patterns upon H2A.Bbd depletion, with the characteristic feature of increased read density within introns (see example in Figure 6A). The frequency of paired-end reads that indicate retention of intronic sequences (those with one end mapping to an exon and the other mapping to an intron) also increases upon H2A.Bbd depletion. These changes are likely to result from a decrease in efficiency of splicing upon H2A.Bbd depletion. We note that the density of intronic reads was non-zero in the control sample in which no variant was depleted; the density increased upon depletion of H2A.Z, but it was much less than the increase in the H2A.Bbd-depleted sample.
To assess how widespread this phenomenon is, we performed a number of genome-wide analyses. First, we confirmed that on average the fraction of intronic reads is significantly higher in the H2A.Bbd-depleted sample compared to both controls (Figure 6B). A more detailed analysis reveals that 1,122 transcripts have significantly increased intronic read density upon H2A.Bbd depletion, which is almost twice as many as 594 transcripts in the case of H2A.Z depletion (Figure S5A). The change in the intronic read density shows moderate correlation with the density of H2A.Bbd tags inside the genes even though the RNA-seq and ChIP experiments were performed in different cell lines (see Experimental Procedures and discussion above), indicating a direct nature of the effect (Figure S5B).
Second, we characterized each exon by estimating a ‘spilling’ score, defined as the fraction of RNA-seq reads that map partly to this exon and partly to an intron (Figure 6C). Using replicate data, we identified exons for which spilling score significantly changed between the control sample and the samples where either H2A.Bbd or H2A.Z was depleted. Comparison of the scores for these exons showed that depletion of either H2A variant resulted in an increased fraction of exon-intron reads; however, the magnitude of the increase is significantly larger in the case of H2A.Bbd depletion (P=10−13).
Third, we computed the numbers of exon-exon junctions supported by the RNA-seq fragments spanning over the junction from one exon to another, which represent instances of normal splicing (Figure 6D). Rather than using a threshold for the minimal number of RNA-seq fragments supporting a junction, we compared the results for several threshold values and observed that the numbers of supported junctions significantly decreased upon H2A.Bbd depletion. As before, the effect of H2A.Bbd depletion is significantly more pronounced than that of H2A.Z depletion (P=10−5). Results of all these analyses indicate that histone H2A.Bbd is involved in splicing of introns from precursor mRNAs.
RNA-seq experiment also allows estimation of the exon inclusion rates, which measure the frequency with which exons are included in the mature transcript. Here, we defined the inclusion rate for each exon as the ratio of the number of the tags mapped to that exon relative to the maximal number of tags mapped to any single exon in the same gene. Using this approach, we detected significant changes in exon inclusion rates between the H2A.Bbd-depleted and control samples and observed that the mean inclusion rate was significantly increased in the H2A.Bbd-depleted sample (Figure S5C). On average, the changes in exon inclusion rates were positively correlated with changes in transcript levels (Figure S5F); however, the transcript level changes were relatively small (less than two-fold) in most of the transcripts exhibiting changes in exon inclusion. We also addressed the relationship between presence of H2A.Bbd and changes in exon inclusion rates by comparing the inclusion rates in control and H2A.Bbd-depleted samples for exons grouped according to H2A.Bbd enrichment levels. We observed that the change in inclusion rate upon H2A.Bbd depletion is more pronounced for the exons with higher normal enrichment levels of H2A.Bbd (Figure S5D–E). We conclude that the presence of H2A.Bbd affects alternative splicing with a net decrease in exon inclusion.
Taken together, our results provide evidence for functional association of H2A.Bbd chromatin with mRNA processing and, specifically, splicing. In particular, our observations show that H2A.Bbd depletion results in increased read density in intronic sequences and elevated exon inclusion rates. These changes are consistent with a decrease in splicing efficiency, indicating that H2A.Bbd facilitates splicing in unperturbed cells.
We report that the properties of the H2A.Bbd chromatin differ from those of bulk chromatin in physical parameters and in an increased association with transcribed genes. The changes we observe in splicing patterns upon H2A.Bbd depletion imply a role for this variant in modulating splicing efficiency. Examples of the impact of H2A.Bbd on chromatin function include the shorter length of nucleosome-protected DNA and the related nucleosomal repeat, corresponding to a non-canonical arrangement in the primary structure of chromatin. The enrichment of H2A.Bbd chromatin in spliceosome components and in the elongating Pol II suggests that H2A.Bbd facilitates a connection between transcription elongation and mRNA processing. Our analysis demonstrated that H2A.Bbd chromatin is associated with the Sf3b complex, a component of the U2 snRNP required for spliceosome function and that H2A.Bbd nucleosomes are arranged in a specific pattern at the intron-exon junction, which may facilitate splice site recognition. Consistent with a role in stimulating spliceosome assembly, depletion of H2A.Bbd resulted in fewer spliced junctions and increased abundance of intronic sequences in the transcripts.
We performed a number of controls and checks to ensure that the observed effects are specific to H2A.Bbd. Since transcription and splicing are known to be linked (Luco et al., 2011; Moore and Proudfoot, 2009), assessing the impact of another histone H2A variant involved in gene activation, H2A.Z, on mRNA processing was important for this purpose. While we observed some association of H2A.Z with Sf3b complex, it was considerably less pronounced than that of H2A.Bbd (Figure 5B). Likewise, the magnitude of the changes in splicing upon H2A.Z depletion were significantly lower than those observed for H2A.Bbd (Figures 6, S5), while H2A.Z and H2A.Bbd depletions were equally strong and H2A.Z depletion resulted in a pronounced down regulation of a set of genes (see Supplemental Experimental Procedures). Furthermore, the number of splicing junctions directly supported by RNA-seq reads did not change significantly upon H2A.Z depletion (Figure 6D), suggesting that the association of this variant with splicing is relatively weak. This indicates that the impact of H2A.Bbd on splicing goes beyond what might be expected for a histone variant involved generally in active transcription.
Two hypotheses are consistent with H2A.Bbd-mediated integration of transcription and mRNA processing. First, H2A.Bbd may arrange chromatin structure that facilitates a transcriptionally favorable ‘open’ chromatin state with indirect effects on downstream mRNA processing. In this ‘non-specific’ model, splicing regulation could be, e.g., a consequence of transit time of RNA Pol II with resultant changes in availability of splice sites (Keren et al., 2010; Kornblihtt et al., 2004; Tilgner et al., 2009). Second, the H2A.Bbd variant may specifically interact with the transcriptional machinery to coordinate both transcription elongation and mRNA processing factors (Luco et al., 2011). The two models are not mutually exclusive and the changes in splicing that we observe upon H2A.Bbd depletion are consistent with either model. At the same time, the enrichment of H2A.Bbd chromatin with splicing factors favors the latter mechanism. Other studies have demonstrated a connection between chromatin factors and mRNA processing by the association between proteins that bind both nucleosomes and RNA-splicing machinery (Batsché et al., 2006; Luco et al., 2010; Sims et al., 2007). Here, we elaborate upon this idea by showing the involvement of the specific H2A variant in integration of transcription and splicing.
A possible role for nucleosomes in mRNA processing was hypothesized from the observation that typical exon length correlates well with nucleosome size in metazoans (Keren et al., 2010). One interpretation is that nucleosomes may be involved in exon definition, helping the splicing machinery to identify shorter exons from the many kilobases of introns in a gene. This is supported by the observation that nucleosomes display higher density in the exons separated by long rather than short introns (Spies et al., 2009). We note that the average human exon length of ~126 bp is shorter than both the exon length in other metazoans and canonical nucleosome protection size (Schwartz et al., 2009), but agrees with the average protection size of H2A.Bbd nucleosomes reported here. Furthermore, a decrease in splicing events and increased prevalence of intronic sequences in mRNA, as observed in the absence of H2A.Bbd, are phenotypes that are reconcilable with defects in splice site definition. However, the mechanism of such an association is yet to be identified.
The landscape of H2A variants in the genome reflects the local regulation of gene expression, with each variant being associated with different stages of transcriptional regulation. Active chromatin consists of promoters enriched with H2A.Z and gene bodies enriched with H2A.Bbd, both increasing with gene activity. Repressed genes are largely devoid of H2A.Z and H2A.Bbd but may have some association with macroH2A. Concordantly, H2A.Bbd and macroH2A generally display mutual exclusivity along the genome as they do on the inactive X. However, regions demonstrating inclusion of both variants are also apparent, and the presence of macroH2A was detected by mass spectrometry of H2A.Bbd chromatin (see Table S3). Therefore, alternative inclusion of H2A.Bbd and macroH2A is likely a consequence of their association with different regulation pathways rather than of possible physical limitation on having a chromatin fiber with adjacent H2A.Bbd and macroH2A nucleosomes.
It is intriguing why H2A.Bbd is observed only in mammals and not in invertebrates, especially in light of our findings that H2A.Bbd is associated with transcription and splicing, which are processes occurring in all eukaryotes. While several histone variants are enriched in active genes throughout eukaryotes (Talbert and Henikoff, 2010; Weber et al., 2010), the hypervariabilty of quickly evolving H2A.Bbd might be instrumental during evolution of higher organisms (Gonzalez-Romero et al., 2008). Alternative splicing of mRNA, which increases protein product diversity without increasing the number of genes (Nilsen and Graveley, 2010), has been associated with development of organism complexity. Therefore H2A.Bbd may have co-evolved with other pathways to further regulate mRNA splicing important during mammalian evolution. Additional studies will be required to test this hypothesis.
Stable HeLa cell lines expressing H2A.Bbd-FLAG, macroH2A1.2-FLAG, H2A.Z-FLAG or H2A-FLAG were constructed as previously described (Nakatani and Ogryzko, 2003,Viens et al., 2006). The cell lines showed no obvious growth defects. Immuno-affinity purified chromatin was isolated as described (Foltz et al., 2006; Goldman et al., 2010, see Supplemental Material for more details including description of the control for relative expression of the recombinant protein and list of antibodies used). The purified DNA was cloned using standard Illumina sample kit and sequenced on Genome Analyzer I instrument.
A panel of 5 different H2A.Bbd (NM_080720) targeted constructs from Sigma-Aldrich (Mission shRNA clones) was assayed for the ability to deplete H2A.Bbd in the HeLa stable cell line after transduction of lentivirus expressing the respective shRNA. Construct TRC106827, which displayed the most potent H2A.Bbd depletion estimated at >90% after 7 days of puromycin selection following transduction, was selected. Construct TRC72585 (NM_002106) was used for H2A.Z with >95% depletion and non-target shRNA lentivirus (SHC002) was used as a control. Briefly, after transduction, nuclei were isolated to enrich for immature transcripts using an NP-40 lysis buffer (Pandya-Jones and Black, 2009). Total RNA was purified using the Trizol reagent (Invitrogen) and subsequently depleted for rRNA using the Ribo-zero kit (Epicentre #RZH1046). Purified RNA was then cloned using the Tru-Seq RNA sample kit without any polyA enrichment (Illumina).
Detailed description of the computational procedures is provided in the Supplemental Material. In brief, sequenced tags were mapped to the human genome (hg18) using Bowtie aligner (Langmead et al., 2009) in the case of ChIP-seq data and Tophat in the case of RNA-seq data (Trapnell et al., 2009). Tags were filtered for possible artifacts prior to further analyses. Differential gene expression analysis was performed with bioconductor package DESeq (Anders and Huber, 2010), using RNA-seq data for two biological replicates for each sample (there were at least 85 million paired reads per replicate). Gene Ontology Term Finder web-server (http://go.princeton.edu/cgi-bin/GOTermFinder/GOTermFinder) was used for gene ontology analysis. Exon inclusion and intron incorporation rates were estimated based on the normalized numbers of fragments mapped within each interrogated locus. The ‘spilling’ score was computed for each exon as the occurrence of RNA-seq reads that have some part within the exon and at least some part of the read outside of all exons of the corresponding gene, normalized by total number of reads associated with this exon.
We are grateful to Drs. Douglas Black, Erika Larschan, Artyom Alekseyenko and Andrey Gortchakov for critical reading of the manuscript and many insightful comments. This work was supported by NIH grants R01GM082798 and U01HG00425 to P.J.P. and R37GM048405 for R.E.K.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Data described in this paper were deposited into the NCBI GEO database with accession number GSE38284.