|Home | About | Journals | Submit | Contact Us | Français|
In this study, we characterize a four-protein nucleosome-binding complex from Schizosaccharomyces pombe, termed SAPHIRE, that includes two orthologs of human Lsd1, a histone demethylase. The SAPHIRE complex is essential for cell viability, whereas saphire mutants lacking key conserved catalytic residues are viable but thermosensitive, suggesting that SAPHIRE has both an important enzymatic function and an essential nonenzymatic function. SAPHIRE is present in (or adjacent to) particular heterochromatic loci and also in the transcription start site regions of many highly active polymerase II genes. However, ribosomal protein genes are notably SAPHIRE deficient. SAPHIRE promotes activation, as target genes are selectively attenuated in saphire mutants. Interestingly, saphire mutants display increased histone H3 lysine 4 dimethylation, a modification typically associated with euchromatin. SAPHIRE localization is dynamic, as activated genes rapidly acquire SAPHIRE. Furthermore, saphire mutants dramatically shift a heterochromatin-euchromatin boundary in Chr1, suggesting a novel role in boundary regulation.
Chromatin is a dynamic material that partitions chromosomes into functional domains (telomeres, centromeres, and rRNA gene arrays) and also partitions individual genes into segments, each with a distinct role in transcriptional regulation. All chromatin regions contain arrays of nucleosomes, which consist of 147 base pairs of DNA wrapped around an octameric disk of histone proteins (24, 34). However, different chromatin regions have distinctive characteristics, including (i) differences in chromatin composition, including histone variants, linker histones, and associated nonhistone chromatin architectural proteins; (ii) structural diversity (nucleosome positioning or compaction); and (iii) variation in covalent modifications of the DNA and histones (34).
Covalent histone modifications serve as docking sites for protein domains present in specific factors, which in turn endow the region with unique characteristics (46). Methylation of lysines in the unstructured termini (tails) of histones regulates a wide array of DNA-templated processes, including transcription (12, 47). Methylation of lysines in histone tails is mediated by two families of enzymes, namely, the DOT family and the SET family (30, 40). These enzymes can mono-, di-, or trimethylate lysines, and even the subtle difference between di- and trimethylation of the same lysine residue can be discriminated by recognition domains and, therefore, used to recruit distinct factors (21, 33, 52).
Dimethylation of histone H3 lysine 4 (H3K4me2) is generally thought to make chromatin permissive to transcription, whereas trimethylation of the same residue (H3K4me3) is associated only with transcriptionally active genes (38). Set1p, the protein which is responsible for all H3K4 methylation in Saccharomyces cerevisiae (budding yeast) (6, 19), bears several autoregulatory domains which control whether the enzyme di- or trimethylates (6, 39). This observation argues that a balance between these two very similar methyl marks is critical for homeostasis. Indeed, chromatin immunoprecipitation (ChIP)-microarray experiments with budding yeast demonstrated that, on average, these two marks are not identically distributed across genes. H3K4me3 is concentrated at the 5′ ends of transcribed genes, whereas H3K4me2 tends to peak in the middle of genes but shifts closer to the 3′ end when genes are highly transcribed (23, 35). Although several factors that bind H3K4 methylated tails have been identified, it is still unclear what role H3K4me2 and H3K4me3 play in transcription. However, deletion of SET1 leads to defects in gene activation, highlighting the importance of these marks (5).
Histone methylation is not restricted to transcriptional activation, as histone lysine methylation at alternative locations on the H3 tail promotes repression. First, methylation of H3K9 is present at sites of constitutive heterochromatin and recruits repressive proteins, such as HP1, to enforce the silent state of these domains (10). Second, H3K36 methylation in the bodies and 3′ ends of transcribed genes recruits a repressive complex that inhibits spurious transcription initiation from cryptic promoters inside the gene (7, 13-15, 29).
Histone methylation is not a static mark in vivo, as shown by the discovery of histone lysine demethylases (HDMs). The largest family of HDMs is the Jumonji (JMJ) domain family. To date, five JMJ HDMs have been characterized, with substrate specificities directed toward H3K9, H3K36, or both (18, 49-51, 53). Thus, this class of HDMs appears to demethylate repressive methyl marks. Significantly, the active-site chemistry of JMJ domains allows for demethylation of mono-, di-, or trimethylated lysine residues.
However, only one protein, Lsd1, has been isolated that can demethylate H3K4. Lsd1 belongs to a class of proteins called nuclear amine oxidases (NAOs), which are defined by the presence of an N-terminal SWIRM domain followed by an amine oxidase domain (43). The SWIRM domain is found in many nuclear proteins, most of which are members of chromatin-associated multiprotein complexes (e.g., the SAGA histone acetyltransferase complex or the SWI/SNF remodeler complex) (1). The amine oxidase domain is a flavin adenine dinucleotide (FAD)-dependent oxidoreductase. Amine oxidases use the oxidative potential of FAD to break a carbon-nitrogen bond that forms the substrate amine. This reaction can also be thought of as the dealkylation of an amine, as is the case in a histone demethylation reaction. Importantly, this class of enzymes cannot oxidize quaternary amines; thus, Lsd1 can only demethylate mono- and dimethylated H3K4 (3, 43).
Although most of the data regarding Lsd1 point to a role in transcriptional repression, a recent study observed that Lsd1 is recruited to the prostate-specific antigen (PSA) gene during activation by the androgen receptor (AR) (20, 27, 43, 44). Small interfering RNA-mediated knockdown of the Lsd1 protein caused a drastic decrease in the activation of AR-responsive reporter genes, suggesting that Lsd1 can be an activator. Knockdown studies also demonstrated that Lsd1 is required for demethylation of repressive H3K9 methylation when PSA is activated. These studies suggested that Lsd1 demethylated H3K9 to promote activation. However, these results are also consistent with Lsd1 activity being required upstream of another H3K9 demethylase rather than Lsd1 being an H3K9 demethylase itself. Indeed, a JMJ domain protein (JHDM2A) with robust in vitro H3K9me2 demethylase activity using purified components is also recruited to PSA upon activation by AR. Small interfering RNA studies demonstrated that JHDM2A knockdown creates profound defects in H3K9 demethylation at the PSA gene upon activation, whereas Lsd1 knockdown effects are much more modest (53).
Thus, many important questions remain to be answered about NAOs. For example, most of the work done to date has focused on Lsd1 function at a limited number of gene targets (see Discussion), and it remains to be determined whether the principles elucidated so far are general. Because most of the data point to a repressive function for NAOs, it is particularly important to determine if the activation function seen at PSA is broadly applicable and dependent upon H3K9 demethylation. Importantly, since Lsd1 is present in large complexes, it is important to determine the catalytic and noncatalytic roles of this class of proteins and the contributions of associated proteins to Lsd1 function. Finally, the only biological function attributed to NAOs so far is transcriptional regulation, and since chromatin modifications impact many aspects of chromosome biology, it is imperative to explore other possible roles for NAOs.
Here we report the genome-wide dynamics of SAPHIRE, a novel NAO complex from Schizosaccharomyces pombe. Our results demonstrate that SAPHIRE promotes gene activation, most likely through enzymatic and nonenzymatic means. SAPHIRE occupancy is dynamic and shifts to newly activated genes during heat shock. Furthermore, mutant analysis reveals that defects in transcription of SAPHIRE targets are associated with increased H3K4me2 of the affected genes. Finally, we identify a new role for NAOs in maintaining the boundary between euchromatin and heterochromatin at a telomere.
Yeast growth, manipulations, and molecular biology were performed according to standard protocols. Strains were grown on yeast extract supplemented with histidine, uracil, leucine, and adenine at 32°C unless otherwise noted. For heat shock experiments, cells were cultured to an optical density at 600 nm of 0.5 and transferred to flasks which were preheated in a 40°C water bath. Cells were incubated, with shaking, for 15 min and harvested for cross-linking and ChIP or RNA isolation.
Full-length Saf110 with an N-terminal six-His tag was expressed in Escherichia coli and purified under denaturing conditions, using Ni-nitrilotriacetic acid agarose, following the manufacturer's instructions (QIAGEN). Antibodies against the protein, purified in-gel by Covance, were raised in a rabbit. The antiserum was purified using hydroxyapatite chromatography and used at a dilution of 1:1,000 for Western blotting following standard procedures.
SAPHIRE was purified using a tandem affinity purification (TAP) tag, which carries (in tandem) the protein A motif, a tobacco etch virus (TEV) protease recognition sequence, and a calmodulin-binding protein domain. The TAP tag was placed at the 3′ end of either the Saf140p or Saf110p gene by homologous recombination (36). Whole-cell extracts were prepared in buffer A (20 mM Tris-HCl [pH 7.5], 10% glycerol, 400 mM NaCl, 1 mM EDTA, 10 mM ZnOAc, 0.1% NP-40, 0.1 mM dithiothreitol, and a protease inhibitor cocktail). Whole-cell extracts were bound to immunoglobulin G-Sepharose and washed with buffer A, and the tagged proteins were eluted by cleavage with TEV protease in buffer B (20 mM Tris-HCl [pH 8.0], 10% glycerol, 150 mM NaCl, 10 mM ZnOAc, 0.1% NP-40, 1 mM dithiothreitol, and a protease inhibitor cocktail). The TEV eluates were then bound to calmodulin-Sepharose in buffer B containing 2 mM CaCl2 and eluted in buffer C (20 mM Tris-HCl [pH 7.5], 10% glycerol, 150 mM NaCl, 10 mM ZnOAc supplemented with 2 mM EGTA [no CaCl2]). Purified SAPHIRE was resolved by sodium dodecyl sulfate-7.5% polyacrylamide gel electrophoresis and silver staining. The bands were excised, and tryptic digestion was performed in-gel. The peptides were purified on reversed-phase microtip columns. Mass spectrometric analysis (matrix-assisted laser desorption ionization-time-of-flight analysis) was performed using a Bruker Reflex III instrument with delayed extraction. The experimental masses (m/z) were used to search a nonredundant S. cerevisiae or S. pombe protein database (NCBI, Bethesda, MD). Mass spectrometric sequencing (matrix-assisted laser desorption ionization-tandem time of flight-tandem mass spectrometry) of selected peptides was done on a Bruker Ultraflex TOF/TOF instrument.
Gel mobility shift analysis was performed with a 172-bp fragment from Xenopus laevis 5S assembled into mononucleosomes (10 ng; 44 fmol) reconstituted with S. cerevisiae recombinant histones. Nucleosomes were incubated with ~16 ng of SAPHIRE (~44 fmol) in a 5-μl reaction mix containing 20 mM Tris-HCl (pH 7.5), 10% glycerol, 150 mM NaCl, and 10 mM ZnOAc; this resulted in an equimolar concentration of DNA/nucleosomes and SAPHIRE (8.8 nM). After incubation at room temperature for 30 min, the binding reaction mix was chilled on ice and resolved in a precooled native polyacrylamide gel (4% polyacrylamide, 5% glycerol, 1 mM MgCl2 in 0.5× Tris-borate-EDTA) run at 4°C. For supershift assay, SAPHIRE was incubated with 1 μl (~500 ng) of hydroxyapatite-purified anti-Saf110 antibody for 20 min on ice prior to addition to nucleosomes.
ChIP was performed as previously described (37). Briefly, cells grown to mid-log phase were cross-linked in 1% formaldehyde for 30 min. Cells were broken, and the chromatin fraction was sheared to roughly 500-bp fragments. For TAP IP, 4 μl of anti-protein A antibody (Sigma) was coupled to 100 μl of Dynabeads (Invitrogen). ChIPs of histone H3 (Abcam ab1791), H3K4me2 (Upstate 03-070), H3K9me2 (Abcam ab1220), and polymerase II (Pol II) (Abcam ab817) were performed using 3 μl of antibody and 100 μl of Dynabeads.
S arrays were hybridized, scanned, and normalized as previously described (25). O arrays (Agilent Technologies) were hybridized, scanned, and normalized according to standard procedures. Spots flagged by the scanner as below background or saturated were removed from analysis. The data shown represent the averages for multiple biological replicates. Importantly, in order to properly interpret H3K4me2 and H3K9me2 ChIP data, signals from modification-specific H3 ChIPs were normalized to genome-wide histone H3 occupancy (determined by multiple analyses using an anti-H3 C-terminal antibody). For analysis of SAPHIRE promoter occupancy-gene expression correlations, promoters were defined as the region from −800 to +200 relative to the ATG of all genes. SAPHIRE ChIP ratios were averaged for all tiles falling within this promoter region, and the resulting value was used to represent SAPHIRE promoter occupancy for the downstream gene. Expression values were derived from one of two sources, either raw fluorescence intensity values from microarray data for a wild-type strain or by averaging the Pol II ChIP ratios for every tile falling within an annotated open reading frame (ORF). In order to reveal genome-wide trends, transcription (Pol II ChIP), expression (microarray), and histone methylation (H3K4me2 ChIP) data were plotted as moving averages against SAPHIRE occupancy for various data sets. In all cases, moving averages were calculated using a moving window that constituted 5% of the total data set, with a step size of 1. For BLAST analysis of subtelomeric regions, the DNA sequence spanning coordinates 5522385 to 5579828 (57 kb) of chromosome 1 was split into 60-bp segments spaced 55 bp apart (5 bp overlap on either side). Each of these segments was used to query the entire S. pombe genomic sequence (obtained from the Wellcome Trust Sanger Institute at www.sanger.ac.uk/Projects/S_pombe/) with the BLAST algorithm available from the NIH-NCBI website (www.ncbi.nlm.nih.gov/BLAST/). Segments were considered to be duplicated if they generated a secondary BLAST hit that showed ≥90% sequence identity with the query sequence. Finally, linear regression and chi-square analyses were done using standard statistical methods.
We recognized two proteins in S. pombe that contained three domains of interest, namely, a SWIRM domain, an FAD-dependent amine oxidase domain, and a high-mobility-group B [HMG(B)] domain. Since the SWIRM and HMG(B) domains are closely associated with chromatin function, we initially pursued these proteins as potential new chromatin enzymes that could be studied in a genetically tractable organism. The amine oxidase domains of these proteins contain the key catalytic residues present in known amine dealkylases, suggesting that these S. pombe proteins have the capacity for amine demethylation. The two S. pombe paralogs are highly similar to each other (BLAST E value, 2 × 10−84) and share the same domain order. The human protein Lsd1 is highly similar (BLAST E value, 4 × 10−14) and shares key catalytic residues, the SWIRM domain, and the same domain order with the S. pombe proteins.
To characterize the biochemical properties of these S. pombe NAO paralogs and to identify associated proteins, we purified each to homogeneity. To facilitate their purification, we integrated a DNA sequence into the 3′ end of each gene (separately, at the chromosomal locus) to encode an in-frame fusion of the TAP epitope. Integrations were performed separately in diploids, with sporulation and dissection generating tagged haploid progeny that grew as well as their untagged siblings, demonstrating that the tags do not impair function. Thus, the two strains used for purification contained one tagged NAO and one untagged NAO. Purification of either NAO to homogeneity yielded the same four proteins, as identified by mass spectrometry: these were the tagged NAO, the untagged NAO, and two additional proteins, of 60 and 50 kDa (Fig. (Fig.1A).1A). Mass spectrometric analysis also showed that the additional bands in these lanes are proteolytic products of these four proteins, and thus both purifications yielded the same four-protein complex. We call the complex SAPHIRE, for a SWIRM-amine oxidase and plant homeodomain (PHD) protein complex involved in regulating gene expression (shown below), with members designated by their apparent molecular weights, i.e., Saf140p, Saf110p, Saf60p, and Saf50p. The systematic gene names are as follows: saf140+, SPAC23E2.02; saf110+, SPBC146.09c; saf60+, SPAC30D11.08c, and saf50+, SPCC4G3.07c. Thus, the two NAO paralogs are both present in the isolated complex. Interestingly, the additional 60-kDa and 50-kDa proteins are themselves paralogs (BLAST E value, 7 × 10−44), and both contain a single PHD (Fig. (Fig.1B).1B). Gel filtration of the entire complex provided an apparent mass of ~350 kDa, suggesting a simple tetramer (data not shown).
Since SAPHIRE contains multiple domains for chromatin association, we used electrophoretic mobility shift assays to assess the interaction with mononucleosomes. We found that SAPHIRE binds to mononucleosomes with a Kd of about 8 nM (Fig. (Fig.1C1C and data not shown). Specificity for this interaction was confirmed with a polyclonal antibody raised against Saf110p, which supershifted the nucleosome-SAPHIRE complex (Fig. (Fig.1C,1C, lane 3).
To date, the only Lsd1-related protein that has demonstrated histone lysine demethylation activity in vitro is Lsd1 itself. Extensive biochemical testing of both native and recombinant SAPHIRE complexes has not revealed appreciable activity. This is consistent with the work of others on these S. pombe orthologs, showing no observed activity in vitro (31). Therefore, we focused on the biological and genomic impact of removing the conserved catalytic residue of these enzymes, as described below.
To determine the importance of SAPHIRE, we constructed gene deletion mutations in each subunit in diploids. We then induced sporulation of the heterozygous diploids and examined the phenotypes of haploid progeny. We found that saf140+, saf60+, and saf50+ are essential genes, as viability segregated 2:2 in four-spore tetrads (data not shown). Somewhat surprisingly, saf110+/saf110Δ diploids produced four viable spores, with the saf110Δ progeny always displaying extremely slow growth (Fig. (Fig.2A).2A). Since 2:2 segregation of slow growth was observed in a large number of tetrads and was always linked to saf110Δ::kanMX, the phenotype was likely not a result of background mutation. Importantly, normal growth could be restored by transformation with a plasmid bearing only saf110+, confirming the null phenotype (data not shown). In addition to having slow growth, saf110Δ strains are temperature sensitive (Fig. (Fig.2A2A).
To test whether saf110+ and saf140+ are partially redundant, we overexpressed saf110+ or saf140+ (using high-copy plasmids) in heterozygous diploids (saf110+/saf110Δ or saf140+/saf140Δ) and tested whether we could acquire haploid segregants where the high-copy plasmid conferred growth ability to the reciprocal null segregants. We found that a high-copy saf110+ plasmid will not suppress the inviability of a saf140Δ strain, nor will a high-copy saf140+ plasmid suppress the extremely slow growth of a saf110Δ strain. These experiments suggest that the Saf140p and Saf110p proteins make largely unique contributions to SAPHIRE.
Saf140p, Saf110p, and Lsd1 share a high degree of similarity to FAD-dependent amine oxidases, such as monoamine oxidase (MAO; dealkylates neurotransmitters) and polyamine oxidase (PAO; dealkylates spermine and spermidine). Although these enzymes have different substrates, they all dealkylate an amine, using flavin reduction to reduce the amine to an imine, followed by imine hydrolysis to oxidize two products; in the case of Lsd1, the products derived from monomethyl-lysine are formaldehyde and lysine. The most extensively studied catalytic residue is a specific lysine. In all crystal structures examined, this lysine positions the lone water molecule in the active site into a bridging position with the flavin N-5 atom of the FAD cofactor (2, 4, 9, 45). This lysine is absolutely conserved in all Lsd1 orthologues, all PAOs, and all MAOs and is the only residue in the catalytic pocket that is absolutely conserved in the distantly related sarcosine oxidases. Crystal structures and/or alignments identified the corresponding catalytic lysine as K661 in Lsd1, K300 in maize PAO, K862 in Saf140p, and K604 in Saf110p (Fig. (Fig.2B).2B). In all cases tested, including Lsd1 and maize PAO, mutation of this residue alone produced no detectable activity in the resultant protein.
We determined the importance of this lysine on SAPHIRE function by replacing it with alanine. Since Saf140p and Saf110p have a lysine adjacent to this position, we also replaced K861/K603 with alanine, resulting in the double mutations KK861-862AA and KK603-604AA (KK→AA), to further ensure an impact of these substitutions on catalytic activity. Strains bearing saf140KK→AA and saf110KK→AA were constructed by replacing one of the two wild-type alleles with a KK→AA mutant allele at the endogenous genomic locus in a diploid, inducing the diploid to sporulate, and then isolating haploid KK→AA progeny. Separately, saf140KK→AA and saf110KK→AA mutants were viable, with the saf110KK→AA mutant displaying a moderate temperature sensitivity. Interestingly, the saf140KK→AA saf110KK→AA double mutant was also viable, though displaying pronounced temperature sensitivity (Fig. (Fig.2C).2C). These phenotypes could not be attributed to a significant reduction in SAPHIRE stability, as SAPHIRE abundance and integrity were similar in the double mutant at the permissive and nonpermissive temperatures (Fig. (Fig.2D),2D), though we have not rigorously quantified the impact on stability by purification of the mutant to homogeneity. No clear phenotype was detected when these strains were tested for growth in the presence of either cycloheximide, EGTA, methyl methanesulfonate, H2O2, latrunculin A, or sodium dodecyl sulfate. These results suggest that catalytic impairment does not render cells inviable but rather confers conditional growth defects. Thus, SAPHIRE exhibits an important catalytic activity and an essential noncatalytic activity.
To identify the sites in the genome where SAPHIRE functions, we determined the genome-wide localization of complex members. Genome-wide SAPHIRE occupancy was initially determined with a haploid S. pombe strain during asynchronous growth in rich medium containing glucose. To isolate DNA segments associated with SAPHIRE, we performed ChIP using TAP-tagged Saf140p or Saf110p derivatives. ChIP-enriched fragments from TAP-tagged or untagged strains were labeled with fluorescent dyes (Cy5 and Cy3, alternated between replicates) and used to probe two different DNA array formats for the S. pombe genome. The first array consisted of the entire genome parsed into two types of segments (~11,000 segments; the S array), i.e., ORFs and intergenic regions (IGRs), which were used to create spots on separate glass slides. The initial results derived from this array (described below) motivated questions that required the use of a higher-density array, which consisted of 60-mer single-stranded oligonucleotide tiles (~43,000 oligonucleotides; the O array) that tiled the entire genome at a resolution of ~250 bp on a single glass slide. For each array, the normalized Cy5/Cy3 ratio (tagged/untagged) provided a measurement of SAPHIRE occupancy at each position.
SAPHIRE occupancy determinations were reproducible and specific. For example, with the S array, three biological replicates of Saf140p-TAP and (separately) Saf110p-TAP yielded Pearson correlation coefficients (r) of ≥0.97 (data not shown). Furthermore, Saf140p-TAP and Saf110p-TAP appear to occupy the same DNA segments, as a comparison of their data sets yielded an r2 value of 0.91 (Fig. (Fig.3A).3A). For the S array, we defined segments that displayed enrichment above the 95.5th percentile as SAPHIRE occupied, which yielded a total of 119 occupied loci. For the O array, the higher density required alternative criteria to define occupancy: a tile was considered occupied if the tile (i) displayed a signal intensity well above background, (ii) displayed an average enrichment of at least threefold (tagged/untagged) over multiple biological replicates, (iii) displayed an enrichment of at least twofold in a majority of individual replicates, and (iv) was adjacent to a tile that displayed an average enrichment of at least twofold. This stringent set of criteria was designed to ensure a very small number of false-positive results. By these criteria, 426 separate loci are occupied by SAPHIRE genome-wide. As expected, the targets identified by the S array are a subset of those identified by the better-resolved O array.
In general, we found SAPHIRE distributed widely throughout the genome and dispersed at genes over all three chromosome arms. We also observed SAPHIRE at several regions associated with heterochromatin. For example, we observed SAPHIRE in the region of the silent mating (MAT) locus and in peaks in the subtelomeric regions of chromosomes 1 and 2 (presented below). In addition, SAPHIRE occupies the junctions between the central region of the centromeres and the dg/dh repeats (data not shown). Notably, these regions also bear clusters of tRNAs which bind TFIIIC, and both tRNAs and TFIIIC sites themselves have demonstrated chromatin boundary functions (32, 41). However, their relationship to SAPHIRE has not been explored. Notably deficient in SAPHIRE are the rRNA gene repeats at the distal tips of chromosome 3.
Since the majority class of SAPHIRE targets comprises genes, we examined whether SAPHIRE prefers ORFs or IGRs. In contrast to the S array, the O array included all ORF and IGR tiles on the same slide, allowing a direct comparison of these two classes. On the O array, tiles representing ORFs and IGRs comprised 65% and 35% of the total tiles, respectively. However, if we examined only SAPHIRE-occupied tiles, the distribution changed to 18% ORFs and 82% IGRs (which is highly statistically significant [P < 0.0001]). Also, nearly all ORF tiles in the SAPHIRE-occupied class clustered at the extreme 5′ and 3′ ends of the ORFs, indicating that this occupancy was likely bleeding from IGR occupancy due to limitations in shearing resolution (data not shown). Thus, the vast majority of SAPHIRE occupancy is in IGRs.
We next determined the extents to which SAPHIRE occupies the three different IGR classes, namely, (i) single promoters, which are flanked by one ORF 5′ end and one ORF 3′ end; (ii) double promoters, which are flanked by two ORF 5′ ends; and (iii) nonpromoters, which are flanked by two ORF 3′ ends. In eukaryotic genomes, these three classes partition as would be predicted by random orientation relationships, i.e., 25%, 50%, and 25% (double, single, and nonpromoters, respectively). However, note that in S. pombe, double promoters are longer and nonpromoters are shorter, on average, than single promoters. Since the O array had a relatively fixed number of tiles per length of DNA, more tiles were present in double promoters and fewer were present in nonpromoters. Thus, the O array tile ratio was 39% to 47% to 14% for the genome as a whole (double, single, and nonpromoters, respectively). However, we found that the partitioning of SAPHIRE-occupied loci was 46%, 46%, and 8%, and this change in distribution favoring promoters was highly significant (Fig. (Fig.3B).3B). Thus, SAPHIRE primarily occupies promoters.
We next examined whether SAPHIRE localizes to active or inactive genes genome-wide. Here we compared SAPHIRE promoter occupancy with transcription of the downstream ORF. Promoters were defined as ~1-kb regions spanning positions −800 to +200 relative to the translation initiation codon, as our array resolution is ~250 bp. To ensure the proper attribution of SAPHIRE to a target gene, only single promoters were included in this analysis. SAPHIRE ChIP ratios for each tile were averaged for these single promoters and collated with their respective downstream genes for occupancy-expression correlations.
We then compared SAPHIRE occupancy to two different measures of gene activity, namely, steady-state mRNA levels and Pol II occupancy. For mRNA levels, expression profiling by microarray analysis was performed under the identical conditions utilized for SAPHIRE occupancy determinations, and the relative normalized fluorescence for the wild-type channel was used to quantify the relative amount of each transcript. Our unpublished data show that this measure is a generally accurate representation of transcript abundance. However, differences in hybridization efficiency can also affect the relative fluorescence of some transcripts with equal representation in the RNA pool. Therefore, we determined the genome-wide occupancy of RNA Pol II and used the average Pol II occupancy over the entire ORF of each gene as an alternative measure of the transcription rate. We found that SAPHIRE occupancy is strongly correlated with transcription when either Pol II occupancy or mRNA abundance is examined (Fig. (Fig.3C3C and data not shown).
The data presented in Fig. Fig.3C3C are useful for examining activity-occupancy relationships, but as they represent moving averages for many genes, they do not reveal how uniformly individual genes conform to this correlation. To examine this, we compared the distribution of SAPHIRE-occupied loci versus that of all loci to Pol II occupancy. Note that because only ORF class Pol II was examined (intergenic regions were omitted), the distribution curve does not center around zero Pol II occupancy (Fig. (Fig.3D).3D). We found that most SAPHIRE targets are highly transcribed, as opposed to the small percentage of the total genome with a high transcription rate. Also, although a small number of SAPHIRE-occupied genes display somewhat less than the average amount of Pol II, we did not find SAPHIRE at any highly repressed promoters. Furthermore, we found that SAPHIRE targets represent only a fraction of all highly expressed genes (defined as genes with average Pol II ORF enrichment of ≥2.8-fold [log2 value, ≥1.5]) (Fig. (Fig.3E).3E). One particularly large class which is notably deficient in SAPHIRE is the ribosomal protein genes, the most transcriptionally active Pol II gene class in the genome. If we removed this coordinately regulated class of genes from the analysis, SAPHIRE occupied a very significant fraction of the remaining highly transcribed genes (Fig. (Fig.3E).3E). Intriguingly, the ribosomal protein genes share a cis promoter sequence specific to S. pombe, i.e., a homology D box, that is absent from most SAPHIRE targets. Thus, promoter architecture differences may determine whether an active gene recruits SAPHIRE.
Our examination of the Pol II genome-wide occupancy allowed us to approximate the transcription start sites (TSS) of highly transcribed genes to within ~125 bp. We defined the TSS as the location of the first tile upstream of an expressed gene with >2-fold occupancy. Then, selecting SAPHIRE-occupied genes with known TSS (45 targets), we compared the peak of SAPHIRE occupancy to the TSS. We found that SAPHIRE localizes at or just upstream of the TSS and falls sharply at locations following the TSS (Fig. (Fig.3F3F).
To determine whether SAPHIRE is dynamically recruited to genes undergoing activation, we examined SAPHIRE occupancy in cells subjected to 15 min of heat shock, a treatment known to impact the transcriptional activity of hundreds of genes. Remarkably, SAPHIRE was redistributed to genes undergoing activation at a magnitude proportional to their activation (Fig. (Fig.4A).4A). In this case, heat shock genes (such as hsp16+) that are bound by heat shock transcription factor (HSTF) were among the acquired targets. However, new targets also included many genes that are not believed to be bound by HSTF, suggesting that several types of promoters recruit SAPHIRE when they are activated. As a test for HSTF recruitment, we placed ectopic HSTF sites on a plasmid but did not observe heat shock-dependent recruitment of SAPHIRE, suggesting that recruitment may require both enhancer elements and promoter elements. As observed at 32°C, SAPHIRE occupied many, but not all, of the genes highly activated by heat shock, suggesting that SAPHIRE targeting is selective (data not shown). Taken together, these results demonstrate that SAPHIRE is a dynamic complex that is recruited to the region near the TSS of particular genes during the activation process.
We then took a genome-wide approach to determine whether SAPHIRE is required for the activation of target genes. If SAPHIRE assists in activation, then target genes should be selectively attenuated in saphire mutants. To test this, we performed mRNA expression profiling (microarray analysis) of wild-type, saf140KK→AA saf110KK→AA, and saf110Δ strains. Genes occupied by SAPHIRE displayed reduced transcription in the mutants, whereas unoccupied genes remained largely unaffected. The impact of the saf140KK→AA saf110KK→AA double mutation was moderate, whereas that of the saf110Δ mutation was much more pronounced (Fig. (Fig.4B).4B). Although the saf110Δ mutation confers very slow growth, the observation of a selective impact on the transcription of SAPHIRE targets strongly supports the notion that the primary effect of the saf110Δ mutation is a reduction in the activation of SAPHIRE targets. Furthermore, although the impact of the saf140KK→AA saf110KK→AA double mutation is modest, it suggests that SAPHIRE catalytic activity also makes a contribution to target gene activation. Taken together, these results strongly suggest that SAPHIRE-occupied genes rely on SAPHIRE for their full activation and that contributions are made by both the catalytic function and the noncatalytic function.
Since Lsd1 is a known H3 demethylase, we correlated H3 methylation levels with SAPHIRE occupancy genome-wide. To this end, we determined the genome-wide patterns of H3K4me2 and H3K9me2. To properly interpret H3K4me2 and H3K9me2 ChIP experiments, we normalized all data to H3 occupancy, determined by multiple independent genome-wide ChIP analyses (using a C-terminal anti-H3 antibody). Genome-wide examination revealed an extremely weak correlation with H3K9 (data not shown), which was due to SAPHIRE occupancy at or adjacent to particular heterochromatic loci. However, we observed that the levels of H3K9me2 at promoters and ORFs were extremely low compared to the levels at heterochromatic loci (<1%) or in comparison to H3K4me2 levels. Therefore, we focused our analysis on the impact on H3K4 methylation. For H3K4 methylation, we focused on the trends observed at loci that contain SAPHIRE (i.e., display a log2 occupancy ratio of ≥0). In this analysis, we observed a modest positive correlation of SAPHIRE occupancy with H3K4me2 genome-wide (Fig. (Fig.4C)4C) but a modest negative correlation at promoters (Fig. (Fig.4D).4D). This raises the possibility that SAPHIRE interacts with H3K4me2 regions but promotes H3K4 demethylation selectively at promoters. To address this possibility, we examined H3K4me2 genome-wide in wild-type cells and in our mutants bearing substitutions in the conserved catalytic residues. Interestingly, H3K4me2 increased at SAPHIRE targets in the saf140KK→AA saf110KK→AA double mutant, consistent with impairment of an H3K4 demethylation function (Fig. 4C and D). Taken together, these results suggest that misregulation of H3K4 demethylation at promoters is correlated with reduced transcriptional activation (see Discussion).
At present, little is known about the factors which restrict the spreading of telomeric heterochromatin in S. pombe. The telomere-proximal regions of all three S. pombe chromosomes bear repetitive segments with high levels of H3K9 methylation, which decrease in telomere-distal regions as unique segments are encountered. Although H3K9me2 was not significantly affected at SAPHIRE targets genome-wide (data not shown), perturbations were observed at particular loci. The clearest example, by far, is a major change in the heterochromatin-euchromatin boundary of the right telomere of chromosome 1 in the saf140KK→AA saf110KK→AA double mutant (Fig. (Fig.5).5). In this double mutant, H3K9me2 spread ~30 kb past its normal boundary (Fig. (Fig.5B).5B). We recognized that certain telomeric repeat elements (and certain genes and pseudogenes) in S. pombe are also located on other telomeres, which can prevent the attribution of ChIP enrichment to a particular locus. However, the affected region mapped almost exclusively with unique O array tiles, showing that the spreading observed can be attributed unambiguously to the right subtelomeric region of chromosome 1 (Fig. (Fig.5D).5D). Interestingly, examination of this subtelomeric region reveals a clear peak of SAPHIRE that coincides with the location where H3K9me2 begins to spread in the mutant (Fig. (Fig.5C).5C). This peak of SAPHIRE occupancy includes three consecutive tiles with ≥2-fold occupancy that are unique in the genome. This raises the interesting possibility that the catalytic activity of SAPHIRE helps to promote boundary activity at this telomere.
In this study, a search for new chromatin-modifying enzymes in S. pombe yielded a four-protein complex called SAPHIRE, which binds nucleosomes and resides at the promoters of particular active genes. SAPHIRE bears two subunits, Saf140p and Saf110p, which are highly similar to the human histone demethylase Lsd1. Our genetic data identify SAPHIRE as an essential complex, providing evidence that Lsd1 orthologues are essential for viability. Our in vivo work also provides several lines of evidence that SAPHIRE promotes gene activation. While we were preparing this paper for submission, others purified a complex of largely the same composition, localized the complex to promoters, and demonstrated that transcription is reduced at three selected targets in saf110Δ mutants (31). However, our work is unique in its (i) evidence for nucleosome binding, (ii) examination of knockouts of all four complex members, (iii) preparation and analysis of catalytically impaired mutants, (iv) observation of occupancy of the complex at heterochromatin regions, (v) genome-wide analysis of the impact of both saf110Δ and catalytic mutants on transcription, (vi) observation of changes in histone methylation that occur in catalytic mutant strains at occupied genes, (vii) observation of dynamic relocalization of SAPHIRE during gene activation, and (viii) demonstration of a large shift in the euchromatin-heterochromatin boundary at chromosome 1 in catalytic mutants, suggesting a novel role in boundary formation. We discuss aspects of these features and the insights gained below.
Gel filtration of purified SAPHIRE as well as recombinant versions of the complex (data not shown) suggested that SAPHIRE consists of a heterodimer of Saf140p-Saf110p bound to a heterodimer of Saf60p-Saf50p. One curiosity is that while saf140+, saf60+, and saf50+ are essential genes, the saf110Δ strain is viable, though extremely slow growing. One possibility is that the complex can partially form and function without Saf110p. Another possibility is that Saf140p can homodimerize at a low frequency, with the homodimer taking the place of the Saf140p-Saf110p heterodimer in the complex.
Lsd1 is purified in a multiprotein complex named BHC (11). Comparison of BHC and SAPHIRE reveals that SAPHIRE bears proteins with domain architectures similar to those of a subset of BHC subunits. For example, BHC80 is purified with Lsd1 and regulates demethylation by Lsd1 in vitro (44). BHC80 bears a single PHD, analogous to the proteins Saf50p and Saf60p. Also, although Lsd1 does not have an HMG domain (present on Saf140p and Saf110p), the BHC complex contains the HMG domain protein BRAF35 (11). Thus, SAPHIRE may constitute a conserved core demethylase-related complex.
SAPHIRE binds nucleosomes with a fairly high affinity (Kd, ~8 nM). However, our interaction experiments were conducted with recombinant nucleosomes which lacked modifications of their N-terminal tails. Notably, Saf60p and Saf50p both contain a PHD, a motif that was recently shown to bind directly to methylated lysine residues, including H3K4me2 (21, 33, 42, 52). Furthermore, we detected a moderate correlation of SAPHIRE with this mark genome-wide. Thus, a clear line of investigation is to test whether H3K4 methylation enhances SAPHIRE interaction with nucleosomes due to its recognition by the PHD.
Our work provides several lines of evidence that SAPHIRE promotes transcription. First, SAPHIRE occupancy is well correlated with transcriptional activity. However, SAPHIRE is present at only a few hundred genes, and many genes with high levels of transcription are not occupied by SAPHIRE. This suggests that SAPHIRE recruitment and action are selective. At present, factors or DNA sites that account for this selectivity have not been obtained but represent an important area of future work. Interestingly, SAPHIRE localization is dynamic, as SAPHIRE is recruited to promoters during activation. This suggests that SAPHIRE is not a factor that assembles at repressed/basal genes to promote subsequent activation (as observed with the histone variant Htz1) but, rather, is dynamically recruited at the time of activation.
We impaired Saf140p and Saf110p catalytic activities in combination and observed conditional growth phenotypes but not the inviability observed with null alleles. Furthermore, saf110+ deletion had a strong and selective impact on SAPHIRE target transcription, whereas the effect with the impaired alleles was modest. This suggests that Saf140p and Saf110p have very important nonenzymatic roles in transcriptional activation. One possibility is that SAPHIRE components recruit proteins that are important for full transcriptional activation. Another possibility is that SAPHIRE binding to chromatin helps to establish a chromatin state that promotes transcription.
Interestingly, the combined catalytic mutants conferred two effects on histone methylation at SAPHIRE targets, namely, a clear increase in H3K4me2 levels (Fig. 4C and D) and a very slight increase in H3K9me2 levels (data not shown). First, we address the impact on H3K9me. H3K9 methylation is generally correlated with gene repression, and thus removing this mark might promote activation. This is in keeping with one proposed transcriptional activation function for human Lsd1, where Lsd1-mediated H3K9 demethylation contributes to AR target gene activation (27). However, our genome-wide experiments indicated that H3K9me2 is extremely low or absent from virtually all promoters, making H3K9 methylation an unlikely player in general gene activation in S. pombe, and therefore H3K9 demethylation by SAPHIRE is an unlikely regulatory mechanism. Although we did observe a slight increase in H3K9 methylation in our catalytic mutants, this acquired H3K9 methylation signal remained <4% of the levels observed at pericentric heterochromatin or the silent mating loci. In contrast, H3K4 dimethylation is generally correlated with gene activity and/or a chromatin environment permissive for activation. However, in our catalytic mutants, the acquisition of H3K4 methylation was considerable and was correlated with downregulation of SAPHIRE target transcription genome-wide. This result raises the intriguing possibility that H3K4 demethylation can help to activate certain genes. In support of this notion, H3K4 methylation was recently shown to participate in a novel histone deacetylase-dependent gene repression pathway (42), and loss of H3K4 methylation has been correlated with transcriptional activation at certain loci (8, 16, 22, 26). Indeed, since H3K4 methylation is highly regulated both spatially and temporally at transcribed genes, changes in H3K4me2 might have opposite effects at distinct times and places along the gene. For example, at a typical gene in S. cerevisiae, H3K4me2 peaks in the middle of the ORF, and the peak moves closer to the 3′ end as transcription increases. Furthermore, as H3K4me3 peaks at the very 5′ end of the ORF, the promoter becomes devoid of H3K4me2. This orchestration of H3K4 methylation states may need to be properly balanced by the opposing activities of histone methyltransferases and demethylases. Our observation of SAPHIRE at activated genes, promoting their activation through (at least in part) H3K4 demethylation, is strongly supported by genome-wide studies of human Lsd1 showing very similar results, which were published while this paper was in revision (8).
One observation that may shed light on the H3K4 methylation issue is that S. pombe lacks chromatin remodelers of the ISWI family, a family present in both budding yeasts and vertebrates. Remodelers of the ISWI class are primarily linked to the ordering of nucleosomes following replication and also to the ordering of nucleosomes at the promoters of genes. For example, the ISW1a complex in S. cerevisiae is localized to the 5′ end of the MET16 gene, where it helps to promote repression (28). Interestingly, S. pombe has a larger number of chromodomain (CHD)- and PHD-containing remodelers. Since CHDs and PHDs interact with methylated lysines, we speculate that S. pombe may utilize CHD/PHD remodelers to help organize gene 5′ ends (and/or the proximal promoter). We further speculate that demethylation of lysines by SAPHIRE might prevent this repressive remodeling, thus promoting activation.
Our finding that saf140KK→AA saf110KK→AA mutations lead to spreading of telomeric heterochromatin on the right arm of chromosome 1 suggests that SAPHIRE may act as a boundary element. One clear question is why only one telomere is affected. Studies of S. cerevisiae provide the precedent that mutations in telomere boundary elements do not always affect chromatin at all telomeres (17, 48, 54). SAPHIRE does not strongly occupy all telomeric boundaries, although subtelomeric regions are enriched for SAPHIRE targets in comparison to the genome overall (data not shown). In principle, SAPHIRE might directly demethylate histones at the boundary region on Chr1, helping to prevent the propagation of heterochromatin. Alternatively, a locus of active transcription mediated by SAPHIRE in the subtelomeric region might provide a barrier to heterochromatin spreading. Another possibility, consistent with our speculation above regarding H3K4 methylation, is that impaired SAPHIRE activity leads to increased remodeling by CHD class remodelers. These CHD class remodelers might organize nucleosomes in a manner that both promotes repression and favors heterochromatin propagation. These issues can be addressed in future studies.
In conclusion, we describe SAPHIRE as an essential nucleosome-binding complex that dynamically relocalizes to active promoters and demonstrates both catalytic and noncatalytic functions that promote gene expression and chromatin boundary function. Future studies on SAPHIRE will address how it is targeted to particular loci, whether SAPHIRE affects the recruitment of CHD/PHD remodelers, how SAPHIRE contributes to the chromatin boundary at Chr1, and whether SAPHIRE functions at other loci, such as centromeres, ORFs, and the MAT locus.
We thank Brian Dalley for ChIP array development and processing and Brett Milash for database development. We thank Robin Allshire for strains and plasmids.
This work was supported by the Howard Hughes Medical Institute (supplies and support of B.R.C. and D.H.), an NIH Developmental Biology training grant (support of M.G.), and the Huntsman Cancer Institute (support of A.P. and genomics resources). B.T.W. was supported by Sanger postdoctoral and Canadian NSERC fellowships. Research in the Bähler laboratory is funded by Cancer Research UK (CUK) under grant no. C9546/A6517.
Published ahead of print on 19 March 2007.