Purification of SAPHIRE complex.
We recognized two proteins in S. pombe that contained three domains of interest, namely, a SWIRM domain, an FAD-dependent amine oxidase domain, and a high-mobility-group B [HMG(B)] domain. Since the SWIRM and HMG(B) domains are closely associated with chromatin function, we initially pursued these proteins as potential new chromatin enzymes that could be studied in a genetically tractable organism. The amine oxidase domains of these proteins contain the key catalytic residues present in known amine dealkylases, suggesting that these S. pombe proteins have the capacity for amine demethylation. The two S. pombe paralogs are highly similar to each other (BLAST E value, 2 × 10−84) and share the same domain order. The human protein Lsd1 is highly similar (BLAST E value, 4 × 10−14) and shares key catalytic residues, the SWIRM domain, and the same domain order with the S. pombe proteins.
To characterize the biochemical properties of these S. pombe NAO paralogs and to identify associated proteins, we purified each to homogeneity. To facilitate their purification, we integrated a DNA sequence into the 3′ end of each gene (separately, at the chromosomal locus) to encode an in-frame fusion of the TAP epitope. Integrations were performed separately in diploids, with sporulation and dissection generating tagged haploid progeny that grew as well as their untagged siblings, demonstrating that the tags do not impair function. Thus, the two strains used for purification contained one tagged NAO and one untagged NAO. Purification of either NAO to homogeneity yielded the same four proteins, as identified by mass spectrometry: these were the tagged NAO, the untagged NAO, and two additional proteins, of 60 and 50 kDa (Fig. ). Mass spectrometric analysis also showed that the additional bands in these lanes are proteolytic products of these four proteins, and thus both purifications yielded the same four-protein complex. We call the complex SAPHIRE, for a SWIRM-amine oxidase and plant homeodomain (PHD) protein complex involved in regulating gene expression (shown below), with members designated by their apparent molecular weights, i.e., Saf140p, Saf110p, Saf60p, and Saf50p. The systematic gene names are as follows: saf140+, SPAC23E2.02; saf110+, SPBC146.09c; saf60+, SPAC30D11.08c, and saf50+, SPCC4G3.07c. Thus, the two NAO paralogs are both present in the isolated complex. Interestingly, the additional 60-kDa and 50-kDa proteins are themselves paralogs (BLAST E value, 7 × 10−44), and both contain a single PHD (Fig. ). Gel filtration of the entire complex provided an apparent mass of ~350 kDa, suggesting a simple tetramer (data not shown).
FIG. 1. Purification of SAPHIRE, a nucleosome-binding complex. (A) TAP of S. pombe NAO proteins. Only two NAO proteins are encoded by the S. pombe genome. TAP of either protein yielded the same complex, with subunits designated Saf140p, Saf110p, Saf60p, and Saf50p. (more ...) SAPHIRE binds nucleosomes.
Since SAPHIRE contains multiple domains for chromatin association, we used electrophoretic mobility shift assays to assess the interaction with mononucleosomes. We found that SAPHIRE binds to mononucleosomes with a Kd of about 8 nM (Fig. and data not shown). Specificity for this interaction was confirmed with a polyclonal antibody raised against Saf110p, which supershifted the nucleosome-SAPHIRE complex (Fig. , lane 3).
To date, the only Lsd1-related protein that has demonstrated histone lysine demethylation activity in vitro is Lsd1 itself. Extensive biochemical testing of both native and recombinant SAPHIRE complexes has not revealed appreciable activity. This is consistent with the work of others on these S. pombe
orthologs, showing no observed activity in vitro (31
). Therefore, we focused on the biological and genomic impact of removing the conserved catalytic residue of these enzymes, as described below.
SAPHIRE is essential for viability.
To determine the importance of SAPHIRE, we constructed gene deletion mutations in each subunit in diploids. We then induced sporulation of the heterozygous diploids and examined the phenotypes of haploid progeny. We found that saf140+, saf60+, and saf50+ are essential genes, as viability segregated 2:2 in four-spore tetrads (data not shown). Somewhat surprisingly, saf110+/saf110Δ diploids produced four viable spores, with the saf110Δ progeny always displaying extremely slow growth (Fig. ). Since 2:2 segregation of slow growth was observed in a large number of tetrads and was always linked to saf110Δ::kanMX, the phenotype was likely not a result of background mutation. Importantly, normal growth could be restored by transformation with a plasmid bearing only saf110+, confirming the null phenotype (data not shown). In addition to having slow growth, saf110Δ strains are temperature sensitive (Fig. ).
FIG. 2. Characterization of SAPHIRE mutant strains. (A) saf110Δ strains have slow growth and temperature-sensitive phenotypes. Serial fivefold dilutions of wild-type and saf110Δ strains were spotted onto rich medium plates containing glucose and (more ...)
To test whether saf110+ and saf140+ are partially redundant, we overexpressed saf110+ or saf140+ (using high-copy plasmids) in heterozygous diploids (saf110+/saf110Δ or saf140+/saf140Δ) and tested whether we could acquire haploid segregants where the high-copy plasmid conferred growth ability to the reciprocal null segregants. We found that a high-copy saf110+ plasmid will not suppress the inviability of a saf140Δ strain, nor will a high-copy saf140+ plasmid suppress the extremely slow growth of a saf110Δ strain. These experiments suggest that the Saf140p and Saf110p proteins make largely unique contributions to SAPHIRE.
SAPHIRE NAO mutants lacking a conserved catalytic residue are viable.
Saf140p, Saf110p, and Lsd1 share a high degree of similarity to FAD-dependent amine oxidases, such as monoamine oxidase (MAO; dealkylates neurotransmitters) and polyamine oxidase (PAO; dealkylates spermine and spermidine). Although these enzymes have different substrates, they all dealkylate an amine, using flavin reduction to reduce the amine to an imine, followed by imine hydrolysis to oxidize two products; in the case of Lsd1, the products derived from monomethyl-lysine are formaldehyde and lysine. The most extensively studied catalytic residue is a specific lysine. In all crystal structures examined, this lysine positions the lone water molecule in the active site into a bridging position with the flavin N-5 atom of the FAD cofactor (2
). This lysine is absolutely conserved in all Lsd1 orthologues, all PAOs, and all MAOs and is the only residue in the catalytic pocket that is absolutely conserved in the distantly related sarcosine oxidases. Crystal structures and/or alignments identified the corresponding catalytic lysine as K661 in Lsd1, K300 in maize PAO, K862 in Saf140p, and K604 in Saf110p (Fig. ). In all cases tested, including Lsd1 and maize PAO, mutation of this residue alone produced no detectable activity in the resultant protein.
We determined the importance of this lysine on SAPHIRE function by replacing it with alanine. Since Saf140p and Saf110p have a lysine adjacent to this position, we also replaced K861/K603 with alanine, resulting in the double mutations KK861-862AA and KK603-604AA (KK→AA), to further ensure an impact of these substitutions on catalytic activity. Strains bearing saf140KK→AA and saf110KK→AA were constructed by replacing one of the two wild-type alleles with a KK→AA mutant allele at the endogenous genomic locus in a diploid, inducing the diploid to sporulate, and then isolating haploid KK→AA progeny. Separately, saf140KK→AA and saf110KK→AA mutants were viable, with the saf110KK→AA mutant displaying a moderate temperature sensitivity. Interestingly, the saf140KK→AA saf110KK→AA double mutant was also viable, though displaying pronounced temperature sensitivity (Fig. ). These phenotypes could not be attributed to a significant reduction in SAPHIRE stability, as SAPHIRE abundance and integrity were similar in the double mutant at the permissive and nonpermissive temperatures (Fig. ), though we have not rigorously quantified the impact on stability by purification of the mutant to homogeneity. No clear phenotype was detected when these strains were tested for growth in the presence of either cycloheximide, EGTA, methyl methanesulfonate, H2O2, latrunculin A, or sodium dodecyl sulfate. These results suggest that catalytic impairment does not render cells inviable but rather confers conditional growth defects. Thus, SAPHIRE exhibits an important catalytic activity and an essential noncatalytic activity.
Genome-wide localization of SAPHIRE.
To identify the sites in the genome where SAPHIRE functions, we determined the genome-wide localization of complex members. Genome-wide SAPHIRE occupancy was initially determined with a haploid S. pombe strain during asynchronous growth in rich medium containing glucose. To isolate DNA segments associated with SAPHIRE, we performed ChIP using TAP-tagged Saf140p or Saf110p derivatives. ChIP-enriched fragments from TAP-tagged or untagged strains were labeled with fluorescent dyes (Cy5 and Cy3, alternated between replicates) and used to probe two different DNA array formats for the S. pombe genome. The first array consisted of the entire genome parsed into two types of segments (~11,000 segments; the S array), i.e., ORFs and intergenic regions (IGRs), which were used to create spots on separate glass slides. The initial results derived from this array (described below) motivated questions that required the use of a higher-density array, which consisted of 60-mer single-stranded oligonucleotide tiles (~43,000 oligonucleotides; the O array) that tiled the entire genome at a resolution of ~250 bp on a single glass slide. For each array, the normalized Cy5/Cy3 ratio (tagged/untagged) provided a measurement of SAPHIRE occupancy at each position.
SAPHIRE occupancy determinations were reproducible and specific. For example, with the S array, three biological replicates of Saf140p-TAP and (separately) Saf110p-TAP yielded Pearson correlation coefficients (r) of ≥0.97 (data not shown). Furthermore, Saf140p-TAP and Saf110p-TAP appear to occupy the same DNA segments, as a comparison of their data sets yielded an r2 value of 0.91 (Fig. ). For the S array, we defined segments that displayed enrichment above the 95.5th percentile as SAPHIRE occupied, which yielded a total of 119 occupied loci. For the O array, the higher density required alternative criteria to define occupancy: a tile was considered occupied if the tile (i) displayed a signal intensity well above background, (ii) displayed an average enrichment of at least threefold (tagged/untagged) over multiple biological replicates, (iii) displayed an enrichment of at least twofold in a majority of individual replicates, and (iv) was adjacent to a tile that displayed an average enrichment of at least twofold. This stringent set of criteria was designed to ensure a very small number of false-positive results. By these criteria, 426 separate loci are occupied by SAPHIRE genome-wide. As expected, the targets identified by the S array are a subset of those identified by the better-resolved O array.
FIG. 3. Genome-wide localization of SAPHIRE. (A) Saf140p and Saf110p colocalize in the genome. The scatter plot depicts each unique tile on the S array as a point on the graph. The x and y coordinates of each point are equal to the log2 ratio of Saf140p and Saf110p (more ...) SAPHIRE occupancy at chromosomal loci.
In general, we found SAPHIRE distributed widely throughout the genome and dispersed at genes over all three chromosome arms. We also observed SAPHIRE at several regions associated with heterochromatin. For example, we observed SAPHIRE in the region of the silent mating (MAT
) locus and in peaks in the subtelomeric regions of chromosomes 1 and 2 (presented below). In addition, SAPHIRE occupies the junctions between the central region of the centromeres and the dg/dh repeats (data not shown). Notably, these regions also bear clusters of tRNAs which bind TFIIIC, and both tRNAs and TFIIIC sites themselves have demonstrated chromatin boundary functions (32
). However, their relationship to SAPHIRE has not been explored. Notably deficient in SAPHIRE are the rRNA gene repeats at the distal tips of chromosome 3.
SAPHIRE occupies promoters.
Since the majority class of SAPHIRE targets comprises genes, we examined whether SAPHIRE prefers ORFs or IGRs. In contrast to the S array, the O array included all ORF and IGR tiles on the same slide, allowing a direct comparison of these two classes. On the O array, tiles representing ORFs and IGRs comprised 65% and 35% of the total tiles, respectively. However, if we examined only SAPHIRE-occupied tiles, the distribution changed to 18% ORFs and 82% IGRs (which is highly statistically significant [P < 0.0001]). Also, nearly all ORF tiles in the SAPHIRE-occupied class clustered at the extreme 5′ and 3′ ends of the ORFs, indicating that this occupancy was likely bleeding from IGR occupancy due to limitations in shearing resolution (data not shown). Thus, the vast majority of SAPHIRE occupancy is in IGRs.
We next determined the extents to which SAPHIRE occupies the three different IGR classes, namely, (i) single promoters, which are flanked by one ORF 5′ end and one ORF 3′ end; (ii) double promoters, which are flanked by two ORF 5′ ends; and (iii) nonpromoters, which are flanked by two ORF 3′ ends. In eukaryotic genomes, these three classes partition as would be predicted by random orientation relationships, i.e., 25%, 50%, and 25% (double, single, and nonpromoters, respectively). However, note that in S. pombe, double promoters are longer and nonpromoters are shorter, on average, than single promoters. Since the O array had a relatively fixed number of tiles per length of DNA, more tiles were present in double promoters and fewer were present in nonpromoters. Thus, the O array tile ratio was 39% to 47% to 14% for the genome as a whole (double, single, and nonpromoters, respectively). However, we found that the partitioning of SAPHIRE-occupied loci was 46%, 46%, and 8%, and this change in distribution favoring promoters was highly significant (Fig. ). Thus, SAPHIRE primarily occupies promoters.
SAPHIRE occupancy is positively correlated with gene activity.
We next examined whether SAPHIRE localizes to active or inactive genes genome-wide. Here we compared SAPHIRE promoter occupancy with transcription of the downstream ORF. Promoters were defined as ~1-kb regions spanning positions −800 to +200 relative to the translation initiation codon, as our array resolution is ~250 bp. To ensure the proper attribution of SAPHIRE to a target gene, only single promoters were included in this analysis. SAPHIRE ChIP ratios for each tile were averaged for these single promoters and collated with their respective downstream genes for occupancy-expression correlations.
We then compared SAPHIRE occupancy to two different measures of gene activity, namely, steady-state mRNA levels and Pol II occupancy. For mRNA levels, expression profiling by microarray analysis was performed under the identical conditions utilized for SAPHIRE occupancy determinations, and the relative normalized fluorescence for the wild-type channel was used to quantify the relative amount of each transcript. Our unpublished data show that this measure is a generally accurate representation of transcript abundance. However, differences in hybridization efficiency can also affect the relative fluorescence of some transcripts with equal representation in the RNA pool. Therefore, we determined the genome-wide occupancy of RNA Pol II and used the average Pol II occupancy over the entire ORF of each gene as an alternative measure of the transcription rate. We found that SAPHIRE occupancy is strongly correlated with transcription when either Pol II occupancy or mRNA abundance is examined (Fig. and data not shown).
The data presented in Fig. are useful for examining activity-occupancy relationships, but as they represent moving averages for many genes, they do not reveal how uniformly individual genes conform to this correlation. To examine this, we compared the distribution of SAPHIRE-occupied loci versus that of all loci to Pol II occupancy. Note that because only ORF class Pol II was examined (intergenic regions were omitted), the distribution curve does not center around zero Pol II occupancy (Fig. ). We found that most SAPHIRE targets are highly transcribed, as opposed to the small percentage of the total genome with a high transcription rate. Also, although a small number of SAPHIRE-occupied genes display somewhat less than the average amount of Pol II, we did not find SAPHIRE at any highly repressed promoters. Furthermore, we found that SAPHIRE targets represent only a fraction of all highly expressed genes (defined as genes with average Pol II ORF enrichment of ≥2.8-fold [log2 value, ≥1.5]) (Fig. ). One particularly large class which is notably deficient in SAPHIRE is the ribosomal protein genes, the most transcriptionally active Pol II gene class in the genome. If we removed this coordinately regulated class of genes from the analysis, SAPHIRE occupied a very significant fraction of the remaining highly transcribed genes (Fig. ). Intriguingly, the ribosomal protein genes share a cis promoter sequence specific to S. pombe, i.e., a homology D box, that is absent from most SAPHIRE targets. Thus, promoter architecture differences may determine whether an active gene recruits SAPHIRE.
Our examination of the Pol II genome-wide occupancy allowed us to approximate the transcription start sites (TSS) of highly transcribed genes to within ~125 bp. We defined the TSS as the location of the first tile upstream of an expressed gene with >2-fold occupancy. Then, selecting SAPHIRE-occupied genes with known TSS (45 targets), we compared the peak of SAPHIRE occupancy to the TSS. We found that SAPHIRE localizes at or just upstream of the TSS and falls sharply at locations following the TSS (Fig. ).
Dynamic relocalization of SAPHIRE during heat shock.
To determine whether SAPHIRE is dynamically recruited to genes undergoing activation, we examined SAPHIRE occupancy in cells subjected to 15 min of heat shock, a treatment known to impact the transcriptional activity of hundreds of genes. Remarkably, SAPHIRE was redistributed to genes undergoing activation at a magnitude proportional to their activation (Fig. ). In this case, heat shock genes (such as hsp16+) that are bound by heat shock transcription factor (HSTF) were among the acquired targets. However, new targets also included many genes that are not believed to be bound by HSTF, suggesting that several types of promoters recruit SAPHIRE when they are activated. As a test for HSTF recruitment, we placed ectopic HSTF sites on a plasmid but did not observe heat shock-dependent recruitment of SAPHIRE, suggesting that recruitment may require both enhancer elements and promoter elements. As observed at 32°C, SAPHIRE occupied many, but not all, of the genes highly activated by heat shock, suggesting that SAPHIRE targeting is selective (data not shown). Taken together, these results demonstrate that SAPHIRE is a dynamic complex that is recruited to the region near the TSS of particular genes during the activation process.
FIG. 4. SAPHIRE regulation of target gene transcript levels is associated with changes in histone methylation. (A) During heat shock, SAPHIRE dynamically relocates to new sites of active transcription. The log2 ratio of the change in transcript level was plotted (more ...) Target genes require SAPHIRE for full transcriptional activation.
We then took a genome-wide approach to determine whether SAPHIRE is required for the activation of target genes. If SAPHIRE assists in activation, then target genes should be selectively attenuated in saphire mutants. To test this, we performed mRNA expression profiling (microarray analysis) of wild-type, saf140KK→AA saf110KK→AA, and saf110Δ strains. Genes occupied by SAPHIRE displayed reduced transcription in the mutants, whereas unoccupied genes remained largely unaffected. The impact of the saf140KK→AA saf110KK→AA double mutation was moderate, whereas that of the saf110Δ mutation was much more pronounced (Fig. ). Although the saf110Δ mutation confers very slow growth, the observation of a selective impact on the transcription of SAPHIRE targets strongly supports the notion that the primary effect of the saf110Δ mutation is a reduction in the activation of SAPHIRE targets. Furthermore, although the impact of the saf140KK→AA saf110KK→AA double mutation is modest, it suggests that SAPHIRE catalytic activity also makes a contribution to target gene activation. Taken together, these results strongly suggest that SAPHIRE-occupied genes rely on SAPHIRE for their full activation and that contributions are made by both the catalytic function and the noncatalytic function.
Catalytic impairment increases H3K4 methylation levels at SAPHIRE targets.
Since Lsd1 is a known H3 demethylase, we correlated H3 methylation levels with SAPHIRE occupancy genome-wide. To this end, we determined the genome-wide patterns of H3K4me2 and H3K9me2. To properly interpret H3K4me2 and H3K9me2 ChIP experiments, we normalized all data to H3 occupancy, determined by multiple independent genome-wide ChIP analyses (using a C-terminal anti-H3 antibody). Genome-wide examination revealed an extremely weak correlation with H3K9 (data not shown), which was due to SAPHIRE occupancy at or adjacent to particular heterochromatic loci. However, we observed that the levels of H3K9me2 at promoters and ORFs were extremely low compared to the levels at heterochromatic loci (<1%) or in comparison to H3K4me2 levels. Therefore, we focused our analysis on the impact on H3K4 methylation. For H3K4 methylation, we focused on the trends observed at loci that contain SAPHIRE (i.e., display a log2 occupancy ratio of ≥0). In this analysis, we observed a modest positive correlation of SAPHIRE occupancy with H3K4me2 genome-wide (Fig. ) but a modest negative correlation at promoters (Fig. ). This raises the possibility that SAPHIRE interacts with H3K4me2 regions but promotes H3K4 demethylation selectively at promoters. To address this possibility, we examined H3K4me2 genome-wide in wild-type cells and in our mutants bearing substitutions in the conserved catalytic residues. Interestingly, H3K4me2 increased at SAPHIRE targets in the saf140KK→AA saf110KK→AA double mutant, consistent with impairment of an H3K4 demethylation function (Fig. ). Taken together, these results suggest that misregulation of H3K4 demethylation at promoters is correlated with reduced transcriptional activation (see Discussion).
saf140KK→AA saf110KK→AA mutations cause spreading of telomeric heterochromatin.
At present, little is known about the factors which restrict the spreading of telomeric heterochromatin in S. pombe. The telomere-proximal regions of all three S. pombe chromosomes bear repetitive segments with high levels of H3K9 methylation, which decrease in telomere-distal regions as unique segments are encountered. Although H3K9me2 was not significantly affected at SAPHIRE targets genome-wide (data not shown), perturbations were observed at particular loci. The clearest example, by far, is a major change in the heterochromatin-euchromatin boundary of the right telomere of chromosome 1 in the saf140KK→AA saf110KK→AA double mutant (Fig. ). In this double mutant, H3K9me2 spread ~30 kb past its normal boundary (Fig. ). We recognized that certain telomeric repeat elements (and certain genes and pseudogenes) in S. pombe are also located on other telomeres, which can prevent the attribution of ChIP enrichment to a particular locus. However, the affected region mapped almost exclusively with unique O array tiles, showing that the spreading observed can be attributed unambiguously to the right subtelomeric region of chromosome 1 (Fig. ). Interestingly, examination of this subtelomeric region reveals a clear peak of SAPHIRE that coincides with the location where H3K9me2 begins to spread in the mutant (Fig. ). This peak of SAPHIRE occupancy includes three consecutive tiles with ≥2-fold occupancy that are unique in the genome. This raises the interesting possibility that the catalytic activity of SAPHIRE helps to promote boundary activity at this telomere.
FIG. 5. SAPHIRE regulates telomeric heterochromatin. (A) Physical map of the right subtelomeric region of chromosome 1. The map corresponds to coordinates 5522385 to 5579828 in the Sanger Institute chromosome 1 virtual contig. A scale drawing of genetic elements (more ...)