|Home | About | Journals | Submit | Contact Us | Français|
A centromere is a chromosomal region on which several proteins assemble to form the kinetochore. The centromere-kinetochore complex helps in the attachment of chromosomes to spindle microtubules to mediate segregation of chromosomes to daughter cells during mitosis and meiosis. In several budding yeast species, the centromere forms in a DNA sequence-dependent manner, whereas in most other fungi, factors other than the DNA sequence also determine the centromere location, as centromeres were able to form on nonnative sequences (neocentromeres) when native centromeres were deleted in engineered strains. Thus, in the absence of a common DNA sequence, the cues that have facilitated centromere formation on a specific DNA sequence for millions of years remain a mystery. Kinetochore formation is facilitated by binding of a centromere-specific histone protein member of the centromeric protein A (CENP-A) family that replaces a canonical histone H3 to form a specialized centromeric chromatin structure. However, the process of kinetochore formation on the rapidly evolving and seemingly diverse centromere DNAs in different fungal species is largely unknown. More interestingly, studies in various yeasts suggest that the factors required for de novo centromere formation (establishment) may be different from those required for maintenance (propagation) of an already established centromere. Apart from the DNA sequence and CENP-A, many other factors, such as posttranslational modification (PTM) of histones at centric and pericentric chromatin, RNA interference, and DNA methylation, are also involved in centromere formation, albeit in a species-specific manner. In this review, we discuss how several genetic and epigenetic factors influence the evolution of structure and function of centromeres in fungal species.
A complete understanding of the complexities of the process of cellular differentiation requires a thorough analysis of the molecular events occurring during eukaryotic cell division. As an important part of this process, a cell has to ensure accurate segregation of duplicated chromosomes into its progeny cells. In eukaryotes, specific DNA sequences, and the factors that bind to them immediately after replication, partly dictate the state of chromatin. Apart from the genetic factors, many epigenetic phenomena also contribute to formation of specialized chromatin required for specific functions. One such specialized chromatin domain, the centromere (CEN)-kinetochore complex, plays a crucial role in high-fidelity chromosome segregation and has great implications for human health. A consequence of improper chromosome segregation is the abnormal chromosome numbers associated with most human cancers. For example, in human colorectal cancers, almost 85% of cells are aneuploid, with 60 to 90 chromosomes (128).
The high-fidelity chromosome segregation that occurs during mitosis and meiosis requires a functional centromere, defined as the primary constriction on a chromosome. The centromere is a region where spindle fibers attach to bring about congression and subsequent movement of the chromatids to opposite poles during the anaphase stage of the cell cycle. In organisms for which meiotic analysis is possible, a centromere can be defined genetically as a chromosomal region where the mutant and wild-type alleles of a diploid heterozygous genetic marker always separate from each other in the first meiotic division. A centromere region is generally transcriptionally inactive, gene-poor, recombination deficient, and heterochromatic in nature. From the cell biologist's point of view, the centromere is an essential specialized chromosomal locus upon which the kinetochore, a macromolecular proteinaceous structure that links a chromosome to spindle microtubules, assembles and powers segregation of chromosomes in high fidelity during cell division.
Even though we cannot visualize a centromere on the small and relatively uncondensed chromosomes of certain lower eukaryotes such as yeasts, we can recognize its presence, localize it accurately, isolate it physically on a cloned segment of DNA, and study its structure and function. More than 30 years ago, the discovery of a functional centromere in a short sequence in the unicellular budding yeast Saccharomyces cerevisiae first revealed the molecular nature of a centromere at the DNA sequence level (26, 27). Molecular determination of the presence of centromeres in a few other model organisms, followed by the advent of the availability of genome sequence information, fuelled identification and characterization of centromeres for a large number of eukaryotes. Functional identification of centromeres from various organisms relied on many distinct properties, such as the presence of a region defined by tetrad analysis or binding of an evolutionarily conserved kinetochore protein or even a minimal region that can provide stable inheritance through mitosis and meiosis to an otherwise unstable piece of naked DNA introduced into a cell by chemical transformation. However, the factors that are required for de novo assembly of a functional centromere on an artificially introduced piece of naked DNA (centromere establishment) and those that support stable inheritance of a natural or artificial chromosome with an already-established centromere (centromere propagation) may not always be the same.
Centromeres usually occur once per chromosome. Cloning and characterization of centromeres from many organisms have shown that, while functional centromeres are contained in short sequences with conserved DNA motifs in a few budding yeasts, most other organisms carry longer centromere sequences that are usually rich in repeat DNA elements. Interestingly, the length of an individual repeat unit in centromeres in many organisms, including humans (171 bp), mice (120 bp), and Arabidopsis spp. (178 bp), is approximately equal to that of a nucleosome (1, 32, 52, 68, 136). However, centromere functions are strictly species specific, and there is no common sequence determinant that specifies the centromeres of all organisms. Even after decades of research since its discovery, these diverse properties mark the centromere as one of the most mysterious regions of a chromosome.
What could be the most relevant general definition of a centromere today? A serendipitous finding of anti-centromere antibodies isolated from human patients with an autoimmune disease called CREST led to identification of several centromeric proteins (CENPs), including CENP-A (36, 96). Subsequently, members of the CENP-A family of proteins were identified in many organisms as centromere-specific histone H3 (CenH3) variants. CENP-A molecules form specialized centromeric chromatin and are present at the functional centromeres. Thus, a centromere can be redefined as the binding region of CENP-A in a eukaryotic chromosome. It has been observed that, in S. cerevisiae, CENP-A/Cse4 also gets recruited at a low level to other noncentromeric loci, indicating that there must be tight epigenetic regulation of formation of the complete kinetochore architecture exclusively at the native centromere locus that acts as the sole microtubule attachment site (16, 71). In this review, we focus on the wide diversity and evolution of structure and function of the centromere (CEN) DNA and of the associated CENP-A-containing chromatin in various fungal species.
The centromere DNAs of a few budding yeasts of the Hemiascomycetes, including S. cerevisiae, have been characterized and found to be contained in a relatively short stretch (<400 bp) of DNA. Members of this class of centromeres are often referred to as “point” centromeres. Since conserved DNA motifs serve as the binding sites of specific kinetochore proteins, formation of point centromeres is directed by the DNA sequence and is thus genetically determined. Plasmids carrying an autonomously replicating sequence (ARS) that serves as the DNA replication origin and the centromere (CEN) behave like aneuploid chromosomes (minichromosomes) that segregate normally through mitosis and meiosis. Yeast Artificial Chromosomes (YACs) can be circular or linear with telomeres at the ends and are useful tools for cloning large pieces of heterologous DNA that can transmit faithfully in mitosis.
The centromere DNA in S. cerevisiae is the best-studied example in this class (Fig. 1A). The 125-bp CEN DNA contains three consensus Centromeric DNA Elements (CDEs). The central element (CDEII) is a nonconserved 78- to 86-bp-long AT-rich (>86%) sequence (41, 57). CDEII acts as a “spacer” element and is flanked by two conserved motifs, the 8-bp CDEI (PuTCACPuTG) and the 25-bp CDEIII [TGTTT(T/A)TGNTTTCCGAAANNNAAAAA]. The CDEIII sequence is an imperfect palindrome. The role of these conserved elements has been studied extensively (28, 33). Deletion of CDEI causes a marginal (20-fold) increase in chromosome missegregation in mitosis, suggesting that CDEI is not absolutely essential for centromere function. Interestingly, deletion of CDEI or certain alterations of CDEI cause premature sister chromatid separation in meiosis I (33). Changes in the length or base composition (AT richness) of CDEII cause only partial inactivation, but a portion of CDEII is essential for CEN function. CDEIII is absolutely essential. Deletion of CDEIII, or even single-base substitutions in the central CCG sequence, completely abolishes CEN function (79, 89).
Genome sequences of three other species of the Saccharomyces genus, S. bayanus, S. mikatae, and S. paradoxus, are available but they are not completely annotated. On the basis of the sequence similarity and synteny of genes of these species in comparison to S. cerevisiae, putative CEN regions with CDEI-CDEII-CDEIII motifs in all 16 chromosomes in each of these three organisms were identified (65, 75, 81). In contrast, S. castellii does not seem to have a point centromere-like sequence (29). Centromeres in a pathogenic budding yeast, Candida glabrata, also contain elements highly homologous to those of S. cerevisiae (69). In this case, a 153-bp fragment accommodates all three DNA elements: CDEI (8 bp) and CDEIII (18 bp) are well conserved among all the chromosomes, while CDEII (77 to 79 bp) is AT rich (83 to 93%) (69). Like S. cerevisiae CDEIIIs, C. glabrata CDEIIIs have a central CCG sequence (35, 69). Both CDEI and CDEIII are required for centromere function. Some mutations in CDEIII cause a complete loss of centromere activity. The C. glabrata centromeres do not function in S. cerevisiae, suggesting that, in spite of the highly similar sequences, the centromere function is species specific. Kluyveromyces lactis and Ashbya gossypii (Eremothecium gossypii) also contain point centromeres. Mutual comparisons to nucleotide sequences of centromeres of K. lactis led to identification of consensus elements similar to those of S. cerevisiae (54, 55, 81). In K. lactis, a longer CDEII element (KlCDEII [160 bp long; ~90% AT rich]) that is almost double the size of the S. cerevisiae CDEII is flanked by two short, highly conserved boxes, CDEI and CDEIII. CDEIII also contains a conserved CCG element. An additional, 100-bp common AT-rich element (CDE0) is present at 150 bp upstream of CDEI. Centromeres of K. lactis are nonfunctional in S. cerevisiae and vice versa. Both the lengths and sequences of CDEI and CDEIII are essential for proper centromere function in K. lactis. Centromeres in A. gossypii are very similar to those in K. lactis, as the lengths of CDEII in both these organisms are approximately the same and double the length of the S. cerevisiae CDEII (34, 81). As with that of S. cerevisiae, the CEN sequences of C. glabrata and K. lactis are able to provide mitotic stability to an ARS plasmid (54, 69).
Centromeres in two other budding yeasts, Yarrowia lipolytica and Candida maltosa, have features that are slightly different from those of a point centromere. Centromeres in Y. lipolytica span up to ~200 bp but, surprisingly, lack any consensus sequences such as those seen in CDEI or CDEIII. Nonetheless, the AT richness of the Y. lipolytica centromere DNA is similar to that of S. cerevisiae CDEII (35, 43). Interestingly, the functions of ARS and CEN are interdependent in Y. lipolytica (131). The centromere DNA in C. maltosa contains consensus CDEI and AT-rich CDEII elements, but a conserved element equivalent to CDEIII is absent. However, a 325-bp region is sufficient to provide an ARS plasmid with high mitotic stability (94).
Most organisms, except certain budding yeasts with short sequence-dependent point centromeres as described above, possess longer (>40 kb to a few megabases) centromere DNAs that are usually highly repetitive and heterochromatic in nature. These centromeres are often referred to as “large regional” centromeres. Due to the absence of the motifs of the DNA binding proteins that are exclusively present in all centromere regions, the formation of centromeres in this class is probably mediated by a sequence-independent epigenetic mechanism that may or may not be conserved among all the species.
Centromeres of the fission yeast S. pombe are the best-studied centromeres in this class. Each centromere in this organism, originally identified in a DNA region of 40 to 110 kb, is organized symmetrically in a 10-to-15-kb CENP-A-rich chromatin that is flanked by an ~10-to-60-kb pericentric heterochromatin (119, 124, 138) (Fig. 1A). A nonhomologous unique sequence of 4 to 7 kb forms the central core (cnt or cc). The central cores of chromosome 1 (cc1) and 3 (cc3) have almost identical 3.3-kb-long “tm” elements. cc2 contains a 1.5-kb-long DNA element that is 48% identical to tm (138). Each cc is flanked on either side by innermost repeats (imr; also called B repeats) that have a unique sequence in each centromere. The DNA that contains cc and imr is flanked on either side by outer repeats (otr) that include “dg” and “dh” elements (also called K and L repeats) (10, 25, 40, 87, 99). The requirement of various centromere DNA elements for its activity was determined by a plasmid-based assay using cen2 of S. pombe (10). That study revealed that cc and K repeats are essential and sufficient for maintaining an active centromere in S. pombe. All other elements are dispensable and not sufficient to complement the function of cc and K repeats. Even the cc region reveals functional redundancy, because deletion of short sequences from any part of this region does not cause a significant reduction in centromere function. Intriguingly, deletion of the entire cc results in abolishment of centromere function. The lack of a conserved sequence of any considerable length among all the three cc regions implies that the part of cc that is critical for CEN function in S. pombe is very small or degenerate.
The centromere in Neurospora crassa, a muticellular filamentous ascomycete, is another example of this class (21, 24, 115) (Fig. 1A). The centromere DNA is contained within a 300-kb region which is AT rich and recombination deficient and consists of centromere-specific repeat sequences. Unlike S. pombe, N. crassa CENs have no inverted repeats. The centromere-specific repeats are highly divergent, indicating that repeat induced mutation (RIP) (111, 112), which causes C-to-T transitions at repetitive sequences with high frequency, is active in these loci (24). Further sequence analysis of a 16.1-kb region in N. crassa CEN7 revealed that it contains a cluster of three retrotransposon-like elements (Tcen, TglI, and Tgl2) along with sequences of degenerate fragments of Tad, a previously characterized LINE retrotransposon (21).
Putative centromeres in another filamentous ascomycetous fungus, Aspergillus nidulans, are predicted to be associated with large repeat sequences which are less prone to meiotic recombination and consist of two degenerated LTR retrotransposons, Dane1 and Dane2 (Degenerated Aspergillus nidulans element). The prototype of degeneration suggests that, as seen with N. crassa, a process of RIP may be active in these sequences (2, 91).
The centromeric regions in the basidiomycetous fungus Cryptococcus neoformans, a human pathogen, were predicted by sequence analysis (74). Transposon (Tcn5, Tcn6)-rich gene-free regions of 40 to 110 kb, present once in each chromosome, are considered presumptive centromeres. This analysis was further supported by linkage studies of two marker genes, URA5 and ADE2, which have been found to be linked to centromeres (60). Direct experimental evidence to define these putative regions as centromeres is still lacking.
The putative centromeres in another basidiomycete, Coprinus cinereus, a mushroom, lie within the transposon clusters that are localized at the less meiotic recombination-prone regions of the genome (117). These DNA elements can correspond to the sequence-independent regional centromeres, which are common in the fungi such as C. neoformans and N. crassa (24, 74, 115).
Apart from the two major classes of fungal centromeres described above, centromeres of a third type, “small regional” centromeres, were identified in Candida species. Two closely related pathogenic Candida species, Candida albicans and Candida dubliniensis, have centromeres with several properties that are unique and thus absent from point and large regional centromeres. Each of the eight chromosomes of these two Candida species has a 3-to-5-kb-long CENP-A-rich centromere DNA, present in 4-to-18-kb open reading frame (ORF)-free regions, that has no common sequence motifs or repeats (95, 110) (Fig. 1A). Thus, each chromosome has a unique and different CEN DNA sequence. Unlike those in other budding yeasts, a circular ARS plasmid with a CENP-A-rich CEN region does not produce a stable minichromosome in C. albicans, suggesting that centromere formation is not strictly sequence dependent in this organism (11). Although chromosome-specific short inverted repeats were found at the centromeres of some of the chromosomes, a long inverted repeat is present only at CEN5. CEN5 of C. dubliniensis is also flanked with similar long inverted-repeat sequences. It is also important that many point centromere-specific proteins, which are absent from organisms having regional centromeres, are also absent from these two Candida species (81).
The locations of the putative centromeres in three other yeasts of the Candida clade, Candida lusitaniae, Pichia stipitis, and Debaryomyces hansenii (75), have been predicted by sequence analysis. Analysis of the GC content of the total genome of these organisms identified a single GC-poor region in every chromosome. In C. lusitaniae, GC-poor troughs are located in intergenic regions approximately 4 kb in length. C. albicans and C. dubliniensis centromeres, which were identified experimentally, are also located in intergenic regions of similar length, although those DNA regions are not GC-poor (96, 110). Putative centromeres in P. stipitis and D. hansenii have clusters of retrotransposons.
Analysis of chromosome III of three closely related lineages of the wild yeast Saccharomyces paradoxus suggests that centromeres are the most rapidly evolving regions (12). Centromeres of the other chromosomes also show a higher rate of nucleotide substitutions than noncentromeric regions. More recently, rapid evolution of centromeres has been demonstrated by comparative genomic analyses in different species of fission yeast in the Schizosaccharomyces clade (104). S. pombe and Schizosaccharomyces octosporous contain no transposons at the centromere, whereas Schizosaccharomyces japonicus has clusters of transposons at the pericentric region. Although repeats are present at the centromeres of all the three Schizosaccharomyces species, the repeat sequences are completely dissimilar. Repeats present at the centromeres in different chromosomes of S. pombe or S. octosporous share high homology, suggesting homogenization of repeats by nonreciprocal recombination in the large array of inverted repeat structures. However, the lack of symmetry observed in the repeats of different chromosomes of S. japonicus indicates that transposition occurred more rapidly than homogenization. It has been argued that evolution of symmetric centromeric repeats occurred by suppression of transposition. Although C. albicans and C. dubliniensis exhibit a high degree of sequence homology throughout the genome and the relative centromere locations in these two species are conserved, comparison of the CEN DNA sequences of orthologous chromosomes reveals very little or no conservation. Genome-wide analysis of orthologous noncoding regions in these two species shows a significantly lower rate of sequence diversity. Thus, comparative genome analysis suggests that centromeres are the most rapidly evolving loci in those two Candida species (95). Although several lines of evidence suggest that centromeres are rapidly evolving, the reason and the underlying mechanism of such rapid evolution are not clearly understood. It is possible that rapid changes in the centromere sequence may help in speciation, as centromeres function in a species-specific manner. It is worth noting that rapid evolution of centromeres has been shown to occur in plants and animals as well (77).
The lengths, sequences, and organizations of DNA elements at various centromeres in different fungi are seemingly diverse. In spite of this diversity in the characteristics of the cis-acting DNA elements, one common feature of all the functional centromeres is the exclusive presence of specialized chromatin marked by the centromere-specific histone H3 (CenH3) variant of the CENP-A family: Cse4 in S. cerevisiae, Cnp1 in Schizosaccharomyces pombe, CID in Drosophila melanogaster, and CENP-A in humans. Localization of most kinetochore proteins in many well-studied organisms has been shown to be regulated by CENP-A, confirming that CENP-A initiates kinetochore formation; thus, it is considered the epigenetic hallmark of a functional centromere (50). However, the presence of CENP-A alone is insufficient to initiate kinetochore formation in humans (46, 129). The mechanism by which only a few molecules of CENP-A compete with a high molar excess of canonical histone H3 molecules to make unique nucleosome structures at centromeric chromatin remains an enigma.
Strikingly, unlike the almost invariant histone H3 proteins, CENP-A proteins show a high degree of sequence diversity across species (Fig. 1B). The N-terminal domain (the essential N-terminal domain [END] in S. cerevisiae) of CENP-A (Cse4) is hypervariable among most species (51) except for members of the Saccharomyces-Kluyveromyces clade (7). Divergence of this region probably reflects differences in the underlying centromere DNA sequence that is believed to be undergoing rapid evolution (50). In addition, CENP-A proteins are known to be loaded at the centromere by the activity of a rapidly evolving chaperone Scm3 in fungi (5, 19, 100, 120, 137). CENP-A and H3 share a C-terminal globular Histone Fold Domain (HFD), where amino acid sequence identity is ~50%. The HFD is shared by all histone proteins and is composed of three alpha helices (α1, α2, and α3) connected by loops (L1 and L2) (6). H3 and CENP-A contain an additional helix, the N-helix, present upstream of α1. L1 in CENP-A is longer than that of histone H3 (76). In S. cerevisiae, the L1-α2 region of CENP-A/Cse4, which contains the CENP-A Targeting Domain (CATD), interacts directly with the Cse4 binding domain of Scm3 (140).
Partial micrococcal nuclease (MNase) digestion along with DNase I digestion of chromatin in S. cerevisiae revealed more distinct ladder patterns at the centromeric chromatin than were seen in bulk chromatin (15). A distinctly protected region of 220 to 250 bp of centromeric chromatin was observed. A similar centromeric chromatin structure in K. lactis was reported as well (53). Partial MNase digestion analysis of both S. pombe and C. albicans revealed that the centromeric chromatin structure in each case is unusual. The bulk chromatin showed ~150-bp ladder patterns, while the centromeric chromatin exhibited smeary patterns of digestion (11, 101). The reason for this unusual pattern of centromeric chromatin seen in vivo is still unknown, but it may be attributable to replacement of histone H3 by CENP-A. CENP-A molecules have been shown to replace histone H3 molecules either partially or fully in centromeric chromatin in different fungi (17, 80, 110, 123). Studies in S. pombe have suggested that the unusual smeary pattern of partial MNase digestion could be due to protection provided by an intact kinetochore. Random access of MNase could potentially yield different sizes of CEN DNA fragments that appear to have a continuous smeary pattern due to blocking by the complex kinetochore architecture that is formed on CENP-A nucleosomes (116).
The unusual features of centromeric chromatin in different organisms suggest that the composition of CENP-A-containing nucleosomes is different from that of canonical nucleosomes (discussed in more detail in a review in reference 3). Strikingly, Scm3 replaces H2A-H2B dimers at centromeres in S. cerevisiae and forms hexameric nucleosomes containing two molecules each of Scm3, Cse4, and H4 (83). However, the existence of octameric nucleosomes is evident from another recent study on CENP-A/Cse4 nucleosomes in S. cerevisiae (20). A previous proteomic study performed with purified centromeric nucleosomes of S. cerevisiae revealed that H2A and H2B copurified with CENP-A/Cse4 (135). Not even the formation of intermediate or hybrid nucleosomes of CENP-A-H3 can be completely ruled out. The occurrence of tetrameric or half-nucleosomes is also a possibility, since CENP-A octamers have a higher propensity for disassembly (31). Although the exact composition of CENP-A nucleosomes in S. cerevisiae is still uncertain, it is plausible that the composition of the CENP-A nucleosome changes in different stages of the cell cycle (3).
Propagation of centromeric chromatin for newly replicated CEN DNA is another active area of research. Unlike the results seen in studies of humans and flies, replication of the centromere DNA in C. albicans and S. pombe occurs at the early S phase (67, 70). If CENP-A nucleosomes behave as do canonical H3-containing nucleosomes at the time of replication, one would assume that half of the preexisting CENP-A molecules would be segregated to each of the two sister chromatids. Using a fluorescence recovery after photobleaching (FRAP)-based method, it has been shown that old Cse4 molecules are replaced by new ones at the kinetochore in S. cerevisiae during the early S phase of the cell cycle (98). In S. pombe, incorporation of CENP-A/Cnp1 molecules occurs at both the S and late G2 phases (125). Hence, it appears that CENP-A deposition may be coincident with replication, at least in unicellular yeasts (23).
Like incorporation of other histones, recruitment of CENP-A can also be affected by various events (including CENP-A expression, turnover of CENP-A protein or RNA, expression of other histones, and posttranslational modifications [PTMs]) and factors (including chaperons, assembly factors, essential components of kinetochore architecture, and histone modifiers) (3). In S. cerevisiae, CENP-A/Cse4 proteolysis, which is mediated by Psh1, an E3 ubiquitin ligase, restricts its localization primarily to the centromere (30, 56, 102). Overexpression of CENP-A/CaCse4 results in recruitment of more CENP-A/CaCse4 molecules along with other kinetochore proteins such as Mtw1 at the centromere in C. albicans (17, 106). Posttranslational modifications associated with the N-terminal histone tail are related to different functional states of chromatin-like repression (silencing) or activation of transcription. Although most of the longer centromeres are flanked by pericentric heterochromatin, surprisingly, the interspersed canonical H3 molecules show modifications that are characteristic of both euchromatin and heterochromatin (122) (Fig. 1A). These include molecules of H3K4Me2 (dimethylated lysine at the fourth position in histone H3), which are commonly observed in transcriptionally active regions, although these histone H3 molecules do not harbor any acetylation marks related to open chromatin. CENP-A-rich core centromere regions usually do not contain H3K9Me2 (dimethylated lysine at the ninth position in histone H3), a signature mark of closed or silent chromatin. The pericentric chromatin of long repetitive centromeres in various eukaryotes, however, harbors H3K9Me2/3 (13, 88, 92, 93, 103). Such H3K9Me2 marks are present but in S. pombe are restricted solely to pericentric heterochromatin (18, 23). Strikingly, H3K4Me2/3 marks, often found in centromeres of other eukaryotes, are absent from the N. crassa centromeres. Instead, H3K9Me3 marks are present and have been shown to colocalize with other kinetochore proteins at the centromeres in that organism. The flanking pericentric region has H3K9Me3 along with cytosine DNA methylation, which partially overlaps with kinetochore protein distribution. Interestingly, DNA methylation is directed by Dim5, a histone H3 methyltransferase that is also responsible for the presence of H3K9Me3 in N. crassa. Moreover, heterochromatin formation is essential for CENP-A localization at the centromere in that organism (115). Thus, the factors required for centromere formation in N. crassa, with a centromere organization more similar to that seen with higher eukaryotes, are different from those in S. pombe, carrying relatively simple long centromeres.
The epigenetic phenomenon has been defined as a change in phenotype that is heritable but does not involve DNA mutations. Fungal centromeres are excellent systems for the study of such epigenetic regulation (Fig. 2).
Epigenetic regulation in the point centromere has been documented to illustrate the differential requirements of centromere establishment and propagation of an already established centromere. The classic example of such regulation became obvious from studies of the behavior of a centromeric plasmid in the chl4/mcm17/ctf17 kinetochore mutant of S. cerevisiae (86) (Fig. 2A). A centromeric plasmid (YCp) stably propagates in the wild-type strain but exhibited very low mitotic stability when introduced into this mutant, suggesting that de novo assembly of a kinetochore on CEN DNA requires the Chl4 protein (107). Interestingly, when the same centromeric plasmid was introduced into a wild-type cell and the CHL4 gene was then deleted, two classes of transformants were obtained. About 50% of the transformants showed high mitotic stability of the centromeric plasmid, while the rest exhibited lower mitotic stability. That study revealed that two different chromatin states can exist and that Chl4 is not absolutely required for propagation of an already established centromere (Fig. 2A). Differential recruitment of cohesion (the molecular glue that prevents sister chromatid separation until anaphase onset) due to mutations in conserved DNA elements in a naive versus an established centromere also illustrates epigenetic propagation of point centromeres (126).
The epigenetic regulation of the in vivo centromere function of various in vitro-constructed minichromosomes was demonstrated in S. pombe (118). Cells carrying circular plasmids with incomplete centromere regions exhibited two states of mitotic stability: either unstable, as seen with an ARS plasmid without any centromere, or stable, as seen with a plasmid having a full-length centromere. The most striking observation from that study is that, when an unstable plasmid with a nonfunctional centromere switches to a stable state in which the centromere is fully activated, the active state is faithfully inherited for many generations (Fig. 2B). Once active, the plasmids with reduced centromeres exhibit mitotic stability that is indistinguishable from that seen with plasmids having a larger centromere DNA. The precise mechanism for this delayed or slow activation could not be determined, but it is postulated that formation of the higher-order centromeric chromatin structure required for its function, which is readily achieved when a plasmid having the full-length centromere is present, is a time-intensive process when the full-length centromere DNA is absent. That study thus set the stage for investigation of nongenetic factors that control centromere function. A striking observation was reported from a later study performed by Folco and coworkers (42). When plasmids containing the essential elements of the S. pombe cen2 gene were transformed in clr4 mutant cells (a histone H3 methyl transferase; see below), CENP-A/Cnp1 did not get recruited on the cc2 region of cen2 in plasmids. However, when wild-type cells were transformed with the same set of plasmids, proper CENP-A chromatin was established on the cc2 region of the plasmids, confirming that Clr4 is essential for establishment of CENP-A chromatin. However, when the wild-type strain transformed with these CEN plasmids was crossed with clr4 mutant strain, surprisingly, both the wild-type and clr4 mutant progeny strains exhibited proper recruitment of CENP-A on the plasmid centromere, implying that inheritance of an established CENP-A chromatin does not require Clr4.
The most striking example of epigenetic regulation of centromere function was observed in C. albicans (11). In that organism, attempts to construct a minichromosome with an ARS and a CEN sequence failed. Even ectopic insertion of the centromere DNA into a nearby locus could not deposit CENP-A to form a functional kinetochore, suggesting that de novo centromere formation on an introduced CEN DNA does not occur in C. albicans. However, a short 85-kb chromosome fragment (CF) carrying a functional centromere obtained by in vivo telomere-mediated truncation of native chromosome 7 was found to be mitotically stable (Fig. 2C). Strikingly, when total DNA isolated from this strain was used to reintroduce the CF in wild-type C. albicans, it was found to be unstable in the transformed cells. MNase digestion assays confirmed that, after reintroduction, CENP-A molecules could not be deposited on the introduced CEN DNA sequence of the CF, implying that an epigenetic memory is required for de novo assembly of a functional centromere.
Thus, regulation of the centromere function of each of the three types of fungal centromeres (point, large regional, and small regional) clearly demonstrates that the requirement for de novo centromere assembly can be different from that for centromere propagation, which is largely epigenetically determined.
A centromere can occasionally form on a nonnative location on the same chromosome when a native centromere is deleted or inactivated. Formation of a new centromere, more popularly known as the neocentromere, was first evidenced in humans several years ago (9, 109). Since neocentromeres can form on DNA sequences that share no common sequence features (i.e., neither DNA sequence nor features of DNA elements such as repetitive, gene-free regions) with a native centromere, the discovery of neocentromeres suggested redundancy in the sequence requirement for formation of a functional regional centromere. Subsequently, neocentromere formation has been demonstrated in fungi, including both S. pombe and C. albicans, each of which has been shown to carry epigenetically determined centromeres (61, 66).
The requirement for centromere formation in the fission yeast S. pombe, which carries a metazoan-like centromere organization with pericentric heterochromatic repeats on both sides of a nonrepetitive CENP-A-rich core region, is more complex but better understood. In a study using a conditional strain, cen1-deleted so-called “acentric” chromosome-containing survivors were recovered (61). Whereas the acentric chromosome has been found in most cases to be fused to one of the other two chromosomes, in rare instances neocentromeres have formed. The neocentromeric regions did not have centromeric repeats but rather formed on regions with several ORFs that are normally upregulated under conditions of nitrogen starvation. Interestingly, once neocentromeres are formed, these ORFs are found to be largely silenced, suggesting a possible correlation of transcriptional repression and neocentromere seeding. Binding of several key kinetochore proteins such as CENP-A, Mis12, and CENP-C was found to be associated with a 20-kb region of the neocentromere, as seen with the native centromere. One of the neocentromere regions is in the proximity of a telomere where the chromatin had already been modified by H3K9Me2 (see below). The other neocentromere also formed close to a telomere. In this case, interestingly, the formation of the neocentromere stimulated the accumulation of H3K9Me2 at a proximal region that is devoid of such marks in wild-type strains. These phenomena again imply that there is a possible interdependency between neocentromere formation and heterochromatinization. The neocentromere-containing chromosomes appeared to have been replicated and segregated normally through the cell cycle. In the absence of a protein involved in heterochromatin formation such as Swi6/HP1, Dicer (RNA interference [RNAi] machinery), or Clr4 (H3K9 methyltransferase), however, neocentromeres can still form, but the ratio of neocentromere formation to telomere-telomere fusion is drastically reduced compared to wild-type observations. This result indicates that the presence of neighboring heterochromatin can provide assistance but is not essential for de novo neocentromere formation. One of the striking observations from that study is the role of telomere-proximal regions in serving the platform for centromere formation. This may support the hypothesis that centromeres originated from telomeres during karyotype evolution (132).
Native centromere sequences of C. albicans do not show any common sequence motifs or repeats (110). Deletion of the CENP-A-rich centromere region from chromosome 7 in C. albicans leads to a high rate of chromosome loss (11). However, an inverted repeat is associated only with the centromere of chromosome 5. When the CEN5 region was replaced with a marker gene, the so-called “acentric” chromosome remained stable in mitosis (66). Further investigation suggested formation at an astonishingly high rate of a neocentromere on the chromosome devoid of the native centromere. Two classes of neocentromeres, centromere-proximal and centromere-distal neocentromeres, were observed. In both cases, the extent of CENP-A binding (~3 kb) was found to be similar to that seen with the native centromere, albeit the number of CENP-A molecules on neocentromeres was found to be less than on the native centromeres. Moreover, a chromosome with a neocentromere shows mitotic stability as high as that seen with the native centromere. Interestingly, as with humans, neocentromeres show no sequence similarities with the native centromere, suggesting that factors other than the DNA sequence alone determine centromere and neocentromere location. Since neocentromeres can form very efficiently at multiple locations, at least on chromosome 5 in C. albicans, it is intriguing that centromere location was found to be conserved in a variety of the clinical isolates of C. albicans examined (82).
RNA interference (RNAi) is a conserved eukaryotic process that is triggered by double-stranded RNA (dsRNA) and results in transcriptional repression or silencing of genes with complementary sequences. The term “RNAi” was first used to represent silencing mediated by exogenous dsRNA in Caenorhabditis elegans (39). However, it is now used broadly to describe both endogenous and exogenous gene silencing mediated by small RNAs of various types: short interfering RNAs (siRNAs), micro-RNA (miRNA), and PIWI-interacting RNAs (piRNAs) (reviewed in references 72 and 84). Among these, siRNAs are found in fungi. More recently, a new type of small RNA degradation product, referred to as primal RNA (priRNA), has been shown to be associated with pericentric heterochromatin formation in S. pombe centromeres (48).
S. pombe has two distinct chromatin subdomains at the centromere: a core CENP-A-rich central domain where most histone H3 molecules are replaced by CenH3 molecules and the pericentric heterochromatic domain on outer repeats that form chromatin with H3K9Me2 molecules (13, 88, 93). However, a transgene placed in the central region that supports kinetochore formation shows reversible gene silencing, indicating the facultative nature of heterochromatin at the core centromere (4, 37, 134). As mentioned above, the histone H3 molecules that remain at the central regions have been found to be H3K4Me2 molecules, a mark associated with chromatin with active transcription (18). A landmark discovery of about a decade ago demonstrated a relation between RNAi machinery and pericentric heterochromatin formation in S. pombe (133, 134). The RNAi pathway in S. pombe is mediated by three key members, each encoded by a single gene, Dicer (dcr1+), Argonaute (ago1+), or RNA-dependent RNA polymerase (rdp1+) (141) (Table 1). Heterochromatin formation at the centromere requires a number of other trans-acting factors: histone deacetylase (HDACs), Clr4 (histone H3K9 methyltransferase [HKMT]), and the histone H3K9-methyl binding proteins such as Swi6 (an HP1 homolog) and Chp1 (8, 44, 62, 88, 90, 103, 108, 113, 121, 139). Both strands of the outer repeats are transcribed by RNA polymerase II (RNA PolII) during the S phase; the dsRNA is recognized by Dicer (Dcr1), the RNase III protein, which cleaves dsRNAs into 21-to-23-nucleotide-long complementary siRNAs (14). Dcr1, which presumably shuttles between the cytoplasm and nucleus, needs to be retained in the nucleus for production of siRNAs (38, 49, 78). Argonaute-related proteins bind to the Dcr1-processed siRNAs through their PIWI domains and thus serve as the core components of RNA-induced silencing complexes (RISCs) (22). Ago1, Chp1, and Tas3 are components of RNA-induced transcriptional silencing (RITS) complexes (130). Ago1 “slices” one strand of the dsRNA, releasing the other strand to form an activated RISC with a short single-stranded RNA (97, 105). This nascent siRNA transcript then binds to the homologous chromatin of the otr repeats to load an RITS complex. The RNAi pathway is eventually amplified by the RNA-dependent RNA polymerase complex (RDRC) composed of Rdp1, Hrr1, and Cid12 (85). This cyclic process makes further dsRNA and triggers synthesis of siRNA (85). Finally, the Clr4-Rik1-Cul (CLRC) complex is recruited. Clr4, a member of the CLRC complex, mediates H3K9 methylation on the chromatin bound to it, resulting in heterochromatin formation and spreading and transcription silencing (58, 59, 63, 73, 127). The spreading is mediated by other chromodomain proteins such as Swi6, Chp2 and Chp1 (8, 108, 139).
Is the RNAi machinery essential for pericentric heterochromatin formation that is required for centromere function? Deletion of any of the key genes (dcr1+, ago1+, or rdp1+) results in accumulation of forward and reverse transcripts from otr, loss of centromeric heterochromatin, loss of H3K9 methylation, and increased chromosome loss (133). Thus, it was believed that the RNAi pathway is essential for recruitment of Clr4 and Swi6 for establishment, spreading, and propagation of centromeric heterochromatin. However, other studies suggest alternative mechanisms. It has been previously shown that the RNAi machinery is absolutely required for establishment of H3K9 methylation on the pericentric repeats but that propagation of an already-established heterochromatin is epigenetic and thus does not depend on RNAi (108). It has been formally demonstrated that, in the absence of Ago1, overexpression of Clr4 alone can result in H3K9 methylation at the centromere (114). Even tethering Clr4 to a noncentromeric region could induce expression of artificial heterochromatin (64). Thus, the dependency on RNAi and heterochromatin formation at the centromere in S. pombe has not been fully understood. Nevertheless, the involvement of siRNA produced from centromeric repeats has been shown to be essential for centromere function and transcriptional silencing via heterochromatin formation in S. pombe, which completely lacks transposons at centromeres (47). However, a recent study revealed that the process of siRNA production is active in transposon-rich regions of centromeres and subtelomeric regions in another Schizosaccharomyces species, S. japonicus. This observation suggests that, in S. pombe, either the RNAi machinery targets repetitive elements instead of mobile elements or the pathway has changed from its previous role of transposon silencing to heterochromatinization (104). In addition, the relationship between RNAi and DNA methylation, a common feature of heterochromatin observed in N. crassa and mammals but not in fission yeast or flies, has been examined. Interestingly, in N. crassa, HP1 recruitment and DNA methylation, components of heterochromatin, were found to be independent of the RNAi machinery (45). That study indicates that, in contrast to the results seen with S. pombe, Neurospora centromeric chromatin formation is independent of components of RNAi machinery.
In spite of performing a well-conserved function, centromere DNAs are substantially divergent in length, sequence, and composition as well as in the requirement for various genetic and epigenetic factors (Table 1). Some of the essential factors required for de novo assembly of a kinetochore for propagation of an established centromere in a given organism appear to be redundant. Since new centromere seeding must occur at the time of neocentromere formation, further mechanistic insights in this process may provide us with useful information on factors required for the establishment of epigenetically determined centromeres. Rapid coevolution of centromere DNA sequences, the N-terminal domain of CENP-A, and its chaperone Scm3 is particularly striking. Thus, it is of great interest to find the biological implications associated with rapid evolution of centromeres, previously known to occur in plants and animals and now evident in fungi as well. We are now at an exciting stage of investigation of this mysterious darkly stained region of a chromosome.
We apologize to our colleagues whose important findings could not be cited in this review due to space limitations. We are grateful for the valuable comments of the members of the Sanyal laboratory.
This work was supported by grants from various funding agencies (CSIR, DST, and DBT) of the government of India to K.S.
Published ahead of print on 9 September 2011.