|Home | About | Journals | Submit | Contact Us | Français|
The minichromosome maintenance (or MCM) protein family is composed of six related proteins that are conserved in all eukaryotes. They were first identified by genetic screens in yeast and subsequently analyzed in other experimental systems using molecular and biochemical methods. Early data led to the identification of MCMs as central players in the initiation of DNA replication. More recent studies have shown that MCM proteins also function in replication elongation, probably as a DNA helicase. This is consistent with structural analysis showing that the proteins interact together in a heterohexameric ring. However, MCMs are strikingly abundant and far exceed the stoichiometry of replication origins; they are widely distributed on unreplicated chromatin. Analysis of mcm mutant phenotypes and interactions with other factors have now implicated the MCM proteins in other chromosome transactions including damage response, transcription, and chromatin structure. These experiments indicate that the MCMs are central players in many aspects of genome stability.
In the early 1980s, Bik-Kwoon Tye's laboratory carried out a general screen for Saccharomyces cerevisiae mutants that had defects in maintaining a simple minichromosome. Several of the MCM genes showed were paralogues and shared a role in DNA replication. They were the founding members of a conserved protein family required for DNA replication that is now called by the name of the screen: the MCM (for “minichromosome maintenance”) proteins. Data at first suggested that the MCM proteins were replication initiation factors. More recent data suggest that they have an additional essential function as a helicase required for replication elongation (see extensive reviews in reference 12, 16, 58, 154, 231, and 321). They are also implicated in many other chromosome transactions including transcription, chromatin remodeling, and genome stability. MCMs thus recapitulate a pattern seen with other cell cycle genes, in which a simple model derived from analysis of a single process grows more complicated as additional functions are identified, which ultimately leads toward a holistic view of how multiple events are linked. In this review, I examine the history of MCMs and discuss recent insights into their function from structural studies. I describe evidence for their different roles in DNA replication and the identification of new functions. Finally, I highlight MCM puzzles and paradoxes that make these proteins of continuing interest 20 years after their first identification.
The MCM protein family is named for the genetic screen in budding yeast from which the founding members were originally isolated. They were defective in minichromosome maintenance, showing a high rate of loss of plasmids that contained a cloned centromere and replication origin (201, 289). Thus, mutants isolated from this screen were defective in function of the replication origin or the centromere on the target plasmid (reviewed in reference 322). Early characterization of the mcm2 and mcm3 phenotypes indicated a role in replication, but the phenotypes varied with different replication origins on the test plasmid (93, 201, 289). Cloning showed that Mcm2 and Mcm3 proteins share a region of sequence similarity called the MCM box (346), and they eventually gave their name to a protein family. Importantly, the family we call MCM proteins include only the six proteins Mcm2 through Mcm7 (with the recent addition of an Mcm8 in some species [discussed below]). Other genes from the same screen, including MCM1 and MCM10, do not have similarity with MCM2 through MCM7 despite having an MCM name. Some of these other MCM genes also act during S phase, while others are involved in other aspects of chromosome metabolism.
Additional members of the MCM family in budding yeast were identified in screens for mutants with cell division cycle, or CDC phenotypes (33, 110, 111, 219). Once the proteins were identified as members of the MCM family, the genes were renamed; however, the original genetic names are still found in the literature. A guide to these synonyms is found in Table Table1.1. Each MCM gene is essential for viability. Conditional mcm mutants showed an extensive network of genetic interactions with one another, as well as with other genes isolated in related screens (see, e.g., references 111, 219, and 346). This suggested that a large number of proteins work together in large complexes to regulate eukaryotic DNA replication.
Using a Xenopus oocyte system, Blow and Laskey observed in the 1980s that sperm chromatin inside a nucleus could be replicated only once during each cycle but became competent again following mitosis (17). As in most eukaryotes (with the notable exception of the yeasts), the nuclear envelope in cycling Xenopus extracts breaks down during mitosis, providing access for the cytoplasmic components to the chromatin. Blow and Laskey speculated that the chromatin required an activity to mark it as capable of replicating: in effect, to license it for S phase. They further proposed that such a licensing activity would be used up or destroyed as a consequence of replication and therefore could be restored only when the nuclear envelope disintegrated during mitosis to allow fresh licensing factor access to the DNA. Licensing factor provided an appealing mechanism that could ensure that the genome was replicated once and only once in each cell cycle.
One of the most intriguing experiments early in the characterization of MCMs demonstrated that the S. cerevisiae Mcm5 (Cdc46) protein cycles in and out of the nucleus during each cell cycle (110). This observation was consistent with how a putative licensing factor would behave in an organism that maintains an intact nuclear envelope throughout the cell cycle (called a closed mitosis). To the replication community, this behavior of MCMs indicated that they would be key factors in regulating S-phase initiation. It also promised another example of synergy between cell cycle analyses in yeast and Xenopus, reminiscent of the groundbreaking studies that produced a universal model of cyclin-dependent kinase (CDK) activation in mitosis (237). This drove studies of MCM biology through the early 1990s.
As described below, cycling in and out of the nucleus is unique to budding-yeast MCMs, and further experiments eventually showed that the MCM proteins do not correspond to licensing factor as originally proposed. MCMs were nevertheless identified as key proteins in DNA replication initiation and shown to play a vital role in the licensing of origins for replication. While today we know much more about them (for example, in their role as replication proteins), in some regards we also know much less (as their roles expand beyond DNA replication). Such puzzles and paradoxes are a recurring theme in MCM biology.
Members of the MCM family have been found in all eukaryotes by genetic and biochemical methods (Table (Table1;1; Fig. Fig.1).1). In fission yeast, as in budding yeast, the mutants were originally identified in screens for cell cycle defects or for chromosome missegregation (201, 219, 227, 303, 316). In Drosophila, Mcm4 corresponds to the gene disc proliferation abnormal (76), while in Arabidopsis, Mcm7 is PROLIFERA (296), stressing their role in cell division. Human Mcm2 (BM28) was first identified as a nuclear protein (318), and human Mcm3 (P1) was isolated as a DNA polymerase alpha-associated protein (314). Related proteins were identified biochemically from a variety of systems, although it was not immediately apparent which were orthologues and which were paralogues (see e.g., references 75, 161, 294, and 314) (Table (Table1).1). With the completion of the S. cerevisiae genome sequence (96), it was evident that eukaryotes contain at least six distinct members of the family. The MCMs are not found in prokaryotes, but MCM-related proteins exist in Archaea (reviewed in references 152, 156, and 320). Our focus here is on the eukaryotic MCMs.
The MCMs are a distinct subgroup of the large AAA ATPasefamily, which has many cellular functions (reviewed in references 56, 187, 228, and 249). AAA ATPases generally form large ATP-dependent complexes, often heterohexamers. The MCMs are defined by a characteristic version of the ATPase domain, the MCM box, which spans about 200 residues (reviewed in references 113, 152, 164, 228, and 323) (Fig. (Fig.2).2). The MCM box includes two ATPase consensus motifs. The Walker A motif, including the P-loop of the active site, contains the invariant lysine residue found in all ATP-binding proteins. In the MCM family, the Walker A consensus has an alanine or serine in place of a glycine, along with several additional conserved residues, creating the MCM-specific consensus GDPxx(S/A)KS. The classic Walker B element D(D/E) is bulky and hydrophobic and is thought to contribute to ATP hydrolysis rather than binding. In the MCM proteins, the Walker B motif is part of the sequence IDEFDKM, which is conserved in all MCM proteins and defines the MCM family. Another short motif, SRFD, occurs approximately 70 residues after the Walker B element, and defines an “arginine finger.” All MCMs contain these sequences.
Outside the MCM box, the individual MCMs within a single species are not particularly closely related to each other. However, comparison of MCMs across species allowed the definition of distinct MCM classes, so that (for example) an Mcm2 protein from humans is more similar to Mcm2 from yeast (an orthologue) than to Mcm4 from humans (a paralogue). A phylogenetic tree of known MCMs from a range of eukaryotic species is shown in Fig. Fig.1.1. (For a comparison to the archaeal MCMs, see the tree in reference 152.)
All eukaryotic MCMs have zinc-binding motifs of some sort, although the precise arrangement of the cysteines varies distinctly from family to family (Fig. (Fig.2).2). The Mcm3 class only weakly matches a classic zinc finger consensus, but it has been argued that the residues indicated in Fig. Fig.22 are capable of chelating zinc (81). Mutational analysis of the yeasts, in which the mutant proteins are tested for their ability to rescue nonfunctional alleles, has shown that the zinc-binding motifs in several MCMs are required for viability (83, 347). Biochemical analysis suggests that the zinc motif also contributes to complex assembly and ATPase activity (81, 256, 284, 354). Studies of an archaeal MCM homohexamer indicate that the ability to chelate zinc is required for assembly of higher-order MCM complexes (described below). Several MCMs have weak homology to a leucine zipper consensus (LX6LX6LX6L) upstream of the zinc finger motif.
Only Mcm2 and Mcm3 families have sequences suggesting nuclear localization sequences (NLS). The NLS of Mcm2 has been identified functionally in Schizosaccharomyces pombe, S. cerevisiae, and mice (132, 230, 247). The Mcm3 NLS sequence has been molecularly characterized in S. cerevisiae and humans (309, 357).
There are other subunit-specific characteristics. Mcm2 family members have an extended N terminus rich in serine residues. Consensus sites (S/T)PX(K/R) for the CDKs are found in most Mcm4 family members. CDK consensus sequences in other MCMs vary from species to species. They are found in most metazoan Mcm2 family members, in Mcm3 in the yeasts, and sporadically amongst the other subunits. The role of CDK and other kinases in regulating MCMs is discussed in “Origin firing” (below).
With the completion of genome sequences of numerous eukaryotes, it is now apparent that fungi have just six MCMs but that there can be additional variants in more complicated organisms. One expected addition is the potential for developmentally regulated MCM subunits, forming isoforms of the known MCM subclasses. For example, there is a zygotic form of Mcm6 in Xenopus, which might be involved in the rapid embryonic divisions of a Xenopus egg (287). A sex-specific plasmodium Mcm4 has also been identified (189). These proteins are presumed to substitute for the “normal” MCM in appropriate developmental conditions.
More surprising was the recent identification of a human Mcm8 by two groups (100, 144). Comparison of its sequence to the others (Fig. (Fig.1)1) clearly indicates that it defines a separate MCM family, distinct from the canonical six. Intriguingly, while human Mcm8 shares all the classic MCM features including a putative zinc finger and the IDEKFM and arginine finger motifs, it is the only MCM that has a classic GKS motif in its Walker A sequence. It is widely expressed in a variety of tissues and may not be restricted to proliferating cells (100, 144). The protein is found in the nucleus, apparently chromatin associated during S phase (100). However, the two published studies disagree on whether this MCM interacts with other members of the family (100, 144). Mcm8 may be involved in negatively regulating the MCM hexamer (144). Alternatively, Mcm8 may form a homohexamer with a distinct function. An Mcm8-related sequence is also found in the mouse but has not been identified in Drosophila or Xenopus. Two additional MCMs beyond the canonical six are also found in the Arabidopsis sequence but have not been studied molecularly. One of them is closely related to human Mcm8 and also contains the GKS sequence in the Walker A motif. The tree in Fig. Fig.11 suggests that this is the orthologue of Mcm8. The other is a more divergent sequence which changes the Walker B IDEKFM motif to IDEKSM and lacks the arginine finger (shown by the dashed line in Fig. Fig.11).
The first evidence that MCMs form a protein complex was provided by the discovery of genetic interactions between mcm mutants in the yeasts, including cosuppression and synthetic lethal interactions (see e.g., references 47, 83, 111, 216, and 346). Physical interactions were identified in vivo by using two-hybrid, coimmunoprecipitation, and biochemical purification (see, e.g., references 2, 28, 37, 45, 53, 129, 161, 185, 198, 224, 279, 284, and 315) and reconstituted using recombinant proteins (55, 128, 181, 258, 280). These studies suggested that the bulk of MCMs in vivo associate in a heterohexamer with 1:1:1:1:1:1 stoichiometry, although there are likely to be small amounts of single MCMs and MCM subcomplexes in the cell.
Mutational analysis of the yeasts has been a powerful tool to examine minimal requirements for MCM association in vivo. As expected, complex assembly is essential for function, since no mutants that disrupt complex assembly are viable (see e.g., references 53, 184, and 284). Importantly, however, assembly into an MCM complex is not sufficient for function. For example, fission yeast Mcm2 mutants lacking an NLS, mutants with ATPase domain defects in several of the MCMs, or certain alleles of budding yeast MCM3 still assemble into a complex but are not able to function in vivo (99, 184, 280, 284). Attempts to identify a minimal binding domain in Mcm2 showed that sequences outside the MCM box including, but not limited to, the zinc finger are essential for complex assembly. Only a few mutant Mcm2 proteins with very short amino-terminal deletions are proficient for complex assembly in either S. pombe or mouse (132, 284).
As discussed above, only Mcm2 and Mcm3 have identifiable nuclear localization sequences (NLS), leading to an early suggestion that these MCMs provide nuclear targeting to the other members of the family (160). In nearly all species, the bulk of MCMs are constitutively located in the nucleus throughout the entire cell cycle, with their chromatin association, rather than nuclear localization, subject to cell cycle regulation (see, e.g., references 86, 121, 153, 159, 199, 240, 247, 278, 300, and 318). However, there is still a role for the nuclear envelope in MCM complex assembly. This has been molecularly characterized using mutational analysis with the yeasts.
A series of experiments with S. pombe, using alleles of mcm2+, suggested that MCM complex assembly occurs in the cytoplasm and is necessary both for MCM localization and for retention in the nucleus (247). First, conditional mutations that disrupt the MCM complex lead to the relocalization of the wild-type MCMs from the nucleus to the cytoplasm. Second, an Mcm2 mutant lacking a functional NLS is still able to associate with the other members of the complex and, when overproduced, can actually trap the remaining wild-type MCMs in the cytoplasm. Third, a mutant form of Mcm2 that is defective in binding the other MCMs, but contains an intact NLS, is unable to localize in the nucleus even if export is blocked by mutating crm1, the nuclear export receptor. These data suggest that transport of Mcm2 or other MCMs into the nucleus requires complex assembly in the cytoplasm. Because unassociated MCM subunits can be exported, this codependence provides a way to ensure that the bulk of soluble MCMs within the nucleus are assembled together in a complex. This preserves the stoichiometry of the subunits, at least in the soluble fraction of the nucleoplasm, and is consistent with studies suggesting that the bulk of MCMs are in large protein complex.
Intriguingly, a recent report suggested that Crm1 (Exportin 1) may be actively involved in regulating MCM function, at least in Xenopus extracts (343). Association of the MCMs with Crm1 prevents rereplication within a single cell cycle. However, active export of the MCMs is not required for this effect (see “Chromatin binding and regulation of rereplication” below).
S. cerevisiae is a special case, because the MCM proteins cycle in and out of the nucleus during a single cell cycle, so that the bulk of the proteins are present only in the nucleus during S phase (54, 110, 347, 358). However, as in other species, localization depends on the NLS sequences of Mcm2 and Mcm3 (229, 357), and intact MCM complexes are required for MCMs to enter the nucleus and to remain there (177, 229). MCM export is regulated by CDK activity (176, 229); whether this also promotes complex disassembly is not known, but this could provide one way to facilitate export using the same mechanism observed in S. pombe.
Different MCM subunits have different relative affinities for one another, forming a distinct series of subcomplexes whether isolated from yeast, mouse, or Xenopus (45, 55, 118, 129, 181, 185, 224, 258, 280, 284). During purification, Mcm4, Mcm6, and Mcm7 subunits bind most tightly together to form a trimeric complex sometimes called the MCM core. Mcm2 binds to the core, but with reduced affinity. Mcm3 and Mcm5 together form a dimer and bind most weakly to the other MCMs, probably through Mcm7. In the absence of other MCMs during in vitro reconstitution experiments, the Mcm4,6,7 core will itself dimerize to form a dimer-trimer (Mcm4,6,7)2, which is disrupted by addition of Mcm2 (128, 181, 258).
Does this mini-MCM complex exists inside the cell, or is it an artifact of purification experiments? This question is extremely important, because only the (Mcm4,6,7)2 complex is associated with DNA helicase activity in vitro (discussed in more detail below [128, 182]). Addition of other MCMs abolishes this activity, so it has been suggested that Mcm2, Mcm3, and Mcm5 negatively regulate the active Mcm4,6,7 complex (130, 182, 355). However, the in vivo data indicate that the situation is much more complicated. As discussed above, assembly of the heterohexamer is required for proper localization in the nucleus. Most evidence suggests that the bulk of MCMs are assembled in the heterohexamer in vivo, both on and off the chromatin (see, e.g., references 2, 87, 160, 198, 258, 263, and 284). For example, Mcm3 and Mcm7 colocalize cytologically (see, e.g., reference 267). When MCM chromatin immunoprecipitation was followed by DNA-DNA hybridization to a yeast genomic DNA microarray, only about 40 sites, of a total of 600 to 700 sites with multiple MCMs, had a single MCM protein (341). One experiment with Xenopus suggests that MCMs may load independently on the chromatin (203), and individual MCMs can be differentially released from chromatin by nuclease treatment (118). However, interpretation of these results is complicated by the different relative affinities within the complex and questions of antibody accessibility within the intact complex.
Functional data obtained in vivo also suggest that the MCMs contribute to a common activity. Immunodepletion of any single MCM protein from Xenopus extracts has an equally negative effect on replication (see, e.g., references 198, 267, and 315). Mutation of any individual mcm gene in yeast yields phenotypes consistent with each subunit playing a positive role throughout replication (see, e.g., references 177, 178, 184, and 193), and there are no alleles of MCM2, MCM3, or MCM5 that identify a negative regulatory activity. There is also some inconsistency with in vivo and in vitro phenotypes of site-directed mutants. While specific mutation of the Walker A or B motif of Mcm6 has a strong, negative effect on helicase activity of the Mcm4,6,7 complex in vitro (355), mutational analysis of the yeasts suggests that Mcm6 is uniquely forgiving of mutations of these sites, and the mutant proteins are in several cases able to function in vivo (see below) (99, 280; T. Schwacha, personal communication).
These data do not preclude the possibility that a subset of MCMs rearrange inside the nucleus to form a modified complex on the chromatin, but they suggest that the (Mcm4,6,7)2 helicase is not the predominant form inside the cell. Again, this could indicate that there are functionally distinct pools of MCMs in the nucleus that play distinct roles, a model to which we will return. Further studies are be required to establish the role of (Mcm4,6,7)2 in vivo, but biochemical analysis is providing some tantalizing clues.
As described previously, the MCM proteins are members of the diverse AAA ATPase family, and ATP is known to be important for assembly of AAA ATPases into hexameric complexes (reviewed in references 56, 113, 249, and 323). Given their highly conserved Walker A and B motifs, it is not surprising that MCM complexes have ATPase activity. Attempts to reconstruct MCM complex assembly in vitro by monitoring ATPase activity suggest that individual MCMs are not active as ATPases by themselves (55, 181, 280, 354, 355). Here, too, the results suggest that there are two distinct classes of MCMs, one represented by the core subunits Mcm4,6,7 and the other represented by the peripheral subunits Mcm2, Mcm3, and Mcm5. Mutations in these different subunits differently affect ATPase activity of the intact complex.
Reconstitution experiments using the intact complex from S. cerevisiae suggest that certain pairs of MCMs can associate to generate ATPase activity: Mcm3,7, Mcm4,7, and Mcm2,6 (55, 280). The order of subunits in the intact MCM complex may be inferred from determining these interactions (55) (Fig. (Fig.3);3); extensive two-hybrid studies using Drosophila MCMs, a more indirect approach, gives the same model (48). The biochemical data suggest that the ATP moiety binds at the interface of the two subunits, with one subunit providing the ATP-binding pocket of the P-loop, in the Walker A motif and the adjacent subunit providing a protruding arginine residue (the arginine finger) (55). Importantly, the sequence SRFD containing the arginine finger is conserved in all MCMs (Fig. (Fig.2).2). Together, these studies suggest a model where ATP hydrolysis occurs in pairs of subunits, with the intervening subunits being required for complex stabilization; thus, in the context of the intact structure, hydrolysis is linked around the ring to generate work (55, 280).
This model implies that the Walker A and B motifs and the arginine fingers are not equally required in each subunit, since not every subunit is involved equally in hydrolysis. This can be tested by site-directed mutagenesis. Davey et al. showed that for the Mcm3,7 pair to have active ATPase activity in vitro, the Mcm3 subunit must have an intact arginine finger but the Walker A motif is not essential. In contrast, the Mcm7 subunit must have an intact Walker A motif but not an arginine finger (55).
Several studies have also examined the phenotypes of these mutants in vivo. Mutation of the conserved K to A in the Walker A motif of all the S. cerevisiae MCM subunits results in death in vivo, as do Walker B motif D-to-A mutations in all but S. cerevisiae Mcm2 (which has residual activity) and Mcm6 (280; Schwacha, personal communication). The arginine finger is likewise essential for viability in all MCMs (Schwacha, personal communication). There are slightly different phenotypes reported in S. pombe. The nonconservative Walker A mutation K to A is viable in S. pombe Mcm6, but the same change in S. pombe Mcm2, Mcm4, or Mcm7 is unable to complement a deletion (83, 99). Interestingly, a conservative change of K to R is tolerated in S. pombe Mcm2, Mcm6, and Mcm7 but not Mcm4. The Walker B mutation is similarly viable in S. pombe Mcm6 but lethal in Mcm2, Mcm4, and Mcm7. This suggests that the ATP-binding site of Mcm6 is dispensable, in contrast to predictions from studies of the helicase activity of the Mcm4,6,7 core in vitro (355). These in vivo results are consistent with the model proposed in Fig. Fig.3:3: the arginine finger of Mcm6 should be required in combination with the Walker A (P-loop) motif of Mcm2, while the Walker A motif of both Mcm4 and Mcm7 should be essential for function.
The structure of the MCM complex and its arrangement as a heterohexameric ring suggest that the MCMs constitute the long-sought cellular replicative helicase. Electron microscopy analysis of the intact MCM hexamer from S. pombe (2) or of the (Mcm4,6,7)2 form of the helicase from mouse (275) suggests a globular complex with a central channel of 3 to 4 nm, similar to many multisubunit helicases from all kingdoms of life (reviewed in references 113 and 249. This channel is large enough to encircle either single- or double-stranded DNA. Although the structure of a eukaryotic, heterohexameric MCM complex has not been solved, striking insights are provided by three recently published studies.
The structure of the N-terminal region of the Methanobacterium thermoautotrophicum MCM protein (MtMCM) has been solved (81). This archaeal species has only one MCM protein, which assembles into two stacked homohexameric rings that have helicase activity (36, 157, 283). While the reported structure lacks the MCM homology domain including the ATPase motifs, it contains an extensive N-terminal domain that is sufficient to form a ring. This includes a zinc finger required for the head-to-head assembly of two hexamer rings (81). The central channel is 2 to 4 nm in diameter and is striking for its positive charge. Significantly, this study also demonstrates that the MtMCM is capable of binding double-stranded DNA and depends on the basic residues in the central channel to do so. Even though there is only limited primary sequence homology between the N termini of the eukaryotic MCMs and the N terminus of MtMCM, several of the charged residues that protrude into the central channel are conserved, and modification of these residues abolishes binding of the MtMCM complex to DNA (81).
The simian virus 40 large T antigen, a viral helicase, is another homohexamer ring that works in a head-to-head structure and is required for viral replication (reviewed in reference 288). The entire structure was solved (188), showing that the hexamer has two domains: one forming a smaller “head” that corresponds to the N-terminal domain solved for the MtMCM, and a larger “body” that corresponds to the ATPase domain. These two domains rotate around one another to open and close the central channel, leading the authors to propose an iris mechanism that grips double-stranded DNA to melt and unwind it. Here, too, the central channel is large enough to encompass a double strand of DNA, and a side channel provides an opening through which the unwound, single-stranded DNA may be extruded. The two linked hexamers are proposed to brace against each other and provide a ratchet mechanism to unwrap the DNA and give access to DNA replication proteins. This ratchet need not be adjacent to the replication fork but could pump open DNA from a distance; this could be consistent with theoretical speculations that MCM complexes act at a distance from replication forks (179), although there is solid evidence that MCMs are located at the fork (5, 40).
As discussed above, the reconstituted Mcm4,6,7 core complex has 3′-to-5′ DNA helicase activity in vitro, using a template with a single-stranded tail (128, 149, 182). Recent work suggests that the extended tail is occluded by steric exclusion from the center of the helicase ring (149). Intriguingly, the core helicase can also encircle double-stranded DNA that lacks an overhang and can translocate along the double-stranded DNA without unwinding it; this is sufficient to drive Holliday junction migration on naked DNA templates in vivo. Despite lacking sequence similarity to the MCMs, the bacterial 5′-to-3′ DnaB helicase shows similar characteristics and can even dislodge other proteins on the DNA (150). Thus, the Mcm4,6,7 core enzyme fulfills many of the expectations for a replicative helicase.
However, as described above, the MCM heterohexamer is the most common form in cells, while the Mcm4,6,7 core helicase is the only form purified that has activity in vitro. Thus, one of the most important puzzles to be solved regarding MCMs is the role of the core helicase in vivo. If this is the active form, how is it rearranged from the abundant heterohexamer? Does activation of proteins in the pre-replication complex by phosphorylation or binding of additional factors (discussed below) rearrange the MCM complex at the origin to generate the active helicase? This would suggest that the heterohexamer is an inert loading and spreading form of the MCM complex that moves out from the origin, leaving the modified core helicase alone at the fork to unwind the DNA. This simple model is inconsistent with in vivo data showing that the Mcm2 and Mcm3 subunits are required not only at initiation but also throughout the S phase, with phenotypes indistinguishable from Mcm4, Mcm6, or Mcm7 (178); if they were not required for activity of a rearranged helicase, this would not be the case. Perhaps other factors are required to adapt the full heterohexamer to helicase activity and their effect is mimicked in the structure of the core helicase in vitro. Supporting this model, the complete MCM complex immunoprecipitated from Xenopus lysates has processive helicase activity when Cdc45 is present (see below) (206). Clear resolution of these conflicting in vivo and in vitro results requires further experiments.
For some MCM genes, there is evidence for increased transcription during the G1/S phase in actively dividing cells; however, despite this, MCM protein levels do not fluctuate during the cell cycle (see, e.g., references 54, 83, 161, 278, 319, and 358). MCMs are highly abundant proteins, estimated at approximately 30,000 copies/cell in S. cerevisiae (65, 185). With roughly 400 replication origins (259, 341), this suggests that MCMs exceed origins by approximately 75 to 1 in budding yeast. Assuming a genome size of approximately 14 Mb (96), this further suggests about two MCM complexes per kb. This level is apparently important: mutations that reduce levels of active MCMs cause defects in genome stability (80, 83, 185, 193), suggesting that a high threshold level of MCM is required for full activity. However, even during S phase, only a fraction of the MCMs are chromatin associated (see below). This leads to the hypothesis that the MCMs are not uniform in function: there may be pools of MCMs that are able to perform distinct roles, which could be distinguished by localization, modification, or binding to other factors.
The primary models for the function of MCM proteins rely extensively on genetics in the case of S. cerevisiae and biochemistry in the case of Xenopus. Work with other systems has provided additional detail, evidence for variation, and important insights into function. Together, these studies have resulted in an increasingly complex model for the activation of individual replication origins in a single cell cycle. This section describes our current understanding of the role of MCMs in replication initiation; additional details may be found in numerous recent reviews (13, 16, 58, 154, 321). A diagram of the assembly and activation of the replication origin is presented in Fig. Fig.4,4, and the identity of genes is given in Table Table11.
Although many budding yeast mcm mutants are able to synthesize substantial amounts of DNA, their phenotypes clearly implicate them in replication, including nuclear and cellular morphology at the arrest point, their interactions with other replication mutants, and their origin-specific effects on the minichromosomes (see, e.g., references 54, 93, 111, 201, and 336). Similarly, temperature-sensitive fission yeast mcm strains synthesize significant amounts of DNA, but they arrest the cell cycle with a cdc phenotype and show the classic checkpoint-dependent arrest characteristic of other replication mutants (see, e.g., references 47, 82, 192, 216, and 303). The association of MCM proteins and DNA replication in Xenopus was made very clearly by showing that extracts depleted of MCMs were unable to support any DNA synthesis (see, e.g., references 37, 167, 198, 267, and 315). This led many investigators to conclude that the ability of most yeast mcm mutants to synthesize bulk DNA, without being able to complete the S phase, occurred because these conditional temperature-sensitive alleles incompletely inactivated the mutant proteins. This model predicted that most origins were able to fire with residual MCMs protein but that some were inactive and/or some regions of the genome were refractory to passive replication. Although we now know that this interpretation was incorrect, it focused considerable attention on the role of the MCMs in initiation, and it is that role that has been most closely examined.
Replication initiation depends on identification and activation of origins of replication distributed throughout the genome. Origin structure varies widely, ranging from small DNA elements with a defined consensus sequence (S. cerevisiae) to larger, degenerate elements without a consensus (S. pombe), to stochastically determined elements that may be established by spacing (Xenopus eggs) (reviewed in references 14 and 94). Regardless, the replication origin is marked by the binding of a complex of proteins called ORC (for “origin recognition complex”) (reviewed in references 12 and 295). It is useful to think of ORC as a marker of potential origins of replication. However, ORC function is not limited to replication initiation but has also been implicated in heterochromatin assembly, nucleosome remodeling, chromosome condensation, and transcriptional silencing (see, e.g., references 27, 59, 60, 84, 126, 195, 245, 253, and 257). Not surprisingly, ORC localization is not restricted to active replication origins, so it may serve a broader function as a landing platform for a range of chromosomal proteins (60, 84, 341).
At the replication origin, ORC is bound by Cdc6 and Cdt1 during G1 phase (fig. 4A to C). These proteins are required for subsequent loading of the MCM complex (42, 61, 65, 153, 191, 204, 232, 268, 311, 337, 340). Analysis of S. cerevisiae cdc6 or the equivalent S. pombe cdc18 mutants shows a range of phenotypes, depending on the allele: while some alleles result in a complete block to DNA synthesis, others allow significant accumulation of DNA (26, 49, 57, 112, 155, 190, 196, 227, 251, 255, 276, 334). This may be related to the observation that many mcm mutants synthesize substantial amounts of DNA. Evidence suggests that Cdc6 binds directly to ORC (191, 218). It has been suggested that Cdc6 functions as a clamp loader to assemble the MCM ring on the DNA, which is consistent with evidence suggesting that only a small fraction of chromatin-bound MCM is associated with Cdc6 (56, 88, 180, 210, 251, 334). The molecular contribution of Cdt1 to this activity is unclear. In metazoans, a small protein called geminin binds Cdt1 and prevents its association with MCMs in G2 or mitosis (302, 340, 348).
Once the MCMs are loaded, the origin is enabled for activation. This assembled origin complex is called the prereplication complex (pre-RC). Assembly of the pre-RC occurs at the end of mitosis, when CDK levels start to fall (11, 46, 153, 211, 300, 311). Although biochemical experiments suggest that the ORC and Cdc6 are dispensable once MCMs are bound (65, 122), in vivo experiments with the yeasts suggest that continued expression of Cdc6 throughout G1 is required to maintain the pre-RCs (11, 42, 255, 274).
Despite its name, the Mcm10 protein is not a member of the MCM family, although it interacts physically and genetically with members of the MCM complex and other replication initiation factors (39, 43, 104, 119, 136, 151, 192). MCM10 was isolated in the same S. cerevisiae screen as the true MCM proteins and independently isolated in a screen for replication mutants (201, 293). Mcm10-containing complexes are extremely insoluble, especially during S phase, and the protein is part of a large complex (43, 104, 119, 136, 192). Although experiments with budding yeast suggested that Mcm10 is required for assembly of the pre-RC (119), studies with human cells, Xenopus extracts, and fission yeast indicate that Mcm10 acts after MCM association (101, 136, 339) (Fig. (Fig.4D).4D). Additional data from S. pombe suggest that the S. pombe Cdc23 (Mcm10) protein may target the Dbf4-dependent kinase (DDK) to the MCM complex, which may facilitate Cdc45 binding (see below) (101, 183). There is also some evidence suggesting that S. cerevisiae Mcm10 may affect replication fork progression (213).
Origin firing is controlled by two kinases: CDK and DDK (reviewed in references 109 and 145) (Fig. (Fig.4D).4D). Genetics and biochemistry have at times conflicted over which kinase is required in order (see, e.g., references 140, 205, 236, 246, 282, 332, and 335). This may reflect the inherent limitations of genetic mutants or biochemical systems at recapitulating complex pathways or may indicate intrinsic differences in protein behavior between species. It is also possible that the kinases act at the same time on independent pathways which converge at a common point. Origin firing, culminating in DNA synthesis, results in loading an ever-expanding group of proteins. The order of these events is not completely worked out, and new players continue to be identified.
The conserved DDK consists of the catalytic subunit (Cdc7 or Hsk1) and a regulatory subunit (S. cerevisiae Dbf4; also called ASK, Dfp1, or Him1 in other systems [Table [Table1])1]) (reviewed in references 141, 145, and 281). While Dbf4 does not have the typical sequence motifs of a cyclin, it is transcriptionally and posttranslationally regulated to restrict the activity of its kinase partner to the S phase of the cell cycle (24, 35, 78, 137, 143, 173, 236, 243, 305, 335, 353). Like a cyclin, Dbf4 appears to provide Cdc7 kinase with substrate binding and specificity. Intriguingly, just as vertebrate cells have multiple cyclins, they may also separate DDK functions by using paralogous regulatory subunits: in humans and Xenopus, an alternative Dbf4 subunit called Drf1 has been identified that may be responsible for DDK activity outside of initiation (220, 350).
DDK is known from biochemical and genetic tests to associate at the origin of replication, suggesting that it must act at each pre-RC to initiate replication (64, 66, 183, 246, 264, 335). Several studies suggest that DDK binds both to ORC and to MCMs independently of ORC (48, 71, 140, 162). It phosphorylates a number of potentially relevant substrates including predominantly Mcm2 but also Mcm3, Mcm4, Mcm6, Mcm7, Cdc45, and DNA polymerase alpha (see, e.g., references 25, 143, 173, 186, 236, and 335). No consensus site for DDK phosphorylation has been identified, and despite much effort, no mcm2 phosphorylation site mutants have been identified in the yeasts, although the N terminus of Mcm2 is an excellent substrate in vitro. However, a new study has identified several sites in human Mcm2 phosphorylated by DDK that are apparently essential for DNA replication, although the sequences are not conserved in other Mcm2 proteins (T. Tsuji and W. Jiang, personal communication). Significantly, while DDK activity is limited temporally to the S phase, its function is not limited to replication: it has also been implicated in additional functions in heterochromatin assembly, gene silencing, repair, and recombination, (see, e.g., references 8, 9, 44, 90, 116, 117, 234, 290, and 306). These roles probably involve substrates distinct from the replication proteins; for example, the kinase also phosphorylates the heterochromatin protein Swi6/HP1 (9).
In vivo data linking DDK and MCMs were obtained in experiments with an S. cerevisiae mutant called mcm5-bob1, which is a recessive bypass allele of Mcm5 (103, 137). In an mcm5-bob1 mutant background, the DDK is no longer essential for viability. Interestingly, Mcm5 is the only MCM subunit that is not phosphorylated by DDK in vitro, so that this allele does not simply mimic a phosphorylation event, but has a more unusual effect. An attractive model is that the mcm5-bob1 mutation results a change in the shape of the MCM complex that mimics the effect of DDK phosphorylation on other subunits and therefore allows subsequent loading of Cdc45 (103, 282). The bob1 mutation is a proline-to-leucine change in a conserved residue of S. cerevisiae Mcm5 (103). The same proline is found in the archaeal MCM protein, and the structural effect of the bob1 mutation was tested in this context (81). It results in a modest but discernible shift in the position of one domain of the protein with respect to another, an effect referred to as a “domain push.” The authors postulated that this shift would also occur with other bulky amnio acid residues and, if so, that such residues should also cause a bob1 phenotype in S. cerevisiae Mcm5. This was tested in vivo in budding yeast; the experiments confirmed that small residues had no bob1 phenotype while large residues behaved similarly to the original bob1 mutation. Importantly, however, this effect appears to be Mcm5 specific; the same proline-to-leucine changes in S. cerevisiae Mcm2, Mcm3, or Mcm4 do not bypass cdc7 in S. cerevisiae (184; L. Pessoa-Brandão, R. Leon, and R. A. Sclafani, personal communication). The bob1 mutation has not yet been constructed into Mcm5 proteins outside of S. cerevisiae. In S. pombe, one of the two lesions in the mcm2+ temperature-sensitive allele called cdc19-P1 is also a proline-to-leucine change of this residue, which does not bypass the DDK kinase (83, 290).
The CDKs are essential for replication initiation (see, e.g., references 18, 21, 62, 68, 69, 74, 122, 254, 282, and 311). Importantly, mcm5-bob1 is still dependent on CDK activity, indicating that this kinase either functions in parallel with or downstream of DDK (137, 282). If DDK provides the signal at individual origins, CDK appears to provide the link to the cell cycle.
This link may be through an essential but poorly understood protein called variously Cut5 or Dpb11 (Table (Table1)1) (Fig. 4D and E) (6, 107, 271, 344, 345; W. P. Dolan, D. A. Sherman, and S. L. Forsburg, submitted for publication). Genetic studies of the yeasts indicate that Dpb11/Cut5 is required both for initiation of DNA replication and, independently, for checkpoint responses to unreplicated or damaged DNA (6, 208, 270, 271). Temperature-sensitive S. pombe cut5 mutants are profoundly defective in DNA synthesis, this is one of the tightest initiation phenotypes of any temperature-sensitive mutant (see, e.g., references 77 and 271). S. cerevisiae Dpb11 associates with DNA polymerase epsilon, which also acts early in S phase (6). The yeast Dpb11/Cut5 binds to a CDK target protein called Drc1/Sld2, and CDK phosphorylation of Drc1 is required for replication initiation (146, 207, 235, 333).
The most significant event in origin firing appears to be binding of a conserved protein called Cdc45, which is a rate-limiting factor for replication initiation (71, 214) (Fig. (Fig.4E).4E). CDC45 was first identified in budding yeast by some of the same screens that identified MCM genes (219, 293). cdc45 and mcm mutants interact genetically, and the proteins interact physically (53, 102, 111, 120, 172, 214, 217, 269, 324, 325, 360, 361; Dolan et al., submitted). Cell cycle analysis of S. cerevisiae indicated that Cdc45 and DDK have a similar execution point, defined as the time at which a protein functions in the cell cycle (244). Combined with the data regarding the mcm5-bob1 mutant allele, a plausible model is that DDK phosphorylation of MCMs results in an allosteric change in the MCM complex that allows binding of Cdc45 at individual origins as they are activated. This suggests that the consequence of DDK activity is Cdc45 binding, consistent with biochemical analysis (see, e.g., references 71, 140, and 339). This is also consistent with localization showing that Cdc45 does not assemble at all origins all at once but apparently does so as they are fired (4, 214, 362). Cdc45 is itself a substrate of the DDK in vitro, but it is not clear whether this influences its function in vivo (236).
Assembly of the pre-RC is not sufficient for Cdc45 loading, however. CDK activity is also required (215, 362). This may be mediated at least in part by the Dpb11/Cut5 protein (described above), which is part of a CDK-regulated complex and is also required for Cdc45 loading (7, 107, 146, 207, 235, 304, 327, 333). However, athough MCMs may be liberally distributed along the chromatin and bound to Cdc7, only a fraction of MCM complexes will recruit Cdc45 (71). While the ratio of MCM complex to ORC in reconstitution experiments approaches 40:1, the ratio of Cdc45 to ORC is 2:1 (71). Thus, something limits Cdc45 binding to the MCMs at the origin.
Together, these data suggest that there may be two independent inputs into the replication initiation pathway: (i) the origin identification and pre-RC assembly of the MCMs, which is activated by DDK phosphorylation, and (ii) the cell cycle response provided by the CDK activation of Dpb11/Drc1. This is particularly satisfying because it provides a fail-safe mechanism for the initiation of replication by integrating two independent signals from converging pathways.
Although for many years the repertoire of players in replication initiation was thought to be complete, recent genetic experiments have identified new and unexpected players. Cdc45 functions with a protein called Sld3, which appears to be conserved (at least in the yeasts [147, 226]). Three very recent studies have identified an additional, completely new complex with a modified ring shape, called the GINS complex, which is mutually interdependent for origin binding with Cdc45 and Dpb11/Cut5 complexes (148, 169, 304) (Table (Table1).1). This complex is conserved in multiple eukaryotes and is essential for cell viability. Its role is likely to be more complicated than simply replication initiation: the S. pombe homologue of the Psf3 subunit of GINS, called Bsh3, was isolated in a screen for interactions with the passenger protein complex required for kinetochore function and was subsequently identified in human cells (H.-K. Huang and T. Hunter, personal communication). Whether this indicates a mechanistic link between replication and kinetochore assembly, or multiple independent functions associated with the GINS complex, remains to be determined.
The consequence of Cdc45 loading is the unwinding of DNA to open a single-stranded region (214, 330). Actual initiation (as measured by DNA synthesis) may be best scored by loading of replication protein A (RPA), which binds single-stranded DNA and is required for subsequent polymerase binding and activation (4, 214, 330, 361). Intriguingly, like all the MCMs, Cdc45 is required both to initiate and to complete DNA replication (178, 206, 312) (see “MCMs and replication in elongation” below).
As discussed above, MCMs in all organisms but S. cerevisiae are constitutively located in the nucleus throughout the cell cycle. MCM chromatin binding, rather than localization, is cell cycle regulated. MCMs bind chromatin only during the G1/S phase and are dislodged from the chromatin as the S phase proceeds (61, 87, 153, 159, 166, 168, 190, 199, 266, 279, 317, 347). Although there is always a soluble fraction of MCM protein in the nucleus throughout the cell cycle, access of the MCMs to chromatin is a key regulatory event necessary to control origin usage.
Cytological studies have indicated that MCMs are liberally distributed throughout the nucleus, associated with unreplicated chromatin, and are not concentrated at the origin (61, 166, 199). This suggests that only a subset of the MCMs are bound at the origin or with actively replicated DNA. This excess of MCMs distant from the origin (termed the MCM paradox [125, 179]) could suggest that MCMs load at the pre-RC and spread along the DNA, perhaps similarly to the translocating DnaB (150).
One possibility is that there is a qualitative difference in the binding of different pools of MCMs to the DNA. Early experiments suggested that MCM association with the chromatin could be stabilized by ATP (87). Genome-wide analysis using “ChIP on a chip” (chromatin immunoprecipitation and PCR, followed by microarray analysis of the products) suggests that there are 600 to 700 MCM sites per S. cerevisiae genome, of which approximately 10% do not colocalize with ORC (341). This is slightly more than the anticipated number of origins (259, 341) but substantially less than the 30,000 estimated MCM molecules per cell (65, 185). Even given the possibility that a significant fraction of the MCMs remain soluble in the nucleus, it would appear that this method underestimates the MCMs on the chromatin, either because they are not tightly associated or because they assemble into very large higher-order structures.
Several experiments suggest that there are multiple MCM complexes associated with a single origin (71, 239, 263). The ratio of MCMs bound to ORC at the origin were examined using a Xenopus in vitro system, in which origins are not defined (71). On an 80-bp fragment of DNA, ORC and MCM assembled at a 1:1 ratio. However, as the fragment length increased, the amount of MCM (but not of ORC) also increased, approaching the estimate of 20 to 40 MCMs per ORC observed for sperm chromatin in Xenopus extracts (71, 331). This observation suggests that MCMs spread away from ORC. Significantly, although Cdc7 kinase can associate with the distal MCMs, origin firing and Cdc45 loading does not occur at the position of all the MCM complexes. This implies that there is a difference between MCMs bound immediately adjacent to the origin at ORC and those distributed along the chromatin, although they seem to be bound equally tightly. Perhaps the spreading MCM complexes restrict the access of additional ORC complexes to the chromatin and thus ensure that potential origins are distributed some distance apart (71).
MCM chromatin binding is regulated by phosphorylation; hypophosphorylation of Mcm2, Mcm3, and Mcm4 in various systems correlates with MCM chromatin binding (46, 89, 108, 224, 317, 358). However, mutation of CDK sites in a single MCM, at least in yeasts, is not sufficient to disrupt replication control, indicating the presence of multiple levels of redundant control (99, 230). Moreover, phosphorylation of MCMs may also affect enzymatic function apart from chromatin binding; in vitro studies indicate that CDK-dependent phosphorylation of Mcm4 also disrupts the helicase activity of the (Mcm4,6,7)2 form of the helicase (131, 134).
Importantly, CDKs negatively regulate pre-RC assembly overall (52, 79, 123, 176, 229) (Fig. (Fig.4E).4E). In fission yeast, inactivation of CDK has the dramatic effect of causing massive rounds of rereplication; a similar, although less dramatic, effect is observed in budding yeast (22, 52, 221, 230). By manipulating CDK activity or Cdc6 levels to disrupt normal origin control, MCMs are reloaded onto the origins and cells rereplicate their DNA inappropriately in a single cell cycle (190, 230, 291, 351).
Other CDK targets include ORC and Cdc6 (23, 73, 79, 88, 108, 131, 134, 138, 176, 197, 225, 229, 233, 328). Phosphorylation of Cdc6 is associated with its degradation (in yeasts) or nuclear export (in mammals) (67, 72, 139, 163, 225, 233, 252, 272). Since Cdc6 is essential for MCMs to load onto chromatin (see above), this has the effect of restricting MCM loading to G1/S, when Cdc6 is present and active. Because high CDK activity blocks pre-RC assembly, this ensures that the firing of an origin also involves its inactivation, thus preventing repeated use of an origin within a single cell cycle.
Additional controls contribute to the prevention of MCM chromatin binding and inappropriate origin activation. A recent study suggests that association of a subset of MCMs with the Crm1 (exportin 1) nuclear export factor is required in Xenopus to prevent rereplication (343). However, this does not appear to involve actual MCM export, and it is likely that the Crm1 association involves only a fraction of the MCMs, which again suggests that there may be distinct pools of MCMs within the nucleus. This is a recurring theme in MCM analysis.
The role of MCMs in origin activation is clear. Biochemical analysis of Xenopus unambiguously revealed an essential function in initiation of DNA synthesis. Execution point mapping of mcm mutants from yeasts clearly implicated them in replication initiation, and studies using synchronized yeast cultures clearly established that these proteins are required for assembly and activation of the pre-RC.
However, these experiments do not eliminate the possibility that MCMs have an additional function later in S phase. In the yeasts, many temperature-sensitive mcm mutants result in arrest at the restrictive temperature with a substantial amount of DNA synthesis, similar to the phenotype associated with mutations in proteins that act later in DNA replication (see, e.g., references 93, 193, 227, and 346). While the temperature-sensitive mutants were assumed to be “leaky” for activity, allowing a subset of origins to fire, this phenotype was also consistent with a later MCM function, a geneticist's argument that was subsequently proven correct. Early experiments offered several tantalizing clues. In fission yeast, spores in which mcm has been deleted germinate and proceed through a slow S phase. This reflects residual, or “maternal,” MCM packaged in the spore. Only if the residual protein is further inactivated with a temperature-sensitive allele do the deleted spores block replication initiation (193). This was interpreted to indicate that relatively little MCM is required for initiation and bulk DNA accumulation but that significant amounts are required for completion of DNA replication, arguing for an additional and essential role later in S phase. Consistent with this, a degron allele of S. pombe mcm4ts that results in rapid and irreversible protein inactivation is the only mcm4+ allele sufficiently stringent to causes initiation defects (194).
The most compelling evidence for a role in replication elongation was provided by an elegant experiment in which an MCM degron allele in synchronized S. cerevisiae cells allowed the protein to be selectively destroyed after replication initiation (178). These cells arrested during S phase with intermediate levels of DNA, showing that the MCMs indeed function after replication initiation; importantly, this was found for five of the six MCMs, only because a functional degron allele of Mcm5 could not be constructed for the experiment. Importantly, this shows that Mcm2 and Mcm3 are indistinguishable from the core complex Mcm4,6,7 even in the elongation stage of S phase. Similar observations using the same technique were made for Cdc45 (312). Thus, the MCMs convert from an assembly factor to an active participant in replication elongation along with Cdc45. Since the structure of the MCMs clearly suggests parallels to known DNA helicases (see above), the MCM complex became the obvious candidate for the cellular replicative helicase.
As described above, biochemical proof that the MCM heterohexamer functions as a helicase has been hard to obtain. Attempts to reconstitute an active helicase using purified proteins from several species have suggested that only the Mcm4,6,7 core complex has helicase activity in vitro. However, this activity is disrupted by addition of the other MCMs, although they are equally required in vivo. One possibility is that the MCM complex requires accessory components and that Cdc45 may be the missing factor (206). In that study, the investigators attempted to purify an active helicase from Xenopus extracts and found that the activity included all six MCM proteins and Cdc45. This is consistent with the evidence from ChIP, suggesting that Cdc45 and MCMs both travel with the replication fork (5). Using a set of nested PCR primers walking outward from an origin, it was determined that MCMs associate with the expanding replication fork with similar kinetics to DNA synthesis factors—and to Cdc45. Cytological studies with Drosophila agree with this, showing that MCMs specifically associate with the origin and then move away as the fork expands (40). These data are consistent with genetic and physical interactions between MCMs and Cdc45 and between RPA and DNA polymerase alpha (83, 172, 215, 324, 325, 361) and would place the MCMs at the fork. Figure Figure4F4F presents a speculative model of MCM function at the fork.
Cytological analysis of metazoans shows that MCMs liberally decorate unreplicated chromatin during S phase but are not particularly closely associated with the proteins of the replication fork (61, 166, 199), leading to a competing suggestion that an MCM helicase may function at a distance from the replication fork (179). However, since only a small fraction of the MCMs localize at the origin (MCMs outnumber ORC by approximately 40:1 in Xenopus extracts ), the majority of MCMs visualized cytologically are likely to be “remote MCMs” not located at the fork. Biochemical studies using Xenopus extracts suggest that these remote MCMs are bound equally tightly as the MCM complexes at the pre-RC (71), although they are not observed in ChIP experiments with yeast. Perhaps once the MCMs are activated by Cdc45 and the GINS complex, they become tightly associated with the replication fork, allowing their identification by ChIP.
What is the purpose of this vast excess of MCMs that cannot be placed at the fork? The cell requires the remote MCMs, because even modest reduction in MCM levels that do not particularly affect replication (71) can lead to genome instability (see, e.g., references 80, 185, and 193). Whatever role they play is probably limited to S phase, because bulk MCMs lose their chromatin association as S phase proceeds (see e.g., references 37, 46, 153, 159, 166, and 199) and in budding yeast they also leave the nucleus (see e.g., references 176 and 229).
The abundance of MCMs and the evidence suggesting that they may affect chromatin functions outside the replication fork suggest that these proteins may have additional functions in the cell. Most evidence for this relies on physical interactions identified by coimmunoprecipitation or two-hybrid experiments. These may define functionally accessible regions of the MCMs; for example, the known interactions with Mcm2 all occur in the same region of its N terminus (27, 115, 130), while interactions with Mcm7 occur in the same region of its C terminus (31, 171, 298) (Fig. (Fig.5).5). Some phenotypes are suggestive of roles in chromosome structure, such as the recent observation that Drosophila mcm mutants have defects in chromosome condensation (39, 253). Increasingly, these observations suggest that MCMs, as is the case for other replication proteins, may have roles in a variety of chromosome transactions.
Many studies have suggested interactions between transcription and replication apparatus (reviewed in references 209 and 223). Perhaps not surprisingly, the MCMs have been implicated in several aspects of transcriptional control. Is this a consequence of links between transcription and replication that coordinate cell proliferation, or is it evidence for an active role for MCMs in transcription? Some investigators point to simian virus 40 large T antigen, which is both a replicative helicase and a transcription factor, as a paradigm (reviewed in reference 288).
Several MCM proteins have been shown to associate with the carboxy-terminal domain (CTD) of RNA polymerase II (RNA pol II), and antibodies to Mcm2 inhibit transcription in vitro (349). Amino acids 168 to 212 of Mcm2 associate with the CTD, and point mutations in this region of Mcm2 disrupt the interaction (115). The CTD of RNA pol II is associated with the assembly of proteins involved in transcription initiation, elongation, and chromatin remodeling (reviewed in reference 105). A genetic interaction in S. cerevisiae between CTD domain mutants and an mcm5 mutant suggests that the interaction is required for replication, and the authors suggest that RNA pol II recruited to origins may influence chromatin structure (20, 91, 297).
MCMs may play a more direct role in transcription. Several lines of evidence suggest that the MCM proteins associate with specific transcription factors. During biochemical purification, the Mcm3-Mcm5 heterodimer associates with STAT1α, a transcription factor stimulated by gamma interferon (50, 359). This association is mediated by Mcm5. Residues 250 to 401 of Mcm5 are sufficient for interaction in vitro; although a point mutation at the extreme C terminus disrupts the interaction, it also disrupts the association of Mcm5 with other factors and may cause a structural change in the protein (50, 359). Intriguingly, increasing levels of Mcm5 in cells correlate with increased levels of transcriptional activation; additionally, levels of STAT1α-dependent transcription vary in the cell cycle, reaching a maximum at G1/S (359). The mechanism of this effect is not known. Loss of mcm5 activity in budding yeast leads to increased expression and nuclease sensitivity of genes at the telomeres (70), which may reflect a role in chromatin structure rather than transcription per se (see below).
Mcm7 has also been connected with several transcriptional regulators. In yeast, Mcm7 autoregulates its expression along with the transcription factor Mcm1 (80). Although MCM1 was isolated in the same minichromosome maintenance screen, it does not encode a member of the canonical MCM family; instead, it is a MADS-box transcription factor that is required for expression of a number of cell cycle-regulated genes (248). Interestingly, purified Mcm7 stimulated the binding of purified Mcm1 to promoters in vitro (80), indicating that this effect does not depend on the complete MCM complex, although it is not clear whether Mcm7 forms a stable complex with Mcm1. Moreover, despite the increased binding of Mcm1, Mcm7 in vivo appears to down-regulate its own transcription, acting as a transcriptional repressor. The functional significance of this interaction remains to be worked out.
Mcm7 also interacts with the retinoblastoma tumor suppressor protein (Rb) (298). A major role of Rb is in regulating the activity of the E2F transcription factor family, which is in turn required for the expression of a number of genes involved in the G1/S transition, including the MCMs (reviewed in reference 299). The carboxy-terminal 137 amino acids of Mcm7 also associate with the Rb-related factors p107 and p130 (298). Expression of the interacting fragment of Rb can inhibit DNA replication in vitro, presumably by interfering with the normal function of Mcm7 at the replication origin (298). Consistent with this, the interaction is disrupted by active cyclin D/CDK4 (95). This suggests that Rb family members may actively inhibit DNA replication by sequestering MCM proteins. Whether this association influences the activity of Rb remains to be determined. The C terminus of Mcm7 also interacts with the papillomavirus E6 oncoprotein (171), which may be involved in ubiquitylation (see below) (170).
The structure of the chromatin and the pattern of histone modifications within it have been proposed to generate an epigenetic code that is crucial for regulated gene expression (reviewed in references 32, 98, 106, and 142). The replication community has only recently begun to explore the connections between histone acetylation and deacetylation and replication competence. For example, deleting the S. cerevisiae histone deacetylase (HDAC) RPD3 correlates with increased origin acetylation and earlier origin firing (329). A similar effect is observed if the Gcn5 histone acetyltransferase (HAT) is tethered next to an origin (329). Interestingly, tethering of PCAF/Gcn5 HAT adjacent to the polyomavirus replication origin also stimulates replication in animal cells (342). Early studies suggested that regions of the chromatin associated with MCM proteins are more susceptible to nuclease digestion, indicating that these domains may be less tightly compacted (118, 262). There is also a Cdc7-dependent change in origin topology, detected by genomic footprinting (92). These observations are consistent with a requirement for more open, histone-acetylated chromatin around a replication origin (reviewed in reference 209).
Budding yeast mcm5 mutants show increased expression and nuclease sensitivity of genes at telomeres (70). Telomeric gene expression in this system is modulated by the balance between the MYST family HAT called Sas2 and the HDAC called Sir2 (158, 301). Do mcm5 mutants disrupt this balance? Overexpression of a member of the Gcn5-containing SAGA complex partially suppresses the phenotypes associated with the mcm5 allele (70). This could indicate direct interactions such as those observed with polyomavirus large T antigen (342). Alternatively, MCMs may influence, or be sensitive to, the histone modifications mediated by this complex.
Evidence linking the MCMs directly to assembly or maintenance of chromatin structure is still incomplete but is nevertheless suggestive. Several interactions with chromatin-associated proteins occur through the amino terminus of Mcm2 (Fig. (Fig.5).5). Mcm2 binds histone H3 sufficiently tightly, that it is possible to purify the MCM complex on a histone affinity column (129); this association requires residues 63 to 153 of mouse Mcm2 (130). When incubated in vitro with histones H3 and H4 and plasmid DNA, Mcm2 by itself promotes the assembly of a putative nucleosome-like structure (132). Studies in vitro and in vivo show that human Mcm2 interacts with a protein called Hbo1, a member of the MYST family of HATs that also interacts with ORC (27, 126). This association occurs through residues 73 to 223 of Mcm2, and a point mutation in a conserved leucine in this region disrupts the interaction in two-hybrid assays. This mutation is likely to be specific to the interaction and does not generally disrupt the Mcm2 structure, because it can be suppressed in vitro by a corresponding mutation in the zinc finger of Hbo1 (27). Does this interaction indicate that MCMs target Hbo1 to the replication origin to acetylate histones? Or does Hbo1 target MCMs to specific regions to modulate silencing, which is linked to the S phase (reviewed in reference 10)? Could this be part of a larger role for MCMs in genome stability? Other members of the MYST family of HATs are implicated in DNA repair (15, 127). In addition, HATs are not limited to acetylating histones. The PCAF and Gcn5 HATs that stimulate polyomavirus replication do so by acetylating the polyomavirus large T antigen, which is required for origin binding and replication (342); in human cells, Mcm3 is also acetylated (see “Modifications of MCM” below). Thus, Hbo1 might directly modify MCMs. Interactions between MYST family HATs and MCMs are also observed in fission yeast (E. B. Gómez and S. L. Forsburg, unpublished observations), providing a genetic model for their study. Together, these data suggest that MCMs may be more active contributors to chromatin structure during S phase.
Genetic analysis of the yeasts suggests that reduction of MCM activity is associated with genome instability and DNA damage (see, e.g., references 111, 185, and 193). Recent work suggests that MCMs may play a direct role as a target of the checkpoint response system. S-phase checkpoints consist of overlapping kinase cascades (reviewed in references 19, 29, 238, 242, and 261). The replication checkpoint is activated in response to replication arrest, such as that caused by the drug hydroxyurea (HU). The damage checkpoint is activated by DNA lesions or double-strand breaks (DSBs). The activated checkpoints cause cell cycle arrest, protect the replication fork from collapse, and activate repair.
Checkpoints may be active participants during S phase to identify and repair any lesions resulting from passage of the replication fork. For example, the Mrc1 protein, which acts with S. cerevisiae Rad53/S. pombe Cds1 as part of the replication checkpoint, has recently been shown to be required for efficient replication elongation, a function genetically separable from its role in checkpoint signaling (241). Its assembly at the origin requires DDK activity, reminiscent of Cdc45. In vertebrate cells, the Rad17 protein, a component of the checkpoint signaling apparatus, associates with early replication foci even in the absence of damage (51), and the Hus1 protein which acts with Rad17 is chromatin associated during normal S phase (356). Thus, interactions between replication and checkpoint proteins may be a normal response to the mutagenic potential of S-phase progression; this is consistent with observations suggesting that S. cerevisiae Mec1 (S. pombe Rad3/human ATR) prevents replication fork stalling (30).
When the replication checkpoint is activated by HU treatment, checkpoint proteins are recruited to the replication fork (51, 241). In the absence of the replication checkpoint kinase S. cerevisiae Rad53/S. pombe Cds1, cells treated with HU suffer replication fork collapse and replication proteins including polymerases are lost from the fork, leading to the model that the replication kinases recruited to the fork act on replication proteins to arrest replication and promote fork stabilization until DNA synthesis can continue (41, 292, 313).
In fission yeast, mcm mutants at the restrictive temperature suffer DSBs (J. M. Bailis and S. L. Forsburg, unpublished data) and activate the DNA damage checkpoint (192, 193, 202). If loss of MCMs causes damage, protecting the MCMs at the fork is a logical target of checkpoint activation. Mouse Mcm4 undergoes HU-specific phosphorylation by the checkpoint kinase ATR and by Cdk2 (133). This phosphorylation inactivates the Mcm4,6,7 helicase in vitro, consistent with the role of the checkpoint in shutting down replication when cells are starved for nucleotides following HU treatment (see, e.g., references 64, 273, and 285). Recent isolation of an HU-sensitive allele of mcm4 from fission yeast further suggests that Mcm4 may be a checkpoint target; other mcm4 alleles are not HU sensitive (T. Nakagawa and H. Masukata, personal communication). This effect may not be limited to Mcm4; there is now evidence that Mcm7 interacts with the Rad17 checkpoint protein, again through its C terminus (C.-C. Tsao and R. Abraham, personal communication). This could indicate that that MCMs recruit or anchor checkpoint proteins at the stalled replication forks.
As described above, dysregulation of MCMs by reducing or increasing the levels of a single MCM leads to disruptions in genome stability in yeast (see, e.g., references 185, 193, and 346). Since MCM activity is essential for DNA replication in dividing cells and is lost in quiescence (200), MCMs are obvious markers for proliferation. Molecular studies suggest that increased levels of MCMs mark not only proliferative malignant cells (85, 97, 114, 135, 212, 260, 265, 338) but also precancerous cells and the potential for recurrance (3, 124, 310). They thus may prove to be effective markers for tumor diagnosis.
Additionally, MCMs may be contributors to cancer. Proteins implicated directly in cancers are known to modulate MCM activity. As described above, Mcm7 physically interacts with Rb, presumably to reduce proliferative capacity (298), and with the papillomavirus transforming protein E6 (171). The MYCN transcription factor, amplified in neuroblastoma, upregulates Mcm7 expression, which could contribute to hyperproliferation in these cells (286). The same region of Mcm7 also interacts with a heart-specific LIM domain protein, FHL2, which is upregulated in various cancers (31). Given the range of new roles in which MCMs are now implicated, these interactions may not be limited to effects on replication.
MCMs are known to be modified by a variety of covalent attachments including phosphorylation, acetylation, and ubiquitylation. It is likely that additional modifications and the responsible enzymes will be identified in the future, providing additional levels of regulation. These modifications may provide the mechanisms for cells to distinguish between different pools of MCMs in the nucleus and to activate distinct functions of these proteins in vivo.
As described above, the function of at least two kinases is required for S-phase initiation: DDK, which phosphorylates primarily Mcm2 but also other MCM subunits, and CDK, which phosphorylates at least Mcm4 and perhaps other subunits. Other kinases may also be involved. There is evidence for phosphorylation of Mcm4 and other MCMs that is not CDK or DDK dependent (250). Additionally, a recent study suggests that Mcm4 may be a target of the ATR-Chk2 checkpoint kinase pathway in response to replication arrest caused by HU (133). Thus, the MCM proteins are targets of both positive and negative phosphorylation events.
The identity of the phosphatase(s) that dephosphorylates MCMs remains a mystery. We can infer that there is one, because Mcm4 phosphorylation is associated with its inactivation (see, e.g., references 46, and 108), and there is no evidence that the abundant MCMs turn over significantly during the cell cycle. The only evidence for a phosphatase associated with MCM function is the observation that protein phosphatase 2A is required for binding of Cdc45 to the pre-RC (38). Since this is a positive effect, it suggests that there is an inhibitory kinase. However, the identity and substrate(s) of this kinase are unknown.
Ubiquitin is a small peptide that is covalently linked to lysine residues in the target proteins (reviewed in references 277 and 326). Chains of ubiquitin target proteins for degradation by the proteasome. More recent studies have indicated that ubiquitin and the related peptides SUMO and NEDD8 can also modify proteins to regulate them and can affect localization or protein association in addition to protein stability (reviewed in references 222, 277, and 352). Although there is not yet evidence for sumoylation or neddylation, it is likely that the MCMs will be substrates for a broad range of related modifications.
Genetic experiments with budding yeast isolated an allele of UBA1, a ubiquitin-conjugating enzyme, as a suppressor of an mcm3 mutant (34). This suggested that Mcm3 may be negatively regulated by ubiquitylation. Subsequent studies reveal that a fraction of wild-type S. cerevisiae Mcm3 is polyubiquitylated in vivo, although the consequences of this are not yet clear. Human Mcm7 is polyubiquitylated by the ubiquitin ligase E6-AP, which acts with the papillomavirus HPV-18E6 protein to form a virus-specific ubiquitin ligase (170). Data suggest that this targets Mcm7 for turnover by the proteosome. A homotypic binding site for the ligase was identified between residues 633 and 654 of Mcm7, defining an “L2G box,” (S/T)xxxLLG. In vivo studies show that Mcm7 is also polyubiquitylated in the absence of the E6-AP protein, suggesting that it is a substrate for ubiquitylation even in noninfected cells.
While the steady-state level of bulk MCMs remains fairly constant, as described above, it is possible that a fraction of MCMs are subject to regulated turnover. This could provide one mechanism to define functional pools within this abundant protein family.
Acetylation is now appreciated as a significant modification for many cellular proteins and is not limited to histones (reviewed in reference 165). As described above, the histone aceyltransferase PCAF/Gcn5 is reported to stimulate replication from a viral origin by contributing to the acetylation of polyomavirus T antigen, a viral helicase similar to MCMs (342). This suggests that the interaction of MCMs with other HATs such as Hbo1 may result in MCM acetylation. Therefore, the target of the HATs to which MCMs bind may not be histones but may be MCMs themselves (see “Chromatin remodeling” above).
Mcm3 protein is acetylated in mammalian cells by a protein called MCM3AP (308), which was originally isolated in a two-hybrid screen for Mcm3 interactors (309). The acetylated Mcm3 is associated with the chromatin and is not observed in cells arrested in G2/M, which suggests that acetylation is involved in regulating Mcm3 specifically during G1/S, when it is chromatin bound. Curiously, MCM3AP inhibits DNA replication in a cell-free system, suggesting that it is a negative regulator (307, 308); however, it clearly functions positively by promoting Mcm3 nuclear localization and chromatin binding (307, 309). This paradox has yet to be resolved. MCM3AP is distantly related to the PCAF/Gcn5 family of HATs (308), although it does not have a close homologue in the yeasts. It is a splice variant of a much larger protein called GANP that is found in B cells (1). GANP is associated with DNA primase activity, leading to the suggestion that it is specifically involved in the regulation of B-cell proliferation (174, 175).
The MCM proteins have traveled a long way from their identification nearly 20 years ago in a yeast screen for chromosome loss mutants. While for many years their role in replication initiation was thought to be their only function, it never explained the MCM paradox, i.e., their abundance and wide distribution in regions outside of replicating DNA. Thus it is gratifying that current studies implicate these factors in a range of additional chromosome transactions including transcription and chromatin remodeling. Insights derived from recent structural studies, combined with an expanding repertoire of MCM-interacting proteins, promise further advances in our understanding of this ubiquitous hexamer. However, puzzles and paradoxes remain. We still do not understand the relationship between the core helicase, which has been enzymatically characterized in vitro, and the larger heterohexamer, which is the predominant in vivo form. We have little insight into the way in which different pools of MCMs are distinguished inside the cell, whether by expression, location, modification, or activity. Finally, it is still unclear why these proteins need to be so abundant. While molecular mechanisms are still elusive, we can safely conclude that MCMs, through a variety of activities, are fundamental contributors to the integrity of the eukaryotic genome.
A distant MCM relative has recently been identified in Drosophila (H. Matsubayashi and M.-T. Yamamoto, Genes Genet. Syst. 78:363-371, 2003). However, unlike Mcm2-8, it is missing the highly conserved IDEFDKM motif and most other canonical sequences.
I am indebted to generous colleagues, including Robert Abraham, Steve Bell, Grant Brown, Brian Calvi, Mel DePamphilis, Tony Hunter, Yukio Ishimi, Wei Jiang, Tom Kelly, Hisao Masai, Hisao Masukata, Mike O'Donnell, Tony Schwacha, Robert Sclafani, Bruce Stillman, Bik Tye, and Teresa Wang, who shared new data, reprints, and helpful discussions. I also thank Julie Bailis, Eliana Gómez, and an anonymous reviewer for careful reading of the manuscript and many helpful comments.
Work in my laboratory is supported by grants from the American Cancer Society (RSG-00-132-04-CCG) and the National Institutes of Health (R01 GM59321).