Contrapodal genomic organization of human genes encoding MCM8 and GCD10
By searching the dBEST nucleotide database of sequence tags from expressed mRNAs with the protein sequence of MCM7 using the tBLASTn algorithm (19
), we identified several clones containing ESTs from a sequence similar, but not identical, to MCM7. Segments of such sequences could be arranged contiguously to form an open reading frame (ORF). Selected IMAGE cDNA clones (GenBank accession nos BE278386, BG023796 and BG422937) were then ordered from the ATCC and Incyte Genomics for sequencing to confirm limits of the ORF and to identify sequences for additional searches. The resulting deduced ORF is bounded at the 5′ end by an ATG start codon upstream of which are TAA or TAG stop codons in all three reading frames. It is bounded at the 3′ end by a TAA stop codon, 200 nt downstream of which is an AATAAA potential polyadenylation signal. Further searching the HTGS database of human genomic sequences with unique sequences from this ORF using the tBLASTn algorithm revealed a BAC clone containing a unique gene encoding the new ORF at chromosome band 20p12.3–13 (GenBank accession no. AL035461). This gene is composed of 19 exons, comprising 43 796 genomic bp, and it encodes precisely the ORF deduced by our sequencing of ESTs. A complete mRNA sequence encoding this protein is available. Due to protein homologies to be described in this article, this protein is considered a member of the MCM protein family. Although no data yet link this protein to minichromosome maintenance, the name MCM is given the new protein, and the new gene, since it reflects family sequence similarities. Family members MCM2–MCM7 have been identified in humans, as has protein MCM10. Since the name MCM8 has not yet been used, the new family member is given the name MCM8. While this manuscript was undergoing initial review, a paper by a separate group reported the same protein, also terming it MCM8 (20
). That paper did not report on the MCM8 genomic organization.
This new MCM family member has certain aspects unique among MCM proteins. Notably, it has no clear homolog in yeast. The N-terminal 250 amino acids of MCM8 are quite different from the N-termini of other MCMs. When a database search is initiated with the N-terminal sequences of MCM8, significant homologs are found in Drosophila, roundworms and the plant Arabadopsis, but no homologs are detected in yeast or lower organisms. This suggests that MCM8 evolved, perhaps from another MCM family member, at some point in metazoan development. The central region of MCM8 is very similar to those of all the other MCMs, especially that of MCM7. Further aspects of this MCM8 region will be considered below.
Intriguingly, the MCM8 gene is oriented head-to-head with another gene in the same BAC (GenBank accession no. AL035461; Fig. A). This gene consists of 11 exons comprising 12 698 genomic bp and encoding two alternative splice forms of mRNA, as determined by comparison to multiple recorded ESTs. The protein encoded by this gene is homologous to yeast protein P41814, termed the Gcd10p protein. Gcd10 from humans has not yet been described, but in S.cerevisiae
the protein is a RNA-binding protein involved in mediating mRNA translation (21
). Based on the significant homology, the new human coding sequence located contrapodal to MCM8 at 20p12.3–13 is named hGCD10. The start points of several extended cDNAs representing mRNAs for MCM8 and GCD10 are indicated by circles with arrows in Figure B. The 5′ start points, sequenced by us, include that for the EST from choriocarcinoma lacking codons for 16 amino acids, AY158211 (hMCM8, submitted by us 3 October 2002). Of the start points indicated in Figure B, that for hMCM8 at +1 is represented in two clones, and that for hGCD10 at –476 is found in three clones, making these at present the most common start points for the two genes. It is not known whether there are tissue or developmental specifications for start points. We have recently reported that certain other genes located in head-to-head fashion are each characterized by an array of transcriptional start points (22
). Another GenBank entry for the MCM8 transcript, AJ439063, has an apparent start point located at –364. It can be seen that the entire sequence separating the transcription units represented in Figure B is highly GC-rich and that it contains no obvious candidates for TATA boxes for either gene. It does possess one potential CCAAT element, located less than 30 bp upstream of one hGCD10 start point. The region contains potential elements for binding transcription factors Sp1, Ets-1, E2F-1 and Purα, as indicated in Figure B. At this time there is no evidence that any of these transcriptional control elements is functional in the depicted region. The region separating the transcription units also contains multiple CpG units. In fact, the ratio of CpG to GpC in the 500 bp shown in Figure B is 1.07, a value high for mammalian GC-rich sequences and one that qualifies this region as a CpG island. The configuration of TATA-less promoters containing Sp1 elements is characteristic of several genes recently reported to be oriented in contrapodal fashion (22
). No further characterization of the hGCD10 gene or its encoded protein was undertaken for the present report, which focuses on expression and preliminary functional characterization of MCM8.
Figure 1 (A) Genomic organization of human MCM8 and GCD10 genes at chromosome band 20p12.3–13. Arrows indicate direction of transcription of the two genes. Gray boxes represent MCM8 exons, black boxes GCD10 exons. Translational start codons are indicated (more ...)
An aberrant MCM8 mRNA form in choriocarcinoma
Sequencing of IMAGE clone 3546350 (GenBank accession no. BE278386) by us revealed an anomaly in the encoded MCM8 ORF in the form of a deletion of codons 342–357, encoding 16 amino acids, present in many reported ESTs as well as in the genomic sequence of the ORF. This IMAGE clone is a cDNA from an mRNA expressed in tumor tissue in a case of placental choriocarcinoma. In the genomic sequence of BAC RP5-967N21 (GenBank accession no. AL035461), obtained from non-cancerous cells, the codons missing in the choriocarcinoma clone are present in exon 10. This anomaly could represent either a genomic deletion in the tumor tissue DNA, a polymorphism found in non-cancerous or tumor DNA or aberrant processing of the mRNA in the tumor tissue. An examination of splicing in MCM8 mRNA, as exemplified by comparison of a full-length cDNA clone with genomic DNA, offers clues as to the apparent anomaly in choriocarcinoma. In Figure , which represents genomic DNA, the nucleotides missing in the choriocarcinoma mRNA are underlined. In the usual case …gaa g of exon 9 is a splice donor site to gt tct… of exon 10, the splice acceptor site. These sequences are standard for splice donor and acceptor sites. The splice forms the codons for amino acids EGS, beginning with codon 342. In the choriocarcinoma case it is exactly as though the …gaa g of exon 9 is spliced to the sequence ca aat…, occurring 48 nt downstream of the beginning of the usual exon 10 splice acceptor site. In this case the resulting nucleotide sequence encodes amino acids EAN. The ca aat… is not a canonical splice acceptor site. Nevertheless, it is quite conceivable that it has acted as a cryptic splice acceptor site in this case. At this time, however, the alternative explanation that the mRNA deletion may result from a genomic DNA deletion cannot presently be ruled out. This cDNA may be a valuable mutant in studies of MCM8 function.
Figure 2 Missing codons in an MCM8 transcript from a case of choriocarcinoma. Genomic DNA in the region of exons 9 and 10 is depicted showing codon triplets. Normal splicing, employing canonical donor and acceptor sites, removes an intron of 242 bp, as shown (more ...)
Distribution of expression of MCM8 mRNA in non-cancerous and neoplastic human tissues
Tissue distribution of MCM8 mRNA expression was examined using a cDNA panel of multiple adult human tissues, as shown in Figure (N, non-cancerous). The G3PDH controls show that overall mRNA representation is approximately the same in each lane. The amplified MCM8 bands (top) reveal considerable variation in expression in different tissues. In addition, only a band of 317 bp is seen in each lane, indicating that only appropriately spliced exons 9 and 10 are expressed in each tissue. Note that the samples in this panel, obtained from Clontech, are from non-cancerous tissues. Highest levels of MCM8 expression are seen in the placenta, lung and pancreas. Skeletal muscle and kidney lanes have little or no amplified MCM8 bands. There is a faint, but visible, MCM8 band in the brain sample lane. The levels of MCM8 expression do not necessarily correlate with levels of cell proliferation, i.e. with expected levels of DNA replication. For example, MCM8 amplification is significant in the heart sample lane, although adult heart generally does not have a high percentage of proliferating cells. Caution should be exercised, however, in interpreting this type of distribution study since differences in exsanguination of tissues could influence levels of proliferating cells. It is interesting that placenta, with relatively high levels of MCM8 expression, has only the 317 bp band and not the 269 bp band that would be generated by the aberrant mRNA. The aberrant mRNA described was from a choriocarcinoma obtained from the placenta.
Figure 3 Tissue distribution of expression of MCM8 mRNA. (N, non-cancerous) PCR amplification of an indicator segment comprising 317 bp of MCM8 mRNA or a 269 bp segment comprising the region with a deletion in eight different non-cancerous tissue types. G3PDH (more ...)
In a panel of cDNAs representing mRNAs expressed in various tumor tissues MCM8 was detected by PCR in every sample (Fig. , T, tumor). There were only slight variations in overall expression level, although colon and lung tumor samples were relatively high. There is variability in expression levels in the lung tumor samples, but the significance of this can only be evaluated upon comparison of matched non-cancerous and tumor samples from the same individual. One notable result of this tumor tissue study is that the 269 bp aberrant mRNA found in the choriocarcinoma is not detected in these other tumors. It remains to be determined whether other alterations that would affect MCM8 gene expression occur in different neoplastic tissues or stages.
MCM8 mRNA expression was compared in matched non-cancerous (N) and tumor (T) samples from selected individuals (Fig. , bottom panel). The tumors examined were a metastatic colon adenocarcinoma and a malignant lung carcinoid tumor. These cDNA samples were normalized as to total mRNA amounts. MCM8 expression in the lung tumor (T) showed no change relative to matched non-cancerous lung tissue (N), as seen by comparison to control expression of ribosomal protein S9 in each tissue. In contrast, MCM8 expression in the colon tumor was significantly reduced relative to non-cancerous tissue, as compared with S9 expression. S9 expression was increased in the tumor tissue. Previous reports have indicated enhanced expression of ribosomal protein S9 mRNA in a minor percentage of cases of colon cancer (27
). MCM8 expression is several-fold lower relative to its level in non-cancerous tissue (N) and relative to the level of S9 expression. These comparisons, shown in Figure (bottom panel), at 34 PCR cycles, were essentially the same over a range of different cycles of amplification.
To further examine the link between reduced MCM8 gene expression and colon cancer, mRNAs from three additional cases of colon adenocarcinoma and one case of rectal carcinoma were analyzed. All samples were commercially obtained (Clontech) together with matched non-cancerous colon tissue samples and all had been prepared for PCR with first strand cDNA synthesis. Control S9 ribosomal protein mRNA was assayed to standardize mRNA amounts. The comparisons of non-cancerous (N) and tumor (T) MCM8 amplified DNA and of control S9 DNA are shown in Figure . There is no change in MCM8 mRNA levels between non-cancerous and tumor rectal tissue. It can be seen that in each case of colon adenocarcinoma MCM8 gene expression is several-fold lower than in matched non-cancerous tissue. In contrast, there is no significant difference in S9 mRNA levels in non-cancerous and tumor tissue. Densitometry of the bands obtained from all four patients analyzed reveals that the average level of MCM8 mRNA in the colon tumor tissue is only 0.153 ± 0.044 that of the level in the matched non-cancerous colon tissue.
Figure 4 Reduced levels of MCM8 mRNA in colon adenocarcinoma. All samples of matched non-cancerous (N) and tumor (T) mRNAs were commercially prepared with first strand cDNA synthesis. PCR was performed as described in Figure . Samples for electrophoresis (more ...)
Association of MCM8 with other MCM proteins involved in initiation of DNA replication
All known MCM proteins have been detected in association with the DNA replication apparatus in various organisms (1
). MCM2–MCM7 can be isolated as a distinct complex, and among these, MCM4, MCM6 and MCM7 from mammalian cells have been reported to function as a helicase complex (6
). These proteins co-isolate through several purification steps culminating in DEAE cellulose chromatography (6
). We sought to determine whether MCM8 associates with other MCM proteins, particularly MCM4, MCM6 and MCM7. Figure A shows that MCM8 can be co-immunoprecipitated with either anti-MCM6, anti-MCM4 or anti-MCM7. Figure B shows that MCM8 co-isolates with MCM6 and MCM7 through DEAE cellulose column chromatography. MCM8 is visualized upon SDS–PAGE and staining with anti-MCM8 antibody as a band of slightly less than 100 kDa, migrating slightly more rapidly than MCM6 and slightly retarded relative to MCM7. MCM8 has a molecular weight of 92 kDa, based on its length of 840 amino acids, but its migration at nearly 100 kDa is consistent with anomalous migration of several of the MCM proteins.
Figure 5 (Next page) Co-isolation and co-immunoprecipitation of MCM8 with other MCM proteins in a potential helicase complex. (A) Co-immunoprecipitation of MCM8 from a HeLa cell lysate using antibodies against either MCM4, MCM6 or MCM7 and agarose beads conjugated (more ...)
Glycerol gradient centrifugation helps in assessing the functional state of MCM protein complexes. In Figure C centrifugation of a complex isolated from the DEAE column of Figure B is compared with centrifugation of complexes in a HeLa cell soluble extract. In the HeLa extract, which is a whole cell lysate cleared by centrifugation, shown in Figure C (bottom panel), MCM8 is detected in several gradient peaks. The lowest density peak at fraction 6 does not coincide with the presence of any other MCM proteins assayed. It may represent free MCM8. It cannot be immunoprecipitated with antibodies to MCM4, MCM6 or MCM7. Another sharp peak at fraction 10 coincides precisely with a sharp peak of MCM6. MCM7 does not have a peak at this position. Another MCM8 peak at fraction 14 coincides with the sole peak of MCM7. MCM6 sediments in two relatively sharp peaks, both of which coincide or overlap considerably with MCM8 and one of which overlaps with that of MCM7. Results of centrifugation of the DEAE isolate (Fig. C, top panel) are straightforward. MCM8 sediments in one peak at fraction 14, a peak that is also seen in the bottom panel. MCM7 from the DEAE isolate also peaks at this position, almost exactly as it does in the bottom panel. MCM6 and MCM4 from the DEAE isolate also peak at fraction 14, although for neatness sake, only MCM7 is shown in Figure C (top). Thus, this peak at fraction 14 most likely represents the complex of MCM4, MCM6 and MCM7, which also at times includes a portion of MCM8. By integrating the area under the MCM8 peaks in the bottom panel it can be estimated that the maximum portion of MCM8 present in the MCM complex peaking at fraction 14, the potential MCM4,6,7 complex, is ~32%. In contrast, in the non-synchronous HeLa lysate nearly all MCM7 is in this complex. These results also suggest a dynamic association of different MCM proteins since the highest peak of MCM8 in the HeLa lysate coincides with one of MCM6 but not of MCM7. More work would have to be done to establish that these coinciding peaks represent a complex. The results indicate that MCM8 associates with other MCM proteins that play a role in DNA replication. At this point, however, the results do not unequivocally implicate MCM8 itself in DNA replication since it is conceivable that the MCM proteins are involved in other cellular functions.