|Home | About | Journals | Submit | Contact Us | Français|
We have identified a new class of ribosomal protein (RP) genes that appear to have been retrotransposed from X-linked RP genes. Mammalian ribosomes are composed of four RNA species and 79 different proteins. Unlike RNA constituents, each protein is typically encoded by a single intron- containing gene. Here we describe functional autosomal copies of the X-linked human RP genes, which we designated RPL10L (ribosomal protein L10-like gene), RPL36AL and RPL39L after their progenitors. Because these genes lack introns in their coding regions, they were likely retrotransposed from X-linked genes. The identities between the retrotransposed genes and the original X-linked genes are 89–95% in their nucleotide sequences and 92–99% in their amino acid sequences, respectively. Northern blot and PCR analyses revealed that RPL10L and RPL39L are expressed only in testis, whereas RPL36AL is ubiquitously expressed. Although the role of the autosomal RP genes remains unclear, they may have evolved to compensate for the reduced dosage of X-linked RP genes.
Ribosomes comprise the protein synthesis machinery that is essential for all living cells. Because of the fundamental role played by ribosomes in the growth and development of organisms, their structure and function have been significantly conserved during evolution. In higher eukaryotes, ribosomes are composed of two subunits: a large 60S subunit and a small 40S subunit, which includes four ribosomal RNA (rRNA) species and 79 distinct ribosomal proteins (RPs) (1,2). The genes encoding rRNAs are clustered at a few sites in the genome (3–6), whereas the genes encoding RPs are widely dispersed (7,8). The expression of these genes has to be coordinately regulated to ensure the equimolar assembly of the components into a ribosome particle (9).
Although the complete process for assembling the functional ribosome has not yet been elucidated, the influence of the gene dosage of each ribosomal component on development has been extensively studied in Drosophila melanogaster. For example, haploinsufficiency of any one of the RP genes yields a Minute phenotype, which includes short and thin bristles, reduced body size, diminished fertility and recessive lethality (10,11). In addition, bobbed and mini mutants, which display a phenotype similar to Minute, are also caused by a quantitative deficiency of rRNA genes. These phenotypes may reflect a reduced rate of protein synthesis resulting from insufficiency of ribosomal components in early development (11,12). Recently, heterozygous mutations of the human RPS19 gene were found in 25% of unrelated patients with Diamond–Blackfan anemia (DBA), which is characterized by a decrease in or absence of erythroid precursors in the bone marrow (13,14). So far, this is the only reported case in which an RP gene mutation has been found associated with human disease, although how this RP defect causes the DBA phenotype is still unknown. Together with the above-mentioned Drosophila mutants, this implies that two copies of each RP gene are required for normal growth and development of multicellular organisms.
In mammalian female cells, one of the two X chromosomes is inactivated, which provides a dosage compensation mechanism to overcome sex differences. We have recently mapped four RP genes to the human X chromosome (7,8). If haploinsufficiency of any of these genes causes abnormal phenotypes as seen in Drosophila Minute mutants, either they must be twice as active as autosomal RP genes or there must be a second functionally redundant gene elsewhere in the genome. In fact, one of the X-linked RP genes RPS4X, has been shown to escape X inactivation and to have a functional homolog on the Y chromosome (RPS4Y) (15). However, none of the other X-linked RP genes have Y chromosome homologs. Interestingly, several X-linked housekeeping genes, including PGK, PDHA, XAP-5, G6pd and Cent1, have functional intronless copies on autosomes (16–20). These copies are believed to have retrotransposed from the X-linked genes and are responsible for compensating for the silenced genes during spermatogenesis. In this study, we have identified three new members of the human RP gene family, RPL10L, RPL36AL and RPL39L, that most likely retrotransposed from X-linked genes. We also show their characteristic expression patterns in various tissues as well as in cancer cells.
The sequence data described in this paper have been submitted to the DDBJ/EMBL/GenBank DNA databases under accession numbers AB063605–AB063610.
Genomic sequences were retrieved from the human draft sequence by BLAST search using the cDNA sequences of the X-linked RP genes as the query. We determined the intron/exon structure by comparing the genomic sequences with the full-length cDNA sequences that were assembled in silico from retrieved expression sequence tags (ESTs) or cDNA sequences of the genes. The assembled cDNA sequences will appear in the DDBJ/EMBL/GenBank DNA databases under accession numbers AB063608–AB063610.
The GeneBridge 4 and Stanford G3 radiation hybrid (RH) panels (Research Genetics) were employed for mapping the newly identified genes. We tested the panels by PCR using sequence tagged sites (STSs) generated in the draft sequence. The STS sequences were verified by re-sequencing the PCR products and deposited in the DDBJ/EMBL/GenBank DNA databases under accession numbers AB063605–AB063607. The data vectors were submitted to the RH servers at the Whitehead Institute/MIT Center for Genome Research (GeneBridge 4; http://www-genome.wi.mit.edu/cgi-bin/contig/rhmapper.pl) or the Stanford Human Genome Center (G3; http://shgc-www.stanford.edu). Sequences of the primer pairs are as follows: RPL10L, 5′-GCCTCAGGACTCTATG GTTCC and 5′-CAGGTCAAAGATGCGGATCT; RPL36AL, 5′-CAAAGTGCTGGGATTACAAGC and 5′-CAGCAGGG CTGTTTTGTCTATA; RPL39L, 5′-GGATCCTGAGTGG CAATGAG and 5′-TTCATCTGAATCCACTGGGG.
The expression patterns of the RP genes were analyzed on two sets of commercially available poly(A)+ RNA blots [Human multiple-tissue northern (MTN) Blot I and II, CLONTECH]. To avoid cross-reaction during hybridization, probes were generated against the 3′-non-coding region of the genes. The 5′ end of probe was labeled with [γ-32P]ATP (Amersham Biosciences) using T4 polynucleotide kinase (MEGALABEL™, Takara). The blots were pre-treated by ExpressHyb™ hybridization solution (CLONTECH) for 30 min at 37°C, and then hybridized with the labeled probe for 1.5–2 h at 37°C. After washing according to the manu facturer’s protocol, the blots were exposed to the BAS1500 system imaging plate (FUJI FILM) overnight and analyzed by the attached ImageGauge program. Sequences of the synthesized probes are as follows: RPL10, 5′-GTGAGTATTAA GAGGGGGGCAGCACATTGG; RPL10L, 5′-GCCAGTAA ACAGAATTTATTAGTAAGCATA; RPL36A, 5′-GCCAG TAAACAGAATTTATTAGTAAGCATA; RPL36AL, 5′-CG GGTAACTTTTCTATGGCTTCACCA; RPL39, 5′-GTGTT CATAACAGATTCAGAGAGGA; RPL39L, 5′-TACTAGC ACAGAGCATACAGAAA. The accession numbers of cDNA sequences from which the probes were chosen are listed in Table Table11.
RP gene expression patterns were also analyzed by PCR using the cDNA panels of 16 different human tissues [Human multiple tissue cDNA (MTC) Panel I and II, CLONTECH] and 8 tumor cell lines (Human Tumor MTC Panel, CLONTECH). PCR was performed in a 10 µl reaction volume containing ~20 pg of template cDNA, 5 pmol each of the forward and reverse primers (listed in Table Table1),1), 0.1 mM dNTPs and 0.35 U of the Expand Long polymerase (Boeringer Mannheim). The thermal cycling conditions included an initial denaturation at 94°C for 3 min, followed by 35 cycles of 94°C for 0.5 min, 58°C for 1 min, 70°C for 1 min and final extension at 72°C for 5 min.
We performed BLAST searches on the human draft sequence for functional second copies of the X-linked RP genes RPL10, RPL36A and RPL39. The cDNA and EST databases were also searched to help determine whether these genes are expressed. Although a number of sequences similar to the X-linked genes were found, most of them have nonsense mutations within the open reading frames and are not expressed, suggesting that they are pseudogenes. Nonetheless, we found three genes with coding sequences that are 89–95% identical and amino acid sequences that are 92–99% identical to each of the X-linked RP genes. All of these genes were also found in the EST database, suggesting that they are functional. We therefore designated these genes RPL10L, RPL36AL and RPL39L, corresponding to their X chromosome homologs (Table (Table22).
We then predicted the genomic structure of each new RP gene by comparing its cDNA sequence with the draft genome sequence (Fig. (Fig.1).1). Interestingly, none of these genes have introns in their coding regions, while their X chromosome homologs do have introns. This suggests that these genes are derived from their X-linked homologs by retrotransposition. In contrast, RPL36AL and RPL39L contain at least one intron in their 5′-non-coding regions (Fig. (Fig.1).1). Because the 5′-non-coding regions are dissimilar both in size and sequence to those of the X-linked genes, they may have evolved to express the transposed genes by acquisition of promoters. We also determined the GC contents and CpG ratios (21) along the entire regions of the genes (Fig. (Fig.1).1). The GC contents in the region extending from the predicted transcription start site up to –300 bp are 58% (RPL10L), 68% (RPL36AL) and 62% (RPL39L), while the CpG ratios are 0.6, 0.9 and 0.8, respectively. Because increased GC contents and CpG ratios are commonly seen in promoter regions, the predicted transcription start sites seem to be reasonable (22).
In the case of RPS4X and RPS4Y, another X-linked RP gene and its homolog, the intron/exon structures are completely identical between the two genes (23). These genes may have branched out from a single ancestral autosomal gene at the emergence of the sex chromosomes during evolution, with RPS4Y surviving subsequent Y chromosome evolution (24).
We have localized RPL10L, RPL36AL and RPL39L to autosomes by typing two different radiation hybrid mapping panels, GeneBridge 4 and Stanford G3, using STSs specific to these genes. The data vectors on GeneBridge 4 indicated that RPL10L is located at 6 centiRay (cR) from CHLC.GATA5C11, RPL36AL at 11 cR from D14S269, and RPL39L at 10 cR from D3S1571. Similarly, the results using Stanford G3 indicated that RPL10L is located at 50 cR from SHGC-1399, RPL36AL at 7 cR from SHGC-20858, and RPL39L at 17 cR from SHGC-1745. The results obtained from these two panels are in agreement. RPL10L was thereby assigned to chromosome 14q21.2–21.3 between markers D14S288 and D14S269, RPL36AL was assigned to 14q21.3 between D14S269 and D14S66, and RPL39L was assigned to 3q27.3 between D3S1262 and D3S1580 according to the Ensemble database (25) and a cytogenetic BAC-STS map (26) of the human genome (Fig. (Fig.22).
To investigate the mRNA expression profiles of the X-linked RP genes and their autosomal homologs in various tissues, we performed northern blot analysis using oligomers designed from the 3′-non-coding regions of the genes as probes (Fig. (Fig.3).3). We found that RPL10, RPL36A, RPL36AL and RPL39 are expressed ubiquitously, whereas RPL10L and RPL39L are expressed only in testis. To confirm these results, we performed PCR assays on cDNAs from multiple normal tissues using the STSs listed in Table Table1.1. The expression patterns of RPS4X and RPS4Y were also examined. The mRNA expression patterns in normal tissues correlate with those shown by northern blot analysis (Fig. (Fig.4,4, lanes 1–16). Because the search of EST databases suggested that RPL39L is expressed in some carcinomas, we further examined whether these genes are expressed in cancer cells using cDNAs prepared from eight different cell lines. The transcripts from the four X-linked RP genes and RPL36AL were detected in all tested cell lines, the RPL39L transcript was detected in all but two cell lines, and the RPS4Y and RPL10L transcripts were not detected in any of the lines (Fig. (Fig.4,4, lanes 17–24).
We compared the 5′-non-coding regions of the autosomal RP genes with those of their X-chromosome homologs and found that while the sequences extending from the ATG up to the –10 to –30 bp position are similar, the regions further upstream are completely different. This suggests that while the regions immediately adjacent to the start codon probably came from the X-linked RP genes during retrotransposition, the upstream regions seem to have arisen after the retroposition events occurred. In contrast, we found some sequence similarities in the upstream regions of the autosomal genes (Fig. (Fig.5).5). Although the function of these sequences is unknown, they might play an important role in the autosomal gene expression. Interestingly, we also found some sequences that are similar to those of the human endogenous retroviral (HERV) long terminal repeats (LTRs), though the similarities are very limited. It is known that some HERV genes are expressed in human cell lines or tissues (27), and that their expression is primarily controlled by the LTRs, which harbor multiple sequences that are recognized by cellular transcriptional machinery. Figure Figure55 shows an alignment of the regions upstream of the predicted 5′ ends of retrotransposed genes and the U3 region of HERV-K LTR.
We have identified functional homologs of human X-linked RP genes and have localized them to autosomes. Because these genes have no introns in their coding regions, they were most likely produced by retrotransposition of the original X-linked genes during evolution. Although each mammalian RP is typically encoded by a single gene, this functional gene also generates a large number of retroposons. However, the majority of these retroposons would not be expected to survive during evolution because, without promoters, they are inactive at integration sites and would therefore accumulate mutations in their open reading frames. In fact, there are at least a dozen processed pseudogenes for each RP gene in the genome (28–31). In this study we have identified three genes that appear to have been protected from such evolutionary pressure and to be actively transcribed. Similar retrotransposed genes have been reported, including PGK2, PDHA2, X5L and HNRNP G-T (16–18,32). All of these genes have a progenitor on the X chromosome and are highly expressed in testis. We also observed testis-specific expression of RPL10L and RPL39L, which may indicate a role of the retrotransposed genes in compensating for the inactivated X-linked genes during spermatogenesis.
Based on their findings, which included an analysis of 49 intronless paralogs of autosomal RP genes, Venter et al. (33) suggested that there was no bias toward the X chromosome origination of active retroposons during evolution. Because we have identified only three active RP retroposons despite a thorough search of the public DNA databases and have found that all of these genes originated from the X chromosome, we believe that there may actually have been a strong bias for retrotransposition of X chromosome RP genes.
Haploinsufficiency of any of the RP genes causes viable but abnormal phenotypes (Minute) in Drosophila (11), and heterozygous mutations in RPS19 are associated with DBA (13,14), suggesting that two copies of each RP gene are essential for normal growth and development. It therefore seems that genes on the sex chromosomes must increase their expression levels to compensate for gene dosage. One of these genes, RPS4X, reportedly escapes X-inactivation, and both it and its Y homolog (RPS4Y) are ubiquitously expressed (15,23). In contrast, we have shown in this study that the remaining X-linked RP genes have autosomal copies, which are expressed either ubiquitously or only in testis. Recently, we have reported a unique feature of RP gene promoters in which transcription always starts within a characteristic oligopyrimidine tract (34). We found this oligopyrimidine tract in the 5′-upstream region of RPL36AL, which is expressed ubiquitously, but not in those of RPL10L and RPL39L, which are expressed only in testis (data not shown). This oligopyrimidine tract may therefore play an important role in ubiquitous RP gene expression.
Several mechanisms have been proposed for retroposons to acquire a functional promoter (35–37). For example, (i) the retroposon may contain the original gene promoter, (ii) the insertion may occur near an existing promoter, and (iii) mutations may create a functional promoter upstream of the retroposon insertion. The three retroposons identified in this study have little similarity to the original X-linked genes in their 5′-non-coding regions but are slightly similar to HERV-K LTR in a small 70 bp region. A large number of solitary HERV LTRs have been created by homologous recombination, and their possible role in promoter control of downstream genes has been proposed (38,39). We also found such similarity in previously reported X-originated retroposons, including PGK2, PDHA2, X5L and HNRNP G-T (data not shown). Also, Chen et al. found the similarity in the 5′-non-coding region of mouse zinc-finger protein Zfp352 gene, which is thought to have arisen from retrotransposition (40). HERV-K LTR, therefore, might be involved in the active transcription of these retrotransposed genes.
Finally, we searched the DNA databases for mouse RP orthologs, and found three genes, designated Rpl10l, Rpl36al and Rpl39l, which correspond to human RPL10L, RPL36AL and RPL39L, respectively (data not shown). They are 84–92% similar to the human genes in their nucleotide sequences and are located in the syntenic regions between the two species. Moreover, the EST database search has shown that Rpl36al is expressed ubiquitously, whereas Rpl10l and Rpl39l are expressed primarily in testis, which is consistent with the human gene expression patterns. Further experiments using the mouse system will be of great interest to understand the function of these retroposons.
This work was supported in part by Grants-in-Aid for Scientific Research on Priority Areas from the Ministry of Education, Culture, Sports, Science and Technology of Japan, and the Fund for ‘Research for the Future’ Program from the Japan Society for the Promotion of Science.
DDBJ/EMBL/GenBank accession nos AB063605–AB063610