|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: JNM JR RES EJD MO. Performed the experiments: JNM ASY CBL MGF MO EJD. Analyzed the data: JNM CO CBL JR RES SN. Contributed reagents/materials/analysis tools: CO CBL JR RES SN. Wrote the paper: JNM JR CO SN. Obtained the software to run analysis: SN CBL CO.
The function of prostate-specific antigen (PSA) is to liquefy the semen coagulum so that the released sperm can fuse with the ovum. Fifteen spliced variants of the PSA gene have been reported in humans, but little is known about alternative splicing in nonhuman primates. Positive selection has been reported in sex- and reproductive-related genes from sea urchins to Drosophila to humans; however, there are few studies of adaptive evolution of the PSA gene. Here, using polymerase chain reaction (PCR) product cloning and sequencing, we study PSA transcript variant heterogeneity in the prostates of chimpanzees (Pan troglodytes), cynomolgus monkeys (Macaca fascicularis), baboons (Papio hamadryas anubis), and African green monkeys (Chlorocebus aethiops). Six PSA variants were identified in the chimpanzee prostate, but only two variants were found in cynomolgus monkeys, baboons, and African green monkeys. In the chimpanzee the full-length transcript is expressed at the same magnitude as the transcripts that retain intron 3. We have found previously unidentified splice variants of the PSA gene, some of which might be linked to disease conditions. Selection on the PSA gene was studied in 11 primate species by computational methods using the sequences reported here for African green monkey, cynomolgus monkey, baboon, and chimpanzee and other sequences available in public databases. A codon-based analysis (dN/dS) of the PSA gene identified potential adaptive evolution at five residue sites (Arg45, Lys70, Gln144, Pro189, and Thr203).
Prostate-specific antigen (PSA) is encoded by the kallikrein-3 (KLK3) gene, which belongs to the tissue kallikrein (KLK) gene family; the official name of the gene is KLK3, although it is commonly referred to as PSA , . The PSA gene is composed of five exons and four introns. Part of the first exon codes for a signal peptide that targets the protein for secretion , . In the PSA protein, residues His57, Asp102, and Ser195 have been reported to be important for proteolytic activity of PSA and other kallikreins, and they are commonly referred to as the catalytic triad , . PSA has an activation peptide of seven amino acids that is cleaved by KLK2 protein to generate enzymatically active PSA , . PSA is highly expressed by prostatic epithelial cells and is abundant in seminal plasma . After ejaculation, semen forms a coagulum through the linkage of semenogelin proteins , . Then, the coagulum has to be liquefied so that the sperm can be released and fuse with the ovum. In humans, great apes, and Old World monkeys this is done by PSA , –.
Alternative splicing is a common mechanism in nature to enhance protein diversity, and it occurs in more than 90% of human genes , . Alternative splicing of the PSA gene produces at least 15 transcripts of 0.7–6.1 kb .
Positive selection has been reported in sex- and reproductive-related genes from sea urchins to Drosophila to humans , , . However, there is scant data regarding the PSA gene despite its importance in reproductive success. Marques et al. (2012) recently provided evidence of adaptive evolution of the KLK3 gene toward an expanded enzyme spectrum that has an effect on the hydrolysis of semen coagulum . In another study that analyzed the PSA gene, Clark and Swanson (2005) reported an overall dN/dS value of 0.38 for the PSA gene in human-chimpanzee pairwise comparisons, suggesting purifying selection .
The aims of this study are to investigate PSA transcript heterogeneity in four nonhuman primates commonly used in biomedical research and to study the mode of selection exerted on this gene in primates. Our previous data showed that cynomolgus monkeys have high serum PSA concentrations compared to baboons, which have low concentrations . The chimpanzee was examined because of its close evolutionary relationship to humans.
We opportunistically sampled animals that presented for necropsy. Six cynomolgus monkeys (Macaca fascicularis) and baboons (Papio hamadryas anubis) and four chimpanzees (Pan troglodytes) were from the Southwest National Primate Research Center (SNPRC), located at the Texas Biomedical Research Institute in San Antonio, Texas. Three African green monkey (Chlorocebus aethiops) samples were from the Department of Pathology/Comparative Medicine, Wake Forest University of Health Sciences, Winston-Salem, North Carolina. The chimpanzees died of heart failure as a result of chronic cardiomyopathy. Three cynomolgus monkeys were removed from the colony and euthanized because of positive titers to simian retrovirus. One baboon with a sperm granuloma that tested positive for herpes was removed from the colony and euthanized. The other animals were part of research studies and were euthanized at the end of the studies.
Animals were chemically restrained with an injection of ketamine HCl and euthanized humanely under the supervision of a veterinarian using euthanasia solution (Fatal plus, Vortech Pharmaceuticals, Dearborn, MI). Confirmation of death was determined by monitoring for absence of pulse, respiration, and neural reflexes. The Institutional Animal Care and Use Committee of the Texas Biomedical Research Institute approved all procedures. Human prostate cDNAs were from cryopreserved human prostate tissues that were collected following radical prostatectomies under an Institutional Review Board approved protocol for the Department of Pathology at the University of Texas Health Science Center, San Antonio, and have been described previously .
We extracted total RNA from prostates using Trizol reagent (Life Technologies, Carlsbad, CA), reverse-transcribed them, and performed polymerase chain reaction (PCR) analysis, as described previously .
Primer sequences were based on the cynomolgus monkey and chimpanzee sequences (GenBank accession numbers AY647976, NM_001009136) ,  and are shown in Tables 1 and and2.2. The forward primers were located in exon 1, and the reverse primers were located in exon 5 or in intron 3. We used primer sequences located in intron 3 because of reports indicating that some PSA variants contain sequences in the third intron . To obtain the complete coding sequences, we cloned the PCR products into the pCR2.1-TOPO vector (Life Technologies, Carlsbad, CA) and sequenced them. Cloning using the two primer sets was done at least four times for each animal species. Each time 22 colonies were picked and cultured. Positive clones were characterized by restriction enzyme analysis according to their size on an agarose gel. One clone of each size was sequenced. We predicted the presence of a signal peptide to determine whether the transcripts could be secreted outside the cell using the Signal Peptide Prediction website (http://www.cbs.dtu.dk/services/SignalP/).
A closer look at the transcripts indicated that they could be broadly grouped into those that used the sequences in intron 3 and those that did not. We performed real-time PCR to explore whether expression of transcripts that do not use intronic sequences takes precedence over those that do. We carried out quantitative reverse transcription (RT) PCR using TaqMan technology (Life Technologies, Carlsbad, CA). We designed primers and probes for the quantitation of the chimpanzee full-length and short-version transcripts using the IDTDNA SciTools software (Integrated DNA Technologies Inc., Coralville, IA); the sequences are shown in Table 2. TaqMan 18s ribosomal RNA probe and primers served as the internal control. All samples and standards were analyzed in triplicate. We conducted real-time fluorescence-based PCR using the ABI Prism 7900 real-time PCR thermal cycler (Applied Biosystems, Foster City, CA) under vendor-specified conditions. We determined the values of the unknown samples from the standard curves prepared in each assay and normalized the concentration of the PSA transcripts to 18s ribosomal RNA.
We assessed the evolutionary selection acting on the PSA gene using 11 PSA gene sequences. Our group produced four sequences, one each for Pan troglodytes, Macaca fascicularis, Papio hamadryas anubis, and Chlorocebus aethiops; these sequences are reported in this paper. Eight other sequences—Macaca fascicularis (AY647976), Macaca mulatta (NM_001042776.1), Macaca fuscata (KC155626.1), Homo sapiens (NM_001648.2), Pan paniscus (DQ150478.1), Gorilla gorilla (KC155627.1), Pongo pygmaeus (DQ150480.1), and Nomascus leucogenys (UCSC genome Browser, Gibbon chr10)—were retrieved from public databases.
Protein sequences were aligned using the Multiple Sequence Comparison by Log-Expectation (MUSCLE) program, and codon-guided alignments were generated using Java Codon Delimited Alignment (JCoDA) , . Preliminary analysis of dN/dS (nonsynonymous changes/synonymous changes) was performed on all pairwise comparisons using the coding region and sliding window (window 20, jump 5) analysis in JCoDA. Although purifying selection dominated across all comparisons, multiple regions in the C-terminal region appeared to be under relaxed or positive selection or with dN/dS (ω) values greater than 2. To identify individual residues under positive selection, we used the Phylogenetic Analysis by Maximum Likelihood (PAML) software package, version 4.7, with the following comparisons: between null hypothesis M0 (one ratio for all branches) and hypothesis M3 (discrete), between hypotheses M1 (neutral) and M2 (selection), and between hypotheses M7 (β-distributed 0 to 1) and M8 (β-distributed and estimated ω). Sites under selection were identified using the Bayes empirical Bayes (BEB) approach, the naive empirical Bayes (NEB) approach, and the likelihood ratio test (LRT) in CODEML (part of the PAML package) –.
Phylogenetic trees were constructed using neighbor-joining, maximum parsimony, maximum-likelihood, and unweighted pair group method with arithmetic mean (UPGMA) methods provided with the Phylip package, version 3.695 . In each case, tree topology was tested using bootstrapping with 100 replicates.
We were cognizant of the reported chimeric nature of the KLK2/KLK3 genes in both Gorilla gorilla and Nomascus leucogenys. Marques et al. (2012) indicated that the gorilla and gibbon KLK3 gene is in fact a fusion of KLK3 and KLK2 (cKLK), in which the first four exons of the fused gene are orthologous to KLK3 and the last exon is more similar to KLK2 ; therefore in our analysis we excluded exon 5. However, it has also been reported that at the protein level these genomic rearrangements in exon 5 account for only minor amino acid replacements and are not predicted to alter protein structure and function and that cKLK is likely a functional KLK3-like gene .
When numbering the amino acid residues, we followed both the PSA protein numbering and also the convention used for serine proteases. The convention numbering system is based on chymotrypsin and denotes that the catalytic triad is composed of His57, Asp102, and Ser195 and that the “Kallikrein loop” is composed of an 11 amino acid insertion (95A–95K) that is specific for PSA .
We identified two PSA transcripts in the cynomolgus monkey, baboon, and African green monkey and six PSA transcripts in the chimpanzee prostate (Table 3; Figure 1). From all four species the same two main transcripts were identified: PSA-1 has the five exons characteristic of the kallikrein (KLK) gene family, and PSA-2 is missing exons 4 and 5 and uses sequences in intron 3. Because of intron retention, a unique 50-base sequence is added to exon 3. The PSA-1 transcript is derived from five exons with an open reading frame of 786 bases encoding a 261 amino acid protein and has a predicted molecular weight of 28,839 Da (Figure 2). This sequence is orthologous to the human KLK3 transcript variant 1 (NM_001648.2). The cynomolgus monkey PSA-1 is identical to the one reported by Marshal et al. (2006) under GenBank accession number AY647976.1 . The baboon sequence is identical to sequences under GenBank accession numbers NM_001112745, DQ150485 , and EF676031 . The chimpanzee sequence is identical to a sequence under GenBank accession number DQ150477 . The African green monkey PSA-1 is novel, because no similar transcript was found in the National Center for Biotechnology Information (NCBI) GenBank.
The PSA-2 transcript has an open reading frame of 543 bases, which encodes a 180 amino acid protein, and has a molecular weight of 19,820 Da (Figure 2). The chimpanzee transcript is identical to a chimpanzee sequence available in the GenBank under accession number NM_001009136.1  and is orthologous to gorilla and orangutan sequences numbers AY781395.1 and AY78139.1, respectively. It is 98% similar to a human sequence identified as PSA-RP2 reported by Heuzé-Vourc'h et al. (2001) (GenBank AJ310937, number CAC41631) . The cynomolgus monkey, baboon, and African green monkey PSA-2 sequences are novel; no similar transcripts were found in the NCBI GenBank.
Four more transcripts were identified in the chimpanzee. Chimpanzee PSA-3 has an open reading frame of 456 bases. It skips 89 bases of exon 3, resulting in an early stop codon. It codes for a 151 amino acid protein and has a predicted molecular weight of 16,470 Da. It is novel, as no similar transcript was found in the NCBI GenBank.
Chimpanzee PSA-4 is a result of an in-frame deletion of 129 bases of exon 3. This transcript has an open reading frame of 657 bases, which encodes a 218 amino acid protein, and has a molecular weight of 23,768 Da. This transcript is orthologous to the human KLK3 transcript variant 4 (NCBI accession number NM_001030048) and a rhesus monkey sequence (UniProt number F6TBJ4_MACMU).
Chimpanzee PSA-5 is novel, consisting of four exons. It skips 38 bases in the 5′ part of exon 4, resulting in an early stop codon, and is therefore shorter than chimpanzee PSA-1. It has an open reading frame of 507 bases and encodes a 168 amino acid protein and has a predicted molecular weight of 18,774. Da.
Chimpanzee PSA-6 is also novel. It is similar to the chimpanzee PSA-2 transcript, except that it skips 129 bases in the 5′ part of exon 3 as a result of use of an alternative splice acceptor site. It has an open reading frame of 414 bases, which encodes a 137 amino acid protein, and has a molecular weight of 14,730 Da.
We cannot rule out the possibility that other splice variants were not found by our cloning strategy, especially those expressed at much lower levels.
All the novel transcripts reported in this study for nonhuman primates were also found in human prostate cDNA, indicating greater similarity among the prostate transcriptomes across primates than has previously been appreciated. In addition, all the PSA transcripts reported have an intact signal peptide consisting of 17 amino acids. However, only the PSA-1 transcripts have an intact catalytic triad.
A comparison of the amino acid sequences of the PSA-1 and PSA-2 transcripts is shown in Figure 2. The transcripts reported in this study have been deposited in the NCBI GenBank under accession numbers KC853005 (Chlorocebus aethiops PSA-1), KC853006 (Chlorocebus aethiops PSA-2), KC853007 (Macaca fascicularis PSA-2), KC853008 (Papio anubis PSA-2), KC853009 (Pan troglodytes PSA-3), KC853010 (Pan troglodytes PSA-4), KC853011 (Pan troglodytes PSA-5), and JX445923 (Pan troglodytes PSA-6).
We next investigated quantitative differences in expression levels among the six chimpanzee transcripts. Using real-time PCR techniques, we did not find a significant difference in amplification rates between the transcripts that use sequences in intron 3 (chimpanzee PSA-2 and PSA-6) from those that do not (chimpanzee PSA-1, -3, -4, -5) (Figure 3).
We compared the PSA gene sequences among 11 primate species to analyze selection on the PSA gene. The pairwise dN/dS values across all codon sites of the PSA gene of any two species did not exceed the threshold of 1, indicating purifying selection (Table S1). Although purifying selection dominated across all comparisons, multiple regions in the C-terminal region appeared to be under relaxed or positive selection with dN/dS (ω) values greater than 2. To identify individual residues under positive selection, we used the PAML software package, version 4.7, with the following comparisons: M0 vs. M3, M1 vs. M2, and M7 vs. M8 using the BEB and NEB approaches and the likelihood ratio test. All model comparisons with CODEML identified multiple residues with p>0.5 (probability of selection greater than 0.5), indicating directional selection. The M7 vs. M8 comparisons using the NEB and BEB approaches were similar, identifying five codons with dN/dS values greater than 2 (Table 4; Figure 4). When the codon-specific dN/dS value data were compared with the alternative splicing data, we found that several splice variants originated at codons with dN/dS values greater than 2 (Table 4).
All methods generated similar consensus trees with similar bootstrap support for internal nodes. The only exception was the branching of Gorilla gorilla and Nomascus leucogenys, which varied between methods. The maximum-likelihood phylogenetic tree is shown in Figure 5, and the branching order is similar to the topology reported by others, except for the placement of Gorilla gorilla . We looked at branch-specific variation in selective pressure along phylogenetic lineages. This did not reveal any statistically significant differences, but we noted that the variations in dN/dS values ranged from 0 to 0.76 (Figure 5).
We found six alternatively spliced PSA variants in the chimpanzee but only two in the cynomolgus monkey, baboon, and African green monkey. To the best of our knowledge, this study is the first to report the African green monkey PSA mRNA sequence and also the first to report on alternative splicing of the PSA gene in nonhuman primates. It is important to note that the identification of alternative splice variants depends on the primers used, and therefore we cannot rule out the possibility that there are other splice variants not found by our cloning primers.
Alternative splicing of the PSA gene in this study involved exon truncation, intron retention, or a combination of two splicing events. In a study of human kallikrein genes, Kurlender et al. (2005) found exon skipping to be the most common splicing event, with internal exon deletion being less common in kallikrein genes . However, in our study we found retention of sequences in intron 3 to be the most common splicing event; it was found in all the species studied. Real-time PCR data indicated that in chimpanzees the expression of transcripts that do not use sequences in intron 3 did not take precedence over those that do, although we cannot rule out the possibility that under certain physiological conditions there might be differences in transcript expression. It is not clear why alternative splicing of the PSA gene is more common in chimpanzees than in cynomolgus monkeys, baboons, and African green monkeys, but chimpanzees do share a more recent common ancestor with humans than any other nonhuman primates do. All the transcripts reported here have orthologous human counterparts. The PSA-2 sequences are orthologous to the human PSA-RP2 sequence, which has been shown to be up-regulated in prostate cancer , indicating potential links of these transcripts to prostatic diseases. The link between these transcripts and human disease conditions needs further study. It is important to note that the splice variants were found in areas of the gene that are likely to be under selective pressure.
Not surprisingly, given the role of PSA, purifying selection dominates in all pairwise comparisons. Further analysis using codon-based models in CODEML identified multiple residues in the PSA gene having dN/dS values greater than 1, indicating directional selection. These data suggest that selective pressure in the PSA gene is not uniform across all codon sites but rather is focused on specific regions. Five codons were found to have dN/dS values greater than 2, although only one His45 reached the 95% confidence level. It is important to note that another study also found codons 45, 189, and 203 to be positively selected . In addition, we note that limiting our analysis to exons 1–4 curtailed our statistical power. When the same analyses were performed using full-length sequences with all five exons, we found additional codons that had dN/dS values greater than 1. Limiting our analyses to the first four exons was due to the reported fusion of KLK3 and KLK2 in Gorilla gorilla and Nomascus leucogenys, in which the first four exons of the fused gene are orthologous to KLK3 and the last exon is more similar to KLK2 .
At codon position 189 the amino acid proline is substituted for serine or phenylalamine. Substitution of proline residues in proteins is of significant interest because of proline's unique structural and functional properties. Proline has a rigid conformation with low flexibility; it can often be found in tight turns in protein structures and also introduces kinks into alpha helices because it is unable to adopt a normal helical conformation. It is likely that proline substitution at codon 189 results in secondary and tertiary structural alteration, leading to significant functional changes. Although we did not explore functional consequences of the change at codon position 189 in this study, this is an excellent question for future analysis.
Our results of codon-based selection indicate that dN/dS values for the PSA gene are low relative to other proteins from the seminal fluid and prostasome. Clark and Swanson (2005) reported a dN/dS value of 2.9 for β-microseminoprotein and a value of 14 for prostate-specific transglutaminase for the class of codons identified to be under positive selection . It is possible that the critical role for the PSA protein in copulatory plug dissolution constrains the changes tolerated and maintains low dN/dS values. Alternatively, the atypical evolutionary history of the PSA gene (e.g., fusion of KLK3 and KLK2 in some lineages) may be playing a contributing role that has yet to be defined.
Previous studies have indicated that chimpanzees have low serum PSA concentrations . In our unpublished data we also found that chimpanzee PSA concentrations range from 0.01 to 0.031 ng/ml, well below levels reported in humans and other nonhuman primate species . We propose that PSA gene evolution in the chimpanzee lineage might have functional implications. Clark and Swanson (2005) also showed that although the PSA gene does not have high pairwise human-chimpanzee dN/dS values, it does show significant variation in selective pressure during its evolution and dN/dS values along some phylogenetic lineages exceeded 1 .
The placement of the gorilla in our phylogenetic tree may be due to incomplete lineage sorting, whereby the genealogy relating humans, gorillas, and chimpanzees varies across the genome and about 22% of all coding exons exhibit phylogenies that contradict the overall genomic phylogeny, in which humans are more similar to chimpanzees than to gorillas . The genomic region containing PSA may be one of the regions that has a local phylogeny in which the gorilla sequences are indeed more similar to the human sequences than the chimpanzee sequences are to either human or gorilla.
Because the function of PSA is to liquefy the sperm coagulum formed after ejaculation, we speculate that the differences in alternative splicing reported in this study might be related to the mating systems of these species. The four species studied here (chimpanzees, cynomolgus monkeys, baboons, and African green monkeys) all exhibit multimale-multifemale mating systems in which any given adult female may mate with more than one male during a single estrous cycle. However, some data suggest that chimpanzee females may have larger average numbers of mates per cycle. A single chimpanzee female frequently mates with numerous males during a single estrous cycle , . Multiple males may follow, compete for, and copulate with a female during a single cycle . Such male-male competition also occurs in other species to some degree. But there is reason to believe that sperm competition is important in chimpanzees, as a rubbery and long-lasting “copulatory plug” that obstructs the sperm of subsequent mating events from accessing the ovum is produced by chimpanzee ejaculates and not by those of cynomolgus monkeys, African green monkeys, and baboons . Further studies of transcript heterogeneity comparing several species that have different mating patterns would further elucidate the possible relationship between the alternative expression of the PSA gene and mating systems.
Species pairwise comparison of the dN/dS values of the PSA gene.
We would like to thank Dr. Matthew J. Jorgensen of the Department of Pathology/Comparative Medicine, Wake Forest University Health Sciences, Winston-Salem, North Carolina, for providing the African green monkey prostates. We gratefully acknowledge the technical assistance of Cathy Snider, Rita Sholund, Maureen Robbins, Reneé Escalona, Michaelle Hohmann, Jesse Martinez, and Jacob Martinez. We would like to thank Dr. Anthony J. Valente, of the Department of Medicine, University of Texas Health Science Center, San Antonio; Dr. Dean Troyer, of Eastern Virginia Medical School, Departments of Pathology and Microbiology and Molecular Biology; and Jeff T. Williams of the Department of Genetics, Texas Biomedical Research Institute, for their helpful comments.
Dr. Mubiru was supported by the Office of the Director, National Institutes of Health under Award # 8K01OD010973-04. This investigation used resources that were supported by the Southwest National Primate Research Center grant P51 RR013986 from the National Center for Research Resources, National Institutes of Health and that are currently supported by the Office of Research Infrastructure Programs through P51 OD013986. Nonhuman primates were housed in facilities constructed with support from Research Facilities Improvement Programs Grants C06 RR015456 and C06 RR014578 from the National Center for Research Resources, National Institutes of Health. This work received computational support from Computational Systems Biology Core, funded by the National Institute on Minority Health and Health Disparities (G12MD007591) from the National Institutes of Health. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.