|Home | About | Journals | Submit | Contact Us | Français|
Previous studies have reported the existence of eleven different single nucleotide polymorphisms (SNPs) within human PECAM-1 mRNA, several of which have recently been associated with disease. Though SNPs in the PECAM-1 gene have been known for some time, the genetic background on which they exist, and their association into distinct allelic isoforms has not yet been established. To identify the major allelic isoforms of PECAM-1, we determined the nucleotide sequence of individual full-length cloned cDNAs derived from anonymous, unrelated volunteer individuals. Initial sequence analysis of 34 alleles from 17 individuals confirmed the presence of two distinct human PECAM-1 alleles (L98S536R643 and V98N536G643) within the human population. Each of these were found, upon more detailed analysis, to be superimposed on a previously unreported a2479g nucleotide polymorphism within the 3′ untranslated region (3′UTR) that occurred on both allelic isoforms - yielding a total of four major alleles. Multiplex Luminex bead analysis of an additional 259 individuals allowed identification of 117 individuals homozygous for either the L98S536 or V98N536 allele, and sequence analysis around the R643G and a2479g polymorphic sites permitted accurate determination of significant differences in the gene frequencies of LSRa, LSRg, VNGa, and VNGg among Caucasian individuals. Identification of these PECAM-1 allelic isoforms should facilitate future detailed examination of PECAM-1-related disease associations, and may help resolve previously disparate results.
PECAM-1 (CD31) is a vascular cell adhesion and signaling receptor that is expressed on the surfaces of platelets, leukocytes, and endothelial cells (Newman et al 1990), and is encoded by a ~70 kb gene near the end of the long arm of chromosome 17 (17q23) (Gumina et al 1996). PECAM-1 exists in mature form as a 130 kDa Type I transmembrane glycoprotein comprised of a 574 amino acid extracellular domain containing six Ig-like homology domains, a 19 amino acid transmembrane domain, and a cytoplasmic tail of varying length due to alternative splicing (Kirschbaum et al 1994). Ig-domain 1 mediates homophilic binding (Liao et al 1997; Newton et al 1997; Sun et al 1996), while Ig-domain 6 binds calcium (Jackson et al 1997a) and has been suggested to participate in cis interactions with integrin αvβ3 within the plane of the plasma membrane (Wong et al 2000). The cytoplasmic tail of PECAM-1 possesses two Immunoreceptor Tyrosine Inhibitory Motifs (ITIMs) (Newman 1999) which, upon tyrosine phosphorylation, recruit and activate the protein tyrosine phosphatase, SHP-2 (Jackson et al 1997b; Masuda et al 1997; Sagawa et al 1997). PECAM-1 has been demonstrated to participate in a variety of physiological events, including leukocyte adhesion and migration, angiogenesis, apoptosis, and modulation of Immunoreceptor Tyrosine Activating Motif (ITAM)-mediated cellular activation (for recent reviews, see (Ilan and Madri 2003; Newman 1997; Newman and Newman 2003)).
Like most genes, variations within the nucleotide sequence of the PECAM-1 gene have been reported, with individual polymorphic residues identified within the 5′UTR, the extracellular and cytoplasmic domains, and the 3′UTR (summarized in Table 1). While the effects of these polymorphisms on PECAM-1-mediated adhesion or signaling have not yet been determined, mismatches at PECAM-1 amino acid residues 98, 536, or 643 have often (Balduini et al 1999; Behar et al 1996; Cavanagh et al 2005; Grumet et al 2001; Maruya et al 1998), though not universally (Nichols et al 1996), been associated with an increased incidence of acute graft-versus-host disease (GVHD), and these and other PECAM-1 polymorphisms have been linked with early onset of atherosclerosis (Elrayess et al 2003), increased risk of cardiovascular disease (Elrayess et al 2004; Fang et al 2005; Listi et al 2004; Sasaoka et al 2001; Song et al 2003; Wei et al 2004), and susceptibility to malarial infection (Kikuchi et al 2001), though the latter is also controversial (Casals-Pascual et al 2001).
While the frequencies of individual PECAM-1 polymorphisms have been determined in a limited number of population studies, these polymorphisms have not, to date, been linked into distinct PECAM-1 alleles, hampering efforts to more definitively establish PECAM-1-related disease associations. The purpose of the present investigation, therefore, was to determine the major alleles bearing each of the most commonly reported SNPs within the PECAM-1 gene. This information should not only permit more precise bio-epidemiological associations to be made amongst different human populations, but also enable biochemical and cell biological studies to be performed to investigate whether functional differences between PECAM-1 allelic isoforms might be causally linked to the reported disease associations of PECAM-1 SNPs.
Human whole blood was obtained from anonymous volunteer blood donors. RNA was isolated using a QIAamp® RNA Blood Mini Kit according to manufacturer’s instructions (Qiagen, Valencia, CA). cDNA was then prepared from human RNA using the SuperScript™ First-Strand Synthesis for RT-PCR kit (Invitrogen, Carlsbad CA). Following cDNA synthesis, RNase (1 μl) (Invitrogen) was added, tubes incubated at 37°C for 20 minutes, and then put on ice or frozen at −20°C for later use. Genomic DNA was isolated from whole blood using a DNA Blood Mini Kit (Qiagen) following lysis of red blood cells with ammonium chloride.
Primers used to generate nested PCR products were designed to amplify a 2520 bp region encompassing all known individual polymorphisms within the human PECAM-1 transcript (including the 5′- and 3′UTRs, and were manufactured by Integrated DNA Technologies, Inc. (Coralville, IA). Primary PCR reactions were carried out using PfuTurbo® DNA Polymerase (Stratagene, La Jolla, CA) using primers corresponding to human PECAM-1 18–35 (sense) 5′-gccatggctgccattacc-3′ and 2573–2554 (antisense) 5′-taagaaccggcagcttagcc-3′. Amplification was performed for 35 cycles (95°C for 30 seconds, 59°C for 30 seconds with a ramping speed of 3°C/second and a 10°C gradient, 72°C for 3 minutes) in an Eppendorf Mastercycler Gradient Thermal Cycler (Brinkmann Instruments, Westbury, NY). Nested PCR reactions were performed for 25 cycles using identical amplification conditions using internally nested primers corresponding to PECAM-1 26–43 (sense) 5′-tgccattacctgaccagc-3′ and 2545–2528 (antisense) 5′-tgctgtgttctgtgggag-3′.
PCR products were separated from primers on a 1.5% agarose gel and then purified using the QIAquick PCR Purification kit (Qiagen) according to manufacturer’s instructions. They were then ligated into the pPCR-Script Amp SK(+) cloning vector and transformed into XL10-Gold Kan ultracompetent cells (Stratagene). Plasmids were isolated using the QIAprep Miniprep kit (Qiagen), analyzed by selective restriction enzyme digestion, and their PECAM-1 cDNA inserts fully sequenced in a 96-well plate using an Applied Biosystems Model 3100 Capillary Sequencer using the following primers: 355–372 (sense) 5′-aagaacctgaccctgcag-3′, 411–392 (antisense) 5′-aggcttgacgtgagaggtgg-3′, 1177-1194 (sense) 5′-ttttccaagcccgaactg-3′, 1588–1607 (sense) 5′-gcggtattcaaagacaaccc -3′, 2030–2047 (sense) 5′-tcggagtgatcattgctc-3′, and 2545–2528 (antisense) 5′-tgctgtgtctgtgggag-3′. Sequencing reactions were performed for 25 cycles (96°C for 30 seconds, 50°C for 15 seconds, and 60°C for 4 minutes). Sequence analysis of smaller defined regions surrounding the R643G and a2479g polymorphisms was performed by PCR amplification using 5′ctagaatttcccttgtcactcaccc-3′ (sense) and 5′-gctaccttcattgacacatcggct-3′ (antisense), and 5′-gggcaatcttcaatcttgag-3′ (sense) and 5′-tgggagcagggcaggttcataaat-3′ (antisense) primers, respectively, and sequenced using (sense) 5′-ctagaatttcccttgtcactcaccc-3′, (antisense) 5′-gctaccttcattgacacatcggct-3′, (sense) 5′-gggcaatcttcaatcttgag-3′.
Large-scale population genotyping for the L98V and S536N polymorphisms was performed using a MultiCode-PLx (EraGen Biosciences, Madison, WI) multiplex assay. Fluorescently labeled allele-specific elongation products were hybridized to Luminex™ (Austin, TX) microspheres and detected on the Luminx100 instrument using methods recently described in detail (Pietz et al 2005). Primers used for both locus-specific amplification and allele-specific extension reactions were synthesized by EraGen Biosciences (Madison, WI, see http://www.eragen.com/diagnostics/technology.html). The following primers (‘x’ denotes the IsoBase residue, isoC) were used in multiplex allele-independent amplification of PECAM-1 around the L98V polymorphism (sense) 5′-xatctatgactcagggac-3′, (antisense) 5′-gtgctcagttccaag-3′, and around the S536N polymorphic site (sense) 5′-ttctatcaaatgacctcaaat-3′, (antisense) 5′-xaggctgtgcagtaat-3′. Allele-specific elongation reactions were performed with primers specific to the L98V polymorphism 5′-caaggactcaccttccaccaacag-3′ and 5′-caaggactcaccttccaccaacac-3′, and primers specific to the S536N polymorphic site 5′-ttggaccaagcagaaggctag-3′ and 5′-tttggaccaagcagaaggctaa-3′.
Hardy-Weinberg equilibrium analysis was performed, with all deviations from Hardy-Weinberg equilibrium analyzed using a Fisher’s Exact Test contingency table. Statistically significant differences in the gene frequencies of PECAM-1 allelic isoforms were determined by Chi-square analysis.
Single nucleotide polymorphisms within the PECAM-1 gene, including several that result in amino acid substitutions, were originally identified by comparing the sequences of PECAM-1 cDNAs cloned from different laboratories (Newman et al 1990; Simmons et al 1990; Stockinger et al 1990; Zehnder et al 1992), as well as the single sequence derived thus far from PECAM-1 genomic DNA (Kirschbaum et al 1994). The frequencies for several individual SNPs have since been determined in several small studies, and the results obtained from these are summarized in Table 1, which additionally shows the location of the SNPs relative to the exon in which they are encoded.
Because PECAM-1 SNPs have recently been implicated as risk factors for graft rejection, malarial infection, and cardiovascular disease (Table 2), we undertook identification of linked polymorphisms that might more precisely define PECAM-1 allelic isoforms. In preliminary studies, RNA was isolated from small blood samples derived from seventeen unrelated, anonymous volunteer blood donors, converted to cDNA, and nested PCR was performed to amplify full-length PECAM-1 transcripts (shown schematically in Figures 1A & 1B). The final 2500 bp amplified product (Figure 1C), encompassing all known PECAM-1 polymorphisms, including those within the 5′ and 3′UTRs, was subcloned into a plasmid vector, and five clones from each donor were sequenced in their entirety to determine the linkage of SNPs carried on each of the two PECAM-1 alleles present in that individual. Sequence analysis of 34 alleles from 17 individuals revealed the presence of two distinct human PECAM-1 alleles (L98S536R643 and V98N536G643), each of which was superimposed on an additional a2479g nucleotide polymorphism within the 3′UTR that occurred on both allelic isoforms – yielding a total of four major alleles, depicted in Figure 2. None of the other reported polymorphisms noted in Table 1 were found in any of the 34 alleles examined, indicating that they are likely to be rare variants of one or more of the four primary alleles. The genetic background on which they occur remains to be established.
To determine the gene frequencies of the four major PECAM-1 alleles - LSRa, LSRg, VNGa, and VNGg, genomic DNA from 259 healthy Caucasian blood donors was subjected to Luminex-based MultiCode-PLx bead hybridization analysis that detected fluorescently labeled allele-specific elongation products hybridized to Luminex™ microspheres in multiplex (Pietz et al 2005). Of these, 49 individuals were found to be L98S536 homozygotes and 68 people were V98N536 homozygotes. PCR amplification and sequence analysis of the regions surrounding the polymorphisms at R643G and a2479g was performed on these samples, and from them the gene frequencies for LSRa, LSRg, VNGa, and VNGg were calculated, and found to be 0.14, 0.28, 0.27, and 0.31, respectively. Chi-square analysis revealed significant differences in the frequencies of the LSR versus VNG allelic isoforms (p<0.025), and also between the LSRa and LSRg alleles (p<0.01). None of the other gene frequencies (e.g. between VNGa and VNGg) were found to be significantly different.
Haplotype blocks are 10–100 kb regions of the human genome that contain SNPs sufficiently close to each other as to be nearly always inherited together - i.e. in strong linkage disequilibrium (Daly et al 2001; Gabriel et al 2002; Shifman et al 2003). Thus, though few or no recombination events occur within the “blocks” themselves, intervals between haplotype blocks are commonly sites of recombination, leading to the loss of linkage disequilibrium between blocks. The observation that two major human PECAM-1 alleles, L98S536R643 and V98N536G643, are each superimposed upon an additional a2479g nucleotide polymorphism within the 3′UTR, can be explained by the fact that exons 12–16 within the PECAM-1 gene are widely dispersed amongst large introns (more than 26,000 base pairs exist between the R643G polymorphism in exon 12 and the a2479g nucleotide polymorphism within exon 16), and that the middle third of the PECAM-1 gene (e.g. exons 8–12) travels together as a haplotype block. A crossing-over event somewhere between exons 12 and 16 (depicted schematically in Figure 3) was likely responsible for generating the four major alleles of PECAM that we observed in this study. Moreover, though the S536N (exon 8) and R643G (exon 12) polymorphisms have consistently been found in nearly complete linkage disequilibrium ((Elrayess et al 2004; Maruya et al 1998; Sasaoka et al 2001; Wei et al 2004), and this study), evidence exists that the L98V polymorphism within exon 3 can be inherited independently (Cavanagh et al 2005; Listi et al 2004; Maruya et al 1998; Sasaoka et al 2001), and may not be as tightly-linked to the centrally-located haplotype block within the PECAM-1 gene. Thus, a second crossing-over event (depicted with dotted lines in Figure 3) appears to be responsible for forming the less common VSR and LNG allelic isoforms. Future studies will be required to determine whether the frequencies of PECAM-1 alleles might vary within different ethnic groups.
Though no functional consequences of, or disease associations with, the a2479g nucleotide polymorphism have yet been reported, 3′UTRs can influence both mRNA stability (Rousseau et al 2003) and translation efficiency, and have also been associated with disease states. For example, a polymorphism in the 3′UTR of the interleukin-12B gene has been found to associate with late onset of type 1 diabetes mellitus (Windsor et al 2004), while another study demonstrated that a polymorphism within the 3′UTR of the cyclooxygenase-2 gene contributes to lung cancer susceptibility within the Chinese population (Hu et al 2005). Interestingly, the “prioritization” of synthesis of certain mRNAs involved in the synthesis of selenium-containing proteins have been shown to be influenced by sequences within 3′UTRs, and polymorphisms within this region may also regulate expression (Hesketh 2004). Based upon these findings, it may be of future interest to examine whether the PECAM-1 3′UTR polymorphism affects PECAM-1 mRNA and protein expression levels, and thereby PECAM-1-mediated adhesion and/or signaling.
Our identification of four major allelic isoforms for PECAM-1 within the Caucasian population may permit more detailed and accurate examination of PECAM-1-related disease associations (Table 2). Thus, although Sasaoka and colleagues (Sasaoka et al 2001) found a disproportionate increase in the frequencies of the L98, S536, and R643 forms of PECAM-1 in patients admitted for myocardial infarction (MI), especially males <60 yrs old, Listi et al. (Listi et al 2004) found an increased frequency (38.6% in patients compared with 24.6% in healthy individuals) of the R643 polymorphism in Sicilian MI patients in the absence of significant differences in the frequencies of L98 or S536. Similarly, Song and colleagues (Song et al 2003) found a significant increase in the frequencies of V98 and N536 polymorphisms in patients presenting with early onset of coronary artery disease (CAD), while two other studies (Fang et al 2005; Wei et al 2004) found only V98 associated with CAD. Finally, PECAM-1 serves as an endothelial cell receptor for certain strains of Plasmodium falciparum-infected erythrocytes, and there are two reports regarding susceptibility of individuals carrying the Leu98 polymorphism to malarial infection: one showing an association (Kikuchi et al 2001) and one not showing such a connection (Casals-Pascual et al 2001). It is possible that re-genotyping these populations for PECAM-1 alleles, including the newly described 3′UTR polymorphism at nucleotide 2479 (this manuscript), rather than for individual SNPs, might help resolve the seemingly disparate results obtained in each of these studies.
Finally, although the consequences for allogeneic bone marrow transplantation of mismatching individual PECAM-1 polymorphisms has been controversial (Balduini et al 1999; Behar et al 1996; Grumet et al 2001; Maruya et al 1998; Nichols et al 1996), Cavanagh et al.(Cavanagh et al 2005), recently suggested, based on the results of a small study, that genotyping for linked PECAM-1 polymorphisms might be a better predictor of both acute graft-versus-host disease (P=0.004) and overall patient survival.
The major finding of the present study is that, rather than existing as a single invariant species, PECAM-1 is actually represented by distinct molecular isoforms that differ in functionally important regions within both the extracellular and cytoplasmic domains. In addition, a polymorphism within the 3′ untranslated region further distinguishes these two protein isoforms, yielding four primary human PECAM-1 alleles: LSRa, LSRg, VNGa, and VNGg, with frequencies 0.14, 0.28, 0.27, 0.31, respectively, within the Caucasian population. The middle third of the PECAM-1 gene displays strong linkage disequilibrium, and is inherited as a haplotype block. A crossing over event within the 26 kb region lying between the polymorphism in exon 12 and the polymorphism within exon 16 has led to the generation of these four major PECAM-1 alleles, while another recombination event within the first third of the gene led to the generation of at least two additional alleles that are relatively rare, at least in Caucasians of European descent. Because PECAM-1 polymorphisms affect residues within domains of the protein that are involved in adhesion and signal transduction, it may be important in the future to determine whether differences exist in the biochemical and cell biological properties of these distinct isoforms. Finally, identification of the primary alleles of PECAM-1 should provide both investigators and clinicians with the ability to more accurately establish correlation of these alleles with diseases in which PECAM-1 polymorphisms have been implicated, including CAD, MI, and GVHD.
The authors thank Daniel B. Rowe, Ph.D., Division of Biostatistics, Medical College of Wisconsin, for his assistance in statistical analysis. This study was funded by grants HL-40126 and Training Grant HL-07209 from the Heart, Lung, and Blood Institute of the National Institutes of Health grant.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.