Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Anal Biochem. Author manuscript; available in PMC 2010 December 1.
Published in final edited form as:
PMCID: PMC2753488

Expression artifact with retroviral vectors based on pBMN


While characterizing various splice forms of p120 catenin, we observed what appeared to be a novel post-translational modification of 120 resulting in a higher molecular weight form that was dependent upon the splicing pattern. Further investigation revealed the higher molecular weight form to be a fusion protein between sequences encoded by the retroviral vector and p120. We found that the publicly available sequence of the vector we used does not agree with the experimental sequence. We caution other investigators to be aware of this potential artifact.

Keywords: p120 catenin, pBMN, LZRS, retroviral vector


The cytoplasmic tails of classical cadherins directly bind both β-catenin and p120 catenin [1]. In contrast to β-catenin, p120 [2] undergoes extensive alternative splicing leading to the expression of multiple protein isoforms [3]. There are 4 possible start codons with proteins initiating at the most upstream start codon termed isoform 1. In addition there are three exons (termed A, B and C) that are variably included. The longest possible protein would thus be termed p120-1ABC.

As part of an ongoing project, we obtained constructs encoding isoforms p120-1A and p120-1AC [4]. When we expressed these constructs in cells using a retroviral vector based upon LZRS [5], we observed a modification of p120 that appeared to be exon C — dependent. However, further analysis showed this was an expression vector artifact, and these data are presented below.

Materials and Methods

Cell culture

A431 cells (American Type Culture Collection, Manassas, VA) and S2-013 cells [6] were grown in Dulbecco’s Modified Eagle’s Medium (DMEM) supplemented with 10% fetal calf serum at 37°C in a 5% CO2 humidified atmosphere.

Reagents, Sources of the antibodies

Reagents were from Sigma-Aldrich (St. Louis, MO), or Fisher Chemicals (Fairlawn, NJ) unless otherwise indicated. Anti-p120 mouse mAb (pp120) was purchased from BD Biosciences (Franklin Lakes, NJ). Anti-HA mouse mAb (H3663) was from Sigma-Aldrich. Anti-beta-tubulin mouse mAb (E7) was from the Developmental Studies Hybridoma Bank (Iowa City, IA). Rabbit polyclonal antiserum against MoMLV integrase was kindly provided by Dr. Monica J. Roth [7]. The antiserum against the integrase was diluted 1:1000 in TBS (10 mM Tris-HCl pH 8.0 and 150 mM NaCl) for immunoblotting. The MoMLV gag p15 mouse monoclonal antibody (hybridoma supernatant from clone 34) was a kind gift from Dr. Bruce Chesebro [8].

cDNA constructs, transfections and infections

Full-length, HA-tagged, human p120 cDNAs were kind gifts from Dr. Xiang-Jiao Yang (McGill University, Montreal). p120-1AC contains extra 6 amino acids encoded by exon C in addition to p120-1A as described (see Fig. 1A, [4]). The p120-1A and p120-1AC cDNAs were inserted into a derivative of the LZRS-neo retroviral vector [9]. Details of the construction are available upon request. LZRS was derived from vectors based upon pBMN [5]. Constructs were transfected into Phoenix packaging cells using TransIT-LT1 Reagent (Mirus, Madison, WI). Conditioned medium containing recombinant retrovirus was supplemented with 4 μg/ml polybrene and added to target cells as described [6]. Transfected Phoenix cells were selected with 2 μg/ml puromycin, and infected target cells were selected with 1 mg/ml G418.

Figure 1Figure 1
A. The structures of the p120-1A and p120-1AC constructs are shown. In addition to p120-1A, p120-1AC contains extra 6 amino acids encoded by exon C in the middle of the Armadillo repeats. The N-terminal HA epitope and the epitope recognized by pp120 are ...

Extraction of cells, purification of proteins, mass spectrometry and N-terminal sequencing

Confluent monolayers of cells were rinsed three times with phosphate buffered saline (PBS) and extracted on ice with TNE buffer (10 mM Tris—HCl, pH 8.0, 0.5% Nonidet P-40, 1 mM EDTA) containing 1 mM phenylmethylsulfonyl fluoride (PMSF). Extracts were centrifuged at 14,000 rpm for 15 minutes at 4°C, and the supernatant was collected. Protein concentration was determined using a Bio-Rad protein assay kit (Bio-Rad, Hercules, CA).

The protein was purified by immunoprecipitation using an anti-HA antibody agarose conjugate (Sigma) according to the manufacturer’s protocol. In brief, cells were extracted on ice with RIPA buffer (50 mM Tris-HCl [pH 8.0], 1% Nonidet P-40, 0.5% sodium deoxycholate, 0.1% SDS, 150 mM NaCl) containing protease inhibitor cocktail (CalBiochem, La Jolla, CA). The cell extracts were incubated with anti-HA agarose conjugate overnight at 4°C and the agarose beads were washed five times with RIPA buffer. The immunoprecipitated HA-tagged proteins were eluted with 200 μg/ml HA peptide (Sigma, I2149) for an hour or with SDS sample buffer. The supernatant was collected, resolved by SDS-PAGE and transferred to nitrocellulose for mass spectrometry or PVDF membranes for N-terminal sequencing. The mass spectrometry data was analyzed using the MASCOT search engine. The samples for sequencing were sent to the Protein Structure Core Facility, UNMC (Omaha, NE).

Immunoprecipitation and immunoblotting

2.0 mg of proteins were added to 300 μl of hybridoma-conditioned medium and gently mixed at 4 °C for 2 hrs. 50 μl of packed anti-mouse IgG affinity gel (MP Biomedicals, Solon, OH) was added, and mixing continued overnight. Immune complexes were washed five times with TBST (10 mM Tris-HCl pH 8.0, 150 mM NaCl and 0.05% Tween 20). After the final wash, the packed beads were resuspended in 50 μl of 2× Laemmli sample buffer, boiled for 5 min, and the proteins were resolved by SDS-PAGE. Proteins were electrophoretically transferred overnight to nitrocellulose membranes and immunoblotted as described previously [10].

Sequencing a portion of pBMN-Z and LZRS

The amino acid sequence in Table 2B was deduced from the nucleotide sequence of LZRS obtained with the primers CTCCGTCTGAATTTTTGC and GGTCAAGCCCTTTGTACAC that prime upstream of the gag start codon and in the p15 open reading frame, respectively. pBMN-Z (purchased from Addgene) was sequenced with the same primers and found to contain sequence identical to LZRS.

Table 2
Amino acid sequences of the Moloney murine leukemia virus polyprotein and a portion of the LZRS vector. A. The sequence of the Moloney murine leukemia virus gag-pro-pol-int polyprotein is shown with the gag, protease, polymerase, and integrase proteins ...

Results and Discussion

Constructs encoding isoforms p120-1A and p120-1AC that were N-terminally HA-tagged (YPYDVPDYA, [4]) were ligated into the retroviral expression vector LZRS. The constructs were transfected into Phoenix packaging cells (5) and the retroviral particles used to infect target cells. Normally, p120-1A would be expected to migrate at approximately 110 kDa and this was observed (Fig. 1B). Since exon C is only 18 nt (Fig. 1A, [3]), p120-1AC is also expected to migrate at approximately 110 kDa. However when p120-1AC was expressed in cells, we observed two bands positive with both anti-HA and the monoclonal antibody pp120 that recognizes an epitope near the C-terminus of p120 [11]. One of these two bands essentially co-migrated with isoform p120-1A at 110 kDa while another band of approximately equal intensity was observed at about 135 kDa (Fig. 1B). We observed the same two bands when p120-1AC was expressed in S2-013, A431, HeLa, MiaPaCaII, MDA-MB-468, SCC-1, MCF7 and Phoenix packaging cells.

Initial experiments showed the 135 kDa band was not identified by anti-ubiquitin antibodies or antibodies against the sumo family members (results not shown). The lack of processing intermediates between the 135 and 110 kDa bands as well as the intensity of the 135 kDa band suggested a novel post-translational modification of p120 that was exon C — dependent.

Using an anti-HA resin, we purified the 110 kDa and 135 kDa proteins from extracts of A431 and S2-013 cells that had been infected with the p120-1AC retroviral construct. Bands cut from Coomassie Blue - stained gels were digested with trypsin, the peptides processed for mass spectrometry, and the results searched against the human database. In some experiments, we obtained tryptic peptides that represented up to 70% coverage of the p120-1AC open reading frame. However, even after multiple mass spectrometry experiments, we failed to identify candidates that could have caused the shift in apparent molecular weight.

We then sent the 110 and 135 kDa bands for N-terminal sequence analysis. The N-terminal sequences of the 110 and 135 kDa proteins are shown in Table 1 along with the deduced amino acid sequence based upon the experimentally determined nucleotide sequence of the p120-1AC construct. The N-terminal sequence of the 110 kDa band clearly matched the expected sequence of methionine followed by the HA epitope. However, the N-terminal sequence of the 135 kDa band, although somewhat ambiguous, was clearly different from that of the 110 kDa band with no evidence of an HA epitope.

Table 1
Experimental and predicted amino acid sequences.

The N-terminal sequencing data together with the immunoblotting data suggested the 135 kDa protein consisted of the full-length, HA-tagged p120-1AC construct fused at its N-terminus to an unidentified protein. However, the mass spectrometry experiments failed to identify this fusion partner in the human database.

The LZRS retroviral vector system was constructed using Moloney murine leukemia virus [5]. Cells infected with Moloney murine leukemia virus express several proteins including the gag-pro-pol-int fusion protein (GenBank AAL69910) shown in Table 2A [12]. When the tryptic peptides obtained from the 135 kDa band were searched against the viral database, several peptides from the gag-pro-pol-int protein of MoMLV (highlighted in red in Table 2A were identified.

These data prompted us to determine the nucleotide sequence the relevant portion of the LZRS vector (Table 2B). We found the LZRS vector contains a 273 AA open reading frame beginning with the methionine codon of the MoMLV gag gene and continuing on through all the viral sequences that are upstream of the multiple cloning site of the expression vector. The deduced 273 amino acid protein contains the entire retroviral p15 protein that is derived from the gag protein (with 2 amino acids inserted near the N-terminus compared to AAL69910), a small portion of the retroviral p12 protein also derived from gag, a 3 amino acid sequence encoded by a linker and 128 amino acids of the viral integrase protein. Although not an exact match, the N-terminal sequence of the deduced protein is similar to the N-terminal amino acid sequence experimentally obtained from the 135 kDa band (Table 1). As fortune would have it, if we continued the conceptual translation of the p120-1AC construct downstream of the viral sequences into the multiple cloning site, the gag-int protein shown in Table 2A was in-frame with the HA tag and the p120 cDNA. However, due to differences in the 5′ end sequences between the p120-1A and p120-1AC constructs, the gag-int protein was not in-frame with HA or p120 in the p120-1A construct.

To determine if the 135 kDa band was indeed a fusion between portions of the MoMLV gag-pro-pol-int protein with p120, we obtained antibodies against both the MoMLV integrase [13] and MoMLV p15 (monoclonal antibody 34, [8]) proteins. As shown in Fig. 2A, the anti-integrase antiserum identified the 135 kDa protein, but not the 110 kDa protein, in immunoblots. Although in our hands the monoclonal antibody recognizing p15 did not work in immunoblots, this antibody immunoprecipitated the 135 kDa, but not the 110 kDa band, from cell extracts (Fig. 2B). These data show the 135 kDa protein is a fusion between sequences encoded by the expression vector and by the HA-tagged p120 cDNA.

Figure 2Figure 2
A. Extracts were prepared and immunoblotted as described in Figure 1. The 135 kDa band (lane 3), but not the 110 kDa band, was identified by the anti-integrase antiserum.

LZRS contains both the MoMLV splice donor site, located about 415 bases upstream of the gag AUG, and the splice acceptor site, located in the integrase-encoding segment [12]. If transcripts produced from LZRS are spliced using these donor and acceptor sites, the gag AUG along with all of the p15 coding region as well as much of the integrase coding region should be spliced out. However, our data suggest splicing is not complete since the 135 kDa fusion protein is as abundant as the normal-sized p120 (Fig. 2).

We have previously utilized LZRS vectors without encountering this artifact. It is interesting to note that with the integrase antiserum we also observed a 30 kDa band in cells infected with the empty LZRS vector implying the gag-int protein is typically produced (results not shown).

The publicly available sequence of pBMN-Z and related vectors (; do not agree with our sequence of LZRS. These sequences show no open reading frame across the MoMLV sequences. However, when we purchased pBMN-Z from Addgene and sequenced the relevant region, we found that indeed pBMN-Z contains the same open reading frame that we found in LZRS due to the presence of two nucleotide differences in the publicly available pBMN sequence. The 5′ LTR and Ψ packaging sequences in pBMN were derived from the vector MFG [5]. The sequence of MFG (US Patent 6,544,771) agrees with the sequence we found in LZRS and not that of the publicly available sequence of pBMN. It thus is likely that all vectors based upon the pBMN series, such as LZRS, have the potential to express artifactual fusion proteins.

To prevent the artifact we saw with the p120-1AC construct from occurring, we recommend that investigators place a stop codon upstream of the desired start codon, either in the gag reading frame or in the reading frame of their constructs. We have done this with the p120-1AC construct and no longer see the 135 kDa band (results not shown).


We thank Dr. Laurey Steinke of the Protein Structure Core Facility in the Department of Biochemistry and Molecular Biology at the University of Nebraska Medical Center for determining the N-terminal sequences and for helpful discussions. We thank Dr. Xiang-Jiao Yang for constructs and Drs. Monica J. Roth, and Bruce Chesebro for antibodies. The work was supported by GM51188 to MJW and DE12308 and CA137401 to KRJ.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


[1] Wheelock MJ, Johnson KR. Cadherins as modulators of cellular phenotype. Annu Rev Cell Dev Biol. 2003;19:207–35. [PubMed]
[2] Reynolds AB. p120-catenin: Past and present. Biochim Biophys Acta. 2007;1773:2–7. [PMC free article] [PubMed]
[3] Keirsebilck A, Bonne S, Staes K, van Hengel J, Nollet F, Reynolds A, van Roy F. Molecular cloning of the human p120ctn catenin gene (CTNND1): expression of multiple alternatively spliced isoforms. Genomics. 1998;50:129–46. [PubMed]
[4] Kim SC, Sprung R, Chen Y, Xu Y, Ball H, Pei J, Cheng T, Kho Y, Xiao H, Xiao L, Grishin NV, White M, Yang XJ, Zhao Y. Substrate and functional diversity of lysine acetylation revealed by a proteomics survey. Mol Cell. 2006;23:607–18. [PubMed]
[5] Kinsella TM, Nolan GP. Episomal vectors rapidly and stably produce high-titer recombinant retrovirus. Hum Gene Ther. 1996;7:1405–13. [PubMed]
[6] Fukumoto Y, Shintani Y, Reynolds, AB, Johnson KR, Wheelock MJ. The regulatory or phosphorylation domain of p120 catenin controls E-cadherin dynamics at the plasma membrane. Exp Cell Res. 2008;314:52–67. [PMC free article] [PubMed]
[7] Bupp K, Sarangi A, Roth MJ. Selection of feline leukemia virus envelope proteins from a library by functional association with a murine leukemia virus envelope. Virology. 2006;351:340–8. [PubMed]
[8] Chesebro B, Britt W, Evans L, Wehrly K, Nishio J, Cloyd M. Characterization of monoclonal antibodies reactive with murine leukemia viruses: use in analysis of strains of friend MCF and Friend ecotropic murine leukemia virus. Virology. 1983;127:134–48. [PubMed]
[9] Ireton RC, Davis MA, van Hengel J, Mariner DJ, Barnes K, Thoreson MA, Anastasiadis PZ, Matrisian L, Bundy LM, Sealy L, Gilbert B, van Roy F, Reynolds AB. A novel role for p120 catenin in E-cadherin function. J Cell Biol. 2002;159:465–76. [PMC free article] [PubMed]
[10] Wahl JK, 3rd, Kim YJ, Cullen JM, Johnson KR, Wheelock MJ. N-cadherin-catenin complexes form prior to cleavage of the proregion and transport to the plasma membrane. J Biol Chem. 2003;278:17269–76. [PubMed]
[11] Xia X, Carnahan RH, Vaughan MH, Wildenberg GA, Reynolds AB. p120 serine and threonine phosphorylation is controlled by multiple ligand-receptor pathways but not cadherin ligation. Exp Cell Res. 2006;312:3336–48. [PubMed]
[12] Shinnick TM, Lerner RA, Sutcliffe JG. Nucleotide sequence of Moloney murine leukaemia virus. Nature. 1981;293:543–8. [PubMed]
[13] Puglia J, Wang T, Smith-Snyder C, Cote M, Scher M, Pelletier JN, John S, Jonsson CB, Roth MJ. Revealing domain structure through linker-scanning analysis of the murine leukemia virus (MuLV) RNase H and MuLV and human immunodeficiency virus type 1 integrase proteins. J Virol. 2006;80:9497–510. [PMC free article] [PubMed]
[14] Utsumi T, Sato M, Nakano K, Takemura D, Iwata H, Ishisaka R. Amino acid residue penultimate to the amino-terminal gly residue strongly affects two cotranslational protein modifications, N-myristoylation and N-acetylation. J Biol Chem. 2001;276:10505–13. [PubMed]