|Home | About | Journals | Submit | Contact Us | Français|
Tumor-derived proteins may occur in the circulation as a result of secretion, shedding from the cell surface, or cell turnover. We have applied an in-depth comprehensive proteomic strategy to plasma from intestinal tumor–bearing Apc mutant mice to identify proteins associated with tumor development. We used quantitative tandem mass spectrometry of fractionated mouse plasma to identify differentially expressed proteins in plasma from intestinal tumor–bearing Apc mutant mice relative to matched controls. Up-regulated proteins were assessed for the expression of corresponding genes in tumor tissue. A subset of proteins implicated in colorectal cancer were selected for further analysis at the tissue level using antibody microarrays, Western blotting, tumor immunohistochemistry, and novel fluorescent imaging. We identified 51 proteins that were elevated in plasma with concordant up-regulation at the RNA level in tumor tissue. The list included multiple proteins involved in colon cancer pathogenesis: cathepsin B and cathepsin D, cullin 1, Parkinson disease 7, muscle pyruvate kinase, and Ran. Of these, Parkinson disease 7, muscle pyruvate kinase, and Ran were also found to be up-regulated in human colon adenoma samples. We have identified proteins with direct relevance to colorectal carcinogenesis that are present both in plasma and in tumor tissue in intestinal tumor–bearing mice. Our results show that integrated analysis of the plasma proteome and tumor transcriptome of genetically engineered mouse models is a powerful approach for the identification of tumor-related plasma proteins.
Although our understanding of colorectal cancer has substantially improved, few circulating biomarkers have emerged that have diagnostic utility. The current gold standard is screening by visual endoscopy. Newer modalities, such as computed tomographic colography or fecal DNA testing, have not achieved widespread usage (1, 2). There remains a substantial need for noninvasive diagnostic methods.
An alternative approach to screen for colorectal cancer is through blood-based testing. Proteins detectable in serum and plasma are the basis of commonly relied upon tests to detect prostate, ovarian, and pancreatic cancer through the measurement of prostate-specific antigen, CA125, and CA19.9, respectively (3–5). Current colorectal cancer circulating markers, exemplified by carcinoembryonic antigen, have poor sensitivity and specificity that preclude their usage as a population-wide screening tool (6). The development of panels of protein markers may provide the necessary sensitivity and specificity for blood-based testing of colorectal cancer.
Current proteomics technologies allow systematic interrogation of complex proteomes and identification of differentially expressed proteins, whether in cells, tissues, or body fluids (7). However, biomarker discovery in humans is challenged by extensive heterogeneity at the disease and patient sample procurement levels. Prior to embarking on a large-scale effort to identify colorectal cancer–specific biomarkers from human patients, we examined the feasibility of the approach by using a genetically modified mouse model. Genetically engineered mouse models of human cancer can be interrogated at defined stages of tumor development, under homogenized breeding and environmental conditions, and with standardized blood sampling, thereby reducing biological and nonbiological heterogeneity and permitting the application of proteomics to the identification of cancer markers found in the circulation.
Colorectal cancer, whether it is sporadic or the result of cancer predisposition syndromes, is associated with a mutation in the Apc gene (8). Several mouse cell lines, each carrying a different mutation in the Apc gene, have been described (9). Most of these genetically modified mice show an intestinal tumor predisposition phenotype and develop few to many adenomas and adenocarcinomas. In this study, we have investigated a mouse model of intestinal tumorigenesis, Apc Δ580, to determine the spectrum of protein changes that occur in mouse plasma with tumor development and the extent to which these observed changes reflect the tumor tissue of origin versus inflammation and other nonspecific disease processes (10).
Heterozygous Apc Δ580 mice on the C57bl/6 (B6) background were mated with wild-type B6 mice (10). The resulting offspring were screened by PCR of tail DNA using standard methods. Heterozygous Apc Δ580 mice were used for the studies. Wild-type age-matched and sex-matched littermates were used as controls.
Plasma pools (1 mL) from 10 tumor-bearing Apc Δ580 mice (10–12 weeks) and from 10 non–tumor-bearing wild-type littermates were individually immunodepleted of the top three most abundant proteins (albumin, IgG, and transferrin) using an Ms-3 column (4.6 × 250 mm; Agilent). Briefly, columns were equilibrated with buffer A at (0.5 mL/min) for 13 min and aliquots of 75 µL of the pooled sera were injected after filtration through a 0.22-µm syringe filter. The flowthrough fractions were collected for 10 min using buffer A at a flow rate of 0.5 mL/min, combined and stored at −80°C until use. The column-bound material was recovered by elution for 8 min with buffer B at 1 mL/min. Subsequently, immunodepleted samples were concentrated using Centricon YM-3 devices (Millipore) and re-diluted in 8 mol/L of urea, 30 mmol/L of Tris (pH 8.5), and 0.5% octyl-β-d-glucopyranoside (Roche). Samples were reduced with DTT in 50 µL of 2mol/L Tris-HCl (pH 8.5; 0.66 mg DTT/mg protein), and isotopic labeling of intact proteins in cysteine residues was done with acrylamide. Controls received the light acrylamide isotope (D0-acrylamide, >99.5% purity; Fluka), whereas cases received the heavy 2,3,3′-D3-acrylamide isotope (D3-acrylamide, >98% purity; Cambridge Isotope Laboratories). Alkylation with acrylamide was done for 1 h at room temperature by the addition of 7.1 mg of D0-acrylamide or 7.4 mg of D3-acrylamide per milligram of protein, diluted in a small volume of 2 mol/L of Tris-HCl (pH 8.5; ref. 11).
After isotopic labeling, the case and control pools were mixed in a 1:1 ratio, diluted to 10 mL with 20 mmol/L of Tris in 6% isopropanol, and 4 mol/L of urea (pH 8.5). The combined sample pools were then separated using a two-dimensional protein fractionation strategy (anion exchange followed by reverse phase chromatography). For the first dimension, samples were injected onto a Mono-Q 10/100 column (Amersham Biosciences). The buffer system consisted of solvent A: 20 mmol/L of Tris in 6% isopropanol and 4 mol/L of urea (pH 8.5); and solvent B: 20 mmol/L of Tris in 6% isopropanol, 4 mol/L of urea, and 1 mol/L of NaCl (pH 8.5). The separation was done at 4.0 mL/min in a gradient of 0% to 35% solvent B for 44 min; 35% to 50% solvent B for 3 min; 50% to 100% solvent B for 5 min, and 100% solvent B for an additional 5 min. A total of nine pools were collected. For the second dimension, each of the anion exchange fractions were further fractionated using a Poros R2column (4.6 × 50 mm; Applied Biosystems) with TFA/acetonitrile as buffer system (solvent A: 95% H2O, 5% acetonitrile + 0.1% TFA; and solvent B: 90% acetonitrile, 10% H2O + 0.1% TFA) at 2.7 mL/min. The gradient used was 5% solvent A until absorbance reached baseline (desalting step) and then 5% to 50% solvent B for 18 min; 50% to 80% solvent B for 7 min, and 80% to 95% solvent B for 2min. Sixty fractions of 900 µL were collected during the run, corresponding to a total of 540 fractions. Fractions were pooled to normalize masses, resulting in 117 discrete fractions.
The final dimension of separation was done online with a 75-µm × 20 cm internal diameter column packed with C18 media running at a 225 nL/min flow rate provided by a Surveyor MS pump using a gradient of 5% to 60% water 0.1% formic acid, acetonitrile 0.1% formic acid over the course of 100 min with a total run time of 150 min. The samples were run on a high-resolution LTQ-FT-MS (Thermo-Fisher Scientific) in a top nine configuration (one MS 100K resolution full scan and nine MS/MS scans). Dynamic exclusion was set to 1 with a limit of 180 s and early expiration set to six full scans. To ascertain column performance and to observe any potential carryover that might have occurred, a 2.5-h standard mixture of five Angio mix peptides (Michrom BioResources) was run after every 12 experimental samples. The acquired mass spectral data was automatically processed by the Computational Proteomics Analysis System (12). Searches were done considering cysteine alkylation with the light form of acrylamide as a fixed modification and heavy form of acrylamide (+3.01884) as a variable modification. For the identification of proteins with a false discovery rate of <5%, liquid chromatography MS/MS spectra were subjected to tryptic searches against a database consisting of forward and reverse mouse IPI databases released in January 2006 (v.3.12) using X!Comet (13). The database search results were then analyzed by PeptideProphet (14) and ProteinProphet (15), and those proteins that resulted in a Protein Prophet error rate of ≤5% were retained.
Quantitative information was extracted from acrylamide-labeled peptides using the q3 script to obtain the relative quantification for each pair of peptides identified by MS/MS that contains cysteine residues (11). All peptide ratios for a specific protein present in a particular fraction were normalized and log-averaged to obtain the local relative protein ratio.
To assign biological significance to differentially labeled proteins, we accessed the Expert Protein Analysis System proteomics server to query Swiss-Prot and TrEMBL for ontological information regarding protein function, domain-family taxa, tissue specificity, posttranslational modifications, disease associations, etc. (16). We also interrogated biological text analysis databases such as MedMiner and XplorMed to determine whether these candidates were known plasma proteins, secreted or membrane proteins, or were associated with the intestinal epithelium, cancer, or colorectal cancer (17, 18). Lastly, we used the Database for Annotation, Visualization, and Integrated Discovery 2.1, FatiGO, and Ingenuity (19, 20). Proteins that were known to be associated with colorectal cancer, cancer biology, or the intestinal epithelium were prioritized for further validation. Thus, integration of relative protein abundance, gene expression data, and functional and ontologic information provided a basis for further validation.
Paraffin-embedded tissue sections (5 µm) were deparafinized in xylene, followed by alcohol rehydration. After quenching endogenous peroxidases in 3% H2O2 in methanol, the slides were rinsed in distilled water, and an antigen retrieval step was carried out in a microwave oven for a total of 10 min in preheated citrate buffer (pH 6.0). The slides were then incubated with primary antibodies at room temperature overnight. The following primary antibodies were used: cullin-1 (Abcam), cathepsin B and cathepsin D (Santa Cruz Biotechnology), Ran (Santa Cruz Biotechnology), PKM2 (Cell Signaling Technology), and DJ-1 (Covance). After washing in PBS, slides were incubated with either anti-rabbit (Jackson Immunoresearch Laboratories) or anti-goat secondary antibody for 1 h at room temperature. The Vectastain Elite ABC kit (Vector Laboratories) was used for detection as per the instructions of the manufacturer. The slides were stained with 3,3′-diaminobenzidine and counterstained with Mayer's hematoxylin.
A protease activatable probe, Prosense 680 (VisEn Medical), was injected i.v. at a dose of 2nmol/mouse 24 h prior to ex vivo imaging. The imaging probe is a long circulating graft copolymer that markedly increases its fluorescence intensity after selective enzymatic cleavage of lysine-lysine bonds, primarily by cathepsin B in vivo. Prior to imaging, the mice were sacrificed. The small bowel was removed, splayed, and washed in saline. White light imaging, immediately followed by near-IR imaging using a filter set optimized for Cy5.5, was done on a prototype OV-100 small animal fluorescence imaging system (Olympus).
Antibody microarray validation was done as previously described (21). The clusterin, cathepsin B, and cathepsin D antibodies were purchased from R&D Systems.
Mouse plasma samples were separated on 4% to 12% Bis-Tris 26-well gels (Criterion XT, Bio-Rad), transferred to a nitrocellulose membrane (0.45 µmol/L, Bio-Rad), and blocked overnight. For sample detection, the nitrocellulose was incubated with a rabbit anti–DJ-1 polyclonal antibody (Santa Cruz Biotechnology). A donkey anti-rabbit IgG conjugated with horseradish peroxidase (Pierce) was used as the secondary antibody. Antigen-antibody bands were visualized using the SuperSignal West Pico Chemiluminescent substrate (Pierce), followed by exposure to Kodak Biomax XAR film (Sigma-Aldrich). The film was scanned and the intensities of the bands were quantified using the image analysis software, Quantity One (version 4.4.1, Bio-Rad).
The study was designed to test directly whether in-depth quantitative proteomic analysis of plasma from cases relative to matched controls yields identification of increased levels of plasma proteins derived from intestinal tumor tissue versus nonspecific protein changes related to the host response. An overview of this workflow is presented in Fig. 1. Plasma was obtained from 10 tumor-bearing Apc Δ580 mice at 10 to 12 weeks of age and 10 age-matched and sex-matched controls. Tumor status of individual mice was assessed by histopathology. These mice presented with 30 to 50 adenomas in both the small and large intestine. For quantitative analysis, differential isotopic labeling was applied to pooled plasma from tumor-bearing and control animals, respectively. The pools were combined in a 1:1 ratio, followed by extensive fractionation of intact proteins using anion exchange and reverse phase chromatography, yielding a total of 117 distinct plasma protein fractions. After tryptic digestion, individual fractions were analyzed separately by tandem mass spectrometry (MS/MS). Approximately 1,400,000 mass spectra were produced and analyzed in this study. Collectively, the fractions analyzed resulted in a primary list of 1,213 distinct proteins identified with high confidence (<5% false discovery rate, based on reverse-database searches). Of this total, 1,132 proteins were identified with two or more peptides. Interestingly, 304 proteins were increased by ≥1.5-fold in case pools compared with control pools, whereas 20 proteins were decreased by ≥1.5-fold, as may be expected with the release of tumor-derived proteins into the circulation. These proteins were broadly distributed in relation to their representation as secreted, membrane, cytoplasmic, or nuclear proteins based on gene ontology (Fig. 2A).
Plasma proteins whose levels were increased in the plasmas from cases versus controls may have originated either from overexpression in tumor tissue, or from a systemic host response. To identify those plasma proteins that may have originated from intestinal tumor tissue, we correlated protein levels in plasma with expression of corresponding genes at the RNA level in tumor tissue. Reliance on gene expression, although limited by potential discordance between RNA and protein levels, takes advantage of the wealth of transcriptomic data already available for various tumor types and is intended to assess the potential tumor tissue origin of candidate markers. One such transcriptome study yielded 2,545 genes that were overexpressed in adenomas and 2,576 genes that were overexpressed in adenocarcinomas (22). Based on this analysis, 51 proteins that were found to have increased levels in plasma from tumor-bearing mice also exhibited overexpression at the RNA level in tumor tissue (Table 1).
To further determine the relationship of these 51 proteins to tumorigenesis, we annotated these proteins using knowledgebases such as MedMiner, XplorMed, Database for Annotation, Visualization, and Integrated Discovery 2.1, and FatiGO (17–20). Ingenuity Pathway Analysis was also used to determine their cancer relevance from information in the Ingenuity Pathways Knowledge Base (Fig. 2B). The 51 proteins were found to be involved in major processes such as cell death (28 proteins), cellular growth and proliferation (25 proteins), cell to cell signaling and interactions (19 proteins), and cell cycle control (12 proteins). One protein was related to the Wnt signaling pathway associated with inactivation of the Apc gene.
The finding that many of the proteins that occurred at increased levels in plasma were overexpressed in the tumor transcriptome suggested that these circulating proteins were tumor-derived. To confirm this relationship, we examined protein expression in tumor tissue by immunohistochemistry. We focused our validation efforts on proteins for which antibodies with the prerequisite specificity were available. Six proteins that met these criteria were cathepsin B, cathepsin D, Cul1, Park7, Pkm2, and Ran, all of which were found to be overexpressed in tumor tissue with low to undetectable expression in normal intestinal epithelium (Fig. 3). Cathepsins B and D, Park7, and Pkm2 were present in tumor cells in a cytoplasmic distribution. Cul1 and Ran were also found in tumor cells, but in both nuclear and cytoplasmic compartments. Increased levels of Park7, Pkm2, and Ran were also observed in human colon adenoma tissue (Fig. 4).
Using available antibodies, we examined the levels of a subset of up-regulated proteins in the plasma of an independent set of 40 individual tumor-bearing mice and 40 matched controls using antibody microarrays and Western blotting. Cathepsin B, cathepsin D, clusterin, and Park7 were found to occur at increased levels in plasma from individual tumor-bearing mice (Fig. 5). The P values were highly significant, ranging from P = 0.002 for cathepsin B and Park7 to P = 6.66 × 10−7 for cathepsin D.
Cathepsin B and cathepsin D were two of the proteins that were identified at increased levels in plasma from tumor-bearing mice by MS and antibody microarray, and in tumor tissue by immunohistochemistry. Both of these proteins are cysteine proteases that have been implicated in cancer pathogenesis. To assess their biological activity in tumors compared with normal mucosa, we injected tumor-bearing mice with the Prosense imaging agent, a normally inert fluorophore that is activated through cleavage by cathepsins (23). Comparison of white light and near-IR imaging that detects the activated Prosense probe revealed that only the tumor and not the surrounding normal mucosa were positive, indicating that cathepsin activity occurred at the tumor site (Fig. 6).
Genetically modified mouse models are attractive as a source of markers because of their genetic homogeneity. In a prior study of an ApcMin mouse model, we showed that plasma proteome analysis of relatively abundant proteins could distinguish between tumor-bearing mice and their wild-type counterparts (21). The proteins that provided such discrimination were largely acute phase reactants which have poor specificity for intestinal tumors. To identify intestinal tumor markers with higher specificity, we have conducted an indepth quantitative multidimensional MS analysis of plasma proteins from tumor-bearing mutant Apc Δ580 mice (cases) and non–tumor-bearing wild-type littermates (controls).
Previous MS-based biomarker discovery studies have focused analysis on either the peripheral blood compartment or directly on the colorectal cancer tissue (24–27). Whereas, the former approach ensures that the identified candidate markers are present in blood, the tumor origin of such proteins is uncertain. Conversely, the latter approach assures that the candidate markers are tumor-derived; however, it is unclear if they occur in serum or plasma. In this study, we implemented an approach in which quantitative tandem MS-based plasma proteome data was integrated with tumor tissue transcriptome analysis. To this effect, we have taken advantage of differential isotopic labeling of plasma proteins with acrylamide, coupled with extensive fractionation to achieve both quantitative analysis and high sensitivity (11). We have also integrated proteomic data with tumor transcriptomic data and biological annotation data to assess the intestinal tumor relevance of up-regulated plasma proteins.
The depletion of high-abundance plasma proteins, coupled with extensive fractionation prior to mass spectrometry, allowed us to identify >1,200 proteins in plasma, most of which were identified with two or more unique peptides with an overall accuracy of detection estimated to be 96.5% based on reverse database search. The predominance of up-regulated proteins reflects the contributions of secretion, protein shedding, and cell turnover to the plasma proteome. During the development of the tumor, some genes and their corresponding proteins were overexpressed whereas others lost their expression or were underexpressed relative to normal cells. Examination of plasma would only reveal those that are more abundantly expressed, and therefore, it is not surprising that most of the differentially expressed proteins that we detected in plasma were up-regulated.
To determine the relevance of up-regulated proteins to intestinal tumors, we integrated findings from the plasma proteome with intestinal tumor transcriptomic and biological annotation data. Such an analysis, combined with biological annotation data, was informative in highlighting the relevance of several proteins found to be up-regulated in colorectal cancer pathogenesis. Antibody microarrays and Western blot analysis of plasma provided a confirmation of MS-based findings. Intestinal tumor immunohistochemistry and Prosense imaging showed that the proteins examined were indeed expressed at the tumor site, and in some cases, were overexpressed specifically in tumor cells. Below, we discuss potential candidate markers for colorectal cancer.
Progression through the cell cycle is based on precisely regulated levels of cyclins and cyclin-dependent kinases through the ubiquitin-proteasome pathway. The SCF E3 ubiquitin ligase is responsible for the attachment of ubiquitin and contains Rbx1, Cul1, and Skp1. Variable F-box proteins, such as Skp2, provide target specificity (28, 29). Skp2 has been shown to be critical for the degradation of p27, thereby allowing progression through the G1-S checkpoint (30, 31). Degradation of p27 has been shown to be important for colorectal cancer in both mouse models and in humans (32, 33). As such, circulating Cul1 might serve as a surrogate marker for increased p27 degradation, reflecting the characteristic rapid progression through the cell cycle.
Cathepsins are involved in the degradation of the basement membrane and extracellular matrix, a key step in tumor invasion and metastasis. Two such proteases, cathepsin B and cathepsin D, have been shown to be associated with colorectal cancer (34–37). In accordance with the ELISA data, cathepsin B and cathepsin D were determined to be of intestinal tumor origin by both immunohistochemistry and ex vivo fluorescence imaging using Prosense.
In normal cells, pyruvate kinase is a key glycolytic enzyme that is responsible for the production of ATP through the conversion of phosphoenolpyruvate to pyruvate. However, in tumor cells, pyruvate kinase is converted from its active tetrameric form into an inactive dimeric form, resulting in a shunting of the upstream phosphometabolites towards synthesis of nucleic acids, amino acids, and phospholipids (38). In accordance with our findings above, Pkm2 has been shown to be increased in cancer including colorectal cancer and has shown potential as a tissue-based biomarker (39–41). Furthermore, the switch to the Pkm2 isoform in tumors has been recently shown to be necessary for aerobic glycolysis and is responsible for their subsequent selective growth advantage (42).
Park7 or Dj-1 is a gene that has been implicated in autosomal recessive, early-onset Parkinsonism (43). However, Dj-1 has also been shown to have transforming activity in conjunction with Ras signaling and to be a negative regulator of Pten (44). Furthermore, its expression is increased in both breast and lung carcinoma (45). Lastly, this gene has been identified as a novel circulating tumor antigen in patients with breast cancer (46). Taken together, circulating Dj-1 might act as a surrogate marker for PI3K activation in colorectal cancer.
Ran is a GTPase that is involved in protein trafficking across the nuclear membrane, formation of the mitotic spindle, and nuclear envelope assembly (47). This may explain the elevated levels in tumor cells.
Our confirmation studies were limited based on the availability of robust antibodies. Therefore, additional proteins from the identified set may also have relevance as plasma-based markers for colorectal cancer. Our findings support the view that genetically modified mice provide useful models for biomarker discovery as recently reported for a mouse model of pancreatic cancer (48). Our results also suggest that examination of plasma from tumor-bearing individuals and comparing the profile with those of age-matched and sex-matched controls would yield interesting biomarkers that might be useful for the detection of human colorectal cancer.
Grant support: National Cancer Institute (CA084301), National Institute of Diabetes and Digestive and Kidney Diseases (DK780332), American Gastroenterological Association (Research Scholar Award), and a contract from SAIAC (F012593).
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.