We present an integrated, biologically targeted approach to protein biomarker discovery and validation. This innovative approach is easily adaptable to studying other diseases, and addresses several major obstacles which have impeded successful protein biomarker development. By using a comparative cell line secreted proteome approach, a large number of biologically relevant candidates have been identified. Integration of a SILAP standard allows for a seamless transition from discovery in vitro to validation in serum. The same peptides used to identify the candidate biomarkers in vitro can be used to perform relative quantitation in serum by SILAP standard LC-MRM/MS. Our approach bypasses the daunting task of characterizing the entire serum proteome, an important advantage, as the vast majority of proteins present in serum are likely to be unrelated to CRC, whereas potential biomarkers are likely to be present at low levels, making direct identification difficult. Validation in a mouse model is another important aspect of our approach, allowing for initial screening of biomarker candidates against a uniform genetic and environmental background. Finally, an LC-MRM/MS approach allows for validation of biomarkers against which high quality antibodies are not readily available.
Using a SILAP standard throughout the discovery and validation phases is crucial to our approach, and offers several advantages to a standard shotgun approach. As with any method integrating a labeled internal standard, the method allows for relative quantitation of the corresponding unlabeled serum proteins, while controlling for nonspecific losses during extensive sample processing.31,32
Limiting the analysis to proteins secreted and over-expressed in colon cancer cells made it possible to exclude acute phase proteins and other abundant serum proteins, while simultaneously focusing on proteins with biological relevance to CRC. A large proportion of the proteins identified in the CT26 secreted proteome were differentially expressed, confirming the findings of other studies investigating secreted proteomes in cancer cell lines.33
To increase the number of candidate biomarkers identified in the discovery phase, IEF was integrated at the intact protein level prior to standard 2D LC-MS/MS analysis of tryptic peptides (). IEF represents an attractive, orthogonal approach for deconvoluting complex biological samples. We have described many of its advantages in a recently published study.32
One tradeoff of multiple dimensions of separation is the exponential increase in MS data acquisition time that results from increased fractionation. In this study, the amount time required for data acquisition from a sample processed by IEF-2D-LC/MS/MS was approximately 13 days. Replicate analysis becomes impractical given these long analysis times. This problem only deepens as more layers of fractionation are added. Regardless, our IEF-2D-LC/MS/MS approach allowed for identification of 1125 proteins from both cell lines including 418 common proteins.
Pathway analysis of the common proteins demonstrates a wide variety of important cellular processes such as protein metabolic process, cell motility, cell growth and death and cell communication. Ultimately, our approach should allow for systematic validation of all 418 common proteins identified. For the initial validation, 20 proteins were chosen based on criteria to maximize the likelihood of identifying meaningful biomarkers. These proteins were abundant in the SILAP standard to facilitate MRM development, and they were over-expressed in the CT26 secreted proteome suggesting biological relevance. Eight of the proteins localized in the extracellular compartment, with the other 12 localizing to membrane bound organelles, supporting the possibility that these proteins could be released into serum for detection.
To develop a high throughput clinical assay, it was anticipated that SILAP standard containing serum samples could be analyzed directly by LC-MRM/MS. Even with immunoaffinity removal of the 3 most abundant serum proteins and simplification of the SILAP standard by excluding the YAMC secreted proteome, this was not possible because of ion suppression. However, when incorporated with initial LC-SCX separation, 18 of 20 SILAP standard peptides could be detected in mouse serum together with 11 endogenous tryptic peptides. This made it possible to conduct relative quantitation of these peptides in normal mouse and Apcmin mouse serum. Quantitation by LC-MRM/MS was reproducible and, where antibodies were available, validated by Western blot analysis. Importantly, this approach made it possible to conduct relative quantitation of proteins such as procollagen C-proteinase enhancer and nucleobindin 1 and others where high quality antibodies were not available. We took care to validate the specificity of the MRM transitions by concurrently obtaining a full MS/MS spectrum of each precursor ion. Monitoring of 2 or more peptides/protein could be performed, however, such an approach would severely reduce the number of biomarkers that could be monitored using ion trap methodology. It will be possible in the future to translate this methodology to higher sensitivity and throughput instrumentation, for example by using MRM retention time segmentation in combination with an ultra performance LC instrument coupled to a high sensitivity triple quadrupole mass spectrometer. Taken together, these innovative methods have facilitated the characterization of a large number of proteins secreted by a murine colon cancer cell line when compared with a normal murine colon epithelial cell line. Using the SILAP standard approach, it was possible to interrogate Apcmin and normal mouse serum for 18 of these proteins, a small subset of those identified. Six of the 11 proteins that could be monitored were over-expressed in Apcmin mouse serum by more than 2-fold. Future work to screen a larger number of the candidates identified will yield a large panel of proteins, and will serve as a template for translating our findings to the human disease.
A number of proteins found in this study to be over-expressed in Apcmin
mouse serum have been implicated in essential processes responsible for tumor growth and spread. Secreted phosphoprotein 1 (osteopontin, OPN) is a multifunctional, secreted glycoprotein implicated in a number of malignancies including breast, stomach, lung, prostate liver and colon.34
OPN has been implicated in a variety of biological pathways crucial to tumorigenesis, including cell adhesion, chemotaxis, apoptosis, invasion, migration and anchorage-independent growth of tumor cells.35-37
OPN has been previously shown to be increased in serum of patients with CRC and other cancers compared to normal serum.34
Cathepsins are a class of globular proteases, initially described as intracellular peptide hydrolases, although several cathepsins also have extracellular functions. Many cancer cells have been found to secrete cathepsin L to degrade the components of extracellular matrices and basement membranes, thus promoting tumor invasion and metastasis.38-42
Cystatin C is a secreted member of the cystatin superfamily of cysteine protease inhibitors. By inhibiting protease activity, cystatins act to modulate extracellular matrix degradation. Increased levels of cystatin C have been found in several malignancies,43-45
and increased serum levels associated with poorer prognosis in CRC.46
Nucleolin is an abundant RNA- and protein-binding protein. Nucleolin has not been described in the literature as a serum biomarker of CRC. On the cell surface, nucleolin serves as an attachment protein for several ligands from growth factors to virus particles.47-52
Enhanced surface expression of nucleolin has been found in numerous malignancies and on endothelial cells within the tumor vasculature. Interestingly, nucleolin has been shown in CRC cells to modulate cell adhesion and spreading on fibronectin substrates.49
Fibronectin, a multifunctional glycoprotein involved in cell-matrix interactions, is best known as one of the crucial proteins involved with wound healing, however, its expression is also altered during neoplastic transformation.53
Pyruvate kinase 3 is a key enzyme involved with glycolysis and gluconeogenesis, processes often up-regulated in cancer cells. Pyruvate kinase 3 has not been previously associated with CRC, however, a recent study demonstrated that a related pyruvate kinase, M2-PK, could be detected at higher levels in stool samples from patients with large colonic polyps and CRC.54
Procollagen C-proteinase enhancer (PCPE) also has not been previously described in CRC. PCPE is an extracellular matrix glycoprotein which binds to the C-propeptide of procollagen I and acts to enhance procollagen C-proteinase activity.55
PCPE appears important for regulation of extracellular matrix, an important pathway for tumor invasion, angiogenesis and metastasis. Nucleobindin 1 (calnuc) has been described as a calcium binding protein involved with signal transduction events. A recent study demonstrates over-expression of calnuc in colon cancer tissue, and a significant minority of CRC patients with autoantibodies against calnuc.56
Heat shock protein 1 alpha (HSP90) is a molecular chaperone over-expressed in many malignancies57
and has been identified as a therapeutic target in CRC.58
This is the first study to demonstrate an association of increased serum HSP90 levels to CRC.
Only two proteins, profilin 1 and heat shock protein 8, had increased expression in the CT26 secreted proteome but decreased expression in Apcmin
mouse serum, providing evidence that such in vitro modeling is a promising strategy for biomarker discovery. This likely reflects proteomic changes induced in cancer cells forced to grow in culture and differences in the genetic background of CT26 cells and Apcmin
mice. One example is profilin 1, a widely expressed protein which has been found to act as a tumor suppressor. Interestingly, down-regulation of profilin 1 has been studied in breast cancer cells, and is associated with enhanced motility and invasiveness.59
No studies of profilin 1 in CRC have been published. Proteins under-expressed in disease states are potentially as valuable as their over-expressed counterparts when designing clinically useful biomarker panels.
The integrated MS-based discovery and validation approach presented here provides a workflow for identifying disease biomarkers, and more importantly, a platform for measuring a panel of disease biomarkers. Many CRC candidate biomarkers have been identified. This study has only explored a small fraction of the differentially expressed proteins identified as part of the discovery phase. Current work is focused on systematically characterizing all candidate biomarkers in serum, a process made possible by the SILAP standard and the MRM approach. Obstacles to a high throughput, clinically useful LC-MRM/MS assay remain. Direct analysis of even abundant proteins in serum is difficult without time and labor intensive sample processing. Potential solutions include more extensive immunoaffinity removal of abundant serum proteins or synthesis of heavy isotope peptide analogs for absolute quantitation. Translation of candidate biomarkers identified and validated in our mouse studies would be straightforward. Human cancer cell lines could be rapidly characterized, SILAC labeled and used as an internal standard for interrogating human serum samples. Development of a biomarker panel for the early detection of CRC would lead to an earlier stage of diagnosis, and therefore a greater chance of cure.
The integrated MS-based discovery and validation approach presented here provides a workflow for identifying disease biomarkers, and more importantly, a platform for measuring a panel of disease biomarkers. Many CRC candidate biomarkers have been identified. This study has only explored a small fraction of the differentially expressed proteins identified as part of the discovery phase. Current work is focused on systematically characterizing all candidate biomarkers in serum, a process made possible by the SILAP standard and the MRM approach.