In the search for clinically relevant biomarkers, the low mass range of the serum
proteome, particularly peptides with a molecular mass below 3,000 Da, has not
received the same attention as higher molecular weight peptides and proteins. Small,
preexisting peptides are not readily picked up by high-throughput liquid
chromatography/liquid chromatography–MS/MS (LC/LC-MS/MS) analyses of
whole-proteome tryptic digests and have also been underrepresented in
surface-enhanced laser desorption/ionization–TOF (SELDI-TOF) MS-based
screens that seem to favor polypeptides in the 5- to 15-kDa mass range (
19–
24). The current study and a recent analysis by Koomen et al. (
17) provide the first details on the
composition of the peptide pool in serum and plasma. Overall, it appears that a
large part of the human serum peptidome as detected by MALDI-TOF MS is produced ex
vivo by degradation of endogenous substrates by endogenous proteases. As illustrated
in Figure , peptides are generated during
the proteolytic cascades that occur in the intrinsic pathway of coagulation and
complement activation (
50). Some of these are
known bioactive molecules, others represent cleaved propeptides, and still others
are seemingly random internal fragments of the precursor proteins. However, the
observed cleavage sites are generally consistent with trypsin- and chymotrypsin-like
activities of known serine proteases (kallikreins, plasmin, thrombin, factor I,
etc.). Once generated, the founder peptides are trimmed down by exoproteases into
ladder-like clusters.
Exoproteases form a heterogeneous group of enzymes that play a role in the regulation
of biologically active peptides (
51–
53). For instance,
leucine aminopeptidase (LAP), aminopeptidase A (AP-A), aminopeptidase N (AP-N),
carboxypeptidase N (CP-N), and the kininase I family of carboxypeptidases are
involved in the production of angiotensin, bradykinin, and vasopressin (
53), and TAFI (a carboxypeptidase B enzyme) in
the regulation of fibrinolysis (
54). Several
exoproteases are transmembrane proteins, anchored in the plasma membrane of vascular
endothelial cells. Heterogeneous distribution results in the production of a wide
variety of proteolytic peptides in different tissues and contexts (
51). In addition, some exoproteases like AP-N and
placental LAP (P-LAP) are shed from cells through the action of ADAM family
proteases (
55) and end up in the bloodstream
in soluble form (
55,
56), thereby degrading resident polypeptides in the
blood, plasma, and serum.
Depending on the analytical approach and the objectives of a diagnostic marker
search, there are opposing views on the presence of a vast peptide pool (degradome)
in plasma or serum generated from blood proteins as described above (Figure ). It can be considered background noise in
peptide marker discovery efforts, making it all but impossible to find any naturally
occurring, true biomarkers in the peptidome or to obtain mechanistic insights in
specific activities of tumor-associated proteases. Those who subscribe to this view
believe that exoprotease activity, or all protease activity for that matter, should
be blocked at the time of sample collection. However, it has been correctly pointed
out (
17) that the protein degradome is the
only segment of the serum peptidome that can be readily interrogated by direct
MALDI-TOF MS. Fragments of bona fide marker proteins (for example, PSA in sera of
prostate cancer patients), if present, are currently undetectable because of
sensitivity, ion suppression, and mass resolution issues inherent in the technology.
It can therefore be argued that precisely this degradome offers the best opportunity
at this point for biomarker or surrogate biomarker discovery.
Whereas the only comprehensive, high-resolution MS analysis of the plasma/serum
peptides to date aimed at providing an inventory (
17), we undertook to find peptides and patterns with marker potential for
specific types of solid tumor cancers. In the discovery phase of our studies, we
sorted through hundreds of features to identify several that were most predictive of
outcome and showed that reduction in the number of key peptides to a few (i.e., the
signatures) that were easily recognized between samples did not adversely affect
class predictions. We then demonstrated that this signature could be used to
discriminate between cancer and control in an independent validation set comprised
of serum samples obtained from patients with advanced prostate cancer. Strikingly,
all 46 sequence-identified peptides from the initial set of 68 rigorously selected
discriminant peptide signals were part of the serum degradome. With two-thirds of
the initial marker group now characterized, we trust that these findings can be
generalized.
The small number of blood proteins that are the source of nearly all the peptides in
prostate, bladder, and breast cancer signatures are naturally not biomarkers but
simply serve as an endogenous substrate pool for the real biomarkers, i.e.,
proteases. There is no actual relationship between the substrate concentrations and
the MS-ion intensities of many of the degradation products. Highly abundant serum
proteins such as albumin and immunoglobulins were not represented, and fragments of
proteins with a more than 10-fold difference in concentration had comparable ion
intensities. On the other hand, whereas full-length C3f produced nearly identical
ion intensities in all cancer groups and controls, several of its truncated forms
did not. In fact, 2 or more patient sera peptides (say,
x and
y) that derived from the same protein had often opposite
relative ion intensities (i.e., the ion intensity divided by that of the
corresponding peptide in the control group); for instance, the signal of peptide
x was higher and that of peptide
y lower than
that of their counterparts in control sera. Finally, several of the protein
degradome peptides that we observed and that had high surrogate marker value were
virtually absent from the controls (e.g., several entries in Figure that list a median normalized intensity value of 1 for
the control). In fact, 7 such peptides (Figures and ;
m/z = 998,
1278, 2053, 2409, 2565, 2704, and 3971), each unique to 1 or more types of cancer,
were not reported in the high-resolution blanket analyses of plasma peptides,
possibly because that blood sample was obtained from a healthy individual (
17).
The 2-step proteolytic process depicted in Figure that generates the most abundant layer of the serum peptidome is subject
to changes in enzyme panels, cofactors, inhibitors, and various other controlling
elements and conditions, which make for a virtually unlimited combinatorial
variability to produce peptides of different sizes and composition. Direct MALDI-TOF
MS–based serum peptide profiling is thus a form of activity-based
proteomics, monitoring surrogate biomarkers in the form of proteome metabolomic
products. This can be exploited for diagnostic and predictive purposes as a
phenotypic read-out of catalytic and other metabolic activities in body fluids or
tissues, utilizing endogenous (or exogenous) substrates and quantitative product
analysis. It also makes this approach particularly well suited for detection of
cancer, as proteases are well-established components of cancer progression and
invasiveness (
57–
60). We provide evidence here that exoprotease activities
superimposed on the ex vivo coagulation and complement-degradation pathways
contribute to generation of not only cancer-specific but also cancer
type–specific serum peptides.
Exoproteases have been previously implicated in cancer (
58). For instance, AP-N/CD13 is highly expressed in
bladder, gastric, thyroid, and hepatic carcinomas (
61–
64), and the
concentration of its soluble form is also increased in cancer patients (
56). Similarly, increased concentration of a
lysosomal dipeptidyl-aminopeptidase (DAP II) has been observed in sera of
tumor-bearing animals and cancer patients (
65). LAP, aminopeptidase P (AP-P), and enkephalin-degrading tyrosyl
aminopeptidase (EDA) have been associated with breast cancer (
57,
66–
68) and AP-A,
methionine aminopeptidase 2 (Met-AP2), and glycylproline dipeptidyl aminopeptidase
(GPDA) with various other types of cancers (
69–
71). Increased
activity and expression of AP-N and Met-AP2 have been functionally correlated with
metastasis of cancer cells by promotion of angiogenesis (
72–
75). As for carboxypeptidases, carboxypeptidase D (CP-D) is selectively more
highly expressed in hematopoietic tumor cells (
76), and PSMA is overexpressed in prostate cancer and has been implicated in
tumor invasion (
14,
77).
How all the above and other, currently unidentified enzymes may contribute
mechanistically to the observed differences in serum peptide patterns among the 3
different cancers remains unexplained and may require a great deal of future study
to understand. Nonetheless, the differences are statistically significant. It is
also important to note some of the overlaps between the groups. Despite the sex
difference, the breast and bladder cancer signatures overlapped by 8 peptide ions
that deviated in median intensities from the corresponding control ions in a similar
manner; only 1 peptide ion (1865) showed diametrically up- or downregulated
intensities. Breast and bladder (85% males in the study cohort; see Supplemental
Table 1) cancer shared 7 peptide ions with similarly up- or downregulated
intensities; 7 others were either higher in breast cancer but lower in bladder
cancer or vice versa, relative to the control. Finally, 23 out of the 26 prostate
cancer marker peptides were also part of the larger bladder cancer signature.
However, 19 of these 23 had markedly better
P values for bladder
cancer, and 4 were better for prostate cancer, relative to the controls. We think it
unlikely that the overlaps or differences are sex related, as a preliminary
comparison of serum peptide profiles from healthy men and women indicated only
statistically insignificant differences (J. Villanueva and P. Tempst, unpublished
observations). Furthermore, most peptide ion markers for each cancer type were
equally well separated from both male and female subsets of the control group
(Supplemental Figure 1). A more likely explanation for the bladder/prostate cancer
overlap is that the prostate gland and bladder (partially) are derived
embryologically from endodermal tissues in the urogenital sinus and likely share
biological features not seen in tissues from outside the genitourinary tract. For
instance, tissue recombination studies have shown that urogenital mesenchyme can
actually induce differentiation of bladder epithelium toward a prostatic
epithelial–differentiated phenotype, but this property is restricted to
endodermal epithelia (as in the bladder) with similar embryonic origin to the
prostate (
78). Overall, the prostate cancer
signature was sufficiently robust to predict the class of members of an independent
validation set with 97.5% sensitivity in multiclass SVM analysis (Table ).
In conclusion, it is our view that proteolytic degradative patterns in the serum
peptidome hold important information that may have direct clinical utility as a
surrogate marker for detection and classification of cancer. Our findings also
suggest that future work to optimize serum peptidomics for clinical practice should
be carried out with the recognition that endogenous proteolytic activities
contribute important cancer type–specific information. Use of protease
inhibitors and, as we have previously cautioned (
29), even the slightest deviation from standard protocol for specimen
collection, storage and handling, analytical chemistry, and MS signal processing are
particularly ill advised. We anticipate that as we scale up these efforts using the
same general methodology, we will expand and refine our definition of key
discriminatory peptides for prediction of each cancer type. The patterns may also
have diagnostic value for identifying cancer subtype and stage or may mark a given
clinical outcome of interest or may reliably distinguish clinically insignificant
from significant cancer. Such a blood test could, for example, identify patients
with newly diagnosed prostate cancer who might safely avoid surgery or radiation.
Focused MS quantitation of key peptides derived from either endogenous or custom
synthetic substrate and utilizing isotopically labeled standards should then
facilitate introduction of this technology into clinical practice.