|Home | About | Journals | Submit | Contact Us | Français|
There is a critical need for the discovery of novel biomarkers for early detection and targeted therapy of cancer, a major cause of deaths worldwide. In this respect, proteomic technologies, such as mass spectrometry (MS), enable the identification of pathologically significant proteins in various types of samples. MS is capable of high-throughput profiling of complex biological samples including blood, tissues, urine, milk, and cells. MS-assisted proteomics has contributed to the development of cancer biomarkers that may form the foundation for new clinical tests. It can also aid in elucidating the molecular mechanisms underlying cancer. In this review, we discuss MS principles and instrumentation as well as approaches in MS-based proteomics, which have been employed in the development of potential biomarkers. Furthermore, the challenges in validation of MS biomarkers for their use in clinical practice are also reviewed.
Cancer remains a major life-threatening disease with about 14.1 million new cases and 8.2 million cancer-associated mortalities reported in 2012 1. The global demographic and epidemiologic transitions signal an ever-increasing cancer burden over the next decades 2. Cancer is a multigene disease and each tumor is composed of a variety of cell populations with distinct morphologies and behaviors 3. Biomarkers such as proteins or biomolecular chemical modifications are quantifiable indicators of a specific biological state. In this respect, cancer-associated biomarkers are useful for studying disease, identifying patients at different clinical stages, and developing adaptive therapies 4. For example, recent studies have demonstrated that long noncoding RNAs, circular RNAs 5, circulating tumor DNAs 6, and non-essential amino acids that support numerous metabolic processes crucial for the growth and survival of proliferating cells 7 can serve as biomarkers for cancers. Also, epidermal growth factor receptor, which is associated with the development of certain types of cancers 8, is regarded as a useful tool for cancer detection (Figure (Figure11).
Cancer biomarkers can be classified into two categories including disease-related biomarkers and drug related biomarkers 9. A biomarker should be (i) a mediator of the disease pathology, (ii) present at low and stable expression levels in healthy individuals and higher expression levels in patients, and (iii) simple and quick to evaluate 10. Such a biomarker can be assayed and linked to cancer using a defined mechanism 11.
Recently, advanced molecular methods have been used in clinical diagnostic laboratories. Most novel techniques are based on transcriptional profiling and DNA methylation. However, compared with the genome and transcriptome, the proteome is more complex and dynamic 12. The term “proteome” was first used in 1994 to indicate all time- and condition-specific proteins that are simultaneously produced by a cell or a tissue 12. Proteins are often subject to proteolytic cleavage or post-translational modifications. Although genomics and transcriptomics can provide valuable information, they do not always reflect the variation of encoded proteins. Also, the association between mRNAs and protein expression levels is low compared with that of cell surface proteins 13. Since proteins are the functional molecules in an organism and may be most ubiquitously affected in disease, therapy response, and recovery, proteomics holds special promise in detecting pathological conditions, predicting the efficacy of treatment, and tailoring personalized medicine (Figure (Figure2)2) 14.
In a typical clinical proteomic study for diagnostic biomarker discovery, measurement of a large number of proteins in various samples is the first step. The initial protein candidates are proteins that are differentially expressed in patient and control samples 15. By confirmation of differential protein abundance in clinically useful samples, candidates can be progressively credentialed to yield a few specific proteins 15. Candidate biomarker verification should be included in the biomarker development pipeline (Figure (Figure3)3) to provide reproducible and sensitive quantitative assays 16.
Because of the limited availability and accessibility of suitable reagents, most proteins in a species cannot be detected and quantified by affinity-based assays 17. Therefore, almost all currently available proteomic procedures and strategies use mass spectrometry (MS) techniques, which are capable of high-throughput profiling of complex samples. Nowadays, non-targeted MS methods have emerged as suitable tools to perform relative quantitation of a large number of proteins to discover novel protein biomarker candidates while targeted MS mode are applied to identify peptides of interest 18, 19. A variety of MS-based proteomic methods have been developed to identify and quantify proteins in biological and clinical samples 20-23 to obtain biomarker candidates. The present study describes various currently used MS-based proteomic approaches and their applications. Also, the challenges of biomarker validation for their use in clinical practice are discussed.
MS analysis utilizes electromagnetic fields in a vacuum, where the molecular mass of the charged particle is determined 3. MS is used to evaluate the molecular mass of a polypeptide or to determine additional structural features 17. Tandem MS/MS is performed in the latter case to determine detailed structural features of peptides. Moreover, MS-based proteomic methods can also be applied to characterize protein complexes 22. For example, protein conformation in solution and structural characterization of therapeutic proteins can be studied by hydrogen/deuterium exchange mass spectrometry (HDX-MS) 23.
In general, during MS analysis, the analyte is ionized in the gas phase, and the ions are subsequently separated according to their mass-to-charge ratio (m/z). Electrospray ionization (ESI) and matrix-assisted laser desorption ionization (MALDI) are two methods widely used to perform the protein ionization. Both techniques hold great potential for the characterization of biomolecules.
A mass analyzer is an instrument that determines the m/z of ions and the number of ions corresponding to a particular m/z is recorded by a detector. Quadrupole (QD), ion trap (IT), time-of-flight (TOF), orbitrap, and Fourier transform ion cyclotron resonance (FTICR) are common types of mass analyzers. Numerous mass analyzers are often combined to achieve maximum performance 24. For example, Muntel et al. used a quadrupole orbitrap instrument for urine protein biomarker discovery 25. Moreover, the workflow of a MALDI imaging mass spectrometer (MALDI IMS) enables the histology-directed analysis of the mass spectra using tissues 26, 27. In addition, optical density mass analyzers, known for their tolerance of high pressure, are particularly suited to the pulsed nature of ESI.
There are two main approaches to identify proteins applying gel-based proteomics, including bottom-up and top-down proteomics. In the former approach, proteins separated by two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) or in some instances, such as shot-gun proteomics wherein the fractionation step is left out, are digested in gel and then analyzed by MS 28, 29. Which means the proteins are digested using chemicals or enzymes before introducing them into MS. Needless to say, this strategy may have several problems including the occurrence of modifications on disparate peptides. while the top-down approach, on the other hand, both the intact proteins and fragment ions masses can be measured 30 (Figure (Figure44).
2-DE has been applied in proteomic research since its introduction in 1975. For example, Klein et al. used the 2DE-MS approach to analyze the nuclear proteome of human gastric cancer cell lines with and without inactivation of hypoxia-inducible factor 1 31, 32. The shortcomings of this strategy include a limited dynamic range and low-throughput analysis 3. Although 2-D gel is still a powerful technique in proteomic analyses 33, 34, such as alternative detection for modification of specific proteins 35, attempts have been made to alleviate these drawbacks by using other techniques such as three-dimensional gel electrophoresis 36.
Shotgun proteomics, also referred to as discovery proteomics, is a successfully used method 37. It is based on employing a liquid chromatography-tandem MS (LC-MS/MS) for data-dependent acquisition (DDA) or in some certain occasions data-independent acquisition (DIA) mode. In DDA mode, peptide fragmentation is guided by the abundance of peptide ions detected in a survey scan. The recorded information of specific ions is searched against a protein database to determine the peptide sequence and protein identity 38. In addition to its exquisite specificity, DDA-based proteomics has numerous other advantages, including unbiased and free-from hypotheses 39. DIA offers advantages over conventional DDA methods as it overcomes the stochastic, intensity-based selection of peptide precursors 40.
One of the applications of the shotgun approach is to generate spectral libraries for mass spectrometric reference maps 41, 42. It has also been used for the analysis of unique types of samples with biological and clinical importance including serum 43 and plasma 44, 45. In a previous study, shotgun proteomics was applied to detect changes in protein profiles related to lung cancer 46.
Although many MS-based proteomic studies were performed using shotgun proteomics, the stochastic sampling of this technique markedly affects reproducible detection 47. Furthermore, in traditional shotgun proteomics experiments, a large number of MS/MS spectra are collected. Peptide sequences are assigned using database searching algorithms, such as Sequest and PepExplorer, which use rigorous pattern recognition to assemble a list of homologous proteins 48. However, not all spectra acquired are matched to peptides. To investigate this problem, Chick et al. identified unassigned peptides and demonstrated that at least one-third of unmatched spectra arise from peptides with substoichiometric modifications 49.
The adaptation of targeted data acquisition in the form of selected reaction monitoring (SRM), approximately a decade ago, was initially motivated by the requirement for robust and sensitive quantification of proteins 50. Numerous LC-MS workflows employ shotgun LC-MS; however, many others require a significantly higher reproducibility, sensitivity, accuracy, and precision of SRM 51. SRM, also known as multiple reaction monitoring, uses triple Quadrupole (QD) (Figure (Figure5),5), where molecular ions are selected in Q1, collision-activated dissociation fragmentation is performed in Q2, and unique fragments ions are evaluated in Q3 52. SRM is an attractive choice for sample analysis due to its sensitivity 53.
Advances in SRM have led to the discovery of numerous allergens in food complexes and cancer-related proteins 54, 55, 56. Recently, by adding an isotopically labeled protein (15N-α-S1-casein), accuracy of SRM analysis was increased 57. In addition, absolute quantitation (AQUA), which has benefits of linearity over four orders of magnitude 58 and inter-laboratory comparability, also demands its use in allergen quantitation 59.
SRM has also been applied in biological fields 60, metabolic processes 61, signaling pathways 62, and validation of potentially interesting proteins 63. As protein-protein interaction networks are significantly important in biological processes, it is essential to develop a computational method to predict protein-protein interactions. For example, Huang et al. proposed an efficient strategy that used a weighted sparse representation-based classifier model and novel feature extraction to sequence proteins for construction of protein-protein networks. 64. Since investigation of phosphorylation events may serve an important role in biological research, Angeleri et al. developed an efficient strategy to obtain information regarding the phosphorylated sites 65.
Targeted data acquisition by SRM has been successful; however, the technique has intrinsic limitations. For example, the sensitivity of SRM currently cannot achieve the entire space of all organisms. Furthermore, the isolation width of Q1 can lead to false positive identifications 66. Recent improvements, including time-scheduled SRM or intelligent SRM, have increased the scale and improved the quality of SRM evaluations 67. In addition, parallel reaction monitoring has been developed markedly in instrumentation and software 68, 69.
SWATH, a recently developed methodology 70, 71, 72 that relies on peptide spectral libraries, can be established by shotgun or obtained from community data repositories. Therefore, in contrast to SRM, SWATH-MS can quantify unlimited number of peptides that are included in spectral libraries.
SWATH-MS can be used in quantitative interaction proteomics 73, 74, 75. For example, Ortea et al. provided evidence that LC-MS/MS combined pre-treatment and SWATH-MS was effective to identify lung cancer biomarker candidates 76. SWATH-MS is also useful for the identification of candidate biomarkers, which will be further discussed in the following section 77, 78.
Additionally, there have been attempts to optimize the SWATH-MS workflow. The generation of a reference assay library is one of the key challenges and limitations of this approach 79. It has been demonstrated that combined assay libraries can be used for SWATH data extraction 78, and certain software tools have been proposed for creating combined assay libraries 80, 81. The parameters of MS detection were also optimized to increase the size of the library and decrease systematic errors 82. These developments have broadened the application of SWATH.
In SWATH and other DIA approaches, peptides and their modified forms are difficult to distinguish because of the width of the window used for the isolated precursor. Egertson et al. introduced and improved the DIA framework, multiplexed MS/MS, to overcome the constraint on the scanning speed of the instrument 83. The authors also suggested that this method may exploit other strengths of DIA 84.
Multiplexed MS/MS has certain disadvantages. It is more suitable for complex samples rather than simple mixtures due to its likely effect on the detection of low abundance peptides. Furthermore, the de-multiplexing and reconstruction of multiplexed MS/MS data may be a time-consuming process 85.
Gastric cancer has one of the highest mortality rates worldwide 86, 87 urgently requiring its early detection 88, 89. Studies of gastric cancer biomarkers mainly focus on tissues 90, blood 91, and biological fluids to identify protein, RNA 92, and DNA 93. MS-based proteomics can aid in the identification of protein biomarkers and help study the mechanisms underlying gastric cancer 94. Using MALDI-TOF-MS, Yang et al. analyzed serum samples obtained from 70 patients with gastric cancer and 72 healthy volunteers and identified two peptides (P <0.001) related to gastric cancer 95. Quantitative MS-based proteomic approaches include SCI techniques or label-free strategies in gastric cancer research. A variety of sources have been used to identify gastric cancer biomarkers, such as serum, gastric fluid 96, 97, cells obtained from tumor sections 98, cancer stem cells, circulating tumor cells 99, plasma membrane 100, saliva, plasma 101 and cancer tissues 102.
Cancer stem cells (CSCs) have been suggested to be extremely resistant to chemotherapy 103, 104. Therefore, the identification of CSC markers has become a novel therapeutic perspective. Yashiro et al. used CSC-like side population cells to identify novel biomarkers of gastric CSCs 105.
Pancreatic cancer has been described as one of the most lethal tumors 106 with 45220 new cases and 38460 mortalities reported in the US in 2013 107, 108. There is a critical need for developing clinically useful biomarkers for pancreatic cancer detection. Carcinoma antigen 19-9 (CA19-9) is a biomarker which has been shown to be significant in the diagnosis, prognosis, and management of pancreatic ductal adenocarcinoma 109. However, CA19-9 reacts with the sialylated Lewisa blood group antigen present in the glycoprotein serum fraction 110. 5-10% of the general population has the Lewisa-b- phenotype; therefore, CA19-9 is not an appropriate biomarker for these individuals 111. To overcome this problem, Yoneyama et al. identified insulin-like growth factor-binding protein 2 (IGFBP2) (AUC value of 0.706) and IGFBP3 (AUC value of 0.766) as plasma biomarkers for early detection of invasive ductal adenocarcinoma of pancreas 112. In another biomarker study, Zhong et al. described a 2D-MALDI-TOF-TOF-MS/MS combined strategy for isolating and identifying membrane proteins. Immunohistochemical staining experiment demonstrated that the biomarker candidate they discovered was downregulated in pancreatic cancer tissue (P<0.05) 113. In another example, Tatsuyuki et al. identified novel prognostic markers by applying MS-based proteomic analysis 114.
HCC is the most common primary liver malignant disease 115. HCC-associated mortality is high due to numerous contributing factors 116. Therefore, there is an urgent need to develop clinical biomarkers that enable early detection for HCC 117. Megger et al. performed 2-DE and label-free ion intensity-based quantification by applying MS and LC to identify differential protein abundance in HCC and control tissues 118. Later, the same group combined previous results with label-free analysis 119. In another study, Wang et al. analyzed five HCC subline variants using 2-DE coupled with MALDI-TOF MS 120.
CRC is the third most common cancer diagnosed and one of leading causes of cancer-related deaths in the US and 121. The survival of patients with CRC is primarily associated with the stage of cancer 122. However, limited number of CRC biomarkers have been developed 123. Prognostic biomarkers could help the management of CRC 124. Tomonaga et al. used the isobaric tags for relative and absolute quantitation (iTRAQ) shotgun method to discover biomarker candidates, which were subsequently validated by SRM 125. Bosch et al. identified potential cancer markers to improve the diagnostic accuracy of the fecal immunochemical test to detect small traces of the blood protein, hemoglobin 126. In another study investigating CRC, Peltier et al. combined iTRAQ technology 128, 129, 130 with reversed-phase liquid chromatography and MALDI-TOF/TOF to perform quantitative proteomic analysis of adenoma, CRC, and healthy control serum samples 127.
Glycosylation is important in many biological processes, such as immune surveillance for tumors 131, 132, 133. Protein glycosylation commonly occur with the addition of specific glycan residues to asparagine (N-linked glycosylation) 134. Sethi et al. utilized LC-MS/MS-based N-glycoproteomics to map the N-glycome landscape associated with a panel of colorectal cell lines and described a novel method to identify disease-associated markers 135. In another study investigating CRC, a fluorogenic derivatization-LC-MS/MS approach was utilized to perform a differential proteomic analysis of normal and cancer cells 136.
Lung cancer can be classified into small cell (SCLC) and non-small cell lung cancer (NSCLC) 137, 138. Numerous previous studies have demonstrated that pleural effusions contain proteins of potential diagnostic value 139, 140. Recently, the proteome of pleural effusion in patients with NSCLC was investigated using pleural fluid from 20 patients with NSCLC and 10 patients with tuberculosis (Figure (Figure6a)6a) 141.The homodimeric glycoprotein stanniocalcin 2 was reported to serve numerous roles in a variety of cancer subtypes. By applying MS/MS analysis on tissue samples from 53 cancer patients, Na et al. revealed that stanniocalcin 2 was upregulated in lung cancer cells 142.
MALDI-TOF-MS has been used in numerous cancer studies 143. It has been shown that endothelial cells (ECs) play an important role in the tumor microenvironment 144, 145 and the properties of tumor-derived ECs are different from normal ECs 146. Zhuo et al. isolated ECs from lung squamous cell carcinoma using magnetic beads (Figure (Figure6b)6b) 147. Using the same method, Jin et al. discovered a protein candidate which was related to the histological presence of lymph node metastasis and neural invasion (p < 0.01) 148.
Toxicity and drug resistance remain major challenges facing cancer therapy. Efforts have been made to discover ideal biomarkers to improve the treatment efficiency. Rovithi et al. developed a serum peptide algorithm to classify cancer patients with regard to their clinical outcome 149. To guide the radiotherapeutic method and avoid severe toxicity, Walker et al. investigated the alterations in blood during therapy 150.
The collection of saliva is less invasive compared with collection of the blood 151 or tissue making it an attractive biological fluid for diagnosis. Xiao et al. used 2D-MS to analyze two pooled samples. The results indicated that saliva analyses might be established for lung cancer detection 152.
Advances in MS help in mapping a large number of mass spectrophotometric peaks to reference libraries 153. Using LC-SRM, 17 circulating proteins could be identified as potential cancer biomarkers in plasma samples collected from 72 patients 154. However, despite extensive efforts in lung cancer diagnosis, it remains challenging to move protein candidates in the clinic 155-158.
Melanoma is a skin cancer with a high mortality rate 159. Besides serum, urine, and cell lines, proteomics can also be used for quantitative analysis on formalin-fixed paraffin-embedded (FFPE) tissues. For example, Byrum et al. used label-free quantitative MS to analyze FFPE to identify potential targets for the therapy of melanoma 160. Qendro et al. performed LC-MS/MS to profile five melanoma cell lines, a tissue sample of metastatic melanoma, and a benign melanocyte cell strain 161. Bioinformatics analysis was performed with each group of proteins to assign over-represented Gene Ontology terms.
Extracellular vesicles including exosomes are one of the mechanisms used for cell-cell communication. Exosomes are initially defined as reticulocyte-secreted vesicles secreted by many cell types 161, 162. Exosomes play an important role in cancer progression 163. Previous studies demonstrated that melanoma exosomes may influence disease progression by enhancing immunosuppression 164, angiogenesis, and tumor metastasis 165, 166. Lazar et al. performed proteomic analysis of seven melanoma cell lines and demonstrated that exosomes may be a potential biomarker for melanoma classification 167.
Uveal melanoma (UM) is a primary malignancy of eye the etiology of UM remains poorly understood. According to clinical, histopathological, and genetic features of these tumors, patients with UM can be classified into low-risk and high-risk metastatic groups 168. Crabb et al. performed global quantitative proteomic analysis of UM to increase our understanding of UM metastasis processes and to identify biomarkers of UM metastasis 169. MS-based proteomics using the untargeted MS method to discover novel protein biomarker candidates and the targeted MS mode to identify peptides of interest, has been a useful tool in melanoma research 170.
Breast cancer contributes to approximately 14% of the cancer-associated mortality 171. Although 5-year survival rates have improved, ≥20% of all patients continue to develop metastatic disease with an associated poor outlook 172. Hormone receptor positive, erb-b2 receptor tyrosine kinase 2 (ErbB2) positive, and hormone (estrogen or progesterone) receptor and ErbB2 negative breast cancers are the four main types of this aggressive disease 173.
Breast milk is an appropriate cancer microenvironment for identifying breast cancer biomarkers. Aslebagh et al. used a nanoLC-MS/MS to analyze breast milk samples collected from patients with cancer and controls. The results demonstrated that sample-specific bands were present between the two groups 174. Besides milk, serum is also used for identifying breast cancer-specific markers 175-182. Dowling et al. combined metabolomics and proteomics platforms to analyze cancer and non-cancer serum samples 175. High mobility group protein HMG-I/HMG-Y (HMGA1) abundance level was found to be associated with breast cancer clinicopathological features. Maurizio et al. utilized label-free shotgun MS to analyze the proteins extracted from HMGA1-silenced cells and control breast cancer cell line MDA-MB-231 176. Ning Qing Liu et al. evaluated numerous approaches for global proteome quantification and proteins involved in a signaling pathway in breast cancer tissues were identified (Figure (Figure6c)6c) 177. Yang et al. collected serum samples from 183 breast cancer patients and 64 healthy controls to extract peptides using magnetic beads and analyzed by them MALDI-TOF-MS 178. Besides serum, urine was also used in proteomic studies to analyze its feasibility as a potential source for breast cancer biomarkers 183.
Ovarian cancer consists of numerous distinct subtypes 184, 185. However, the gold-standard biomarker, CA125, only performs well in one of these. A number of novel protein biomarkers relevant to ovarian cancer have been identified using MS-based proteomics 186. Nepomuceno et al. applied LC-MS/MS on tissues obtained from chickens that developed ovarian tumors spontaneously as an emerging experimental model to investigate the ovarian cancer proteome and reported the upregulation of an inhibitor in tumors (p = 0.0005) 187. Also, Poersch et al. performed LC-MS/MS on tumor fluids to identify ovarian cancer-associated protein biomarkers 188.
Drug resistance is a major challenge for ovarian cancer chemotherapeutic treatments. Therefore, it is essential to discover biomarkers that can distinguish chemosensitive and chemoresistant ovarian cancer patients 189. Based on the LC-MS/MS results acquired from epithelial ovarian cancer, Chappell et al. hypothesized that mitochondrial proteome changes were required to develop chemotherapy drug cisplatin resistance 190. In another study, Zhang et al. analyzed the protein abundance level in chemotherapy drug paclitaxel-resistant ovarian cancer cells and tissues 191.
Using iTRAQ and LC-MS platform, Shetty et al. revealed that major histocompatibility complex class 1 (p < 0.01) may be related to ovarian cancer drug resistance 192. In addition, the mechanism underlying somatic genome effects on the cancer proteome and associations between post-translational modification levels of proteins and clinical outcomes in high-grade serous carcinomas have been investigated 193.
Since Papanicolaou (Pap) test was approved by the US Food and Drug Administration (FDA) in 1996, a vast majority of cervical cancer screening has used liquid-based Pap test 194, 195. Boylan et al. examined the proteins present in residual Pap test fixative samples from females with normal cervical cytology by 2-D-MS/MS and created a “Normal Pap test Core Proteome” 196. More recently, the same group used iTRAQ to quantify the proteins in Pap test samples from patients with ovarian cancer compared with healthy controls or patients with benign gynecological disease 197. The labeled samples were analyzed by 2D-LC-MS/MS. The results demonstrated that Pap test samples may be a valuable source for the identification of ovarian cancer biomarkers 197.
Urinary cancers include kidney, bladder, prostate, and testicular cancers 198. Sensitive and accurate MS quantitative analyses have been introduced for biomarker discovery in these cancers 199. Zhao et al. performed quantitative proteomic analysis on clear cell renal cell carcinoma (RCC) and adjacent kidney tissues using LC-MS/MS 200. Urine is a rich resource for the investigation of kidney physiology as well as diagnosis of glomerulonephritis, hypertensive nephropathy, and renal cancer 201. Sandim et al. investigated the proteins in urine samples collected from 64 patients with clear cell RCC and compared them with the healthy controls 202, whereas Neely et al. combined proteo-transcriptomic analysis and investigated alterations in protein abundance 203.
Prostate cancer is among the most common types of adult malignancies with an estimated 220,000 American males diagnosed with the disease annually 204. Sensitive biomarkers would improve the efficiency of diagnosis, prognosis, and personalized therapy of prostate cancer. Øverbye et al. identified proteins with differential abundance in 16 prostate cancer patients compared to 15 healthy controls by MS-based proteomics (Figure (Figure6d)6d) 205. Kim et al. developed SRM-MS assays in post-digital rectal examination urine samples. The results demonstrated that this strategy may accurately identify non-invasive biomarkers 206.
Urine is also considered to be an attractive source for bladder cancer biomarkers identification 204, 208. Guo et al. proposed a strategy to identify urine proteins associated with bladder cancer 209. In Europe and North America, a majority of bladder cancers are urothelial carcinomas 210. Lin et al. used MALDI-TOF spectrometry on urinary exosomes for the determination of urothelial biomarkers 211.
MS-based proteomics has also been used to identify testicular cancer biomarkers. Liu et al. used the proteomics platform to identify proteins that participate in spermatogenesis and can, therefore, serve as novel targets for the treatment of male infertility and cancer 212. The proteins they identified may also be used for personalized therapy for patients with testicular cancer.
Cancer progression is a comprehensive event that makes biomarker development a challenging task. Despite rapid advances in academia and industry, not many biomarkers move on to clinical practice 213. Failure of cancer biomarkers appears to be due to several distinct challenges depicted in Figure Figure77 214.
The first category of fraud is quite rare 215. False discovery is a major reason for failure of biomarkers to reach the clinic. These biomarkers fail at independent reproduction in the validation phase 216, 217. Small sample size as well as control samples used in the experiments that are not matched for age, sex, and race can lead to deceptive results 218. Other important issues to be considered include, criteria for selection and inclusion of samples, strict standards for collecting and handling samples, suitability of the methodology, for the analysis of the data obtained, and, most importantly, independent validation of the identified biomarkers 219, 220.
Although few cancer biomarkers have entered clinical use, there are numerous ways to improve the situation. For biomarkers with low specificity and sensitivity that are not suitable for clinical use, it is possible to combine a panel of different biomarkers to identify clinical scenarios 214. For example, a novel ovarian cancer biomarker, human epididymis protein 4, is not superior to CA125, which is an FDA-approved marker for ovarian cancer 221. However, by combining human epididymis protein 4 and CA125, diagnosis of malignant versus benign pelvic masses can be improved 222. For false discovery or artefactual biomarkers, understanding of the biological and molecular heterogeneity of disease states is required to guide the experimental design 223. In addition, efforts should be taken made for improving the MS technologies to explore proteins with lower abundance 224.
Because of recent advances in MS-based proteomics together with streamlined sample preparation, improved instrumentation, and combination of various analytical platforms, numerous cancer biomarkers have been identified with diagnostic and prognostic values. The challenge is to realize the diagnostic and prognostic potential of these biomarkers in the clinical practice.
This research was financially supported by NSFC (61527806, 61401217 and 61471168), Chinese 863 Project (2015AA020502), the National Key Research and Development Program of China (2017YFA0205300), Open Funding of State Key Laboratory of Oral Diseases (SKLOD2017OF04) China Postdoctoral Science Foundation (2016T90403), and the Economical Forest Cultivation and Utilization of 2011 Collaborative Innovation Center in Hunan Province [(2013) 448].