The current study explored the hydrazide method as an effective way to preselect N-linked glycoproteins from cell membranes before applying the resolving power of coupled microbore chromatography-mass spectrometry to identify the proteins. The results indicate this approach was successful in enriching, identifying and site-mapping 27 N-linked glycosylation sites in 25 glycoproteins from crude membrane samples. The method allowed us to dig deeper into the glycosylated membrane proteome than is possible with conventional cellular fractionation. This can be clearly seen in all three human breast cancer cell lines examined. For example, in MCF-7 cell membrane isolates there was only an overlap of 3 N-linked glycoproteins, with the 357 total membrane proteins detected by LC-MS/MS alone, while 12 N-linked glycoproteins were uncovered only after N-linked glycoprotein hydrazide enrichment. Similarly, only 3 glycoproteins were detectable by LC-MS/MS in MDA-MB-453 among the 388 total membrane proteins but N-linked glycoprotein enrichment revealed an additional 10 N-linked glycoproteins after hydrazide enrichment. Finally in MDA-MB-468 cells only 2 N-linked glycoproteins were seen by LC-MS/MS among the 338 total membrane proteins identified. Both these N-linked glycoproteins are signature proteins of the MDA-MB-468 cell line. Clearly there was significant enrichment of the N-linked glycoprotein proteome using the hydrazide system. Recently published was a modification of the original hydrazide method32
that significantly increased the selection and identification of cell surface glycoproteins by chemically tagging glycoproteins on intact living cells38
. The application of this method would significantly reduce the contamination of proteins from intracellular membranes found in crude membrane extracts.
Traditionally N-glycosylation typically occurs at the NXS/T motif on glycoproteins, where X is any amino acid except proline. Interestingly, we also detected a number of non-NXS/T motif glyco-sites with the hydrazide method. We can not rule these peptides out as contaminating peptides that underwent deamination through the hydrazide process nor could we dismiss them due to the covalent isolation of the glyco-residues and stringent washing. Therefore we included them in this manuscript (Supplemental Table 1
). We also included representative MS/MS data (Supplemental Figure 2
). Some of these glco-sites have serine or threonine located right next to the asparagines (NS/T) or two amino acids to the right (NXXS/T).
The identified N-linked glycoprotein differences and similarities between the three cell lines are especially interesting, since out of 291 shared membrane proteins there were only 3 glycoproteins in common. In addition, these three proteins were not detectable in the crude membrane fraction, providing further support for the utility of selective glycoprotein enrichment by the hydrazide method. MCF-7 and MDA-MB-453 cells shared three NXS/T motif glycoproteins in common. Cathepsin D is reported to be secreted in more aggressive breast cancer cell lines39
and clusterin when elevated promotes tumor cell survival by having anti-apoptotic effects on chemotherapy treated breast cancer40
. On the other hand pigment epithelium-derived factor (PEDF) expression was found to be dramatically decreased in breast cancer 41
. PEDF is one of the most potent inhibitors of angiogenesis and is a candidate tumor suppressor in a variety of cancers 42
. Further study of increased expression of PEDF in hormone and receptor positive breast cancer cell lines compared to triple negative cell lines may correlate with the prognosis of this cancer. We also detected another angiogenesis inhibitor protein, thrombospondin-1, in MCF-7 cells. Interestingly, the expression of thrombospondin-1 is inversely correlated with vascularization in breast cancer.43
Further study in HER2 positive and HER2 negative cell lines will be necessary to determine the significance of the presence of these glycoproteins in breast cancer cell lines. Three other interesting MCF-7 glycoproteins that did not have the traditional NXS/T motif where keratin and hypothetical protein LOC9768 isoform 1, KIAA0101 or p15(PAF). Keratin KRT8/18 expression is known to differentiate distinct subtypes of grade 3 invasive ductal carcinoma of the breast 44
. We also see an abundant amount of keratin 8 in the MCF-7 breast cancer cell lines membrane fraction, an apparent candidate biomarker for that particular breast cancer cell type. Hypothetical protein LOC9768 isoform 1, KIAA0101 or p15(PAF), is a 15-kDa protein containing a conserved proliferating cell nuclear antigen (PCNA)-binding motif found to be involved in the regulation of DNA repair, cell cycle progression, and cell proliferation45, 46
. KIAA0101 is over-expressed in tumors of the breast, uterine cervix, brain, kidney, esophagus, lungs and colon45, 46
A single non-NXS/T motif glycoprotein, tankyrase 1, was shared by the triple negative MDA-MB-468 breast cancer cells and the HER2 negative MCF-7 breast cancer cells. Tankyrase 1 is known to be involved in the positive regulation of telomeres which play a critical role in cellular senescence, apoptosis and tumorigenesis47, 48
. Increased expression of tankyrase 1 interacts with and releases a negative regulator of telomerase, telomere repeat binding factor 1 (TRF1), inducing telomere elongation47
. Tankyrase 1 also is elevated in tumor tissues and its expression is higher in estrogen and progesterone negative tumors49
. The discovery of a N-linked glycosylation site may be important in the regulation and function of tankyrase 1 as well as being a useful post-translational modification (PTM) for biomarker detection.
After hydrazide enrichment, LC-MS/MS identified breast cancer 1, early onset isoform 1 (BRCA 1) as a N-linked glycoprotein in MCF-7 cells. Point mutations in BRCA 1 and BRCA 2 have been well documented in 30% of hereditary breast cancers50
. Previously (based on the “Keil rule”) it was believed trypsin cleaves at lysine or arginine but not before proline, however Rodriguez et al.
reported that many peptides are actually cleaved with trypsin before proline residues51
. In addition, one of the N-linked sites on the BRCA1 glycopeptide is a predicted N-linked site (https://db.systemsbiology.net
) based on the N-linked glycosylation consensus sequence NXS/T at amino acid site 913 (EENQGKN*ES
NIK). The enrichment and identification of BRCA 1 as a N-linked glycoprotein also provides support for the utility of the hydrazide method in discovering additional glycoproteins as potential breast cancer biomarkers.
N-linked glycosylation plays a key role in the expression levels and receptor activity of several receptor tyrosine kinases (RTK) such as EGFR and HER252
. N-linked glycosylation of the EGFR positively effects the conformation necessary for receptor-receptor self-dimerization. Four hypothetical N-linked glycosylation sites located in domain III of EGFR have been proposed as potential key contributors to EGFR self-dimerization which is necessary for ligand activation of the receptor53
. We have identified and site-mapped a N-linked glycosylation site in the domain III region of EGFR in MDA-MB-468 breast cancer cells. The EGFR peptide, 347DSLSIN
ATNIK357, contains a N-linked glycosylation consensus site (NXS/T) that was also detected in purified EGFR from A431 epidermal cancer cell line54
. In addition, the hydrazide method identified a N-linked glycosylation at N57 of CD44 which matched one of five N-linked glycosylation sites necessary for CD44 conformational interaction with hyaluronic acid leading to CD44 activation55
. Both O-glycosylation and N-glycosylation of CD44 play a critical role in its function55
. Interestingly the presence of CD44 cell surface receptors and the absence of CD24 is a phenotype found in breast cancer stem cells and basal-like breast tumors otherwise known as triple negative cancers (without PR, without ER and without HER2 receptor) such as the MDA-MB-468 breast cancer cell line56–58
. The CD44+/CD24- phenotype correlates with metastasis and invasiveness59
. Identification of key post-translational modifications (PTMs) that play key roles in receptor function and activity may be important as targets for drug development as well as for the development of PTM specific biomarkers.
The high sensitivity of LC-MS/MS allowed detection of progesterone receptor membrane component 1 (PGRMC1) in both of the estrogen receptor (ER) negative breast cancer cell lines tested and not in the ER positive MCF-7 cell line (Supplemental Table 2
). PGRMC1 is found to be highly expressed in ER negative cell lines compared to ER positive cell lines60
. This agreement between our mass spectrometry data with previously published reports confirms the sensitivity and reliability of the LTQ Orbitrap as a biomarker discovery tool.
A primary goal of this study was to enrich for N-linked glycoproteins in crude cancer cell membrane fractions as a prelude to discover of potential biomarkers similar to the currently known biomarkers HER2, MUC1, ER, and PR. Since we were able to isolate 25 N-linked glycoproteins with many of the proteins playing major roles in cancer we also analyzed the proteins differentially expressed in the membrane fraction that may also be secreted or sloughed off into the interstitial fluid of tumors in vivo. Significant overlaps and differences between the membrane proteins were seen in each tumor cell line, but quantitatively each cell line appeared to have a unique proteome signature. Comparison of the normalized spectral counts of membrane proteins in MCF-7, MDA-MB-453, and MDA-MB-468 cells using the scaffold software program revealed signature differences between the three breast cancer cell lines. MCF-7 had high quantities keratin 8 and 18, solute carrier protein 3, HSP 27 and ErbB-binding protein, while MDA-MB-453 had high quantities of enolase 1, nucleolin, RAB1B Ras oncogene, stomatin, filamin A, valosin and cytokeratin 7. MDA-MB-468 cells had high quantities of EGFR, CD44, filamin A, progesterone receptor component 1, and valosin. All three cancer cell lines also share high expression of STIP1, keratin 19, chaperonin, and HSP-alpha. Any one of these proteins could represent signature new candidate biomarkers of cancer for a blood assay. Further studies of breast cancer cell lines, proximal fluid, tissue samples, and patient matched serum samples using affinity-based assays will be necessary to validate reproducibility and functionality of each of these candidate biomarkers.
This preliminary study of N-linked glycoprotein proteome in breast cancer cell lines has led to the identification of 27 glycosylation sites in 25 proteins. An important extension of this work will be to document the correlation between protein quantity and breast cancer type and severity. Already an overlap in the glycoproteins found in cancer membrane fractions and secreted or released proteins has been detected in the breast cell line culture media (data not shown). Another important extension of this work will be to compare both the conventional O-linked glycoproteins of the cell membrane and the monosacharide O-linked N-acetylglucosamine (O-GlcNAc)61
of the nucleocytoplasmic fraction between the different breast cancer cell lines, tissue and the interstitial fluid around the tumor sites. The discovery of unique candidate biomarkers will help with the stratification of breast tumor types. Future development of antibodies that specifically recognize changes in glycosylation PTMs of candidate biomarkers would further the stratification of breast cancer. In addition, discovery of high abundant proteins secreted into the proximal fluid of breast cancer cell lines or interstitial fluid of tumor tissue may lead to a plausible affinity-based assay capable of identifying these trace level biomarkers in the large volume of circulating blood proteins of breast cancer patients.