|Home | About | Journals | Submit | Contact Us | Français|
While milk proteins have been studied for decades, strikingly little effort has been applied to determining how the post-translational modifications (PTMs) of these proteins may change during the course of lactation. PTMs, particularly glycosylation, can greatly influence protein structure, function, and stability and can particularly influence the gut where their degradation products are potentially bioactive. In this work, previously undiscovered temporal variations in both expression and glycosylation of the glycoproteome of human milk are observed. Lactoferrin, one of the most abundant glycoproteins in human milk, is shown to be dynamically glycosylated during the first ten days of lactation. Variations in expression or glycosylation levels are also demonstrated for several other abundant whey proteins, including tenascin, bile salt-stimulated lipase, xanthine dehydrogenase, and mannose receptor.
Breast milk is the remarkable product of the mammary gland emerging from evolutionary pressure as the exclusive food for the growing infant 1-11. Because it serves as the sole source of nutrition in early life, milk has been widely studied to better understand the quantities and bioavailabilities of essential nutrients. Milk not only delivers nutrients, but also provides the infant with a number of physiological advantages such as enhanced immunity 9, 12 and the development and maintenance of a complex microflora in the gastrointestinal tract 11, 13-15. In light of these varied and extensive benefits, it is critical to study the structure-function relationship of different components in milk and how they confer advantages to the infant.
Proteins make up a major component of human milk. Proteins as a whole are the fourth most abundant component of human milk after lactose, lipids and oligosaccharides. Total protein concentrations are typically in the 10-20 mg/mL range, and have been shown to decrease during the lactation period, reaching a minimum after approximately one week. This change is also correlated with a drastic increase in total milk production, and a change in overall protein composition. A more detailed study of these dynamic behaviors is especially interesting because milk proteins and their degradation products have been associated with a number of important biological activities. These include immunomodulatory 16, antibacterial 17, and pathogen binding 18 activities.
The rapidly developing infant requires a dynamic and complex source of nutrients and the overall health, development and protection of the infant is enhanced by a variety of bioactivities also present in milk. To understand these activities, research in this group has examined the behavior of important components in human milk during the course of the lactation period. We have recently quantified HMOs during the lactation period in these samples and have found them to remain relatively constant for all components 19 during lactation. One proposed role of HMOs is as a prebiotic 15, 20 and the constant production of oligosaccharides is consistent with that role.
Variation of protein expression is similarly important given the many roles of proteins in the infant diet. While thorough conventional proteomic studies have been performed on both the whey 4, 21 and the milk fat globule membrane 22-23 fractions of human milk, there are very few studies that address any variation in post-translational modifications. Even fewer studies address milk protein glycosylation, although there have been a few notable exceptions. In 2002, Charlwood et al. 23 described the N-glycosylation of four abundant milk fat globule membrane proteins. This study interrogated samples from a wide range of lactation stages, yet reported neither any temporal variations in glycan abundance that are apparent in this study nor variations in protein abundances.
While there have been studies establishing the variation in both total protein quantity and protein composition in human milk, little progress has been made in determining any qualitative or quantitative variations in glycosylation of the major milk glycoproteins. Protein glycosylation is one of the most commonly occurring post-translational modifications, and directly affects protein structure, function and recognition. Protein glycosylation is particularly relevant to milk, as it has been shown to have direct implications in pathogen binding 8, infant development and proteolytic susceptibility 24.
In this work, glycosylation of specific milk proteins is shown to vary during lactation. The work presented here both outlines a strategy for rapidly determining gross protein glycosylation profiles over time, and demonstrates that there are dynamically glycosylated glycoproteins in human milk. Dynamic protein glycosylation suggests many interesting biological roles for individual proteins. An extensive review has been published on the potential roles of protein-linked glycosylation 25. As changes in protein glycosylation can significantly affect protein function, stability, and structure 26-28, tracking these changes could facilitate discovery of additional bioactive species in milk.
The degradation products from these compounds may be equally as important as the intact protein, as they have been shown to perform various biological functions both in vivo and in vitro 8, 12, 29-30. Whereas glycosylation will greatly influence the function, identity and quantities of these degradation products, a more thorough understanding of temporal variations of protein glycosylation is clearly necessary. As the neonate’s intestinal microflora, immune system, and digestive systems change the most rapidly during the first few days of life, understanding changes in protein expression and glycosylation are critically important.
Organic solvents (all HPLC grade or higher) were purchased from Burdick and Jackson. Sequencing-grade trypsin (modified by reductive methylation to reduce autolysis) was purchased from Promega. Bisacrylamide gels, Bradford reagent and Coomassie brilliant blue G-250 were purchased from Bio-Rad. All water used was 18 MΩ deionized water. Pro-Q emerald 300 was purchased from Invitrogen. Glycerol-free Peptide: N-glycosidase F (PNGaseF) was purchased from New England Biolabs. Porous graphitized carbon cartridges were obtained from Glygen. All other reagents were purchased from Sigma-Aldrich.
Human milk samples were obtained from four healthy women. Overall, samples from the first, second, fifth, tenth, fifteenth, sixteenth, seventeenth, and thirtieth, thirty-first or thirty-second days of lactation were interrogated in this study although none of the individuals provided milk for all time points. Quantitative comparisons were made with a minimum of three samples for each time point, and days 15, 16, and 17 were grouped for this comparison as were days 30, 31, and 32. All milk samples were manually expressed and immediately frozen. Samples were then transferred to a -80° C freezer within three hours and stored there until analysis.
One half milliliter of raw milk was centrifuged at 4° C for 30 minutes and the fat and cellular layers removed. Residual lipids were removed by the method of Wessel and Flugge 31. Briefly, three volumes (1.5 mL) of 2:1 chloroform / methanol were added, agitated and the supernatant was retained. An ethanol precipitation of protein from the supernatant was performed overnight at 4° C by adding 5 mL of HPLC grade ethanol, and following centrifugation the supernatant was removed. Precipitates were resuspended in 50 mM ammonium bicarbonate buffer (pH 7.5), and protein quantities were determined by the Bradford method. The precipitated protein was stored at -20° C until analysis.
An aliquot containing 10 μg of protein was mixed 1:1 (v/v) with Laemmli sample buffer containing 350 mM dithiothreitol 32, and was denatured and reduced by 2 minutes of heating at 95° C. SDS-PAGE separations were achieved with discontinuous gradient 10-20% bisacrylamide gels, and constant 20 mA running conditions. Following separations, gels were washed four times for 10 minutes with water. Subsequent staining with Coomassie brilliant blue and the glycoprotein-specific Pro-Q Emerald 300 were performed in parallel on duplicate gels according to manufacturer’s instructions. Gels were scanned using an HP ScanJet 4890 scanner (in the case of Coomassie stained gels) or a Bio-Rad transilluminating scanner (in the case of Pro-Q Emerald stained gels).
Selected gel bands were thoroughly washed with deionized water and gently agitated for a total of 1 hour. Gel pieces were cut into 1 mm wide cubes, and dried in a vacuum centrifuge. Washed, dehydrated gel pieces were incubated with 10 mM dithiothreitol (DTT) at 55° C for 1 hour and 55 mM iodoacetamide (IoAA) at room temperature in the dark for 45 minutes. Gel pieces were then washed with 100 mM ammonium bicarbonate with gentle agitation for 10 minutes, and briefly dehydrated with acetonitrile. This wash step was repeated twice, and gel pieces were dried. Proteins were then digested with 0.5 μg of trypsin in 100 mM ammonium bicarbonate at 37° C for 16 hours. Following digestion, peptides were extracted with acetonitrile, water, and 50% ethanol and dried in a vacuum centrifuge. Samples were reconstituted with deionized water, desalted using a zip-tip C18 microtip and eluted into 5 uL of 50% acetonitrile. One microliter was used for subsequent matrix-assisted laser desorption/ionization (MALDI) mass spectrometry (MS) analysis, as described below.
Tryptic peptides were analyzed on an IonSpec ProMALDI 7.0 Tesla Fourier transform ion cyclotron resonance mass spectrometer (FT-ICR MS) equipped with an external MALDI source and capable of external ion accumulation with vibrational cooling. All spectra were internally calibrated using the InCAS technique 33 and a mixture of standard peptides to increase mass accuracy. The peptides used were Bradykinin fragment peptide 1-7, angiotensin II, P14R, human adrenocorticotropic hormone fragment peptide 18-39, and oxidized B chain of bovine insulin. Protonated, monoisotopic masses for the calibrant peptides were m/z 757.39915, 1046.54179, 1533.85765, 2465.19833, and 3494.65077 respectively. Spectra were deisotoped using the IonSpec PeakHunter software, and the monoisotopic mass lists were then relieved of known contaminants and non-peptide masses using the Mass Sieve approach 34.
Processed mass lists were submitted to MASCOT 35 for protein identification. Carbamidomethylation of cysteine residues was selected as a fixed modification. No variable modifications were included in the searches. Peptide mass tolerances were set to 10 ppm. All proteins were identified above the 95% confidence interval (CI), and mass errors for all assigned peptides were less than 5 ppm. Each band identified only one protein (discounting splice variants) at that confidence level.
A total of eleven samples were examined via LC MS/MS. One hundred micrograms of extracted milk proteins were dried and solubilized with 60 μL 8M urea. Samples were then reduced with DTT and alkylated with IoAA. After dilution in 180 μL of water, an overnight digestion with trypsin was performed. Peptides were then concentrated and desalted using C18 zip-tip before LC separation and online tandem mass spectrometry.
A nanoLC-2D system (Eksigent, Dublin, CA) coupled with an LTQ ion trap mass spectrometer (Thermo Finnigan) was used with a home-made fritless reverse phase microcapillary column (75 μm×180 mm; packed with Magic C18AQ, 3 μm 100 Å: Michrom Bio Resources) and vented column configuration. Digested samples were transferred from the autosampler to the on-line trap column (0.15 mm×20 mm; packed with Magic C18AQ, 3 μm 100 Å) and desalted. Peptides were eluted from the trap and separated on the capillary column using a reverse-phase gradient at a flow rate of 300 nL/min, and directly electrosprayed into the mass spectrometer. A cycle of one MS survey scan followed by ten MS/MS scans was repeatedly acquired over the LC gradient. Dynamic exclusion for 1 min duration was utilized. Buffers were 0.1% formic acid in water (buffer A) and 0.1% formic acid in acetonitrile (buffer B). A 107 min gradient (2-40% B for 95 minutes, followed by 40-80% B for 12 minutes) was used.
Protein identification based on LC-MS/MS was performed using X! Tandem with a fragment ion mass tolerance of 0.40 Da and a parent ion tolerance of 1.8 Da. Iodoacetamide derivatization of cysteine was specified as a fixed modification. Deamidation of asparagine and glutamine, oxidation of methionine and tryptophan, sulphonation of methionine, tryptophan oxidation to formylkynurenin of tryptophan, acetylation of lysine and the N-terminus, and phosphorylation of serine, threonine and tyrosine were specified as variable modifications.
Scaffold software was used to validate MS/MS based peptide and protein identifications. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Peptide Prophet algorithm 36. Protein identifications were accepted if they could be established at greater than 99.0% probability and contained at least 2 identified peptides. Protein probabilities were assigned by the Protein Prophet 37.
To verify changes in lactoferrin glycosylation, 10 ug of milk protein corresponding to lactation days 2 and 5 were reduced, alkylated and loaded on to 10% polyacrylamide gels (86×68×1mm). After separation, protein visualization was performed using Bio-Safe coomassie blue staining. Lactoferrin gel bands were excised and destained with 5% methanol-7% acetic acid in water and frozen overnight. After extensive washing with acetonitrile and 50 mM NH4HCO3, glycans were released using 750 units of PNGaseF at 37 C for 16 hours. Released glycans were extracted from gel pieces by washing them with 200 ul of water, followed by 200ul of acetonitrile, with a 10 minute sonication step following each addition. Supernatants obtained in each step were recovered and combined. This wash cycle was repeated 3 times and supernatants were dried down in a vacuum centrifuge.
Samples were reconstituted in deionized water and oligosaccharides purified by solid phase extraction using porous graphitized carbon cartridges. Hypercarb carbon cartridges were conditioned with 3 column-volumes of deionized water, followed by 3 volumes of 80% acetonitrile in 0.10% aqueous trifluoroacetic acid (TFA) (v/v) and with another 3 vol of deionized water. The samples were loaded onto the cartridge, incubated for 10 min at room temperature, washed with 3 vol of deionized water, eluted with 3 volumes of 40% acetonitrile, 0.1% TFA in water (v/v) and dried in vacuo. Glycans were reconstituted in 5 μL of water. MALDI –FTICR MS was then performed as described previously, using a mixture of maltooligosaccharides as an external calibrant38. All oligosaccharide compositions were identified by accurate mass, utilizing a 10 ppm tolerance, and five replicate spectra were acquired for each sample.
Gel electrophoresis was used to determine semi-quantitative changes in expression of the major milk proteins. Samples from the first, fifteenth, and thirtieth days of lactation were examined in these studies. Total protein amount was determined, and ten microgram aliquots were investigated via SDS-PAGE. Shown in Figure 1 are representative SDS-PAGE analyses. The ten abundant proteins indicated were identified using in-gel digestion and peptide mass fingerprinting. The assignments are summarized in Figure 1 and Table 1, along with pertinent identification parameters such as the number of peptides matched (nP), the probability of a stochastic assignment, and the score of the next best protein identification (ΔSM). For reference, a significant match at the 95% CI corresponds to a score of 65. All proteins were identified with scores of 69 or higher under the Mascot protocol. In addition, these results are characterized by large values of ΔSM which is indicative of a high level of discrimination and a low likelihood of false positive identification afforded by accurate mass analysis 39.
To explore the glycosylation status of the abundant milk proteins, gels were run in duplicate and stained with colloidal coomassie brilliant blue G-250 (for total protein) and Pro-Q Emerald 300 (for glycan content) stains separately. To obtain quantitative information, each gel lane was examined by an in-house image processing program. Several proteins were shown to change in total coomassie signal intensity during lactation. These trends are demonstrated in Figure 2, which shows an overlay of three SDS-PAGE electropherograms generated from a single individual at differing lactation times. Upon examination it is clear that the expression level of the heavy chain of IgA (immunoglobulin alpha-1 chain C, IGHA1) is decreasing for this mother over the first lactation month, as expected. In addition to IGHA1, there are four other proteins detailed below whose abundances are found to vary during the first month of lactation
All quantitative comparisons of protein and glycan-level variations were made using background-subtracted integrals of coomassie or pro-Q stained electrophoretic peaks, respectively. The dynamic behaviors of five glycoproteins including tenascin, xanthine dehydrogenase (XD), BSSL, lactoferrin, and IGHA1 protein are shown in Figure 3. These proteins were selected for this comparison because they each signaled under both the coomassie and the glycoprotein-specific stains. The upper bar graph shows the averaged glycoprotein-specific stain intensity and the lower graph the averaged coomassie response at days 1, 15, and 30 for each protein. Proteins were identified as differentially glycosylated or expressed if the response to the appropriate stain was significantly different between two time points at greater than the 90% CI. Remarkably, tenascin, IGHA1 protein, XD and lactoferrin expression levels all decrease significantly from colostrum to mature milk in these samples. Interestingly, XD expression peaks at two weeks lactation time and drops to approximately half of peak concentration by one month lactation time, while lactoferrin, IGHA1, and tenascin decrease from colostrum to mature milk. Mannose receptor and Ig kappa light chain both increase from the first to the thirtieth lactation days (data not shown). Mannose receptor and XD are each potential N-glycoproteins by virtue of the presence of the consensus sequence for N-glycosylation. XD responds to the glycoprotein-specific stain used in this study. There is no corresponding response from mannose receptor.
Interestingly, there are also glycoproteins that exhibit disparate behavior under protein- or glycan-specific staining. A small section of the SDS-PAGE gel band is shown for illustrative purpose in Figure 4, with samples from the first 10 days of lactation shown. This detailed examination was intended to determine if any interesting dynamic behavior was missed between days 1 and 15 as examined previously. The CBB stain intensities suggest that the gross quantity of lactoferrin remained relatively constant over the first ten days of lactation. Surprisingly, the total glycan stain intensity decreases by approximately 60% over that time. By day 15, this decrease in glycosylation is restored to near its previous levels. As one of the most abundant glycoproteins in milk, this trend represents a significant change in the total glycoconjugate content present in the infant’s digestive system. Additionally, while the expression level of XD peaks at two weeks and decreases afterwards (day 15 and 30 are different at the 95% CI); the glycosylation level is largely unchanged from two to four weeks, suggesting an increase in the degree of glycosylation of this glycoprotein during this time.
Although the CBB and glycan-specific staining methods have previously both been shown to be valid quantitative platforms 40-41 the changes in lactoferrin glycosylation were also verified by another method. Enzymatic release of N-glycans followed by analysis by mass spectrometry was used both to verify the gross glycosylation trends that were observed via the glycan-specific visualization and to identify the compositions of the N-glycans. Shown in Figure 5 are the mass spectra obtained from enzymatic release of electrophoretically separated lactoferrin from different lactation days. The glycosylation trends obtained from the Pro-Q Emerald d 300 staining approximate the total glycan intensity changes observed via high-resolution, high mass accuracy FTICR-MS. The glycans observed were primarily biantennary, complex-type and fucosylated, and compositions for each peak are labeled on the spectra. A decrease in both the total level of glycosylation and the degree of fucosylation was observed between days 2 and 5, although this change was largely nullified by day 15.
LC MS/MS analyses were performed to determine the gross protein and glycoprotein content in milk. As expected, the shotgun analyses identified significantly more proteins than the gel-based analyses. The LC-MS/MS analyses identified a total of 55 proteins present in 11 samples analyzed. Identified proteins were searched in Swiss-prot to determine their annotated glycosylation status. Shown in Table 2 are the proteins identified along with their potential for glycosylation. Of the 55 proteins identified, 27 are glycosylated, and an additional 12 are potentially N-glycosylated based on the presence of the consensus sequence NXS or NXT, where X is any amino acid except proline. O-glycosylation can occur at essentially any serine or threonine residue, so there is no analogous classification for potential O-glycosylated proteins, as essentially no filter exists based on amino acid sequence.
While are a small number of published studies (<10) that detail the proteomics of the milk globule membrane, there have been surprisingly few contemporary reports on the proteomic analysis of the whey protein in human milk. However, there are a few notable exceptions. Using a combination of casein removal, immunodepletion, PMF analysis, and MudPit 42, Palmer et al. identified 151 proteins in colostrum, 83 of which were identified for the first time. Interestingly, the authors identified 27 proteins that are implicated in defense, 10 that have growth-modulating activities, and several that are involved in vitamin or mineral transport, supporting the hypothesis proposed by many researchers that milk is a multifunctional food. Regrettably, due to the difficulty in attaining a sufficient quantity of high quality colostrum samples they were limited to a single pooled sample from one hundred donors and could not examine any temporal or individual variations of milk protein. However, their study was fundamental to the emerging view of milk as a vanguard of human health and development.
The work presented here shifts the focus from what has historically been done in mass spectrometry-based analyses of milk (identifying a large number of proteins) to identifying the dynamic behavior of the most abundant proteins and glycoproteins during lactation. Glycosylation has been shown to influence protein function, stability, susceptibility to proteolysis and cell-cell interactions, therefore changes in glycosylation could greatly influence both the function and degradation state of glycoproteins in the infant’s gut. As the degradation of the most abundant glycoproteins will have the greatest impact on the identity of potentially bioactive glycopeptides, tracking glycosylation changes in these species is essential.
Of the 10 abundant proteins identified via PMF in this study, eight are glycosylated or potentially glycosylated. Five proteins (tenascin, BSSL, Lactoferrin, Ig A and kappa-casein) are well characterized glycoproteins. In addition, alpha-lactalbumin is highly abundant but seldom (~1%) glycosylated 43 and xanthine dehydrogenase was only recently identified as a glycoprotein 44. Most of these abundant glycoproteins are also found to change in concentration during the first month of lactation. Only one of these has previously been shown to vary in glycosylation during the course of lactation, namely BSSL 45. Its glycosylation was shown to vary greatly between the first and the sixth month of lactation, both in absolute quantity of monosaccharide residues and in identity of glycans. BSSL was found to eventually be replaced by a completely nonglycosylated analog during late lactation.
In this study, we examine glycosylation changes of BSSL during the first month, and show that there is dynamic glycosylation during this time as well. While the protein amount does not show significantly different changes in the samples examined here, the glycosylation amount does, showing a two-fold increase from the first to the fifteenth lactation days. This increase is maintained until the thirtieth day, despite an apparent decrease in the total amount of BSSL expression at the thirtieth day of lactation. Whereas BSSL glycosylation will facilitate transport of the active form of the enzyme through the infant’s stomach, these changes may influence the efficiency of lipid digestion and absorption, a critical energy source for the neonate. BSSL glycans contain Lewis epitopes and have been shown to contribute to host defense both through the presence of these epitopes and by virtue of the enzyme’s lipolytic activity 46. Therefore, BSSL glycosylation affects host defense both directly (via acting as a receptor analogue) and indirectly by preserving enzymatic activity of BSSL in the infant’s digestive tract.
Ig A is one of the most abundant glycoproteins in colostrum, but decreases to much lower levels a few days after the onset of lactation. Ig A is commonly believed to augment the infant’s own nascent immune system. Gross glycosylation profiles of Ig A vary in the same manner as the total protein expression, meaning that the degree of Ig A glycosylation shows little variation. Despite the apparent lack of dynamic Ig A glycosylation, the decrease in protein amount leads to a dramatic reduction of the total glycoconjugate during early lactation.
Lactoferrin presents a compelling glycosylation transformation. During the first ten days of lactation, it has been well established that both the concentration of lactoferrin and the total protein concentration decrease in similar fashion. In the present study, a constant mass of protein was analyzed for each sample, and the amount of lacroferrin was unchanged. Despite a constant amount of lactoferrin analyzed, we have found a significant change in both the identity and the extent of glycosylation present. These changes in glycosylation provide another dimension to this well studied glycoprotein that has not been examined in any detail, and at this point biological implications of this change can only be hypothesized. Compellingly, lactoferrin’s digestion products include peptides that have been shown to be antimicrobial 30, and as glycosylation has been shown to influence lactoferrin’s susceptibility to proteolysis 47, the modification could directly influence the generation of these antimicrobial species. It has also been shown that the identity of oligosaccharide present on lactoferrin can affect the efficiency of brush border binding and iron transport in vitro 48, so changes in glycosylation during lactation will directly affect both the identity of the microbiota and the availability of essential minerals in the infant’s digestive system via regulating the degree of digestion of lactoferrin during early life. In addition to demonstrating an overall drop in degree of glycosylation, FTICR-MS based analyses show a decrease in N-glycan fucosylation between days 2 and 5.
Based on global LC-MS/MS analyses we estimate that 70% of the abundant proteins identified here are likely glycosylated, which is larger than estimates of overall glycosylation in human serum proteins 49. What makes human milk especially unique however is the large amount of total glycosylation in the proteins. With eight of the ten most abundant proteins being glycosylated, the absolute quantity of protein-linked oligosaccharide is extremely high. Combined with the highly abundant HMOs (present at 10-20 g/L), there is a predominance of indigestible carbohydrate present in early lactation. Evolutionary pressures suggest that these compounds are present to increase the survival of the infant, as the mother is synthesizing and excreting them at significant personal cost. While models have been established for the presence of HMO, more research is needed to elucidate all the roles of protein linked glycosylation. However, the dynamic nature of this glycosylation suggests that it adjusts to meet the needs of the infant and may suggest what roles these glycans may play biologically.
In summary, milk glycoproteins are both prevalent and abundant. Contrary to the view of milk proteins functioning solely as a source of amino acids for digestion and mineral absorption, the predominance of protein glycosylation implies a more structure-specific role. Research to date has provided evidence that these glycoproteins and their digestive products provide an important source of bioactive compounds with diverse beneficial properties. We have demonstrated in this work that milk presents a dynamic glycoprotein component to the infant. Compellingly, several abundant glycoproteins are shown to vary in expression and glycosylation in an independent fashion.
Funding provided by UC Discovery, California Dairy Research Foundation and Dairy Management Incorporated is gratefully acknowledged. This research was also supported in part by NIEHS Superfund (grant P42 ES04699), and by the CHARGE study (grant P01 ES11269). The authors would also like to thank the UC Davis proteomics staff for assistance with the shotgun experiments. Also, thanks to Dr. Milady Niñonuevo, Caroline Chu, and Larry Lerno for helpful advice. Thanks to Scott Kronewitter for assistance with the image analysis software.