|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: JB MS AS. Performed the experiments: AB. Analyzed the data: KS JF QRA ML. Contributed reagents/materials/analysis tools: JF SS VS QRA. Wrote the paper: KS JB ML. Coordinated the study: KS. Performed the data analysis: KS. Was responsible for protein isolation, proteomic analyses, and Western blotting: AB. Was responsible for creation of the relational database: JF. Contributed to the development of the relational database: SS VS. Wrote custom software for and participated in proteomic data analysis: QRA. Contributed to the study design: AYK JB. Contributed to the writing of the manuscript: AYK JB ML. Contributed to the data analysis: JF ML. Contributed to conception of the study and design of the proteomics experiments: MS. Conceived the study and oversaw all aspects of the study: AS.
Although prior studies have demonstrated a smoking-induced field of molecular injury throughout the lung and airway, the impact of smoking on the airway epithelial proteome and its relationship to smoking-related changes in the airway transcriptome are unclear.
Airway epithelial cells were obtained from never (n=5) and current (n=5) smokers by brushing the mainstem bronchus. Proteins were separated by one dimensional polyacrylamide gel electrophoresis (1D-PAGE). After in-gel digestion, tryptic peptides were processed via liquid chromatography/ tandem mass spectrometry (LC-MS/MS) and proteins identified. RNA from the same samples was hybridized to HG-U133A microarrays. Protein detection was compared to RNA expression in the current study and a previously published airway dataset. The functional properties of many of the 197 proteins detected in a majority of never smokers were similar to those observed in the never smoker airway transcriptome. LC-MS/MS identified 23 proteins that differed between never and current smokers. Western blotting confirmed the smoking-related changes of PLUNC, P4HB1, and uteroglobin protein levels. Many of the proteins differentially detected between never and current smokers were also altered at the level of gene expression in this cohort and the prior airway transcriptome study. There was a strong association between protein detection and expression of its corresponding transcript within the same sample, with 86% of the proteins detected by LC-MS/MS having a detectable corresponding probeset by microarray in the same sample. Forty-one proteins identified by LC-MS/MS lacked detectable expression of a corresponding transcript and were detected in ≤5% of airway samples from a previously published dataset.
1D-PAGE coupled with LC-MS/MS effectively profiled the airway epithelium proteome and identified proteins expressed at different levels as a result of cigarette smoke exposure. While there was a strong correlation between protein and transcript detection within the same sample, we also identified proteins whose corresponding transcripts were not detected by microarray. This noninvasive approach to proteomic profiling of airway epithelium may provide additional insights into the field of injury induced by tobacco exposure.
Cigarette smoking, the leading cause of preventable death in the United States, is responsible for 440,000 deaths per year, . Smoking is the single most important risk factor in the development of lung cancer, the leading cause of cancer related death in the U.S., and of chronic obstructive pulmonary disease (COPD), the fourth leading cause of death overall. Although smoking is strongly associated with diseases such as lung cancer and COPD, the mechanisms by which smoking contributes to their pathogenesis are not completely understood.
Cigarette smoke creates a field of molecular injury in the epithelial cells lining the entire respiratory tract. Changes include cellular atypia, allelic loss–, and promoter hypermethylation. Using oligonucleotide arrays and candidate gene approaches, our group and others have previously identified a number of mRNA expression changes that occur in the histologically normal airway epithelium in response to smoking– and in association with disease–. Furthermore, we have recently described smoking-induced changes in airway microRNA expression and their potential role in regulating the mRNA response to tobacco smoke . In this study, we sought to extend this field of molecular injury to the protein level and characterize the effect of smoking on the airway epithelium proteome.
Prior studies have analyzed lung tissue from never, current and former smokers using two-dimensional electrophoresis (2DE) coupled with mass spectrometry, leading to the hypothesis that smoke exposure induces an unfolded-protein-like response . Other studies identified lung-cancer-specific proteomic differences in bronchial epithelium obtained by biopsy from both “healthy” smokers and smokers with a history of lung cancer, . Though studies have been performed using pooled nasal lavage samples and pooled exhaled breath condensate samples, little is known about either the effects of smoking on the proteome of airway epithelial cells, or the variability in this response between individuals. In the current study we examined the effects of smoking on the airway epithelial proteome by analyzing individual samples collected by bronchoscopy from the mainstem bronchus. The ability to collect data from individual samples lays the ground work for understanding variation in the proteomic response to cigarette smoke between individuals which may ultimately be useful for determining why only a subset of smokers develop lung cancer or COPD.
Although studies have tried to address the large-scale correlation between protein production and mRNA expression in both cell lines– and human tissues–, the findings have been variable. Studies of yeast and human liver tissue have yielded moderate correlation of protein abundance to mRNA expression, –, . A strong correlation has been reported for abundant proteins in an epithelial cell line model of ErbB-2 overproduction in breast cancer; however, protein abundance and levels of mRNA expression have correlated poorly in resected lung adenocarcinomas, . The relationship between protein production and mRNA expression in normal airway epithelium remains unclear, as does the impact of smoking on this relationship.
In this study, we profiled proteins and genes expressed within the same bronchial epithelium of never and current smokers via 1D-PAGE with LC-MS/MS and DNA microarrays respectively. The relationship between protein detection and mRNA expression was explored both globally and for individual proteins of interest. We found that the majority of airway proteins detected by mass spectrometry have their corresponding transcripts detected at measurable levels by microarray, and that changes at the protein level in response to cigarette smoke parallel smoking-induced changes in mRNA. This approach also detected proteins whose corresponding transcript expression was not detected by microarrays. This study represents the first application of this approach to the simultaneous proteomic and transcriptomic profiling of airway epithelium within the same individual, providing insight into the normal and smoking-affected airway proteome and the relationship between protein changes and the previously described changes in airway gene expression.
The idemographics for subjects recruited into this study are shown in Table 1. The never and current smokers differed in age and cumulative tobacco exposure (as measured by pack-years of smoking) (p<0.05), but were similar for other demographics. None of the subjects were using inhaled medications.
A total of 652 proteins were detected in one or more never smokers, with 197 proteins found in the majority of never smokers (Figure 1). Proteins with molecular functions related to airway biology were over-represented among this list (Table 2). The functional categorization of the normal airway proteome was compared to over-represented functional categories of the normal airway transcriptome among transcripts detected by microarray both in these same five never smoker samples as well as a larger previously described cohort of 22 never smokers . mRNAs and proteins associated with nucleotide binding, and pyrophosphate activity were over-represented in both datasets (PDAVID-BH<0.05).
613 proteins were detected in one or more current smokers, and 169 proteins were detected in the majority of current smokers (Figure 1). Three proteins differed in their rate of detection between current and never smokers at PFisher≤0.05. Aldehyde dehydrogenase 3B1 (ALDH3B1, NP_000685), a gene highly expressed in lung, was detected in all five never smokers and only one current smoker (PFisher=0.048). Palate, lung and nasal epithelium carcinoma associated protein precursor (PLUNC, NP_570913), a secretory protein in the upper respiratory tract was detected in four never smokers and absent in all current smokers (PFisher=0.048). Hypothetical protein DKFZP586A0522 protein (NP_054752) was also detected in four never smokers and absent in all current smokers (PFisher=0.048).
Due to the small sample size, a second list of differentially detected proteins was defined using a qualitative criterion: proteins were included if present in three or more samples of one class compared to the other. Twenty-three proteins differed between never and current smokers based on these criteria (Table 3).
We validated mass spectrometry findings by immunoblot for three of the proteins that differed between never and current smokers (Figure 2). PLUNC, uteroglobin and P4HB were selected from the list of twenty-three candidates based on their biologic interest, molecular weight, and antibody availability. Of these, PLUNC also had a Fisher exact p-value<0.05. Decreased levels of PLUNC and uteroglobin were confirmed among current smokers, although there was heterogeneity for uteroglobin among current smokers (Figure 2). P4HB levels were elevated in two of the current smokers as compared to two never smokers.
An average of 93% of proteins detected by mass spectrometry had at least one matching probe set on the HG-U133A array. Of these, an average of 86% had detectable gene expression (Pdetection<0.05) in samples collected from the same participants demonstrating a significant level of co-detection (χ2=347, p=2.2×10−16). There was not a significant difference in the rate of co-detection between never and current smokers.
For select proteins where detection varied between never and current smokers, we examined the expression of the corresponding mRNA for smoking-related differential expression. PLUNC (NP_570913), ALDH3B1 (NP_000685), and hypothetical protein DKFZP586A0522 (NP_054752) were selected based on the results of the Fisher exact test. Uteroglobin (NP_003348) and the prolyl 4-hydroxylase beta subunit (P4HB) (NP_000909) were selected based on their qualitative differences between never and current smokers. Within this cohort, mRNA expression positively correlated with protein detection for PLUNC, uteroglobin, and P4HB (Figure 3).
The association between smoking and gene expression was also examined in a previously published cohort  from which we excluded a sample that overlapped with the samples used in this study (Figure 3). Consistent with the protein detection data and the gene expression data from the present study, in this independent group of never and current smokers, ALDH3B1, hypothetical protein DKFZP586A0522, PLUNC and uteroglobin mRNA expression were higher in never smokers and P4HB gene expression was higher in current smokers. Additionally, we used this cohort to assess the potential confounding effects of age on the smoking-induced changes in candidate proteins identified in the current study. Within the previously published cohort, we identified 12 never and 12 current smokers matched within 1 year for age. A t-test performed on these age-matched 12 never smokers and 12 current smokers confirmed differential gene expression of ALDH3B1 (211004_s_at, p=0.03), hypothetical protein DKFZP586A0522 (207761_s_at, p=0.03), PLUNC (220542_s_at, p=0.02), uteroglobin (205725_at, p=0.0005), and P4HB1 (200654_at, p=0.03).
Differences in protein detection by mass spectrometry and transcript detection by microarray were also explored. In the matched samples, there was no expression by microarray of transcripts corresponding to 41 proteins that were detected in ≥50% of samples by mass spectrometry (Table 4). Additionally, expression of these transcripts was detected in ≤5% of the never and current smokers in the larger previously published dataset of never and current smokers. Ten of these 41 proteins have been previously described in the erythrocyte proteome, which is not surprising given that brushings contain small numbers of red blood cells that lack nucleic acids.
We applied 1D-PAGE coupled with LC-MS/MS to the study of the airway epithelium proteome and its response to cigarette smoke exposure. This study presents the first proteomic profile of a relatively pure population of bronchial epithelial cells obtained from bronchoscopy brushings. We also used differences in the rate of protein detection between never and current smokers to identify candidates for proteins that vary in abundance in response to tobacco-smoke exposure. The effect of smoking on several of these proteins was confirmed by Western blot. We also found that for many candidates, smoking similarly affected expression of the mRNA transcripts that gave rise to these proteins. This was accomplished by measuring gene expression in the same samples that were profiled at the proteomic level and in an independent data set. The majority of proteins identified by LC-MS/MS had detectable levels of their corresponding transcript by microarray. Differing methodologies may account for the stronger relationship between protein and gene expression reported here relative to prior studies, , , , .
Analysis of the proteome using 1D-PAGE coupled with LC-MS/MS resulted in the detection of 41 proteins for which expression of corresponding transcripts was not detected by microarray. Some of these failures to detect transcript expression could represent technical limitations of the microarray platform. However, we were intrigued that several of the proteins whose transcripts were not detected by microarray represent erythrocyte-specific proteins. This suggests that: 1) the airway epithelial samples collected for this study were likely contaminated with erythrocytes, and 2) that more generally, stable proteins may be detected by proteomic methods long after the mRNA which encodes for them has disappeared.
Using habitual smoking as a paradigm for inhalational exposures affecting airway epithelium, we have identified changes in protein among smokers by LC-MS/MS and validated select changes with Western blotting. A decrease in the short isoform of PLUNC has previously been described in the pooled nasal lavage fluid of current smokers when compared with nonsmokers. Although the exact function of this protein is unclear, it is thought to act in the inflammatory response to inhaled irritants such as tobacco smoke. Other studies have demonstrated decreased levels of uteroglobin, an anti-inflammatory protein secreted by Clara cells, in the BAL, pooled nasal lavage fluid, and serum of healthy smokers and in the bronchial epithelium of former smokers with COPD undergoing lung transplantation. P4HB has been detected in a proteomic analysis of cell surface proteins of a lung adenocarcinoma cell line and in the 2DE-proteomic analysis of resected lung adenocarcinomas. This protein may function in the anti-oxidant response to cigarette smoke. Other proteins with oxidoreductase activity identified by this approach, such as ALDH3B1, have not previously been linked to cigarette smoking at the protein level but may function in the airway epithelial response to the toxins in cigarette smoke. None of the proteins differentially detected in smokers in this study overlapped with proteins previously described as differentially expressed in the lungs of Winstar rats exposed to cigarette smoke, or proteins differentially detected by 2DE/MALDI-TOF in a human pneumocyte cell line exposed to cigarette smoke extract.
This study was limited by a relatively small sample size, the sensitivity of the proteomic technique, and challenges in the quantification of proteins. While age was a confounding variable in this study, the gene expression changes in the airway epithelium of never and current smokers were validated using age-matched samples from current and never smokers in a previously published gene-expression study , suggesting that the association between smoking-status and both gene and protein expression is unlikely to be due to differences in patient age. The amount of time elapsed between last smoking a cigarette and bronchoscopy was not recorded, and some of the variability of protein levels in Western blotting might relate to potential differences to the acute versus chronic effects of cigarette smoke. Although the small sample size limited the statistical analysis, Western blotting validated differences in protein detection identified by LC-MS/MS suggesting the method's potential specificity. However, the power of our study to detect additional proteomic changes that occurred in response to cigarette smoke exposure was limited. The sensitivity of this technology allowed detection of 859 proteins with a false positive rate of 1%. While this represents a small percentage of the total proteins present in epithelial cells, we have identified a greater number of proteins than previously used methods of sample collection and proteomic analysis for smokers and nonsmokers–. Because of the uncertainties associated with label-free quantification methods for the determination of protein expression levels, this platform serves mainly as a discovery tool. However, promising efforts in this area, including correlation of peak intensity or spectral counts with protein abundance, may soon remove this limitation–.
In summary, we have described the proteomic profile of normal bronchial epithelial cells using 1D-PAGE coupled with LC-MS/MS and linked this profile to smoking-induced transcriptional changes in these same cells. This approach has the potential to provide additional insight into host response to tobacco smoke and the pathogenesis of smoking-related lung disease.
Never (n=5) and current smokers (n=5) were recruited for fiberoptic bronchoscopy at Boston Medical Center. Detailed medical and smoking histories were obtained including number of cigarettes smoked per day, cumulative tobacco exposure measured in pack-years, and an estimation of second-hand smoke exposure. Screening prior to bronchoscopy included an electrocardiogram, chest radiograph and spirometry. Participants with a history of underlying lung disease, significant second hand smoke exposure, an abnormal baseline EKG, or evidence of obstructive lung disease on spirometry (defined as an FEV1/FVC<0.7) were excluded from the study. This study was approved by the Institutional Review Board at Boston Medical Center, and all subjects provided written informed consent.
Bronchial epithelial cell brushings from the right mainstem bronchus were obtained at the time of bronchoscopy with an endoscopic cytology brush (Cellebrity Endoscopic Cytology Brush, Boston Scientific, Natick, MA). Cytokeratin staining has demonstrated that this method results in the collection of greater than 90% pure population of bronchial epithelial cells. Airway brushings obtained for proteomics were immediately placed in PBS (Invitrogen, Carlsbad, CA). Additional brushes were collected for gene expression profiling and stored in TRIzol (Invitrogen). Samples in PBS were pelletted at 3500 rpm for 3 minutes, washed with PBS, and stored at −80°C until processing for mass spectrometry. The airway brushings in TRIzol were stored at −80°C until processing.
After cell lysis with 2% SDS, proteins were separated on a 4–20% polyacrylamide minigel by electrophoresis and stained with Coomassie Blue (Supporting Figure S1). Each gel lane was cut into 35–70 sections. Proteins were reduced with DTT, alkylated with iodoacetamide, and digested with trypsin using a DigestPro 96 robot (Intavis Bioanalytical Instruments, Cologne, Germany). Extracted peptides were dried and resuspended in 0.5% acetic acid in preparation for mass spectrometry.
All samples were analyzed by LC-MS/MS using an LTQ ProteomeX ion trap mass spectrometer (ThermoFinnigan, Waltham, MA). Peptides from each gel slice were serially injected onto a home-packed C18 reverse-phase column (Magic C18AQ, 15 cm×100 micron ID, Michrom Bioresources, Inc., Auburn, CA) interfaced directly to the mass spectrometer. Peptides were separated using short, biphasic, 20-minute gradients of 0–90% acetonitrile in the presence of 0.5% acetic acid. From each parent ion scan (MS scan), the ten most intense ions were selected for collision-induced dissociation, and the spectra of the peptide fragments were recorded (MS2 scan).
The data were analyzed using SEQUEST software. Spectra were queried against the curated entries of the NCBI RefSeq database and Xcorr values adjusted for an empiric false positive identification rate of 1% for forward-sequence proteins as determined by the inclusion of reversed protein sequences. Positive identification of a protein required observation of at least two matching peptides from the same or adjacent gel slices.
Residual protein lysates from two never and five current smoker samples were quantified by 1D-PAGE and Coomassie blue staining (Supporting Figure S2). Of these samples, sufficient material was available for Western blotting of two never smoker samples and four current smoker samples. One current smoker sample was excluded due to lack of signal from the loading control, lamin A/C. Samples were incubated at 86°C in SDS-sample buffer and electrophoresed on a 4–20% SDS-PAGE gel. Proteins were transferred to nitrocellulose and stained with Ponceau Red. The membrane was blocked with 5% nonfat milk in TBS-Tween and incubated with the appropriate primary and secondary antibodies. Mouse anti-human prolyl 4-hydroxylase beta subunit was obtained from Chemicon (Temecula, CA). Mouse anti-human PLUNC and goat anti-mouse-HRP affinity purified antibodies were purchased from R&D Systems (Minneapolis, MN). Rabbit anti-uteroglobin was obtained from Abcam (Cambridge, MA). Lamin A/C, a nuclear matrix protein, was used as a loading control.
Six to eight micrograms of RNA obtained from five of the never smoker and four of the current smoker participants was processed and hybridized to an Affymetrix HG-U133A GeneChip (Affymetrix Inc., Santa Clara, CA) containing ~22,215 probesets as previously described.
Expression Console Version 1.0 (Affymetrix Inc.) was used to generate a MAS5 weighted-mean expression level for each transcript and a detection p-value (Pdetection), which indicates the reliability of detection of that transcript above background on the array. The mean intensity for each array was scaled to 100. Each array included in the final analysis had at least 30% of the probesets detected above background (percent present >30%) and a 3′ to 5′ ratio of signal intensity for GAPDH of less than or equal to 5. One never smoker microarray was excluded based on these quality control filters (low percent present, high 3′ to 5′ GAPDH ratio), leaving four never and four current smoker arrays for analysis.
Sample contamination with significant numbers of non-epithelial cells was evaluated, as described previously, by analyzing arrays for the presence of transcripts known to be present in airway epithelium and by confirming the absence of transcripts specific to non-epithelial cell types. No arrays were excluded based on these criteria.
For each protein, we queried the microarray data from the same patient for expression (Pdetection<0.05) of a matching transcript. The significance of the overlap between detected proteins and transcripts was determined using Pearson's Chi-squared test with Yates' continuity correction.
A comparison of protein detection and transcript expression level was also performed for individual proteins of interest using the microarray data generated in this study and a previously published cohort of 23 never smokers and 34 current smokers , excluding one never smoker in common to this cohort. The transcript expression data for these samples was obtained from http://pulm.bumc.bu.edu/aged and log normalized. The association between smoking status and gene expression was determined as previously described .
Functional enrichment analysis was performed using DAVID (http://david.abcc.ncifcrf.gov/). A modified Fisher exact test (PDAVID) was calculated for all analyses, and the Benjamini-Hochberg method was used to correct for false discovery (PDAVID-BH). To determine the molecular functions that were over-represented within the never smoker proteome, the Gene Ontology (GO) molecular functions of the U133A probes corresponding to the proteins detected in the majority of never smokers were compared to the GO molecular functions of all probe sets on the U133A array. A similar analysis was also performed for the never smoker transcriptome. Genes expressed at Pdetection<0.05 in all never smokers with good quality microarrays were compared to a background of all genes represented by probe sets on the U133A microarray. A parallel analysis was performed in DAVID using the genes expressed at Pdetection<0.05 in the 22 unique never smokers from a previously published data set. Over-represented gene ontology categories for proteins changed by smoking and for proteins that were not detectably expressed by microarray were determined by comparing the corresponding RefSeq identifications numbers for these proteins against the complete set of 859 proteins detected by mass spectrometry in this set of experiments.
Additional information, including clinical data for all of the study participants, the complete list of proteins detected in each sample, percent peptide coverage for each protein and the expression levels for all genes in all samples are stored in a relational MYSQL database that is available at http://pulm.bumc.bu.edu/parce/parce.html. Microarray data from this study has been deposited in the National Center for Biotechnology Information Gene Expression Omnibus (GSE4635). Proteomic data has been deposited at Proteome Commons (http://www.proteomecommons.org/).
1D-PAGE of a current smoker sample prior to mass spectrometry. Proteins from each sample were separated by 1D-PAGE prior to mass spectrometry. A representative sample is shown. MW indicates the molecular weight marker. BSA indicates a bovine serum albumin standard. CS indicates current smoker.
(2.28 MB TIF)
1D-PAGE for approximation of protein yield prior to Western Blot. A small amount of material from each sample was retained for Western blotting. To roughly normalize the protein contribution from each sample, a small amount of material from the remaining samples were analyzed on 1D-PAGE and stained with Coomassie blue. MW indicates a molecular weight standard. NS indicates never smokers, and CS indicates current smokers.
(2.04 MB TIF)
The authors thank Yves-Martine Dumas for her assistance with sample collection.
Competing Interests: The authors have declared that no competing interests exist.
Funding: Doris Duke Charitable Foundation (AS), NIH/NCI R01CA124640 (AS and MEL), NIH/NIEHS U01ES016035 (AS and MEL), ATS Fellow Career Development Award (KS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.