|Home | About | Journals | Submit | Contact Us | Français|
Current diagnostic tools limit a clinician’s ability to discriminate between many possible causes of sensorineural hearing loss. This constraint leads to the frequent diagnosis of the idiopathic condition, leaving patients without a clear prognosis and only general treatment options. As a first step toward developing new diagnostic tools and improving patient care, we report the first use of liquid chromatography-tandem mass-spectrometry (LC-MS/MS) to map the proteome of human perilymph. Using LC-MS/MS, we analyzed four samples, two collected from patients with vestibular schwannoma (VS) and two from patients undergoing cochlear implantation (CI). For each cohort, one sample contained pooled specimens collected from five patients and the second contained a specimen obtained from a single patient. Of the 271 proteins identified with high confidence among the samples, 71 proteins were common in every sample, and used to conservatively define the proteome of human perilymph. Comparison to human cerebrospinal fluid and blood plasma, as well as murine perilymph, showed significant similarity in protein content across fluids; however, a quantitative comparison was not possible. Fifteen candidate biomarkers of VS were identified by comparing VS and CI samples. This list will be used in future investigations targeted at discriminating between VS tumors associated with good versus poor hearing.
Hearing is initiated when sound-induced vibrations of the eardrum and middle-ear ossicles are transmitted to the fluids of the inner ear, leading to stimulation of sensory hair cells and excitation of the auditory nerve. Hearing loss is broadly classified into two categories: (1) conductive hearing loss (CHL), in which mechanical energy transfer from the air to the inner ear is impeded, and (2) sensorineural hearing loss (SNHL), in which tissue pathology in the inner ear or central auditory pathways hamper signal transduction or neural conduction. SNHL can be the result of damage to almost any of the inner ear’s approximately 30 different cell types, as well as the nerves connecting the cochlea to the brain. Although SNHL can be routinely differentiated from CHL, diagnosing the specific pathology invoking an individual patient’s SNHL is a major challenge. No method currently exists to directly evaluate the inner ear, leaving the clinician blind to the underlying pathologic condition in SNHL. Indirect measures, such as audiograms, word recognition scores1, auditory brainstem responses and otoacoustic emissions2 can provide some insight, but are unable to make a specific determination.
Clinically, the limitation described above results in a frequent diagnosis of idiopathic SNHL leaving the patient and physician with no clear prognosis and general, and often ineffective, treatment options. A diagnostic tool capable of providing insight into inner ear pathology would enable the formulation of individualized treatment strategies (“personalized medicine”) tailored to each patient’s specific pathology. The need for such a tool is immediately apparent from the large number of patients who have unsatisfactory results with general treatment options. Furthermore, a diagnostic platform is critical for successful implementation of preservative and restorative therapies that may emerge from ongoing research in inner ear development and regeneration3. Positive patient outcomes will only result if specific disease states are known and targeted, making a diagnostic tool absolutely critical to treatment success.
To lay the groundwork for development of a diagnostic platform capable of filling this clinical need, we present the first analysis of the human perilymph proteome using mass spectrometric (MS) techniques. Perilymph, a proximal fluid of the inner ear, bathes spiral ganglion cell bodies of the auditory nerve and nearly all of the tissues vital to sound transduction. Due to its localization, any protein secreted by a damaged cell or released during an apoptotic or necrotic event will be found in perilymph at higher concentrations than in more peripheral fluids such as blood or cerebrospinal fluid (CSF).
Previous work has demonstrated the utility of perilymph as a diagnostic fluid. Prior to modern imaging technologies, vestibular schwannoma (VS), an important cause of SNHL in clinical practice, was diagnosed via a significant, i.e. >2.5-fold increase in the total protein content of perilymph4,5. This example demonstrates the potential for significant change in perilymphatic protein levels as a result of pathology, encouraging our search for relevant diagnostic information. Identification and validation of a specific set of biomarkers coupled with the development of a refined collection technique, minimizing the risk to hearing, may facilitate the collection and analysis of perilymph for diagnostic purposes in a clinical setting.
The present investigation is, to the best of our knowledge, the first attempt to define the proteome of human perilymph using mass spectrometry. Previous knowledge of the perilymphatic proteome is derived mainly from 2-D gel electrophoresis work aimed at diagnosis of perilymphatic fistula6–9. These early investigations established the presence of over 100 proteins in the fluid, nearly 30 of which were subsequently identified. Our work extends the existing proteomic characterization and compares the protein profile of perilymph to other bodily fluids. We have compared protein content in different pathologic states as a first step towards the discovery of disease biomarkers. The similarity of human and mouse perilymph has also been analyzed to explore the potential applicability of mouse models for discovery of biomarkers relevant for human SNHL.
Beyond the previously mentioned diagnostic possibilities, the knowledge generated in this investigation will be of significant utility to basic auditory scientists and inner ear pharmacologists. Characterization of the protein content of perilymph may provide insight into the molecular mechanisms that function to maintain the inner ear’s unique environment. Knowledge of the fluid content will also allow better understanding and prediction of protein-drug interactions, aiding in assessment of pharmacological efficacy and drug delivery within the highly specialized organ.
Perilymph specimens were obtained from 12 patients undergoing clinically indicated surgeries. All patients had profound sensorineural hearing loss with speech discrimination score < 40%, where 100% is normal. Although detailed medical histories are not available, all patients were healthy enough to undergo general anesthesia without complications. All procedures, six translabyrinthine craniotomies for resection of vestibular schwannomas (VS) and six cochlear implantations (CI), involved surgical opening of the cochlea and collection of approximately 1 μl of perilymph prior to it being displaced. Perilymph samples obtained from CI patients generally have protein concentrations near 2 μg/μl and those from VS patients 15–30 μg/μl4,5. The specimens were placed in 0.2 ml phosphate buffered saline (PBS) and immediately stored at −80°C. The study was approved by the Institutional Review Board of the Massachusetts Eye and Ear Infirmary. Five VS specimens were a generous gift from Dr. Jose Fayad (House Ear Clinic, Los Angeles).
Utilizing these 12 specimens, four samples were prepared for MS/MS analysis. Sample CI_1 contained the perilymph specimen from one CI patient and CI_2 contained the pooled specimens from the remaining five CI patients (Fig. 1). Similarly, sample VS_1 contained the perilymph specimen from one VS patient and VS_2 contained the pooled specimens of the five remaining VS patients. Samples were grouped in this manner to determine if pooling samples would result in more protein identifications per analysis. The CI samples, representing a heterogeneous disease group, served as controls for the VS biomarker search (a homogeneous disease group), as hearing loss in VS may be associated with elevated protein content of perilymph, presumably due to an unknown toxic substance produced by the VS10,11.
Samples were fractionated using polyacrylamide gel electrophoresis and subjected to in-gel tryptic digestion followed by reversed-phase liquid chromatography in-line with a tandem mass spectrometer (GeLC-MS/MS) (Fig. 1). In brief, entire gel lanes were divided into 7–10 sections (VS_1: 8, VS_2: 10, CI_1: 7, CI_2: 10) and proteins in each gel section were digested with trypsin12,13. Peptides extracted from each gel section were analyzed by nanoflow reversed-phase high-performance liquid chromatography (HPLC) system (Eksigent) hyphenated with an LTQ-Orbitrap mass spectrometer (Thermo Scientific) (samples CI_1 and VS_1) or microscale capillary HPLC (Surveyor, Thermo Scientific) hyphenated with an LTQ mass spectrometer (Thermo Scientific) (samples CI_2 and VS_2). The LC columns (15 cm × 100 μm ID, New Objective) were packed in-house (Magic C18, 5 μm, 100 Å, Michrom BioResources). Samples were analyzed with a 60-minute linear gradient (0–35% acetonitrile with 0.2% formic acid) and data were acquired in a data-dependent fashion, with six MS/MS scans for every full scan spectrum.
All data generated from the gel sections were searched against the IPI-human database (v3.61)14 using the Paragon Algorithm15 integrated into the ProteinPilot search engine (v.3; AB/Sciex). Search parameters were set as follows: sample type, identification; Cys alkylation, iodoacetamide; Instrument, Orbitrap/FT (1–3 ppm) (CI_1 and VS_1) or LTQ (CI_2 and VS_2); special factors, gel-based ID; ID focus, none; database, international protein index (IPI) human (v.3.61); detection protein threshold, 99.0%; and search effort, thorough ID. To be considered a valid identification, proteins were required to score above a threshold established by a 1% global false detection rate (corresponding to “Unused” scores in sample VS_1 ≥ 2.01, VS_2 ≥ 7.23, CI_1 ≥ 2.47, and CI_2 ≥ 5.58). This criterion enforced confidence intervals exceeding 99% for each identification. Additionally, each valid protein identification was required to have a minimum of two high confidence (≥95%) peptide identifications attributed to it.
Quantitative comparison of protein expression was accomplished via analysis of spectral count data using the statistical framework QSPEC16. QSPEC employs hierarchical Bayes estimation of a generalized linear mixed effects model to identify proteins with differential expression across data sets. A protein was considered differentially expressed across disease state if QSPEC returned a Bayes factor larger than 10 and the fold change exceeded 1.5.
QSPEC analysis of a 2 disease state data set, each with 2 replicates, would typically couple the spectral count data of all 4 samples into one analysis to maximize the amount of information utilized and increase the statistical power. However, the differences in sample preparation and instrumentation (described previously) make the optimal analytic approach ambiguous. To gain maximal insight into the data, three different implementations of the QSPEC package were tested and the results compared. The three analysis paradigms studied were: (1) spectral counts from all 4 samples were coupled into a single 2-disease-state/2-replicate analysis, (2) spectral counts were summed within each disease state, prior to analysis, to generate a single 2-disease-state/1-replicate data set, and (3) two 2-disease-state/1-replicate QSPEC analyses were conducted and the results were combined using a rudimentary scoring metric to determine expression characteristics. Samples prepared/analyzed similarly were directly compared such that the first QSPEC analysis compared samples CI_1 and VS_1 and the second compared CI_2 and VS_2.
These three paradigms have different benefits and drawbacks. Method 1 is generally the most favorable as it utilizes all of the spectral count data generated in the 4 MS/MS analyses along with the spectra distribution characteristics of each sample. However, this method can fail when replicates have significantly different statistical properties.
Method 2 includes all of the available spectral data into one analysis but fails to utilize information contained within the individual statistical properties of each sample. The creators of QSPEC analyzed algorithm performance with summed vs. distinct replicates and demonstrated, with replicates having similar statistical distributions, that this type of combination yielded satisfactorily equivalent results to method 116.
Method 3 has the benefit of utilizing all of the available spectral and sample distribution information but suffers because the data is divided between two separate analyses. To merge the results of method (3), a simple scoring metric was employed. Proteins found to be over-expressed in the VS sample of either analysis were given a score of +1, those up-regulated in the CI samples a −1, and those without differential expression were given a score of 0. Scores were summed across the two analyses resulting in individual protein scores ranging from −2 to +2.
A total of 271 proteins were identified at high confidence within the four perilymph samples (Fig. 2). The number of proteins identified (MS/MS spectra generated) per sample was: CI_1: 225 (35,027), VS_1: 167 (21,272), CI_2: 106 (112,465), VS_2: 90 (111,127). As none of the samples can be considered clinically normal, we began our analysis with the conservative requirement that a protein must be present in all four samples to be considered part of the core proteome. This analysis generated a list of 71 proteins identified with high confidence in all four samples (Supplementary Table 1). Comparison of this list to previous work9, using 2-D gel electrophoresis of perilymph obtained during stapedectomy procedures, shows strong similarity. Our list of 71 proteins contains 92% of those previously identified, and adds another 46 to the knowledgebase.
If the ‘normal’ proteome inclusion requirement is loosened to include proteins identified in only three of the four samples the number of identifications increases to 99 (lower panel of Supplementary Table 1). This number is similar to the number of protein spots separated by Thalmann et al.9 and includes 100% of the proteins previously identified. Such a reduction of inclusion criteria may be warranted based on the low levels of reproducibility observed in MS analyses (<60% similarity across repeated trials17), suggesting the requirement that a protein be present in all four samples is overly stringent.
Comparing the 71 common protein identifications with published proteomes of human cerebrospinal fluid (CSF) and blood plasma, we note a strong similarity across the fluids (Supplementary Table 1). The CSF database downloaded from the Max-Planck Unified Proteome Database (MAPU)18,19 contained 56 of 71 of the proteins we identified in perilymph while the plasma database obtained from Plasma Proteome Institute (PPI)20,21 contained 56 of 71. In each case, the majority of the non-common proteins are suspected contaminants (keratins, carbonic anhydrases). This similarity among human fluids was expected as animal models of the fluids contain similar protein complements. However, the quantity of individual proteins and the total protein content varies widely8.
Comparison of our perilymph dataset with the proteome of mouse perilymph published by Swan et al.22 revealed moderate similarity across the two fluids (Supplementary Table 1). Orthologues of 31 of the 52 proteins identified in that investigation were found in our list of human perilymphatic proteins (orthology determined using OrthoDB23).
The first iteration of QSPEC analysis, employing method 1, resulted in the identification of 104 proteins with differential expression between CI and VS samples. As mentioned above, this method was favorable due to its utilization of all available spectral count data in one analysis. However, the high degree of dissimilarity in spectral distribution properties, owing to the large difference in the number of spectra acquired by each MS/MS platform, resulted in poor model behavior and unreliable Bayes factors. The normalizing constant (Nj) employed in the generalized linear mixed effects model within QSPEC can adequately account for small differences in total spectral counts between replicates, but in this comparison the difference was large (>4 fold). At the protein level this causes heterogeneous counts within and across disease states. This results in poor model fit and inflation of the Bayes factor due to increased model flexibility when the differential expression term is included. Such false identifications can be partly filtered by only considering fold changes in excess of 50%16 as valid, as this reduces the chances of over-fitting, but the extreme deviation in these data sets may render such methods insufficient.
QSPEC analysis using method 2 resulted in identification of 324 differentially-expressed proteins. Summing counts within disease state had the benefit of generating two data sets with similar statistical properties (resulting in better stability of the model), but did so at the cost of information about each sample’s spectral distribution statistics. This informational loss induced the large discrepancy between the results of methods 1 and 2 and caused very different model behavior than QSPEC’s authors experienced with their statistically more similar replicates16. To illustrate a common conflict, consider the protein hornerin (HRNR). HRNR was identified based on 22 spectra in sample CI_1 and was not found in any other sample. Using method 1, HRNR was not determined to be differentially expressed because the zero spectra from CI_2 suppressed the Bayes factor below significance. However in method 2, QSPEC compared 22 spectra from the CI samples vs. 0 spectra from the VS samples and determined it to be differentially expressed. Furthermore, because the number of spectra associated with samples CI_2 and VS_2 was significantly greater than CI_1 and VS_1, this analysis paradigm gave heavy weighting to the second set of samples.
The final QSPEC analysis, method 3, identified 25 proteins scoring ±2 (indicating up-regulation in both VS or CI samples respectively) and 271 proteins scoring ±1 (up-regulated in one VS or CI sample while the other analysis did not reach significance). In addition to the hundreds of proteins which scored ‘0’ because neither analysis reached significance, twenty proteins scored ‘0’ due to conflicting expression patterns across the two sample groups. Despite the drawback of splitting the data, this approach worked well as each analysis compared data sets with similar statistical parameters and took full advantage of information imbedded within sample spectral distributions. This resulted in improved model stability and more confident determinations of differential expression. As such, the results of method 3 were selected for further analysis.
It is important to note that all of the spectra generated in the MS/MS analysis of the 4 samples, with no limitations on spectra confidence, were analyzed by QSPEC. In cases where a protein was evidenced by many low-confidence (but no high-confidence) spectra this resulted in a QSPEC report of differential expression, even though the protein in question did not meet requirements for a confident identification.
When the results of method 3 are filtered to remove these ‘non-significant’ returns, isoform conflicts and immunological proteins, 3 proteins remained that scored ±2 (CRYM, FN1, and KRT10) and 65 that scored ±1. Of these 68 proteins, 14 were selected as particularly interesting due to their differential expression and biological function, Table 1.
Interestingly, while the total protein content of the VS samples, as determined by SDS-PAGE analysis, was greater than the CI samples (due to higher protein concentration in perilymph from VS patients, data not shown), more identifications were made within the latter group based on our search criteria (combined VS samples: 171; combined CI samples: 237). This difference may be real, indicating decreased complexity of the VS perilymph proteome due to reduced protein secretion (possibly because of tissue loss), or may be an artifactual result from limitations of the dynamic range of detection. Tumor induced up-regulation of a small number of proteins may have suppressed the signal of lower abundance proteins, thus impeding their detection. Separate analysis of each section of the electrophoresis gel likely reduced, but did not eliminate, the potential signal suppression due to high abundance proteins.
Comparison across samples demonstrates that more proteins were identified in each of the samples derived from the specimen of one patient (CI_1 and VS_1) than in the samples derived from the specimens of five patients (CI_2 and VS_2). This disparity may have a similar cause as above in that pooling of samples led to suppression of the signal from low abundance proteins. However, the higher resolution and mass accuracy of the LTQ/Orbitrap MS used to analyze samples CI_1 and VS_1 is the most likely reason for the observed increase in identifications. In the context of the experiment, this discontinuity would likely serve to reduce the number of identifications common across all samples and individual disease groups.
Beyond the obvious limitations induced by the use of pathologic specimens, additional evidence for the need to perform a deeper characterization of perilymph comes from the comparison with published proteomes of human CSF and blood plasma. Both databases are significantly larger than the currently observed perilymph proteome, as each is composed of several hundred proteins. However, these body fluids are plentiful and large volumes in the ml-range are available in contrast to perilymph, of which only single digit μl-amounts can be obtained. Thus, multidimensional separation strategies that are optimized for minimized sample losses would be highly advantageous for the comprehensive analysis of the perilymph proteome.
Direct and quantitative comparison of perilymph with CSF and/or plasma, similar to previous work with mice22, would improve our understanding of human perilymph as a functional fluid. Such work would also be useful to assess the possibility that the similarity between perilymph and plasma is due to blood contamination of the perilymph sample.
It is worth noting that the list of 71 common proteins contains several that may not be of perilymphatic origin. The robust presence of carbonic anhydrases (common in red blood cells), multiple keratins and hemoglobins suggest that samples have suffered contamination from blood and cellular debris. However, carbonic anhydrase is known to be abundant in the cochlea, comprising about 1% of the protein of the membranous lateral wall24. In this analysis, we tried to minimize blood contamination by using visibly transparent samples of perilymph and excluding obviously pink specimens. Standardized methodologies must be carefully implemented in the future to limit sources of contamination and establish whether these proteins are in fact typical components of perilymph.
Exact mechanisms of hearing loss associated with VS are not known as some VS induce hearing loss while others do not, a phenomenon that shows poor correlation with tumor size or growth10,25,26. Evidence suggests that these tumors cause hearing loss not only by compressing the auditory nerve but also by secreting a substance toxic to the cochlear nerve or inner ear11,27. In line with this hypothesis, we compared the VS and CI samples to generate an initial list of candidate biomarkers for VS associated with poor hearing (all samples were collected from patients with severe to profound deafness in the affected ear). Candidates were identified using uniqueness criterion, i.e. identified within both VS but neither CI sample, and differential expression analysis using the QSPEC algorithm for spectral counting.
Two proteins where identified in both VS samples but neither CI sample, Fig. 2. The first, μ-crystallin (CRYM, encoded by CRYM) is also known as NADPH-regulated thyroid hormone binding protein. CRYM is found in the cytoplasm where it binds and promotes transcriptional activity of the thyroid hormone triiodothyronine (T3)28. Expression occurs within many cells of the spiral ligament29 and mutation results in autosomal dominant deafness through changes in intracellular localization and the loss of its T3 binding ability, potentially leading to impaired K+ recycling30,31.
The second protein found within both VS samples and neither CI sample was low density lipoprotein-related protein 2 (LRP2, encoded by LRP2, also known as megalin). LRP2 is a trans-membrane receptor protein found primarily in absorptive epithelial cells, including those of the ear and kidney32–35. LRP2 can bind a wide variety of structurally dissimilar ligands and is a key player in mediating endocytosis of many substances including lipoproteins, sterols, vitamin binding proteins, and hormones. Mutations in LRP2 result in Donnai-Barrow and facio-oculo-acoustico-renal (FOAR) syndromes36, both of which present with sensorineural hearing loss. Surprisingly, LRP2 was not found to be differentially expressed by QSPEC because its spectral count in each VS sample was small. Despite this, LRP2 has been included in the list of potential candidates because of its detection pattern; LRP2 will be carefully validated in future work.
Analysis of the remaining proteins identified in Table 1 shows enrichment for many serine protease inhibitors. This class of molecules plays a critical role in regulating the inflammation response. Similarly, PARK7 and SOD3 both play active roles in regulating oxidative stress while CTSD and VCAN have been implicated in both cancer and neurological disorders. These molecules therefore appear particularly relevant to our interest in illuminating mechanisms of inner ear damage and will be investigated in future work.
Analysis of human perilymph specimens has provided novel insight into the protein content of the inner ear fluid microenvironment. However, the specimens originate from clinically diseased ears, and thus may not represent the complete proteome of normal perilymph. Using mass spectrometry, we have more than doubled the known proteome to include over 70 proteins. This number is less than the approximately 100 proteins previously separated using 2D gel electrophoresis9, although the gap may be narrowed by loosening the stringency of our inclusion criteria, as supported by known limitations of MS analysis. The proteins in perilymph are found to be similar to CSF and plasma, but quantitative comparison is not possible at this time. Human perilymph proteins are also similar to those in mouse perilymph, lending potential support to the use of a mouse model as a vehicle for biomarker discovery. Finally, a list of 15 candidate biomarkers of VS is generated, with some candidates having known roles in hearing and deafness, and other candidates showing significant up-regulation as determined via spectral count analysis with QSPEC.
The authors would like to thank Dr. Jose Fayad for the kind donation of five perilymph specimens. This project was supported by the American Otologic Society (AOS), Massachusetts Life Sciences Center (MLSC), NIDCD grants T32 DC00038 and K08 DC010419, NIH grant U24 DC008559 and Shore Fellowship at Harvard Medical School. This work is solely the responsibility of the authors and does not necessarily represent the official views of the AOS, MLSC, NIDCD or NIH.