|Home | About | Journals | Submit | Contact Us | Français|
Author contributions: A.F.R. and J.N.S. designed research; A.E.H., A.C.Y.W., K.P., C.S., M.R., A.F.R., and J.N.S. performed research; A.E.H., A.C.Y.W., K.P., J.R.Y., A.F.R., and J.N.S. contributed unpublished reagents/analytic tools; A.E.H., A.C.Y.W., K.P., C.S., A.F.R., and J.N.S. analyzed data; A.E.H., A.F.R., and J.N.S. wrote the paper.
The mammalian inner ear (IE) subserves auditory and vestibular sensations via highly specialized cells and proteins. Sensory receptor hair cells (HCs) are necessary for transducing mechanical inputs and stimulating sensory neurons by using a host of known and as yet unknown protein machinery. To understand the protein composition of these unique postmitotic cells, in which irreversible protein degradation or damage can lead to impaired hearing and balance, we analyzed IE samples by tandem mass spectrometry to generate an unbiased, shotgun-proteomics view of protein identities and abundances. By using Pou4f3/eGFP-transgenic mice in which HCs express GFP driven by Pou4f3, we FACS purified a population of HCs to analyze and compare the HC proteome with other IE subproteomes from sensory epithelia and whole IE. We show that the mammalian HC proteome comprises hundreds of uniquely or highly expressed proteins. Our global proteomic analysis of purified HCs extends the existing HC transcriptome, revealing previously undetected gene products and isoform-specific protein expression. Comparison of our proteomic data with mouse and human databases of genetic auditory/vestibular impairments confirms the critical role of the HC proteome for normal IE function, providing a cell-specific pool of candidates for novel, important HC genes. Several proteins identified exclusively in HCs by proteomics and verified by immunohistochemistry map to human genetic deafness loci, potentially representing new deafness genes.
SIGNIFICANCE STATEMENT Hearing and balance rely on specialized sensory hair cells (HCs) in the inner ear (IE) to convey information about sound, acceleration, and orientation to the brain. Genetically and environmentally induced perturbations to HC proteins can result in deafness and severe imbalance. We used transgenic mice with GFP-expressing HCs, coupled with FACS sorting and tandem mass spectrometry, to define the most complete HC and IE proteome to date. We show that hundreds of proteins are uniquely identified or enriched in HCs, extending previous gene expression analyses to reveal novel HC proteins and isoforms. Importantly, deafness-linked proteins were significantly enriched in HCs, suggesting that this in-depth proteomic analysis of IE sensory cells may hold potential for deafness gene discovery.
Sensory receptor hair cell (HC) proteins regulate a wide range of specialized sensory, amplification, and synaptic functions in the inner ear (IE) (Housley et al., 2006; Kazmierczak and Mu, 2012; Wichmann and Moser, 2015). Despite the increasingly rapid rate of deafness gene discovery, approximately one-third of human deafness loci remain uncharacterized and it is estimated that hundreds of human deafness genes remain unidentified (Vona et al., 2015). Recent transcriptome analysis of purified specific IE cell types and single cells have provided important new insights into HC developmental processes and critical gene expression for HC versus supporting cell fates (Elkan-Miller et al., 2011; Burns et al., 2015; Cai et al., 2015; Scheffer et al., 2015). Although mRNA provides a sensitive measure of gene expression, proteomic analysis represents gene product maturation and a measure of functioning pathways. Moreover, mRNA and protein levels do not strictly correlate, splice variants may not be detected, posttranslational processing alters many proteins, and mRNA for highly stable proteins may be missed (Sharma et al., 2015; Liu et al., 2016). Therefore, we set out to establish an initial draft of the mammalian IE HC proteome. To achieve this goal, we conducted in-depth analysis of protein expression of multiple mouse IE cell extracts using high-resolution tandem mass spectrometry (MS)-based shotgun proteomics. We hypothesize that protein expression patterns are discretely regulated in specific highly specialized IE cells that play distinct roles for auditory and balance senses.
To achieve a detailed and confident proteomic characterization, we examined a series of IE extracts with progressive enrichment for HCs: whole IEs, sensory epithelia (SE), and HCs that were FACS purified from dissociated SE because of the HC-specific expression of GFP (GFP+) of our Pou4f3/eGFP reporter mice (Masuda et al., 2011). To confirm protein expression specifically in HCs, we also analyzed FACS-purified GFP− cells (presumed supporting cells) from the SE by MS. Through comparisons between these proteomic datasets, we defined hundreds of proteins and associated genes highly enriched in sensory HCs. By further comparisons with existing HC transcriptome data and with annotations for genes associated with deficits in auditory and vestibular function, we also identify novel HC proteins and isoforms and candidate genes for currently uncharacterized human deafness.
All experiments performed were approved by the animal care committees of the Veterans Administration San Diego Healthcare System, University of California–San Diego, and Northwestern University in accordance with National Institutes of Health and the Society for Neuroscience guidelines for the care and ethical use of animals for scientific research. In all studies, mice of either sex were used. Postnatal day 4 (P4) to P7 Pou4f3/eGFP-transgenic mice were used for all proteomic studies. In these mice, 8.5 kb of DNA 5′ to the Pou4f3 start codon drives the selective expression of eGFP in all neonatal IE HCs (Masuda et al., 2011). Additional validation of targets via immunolabeling of IE tissue was performed using P4–P8 Pou4f3/eGFP mice or wild-type FVB mice. Validation of HC gene expression by qRT-PCR was performed using P3–P5 Pou4f3/eGFP mice. Cells from the IE of three C57BL/6 wild-type mice (RRID:IMSR_JAX:000664) were used to set the FACS collection fluorescence and cell size collection gates.
For analysis of the whole IE, IEs were dissected from temporal bones and the bony/cartilaginous capsule removed by microdissection. For analysis of SE, cochlear and vestibular sensory organs (organ of Corti, utricular and saccular maculae, and semicircular canal ampullae) were extracted into Leibovitz's buffer (Invitrogen, #2183–027) in 60 mm culture dishes for microdissection. Otoconial membranes were removed from the maculae. The dissected cochlear and vestibular preparations were incubated separately with 0.5 mg/ml thermolysin (Sigma-Aldrich, #T7902) in Leibovitz's buffer for 25–30 min in a 37°C/5% CO2 humidified tissue culture incubator to dissociate the extracellular matrices. The thermolysin was then aspirated, extracellular matrix tissue removed, the samples rinsed, and the cochlear epithelia (including the organ of Corti, the spiral limbus, and basilar membrane) and vestibular epithelia (utricular and saccular maculae and cristae of the semicircular canals) were pooled. For analysis of purified HCs and supporting cells, SEs isolated as above were first subjected to enzymatic dissociation. The cochlear and vestibular SEs were incubated separately with FACSMax cell dissociation solution (Genlantis, #T200100). The cell mixture was triturated with a pipette and further dissociated into single cells mechanically by passing through a 23 G blunt-ended needle. The dissociation was monitored by fluorescence microscopy. Dissociated cells were passed through a 40 μm cell strainer (BD Biosciences) to eliminate clumps before sorting and collected into a FACS tube on ice containing Leibovitz's buffer with 5% fetal calf serum. Cochlear and vestibular GFP+ and GFP− cells were sorted with a BD Biosciences FACSAria II cell sorter using a 100 μm nozzle at 488 nm and only cells of high and very low fluorescence, respectively, and of large scatter size (indicative of cell integrity) were collected into 0.01 m PBS (Invitrogen) with protease inhibitors (cOmplete protease inhibitor cocktail tablet, Roche) and lyophilized. Lyophilized samples of 199,894 cochlear and vestibular GFP+ HCs (ratio of 0.38:0.62) or 313,808 cochlear and vestibular GFP− cells (ratio of 0.74:0.26) were pooled and reconstituted into 500 μl of RIPA lysis buffer (150 mm NaCl, 5 mm EDTA, pH 8.0, 50 mm Tris, pH 8.0, 1% NP-40, 0.5% sodium deoxycholate, 0.1% SDS) for liquid chromatography tandem MS (LC-MS/MS).
The IE, SE, and HC samples were dissected and, when present, the temporal bones and the bony/cartilaginous capsule were pulverized with microscale Dounce homogenizers and solubilized for 30 min with ice-cold RIPA buffer (components described above) with protease inhibitor cocktail tablet (cOmplete, Roche). The entire extract was then subjected to methanol and chloroform precipitation, the precipitated protein pellets were solubilized in 100 μl of 8 m urea for 30 min, 100 μl of 0.2% ProteaseMAX (Promega) was added, and the mixture was incubated for an additional 2 h. The protein extracts were reduced and alkylated as described previously (Chen et al., 2008), followed by the addition of 300 μl of 50 mm ammonium bicarbonate, 5 μl of 1% ProteaseMAX, and 20 μg of sequence-grade trypsin (Promega). Samples were digested overnight in a 37°C thermomixer (Eppendorf). Up to 100 μg of protein was loaded for analysis with an Orbitrap Velos or Elite MS and up to 3 μg for analysis with an Orbitrap Fusion MS.
IE samples were analyzed by LC-MS/MS and resulting spectral files were searched against a protein database, as described below, as single or pooled MS analysis. For IE samples, three biological replicates (each consisting of both ears from one mouse) were each analyzed independently by LC-MS/MS and the spectral files from all replicates were pooled for a single database search. For SE samples, pooled cochlear epithelia (organ of Corti) were analyzed independently from pooled vestibular epithelia (utricle, saccule, and ampullae, all extracted from the same 25 mice). Cochlear and vestibular SE spectral files were searched both independently and also pooled for one single database search. Two additional replicates of cochlear SE (organ of Corti) were each analyzed and searched independently, consisting of pooled samples from 35 and 70 mice. For HC samples, GFP+ HCs were sorted from all SE types (organ of Corti, utricle, saccule, and ampullae) from a total of 132 mice, pooled into two replicates that were each analyzed independently, and the spectral files from both replicates were pooled for a single database search. GFP− supporting cell samples from a total of 25 mice were collected and analyzed similarly to GFP+ cells in three pooled replicates.
For multidimensional chromatography (Orbitrap Velos or Orbitrap Elite MS) the protein digest was bomb-pressure loaded onto a Kasil frit 250 μm inner diameter capillary packed with 2.5 cm of 10 μm Jupiter C18 reversed-phase resin (Phenomenex), followed by an additional 2.5 cm of 5 μm Partisphere strong cation exchanger (Whatman) (Link et al., 1999; Washburn et al., 2001). The column was washed with buffer A containing 95% water, 5% acetonitrile (ACN), and 0.1% formic acid (FA). After washing, a 100 μm inner diameter capillary with a 5 μm pulled tip packed with 15 cm of 3 μm Jupiter C18 reversed-phase resin (Phenomenex) was attached to the filter union and the entire split-column (desalting column–union–analytical column) was placed in line with an Agilent 1200 quaternary HPLC and analyzed using a modified 11-step separation described previously (Savas et al., 2012). The buffer solutions used were as follows: 5% ACN/0.1% FA (buffer A), 80% ACN/0.1% FA (buffer B), and 500 mm ammonium acetate/5% ACN/0.1% FA (buffer C). Step 1 consisted of a 90 min gradient from 0–100% buffer B. Steps 2–11 had a similar profile with the following changes: 5 min in 100% buffer A, 3 min in X% buffer C, a 10 min gradient from 0–15% buffer B, and a 108 min gradient from 15–100% buffer B. The 3 min buffer C percentages (X) were 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, and 100%, respectively, for the 11-step analysis. As peptides eluted from the microcapillary column, they were electrosprayed directly into an LTQ Orbitrap Velos or Elite MS (Thermo Finnigan) with the application of a distal 2.4 kV spray voltage. A cycle of one full-scan mass spectrum (400–1800 m/z) at a resolution of 60,000 followed by 15 data-dependent MS2 spectra at a 35% normalized collision energy was repeated continuously throughout each step of the multidimensional separation. Maximum ion accumulation times were set to 500 ms for survey MS scans and to 100 ms for MS2 scans. Charge state rejection was set to omit singly charged ion species and ions for which a charge state could not be determined for MS2. Minimal signal for fragmentation was set to 1000. Dynamic exclusion was enabled with a repeat count: 1, duration: 20.00 s, list size: 300, exclusion duration 30.00 s, exclusion mass with high/low: 1.5 m/z. Application of MS scan functions and HPLC solvent gradients were controlled by the Xcalibur data system.
For Orbitrap Fusion Tribrid MS analysis, the tryptic peptides were purified with Pierce C18 spin columns and fractionated with increasing ACN concentrations (15%, 20%, 30%, 40%, 60%, and 70%). Three micrograms of each fraction was auto-sampler loaded with a Thermo Fisher EASY nLC 1000 UPLC pump onto a vented Acclaim Pepmap 100, 75 μm × 2 cm, nanoViper trap column coupled to a nanoViper analytical column (Thermo Fisher 164570, 3 μm, 100 Å, C18, 0.075 mm, 500 mm) with stainless steel emitter tip assembled on the Nanospray Flex Ion Source with a spray voltage of 2000 V. Buffer A contained 94.785% H2O with 5% ACN and 0.125% FA, and buffer B contained 99.875% ACN with 0.125% FA. The chromatographic run was for 4 h in total with the following profile: 0–7% for 7 min, 10% for 6 min, 25% for 160 min, 33% for 40 min, 50% for 7, 95% for 5 min, and 95% again for 15 min, respectively. Additional MS parameters include: ion transfer tube temp = 300°C, Easy-IC internal mass calibration, default charge state = 2 and cycle time = 3 s. Detector type set to Orbitrap, with 60 K resolution, with wide quad isolation, mass range = normal, scan range = 300–1500 m/z, max injection time = 50 ms, AGC target = 200,000, microscans = 1, S-lens RF level = 60, without source fragmentation, and datatype = positive and centroid. MIPS was set as on, included charge states = 2–6 (reject unassigned). Dynamic exclusion enabled with n = 1 for 30 and 45 s exclusion duration at 10 ppm for high and low. Precursor selection decision = most intense, top 20, isolation window = 1.6, scan range = auto normal, first mass = 110, collision energy 30%, CID, Detector type = ion trap, Orbitrap resolution = 30K, IT scan rate = rapid, max injection time = 75 ms, AGC target = 10,000, Q = 0.25, inject ions for all available parallelizable time.
Peptide spectral files from pooled samples or from biological replicates were combined for database searching. Spectrum raw files were extracted into MS1 and MS2 files using in-house program RawXtractor or RawConverter (http://fields.scripps.edu/downloads.php) (He et al., 2015) and the tandem mass spectra were searched against UniProt mouse protein database (downloaded on 03–25-2014; UniProt Consortium, 2015) and matched to sequences using the ProLuCID/SEQUEST algorithm (ProLuCID version 3.1; Eng et al., 1994; Xu et al., 2006) with 50 ppm peptide mass tolerance for precursor ions and 600 ppm for fragment ions. An eGFP sequence (below) was added manually to the mouse protein database to identify eGFP from IE samples of our Pou4f3/eGFP mice: MVSKGEELFTGVVPILVELDGDVNGHKFSVSGEGEGDATYGKLTLKFICTTGKLPVPWPTLVTTLTYGVQCFSRYPDHMKQHDFFKSAMPEGYVQERTIFFKDDGNYKTRAEVKFEGDTLVNRIELKGIDFKEDGNILGHKLEYNYNSHNVYIMADKQKNGIKVNFKIRHNIEDGSVQLADHYQQNTPIGDGPVLLPDNHYLSTQSALSKDPNEKRDHMVLLEFVTAAGITLGMDELYK.
The search space included all fully and half-tryptic peptide candidates that fell within the mass tolerance window with no miscleavage constraint, assembled, and filtered with DTASelect2 (version 2.1.3) (Tabb et al., 2002; Cociorva et al., 2007) through Integrated Proteomics Pipeline (IP2 version 3, Integrated Proteomics Applications, http://www.integratedproteomics.com). To estimate peptide probabilities and false-discovery rates (FDR) accurately, we used a target/decoy database containing the reversed sequences of all the proteins appended to the target database (Peng et al., 2003). Each protein identified was required to have a minimum of one peptide of minimal length of six amino acid residues; however, this peptide had to be an excellent match with a FDR <0.001 and at least one excellent peptide match. After the peptide/spectrum matches were filtered, we estimated that the protein FDRs were ≤1% for each sample analysis. Resulting protein lists include subset proteins to allow for consideration of all possible protein forms implicated by a given peptide identified from the complex IE protein mixtures.
The complete MS search results, search parameters, and MS raw files have been submitted to MASSIVE (accession number: MSV000079756) and ProteomeXchange (accession number: PXD004210). Upon acceptance, the data (project title: IE hair cell proteome) can be accessed by FTD download (URL: ftp://MSV000079756@massive.ucsd.edu).
Each protein identified with the IP2 pipeline was associated with several different measures of abundance used in our analyses, including: peptide counts, spectral counts, and normalized spectral abundance factor (NSAF) (Zybailov et al., 2006), which takes into account protein length and number of proteins identified in the experiment. When comparing abundances of a given protein across samples, we used NSAF rank rather than abundance to minimize the effects of differences in sample sizes and stochastic differences between MS analyses. Unless otherwise stated, all following analyses were performed, and all plots generated, with custom scripts in MATLAB (Release 2015b; The MathWorks). Venn diagrams were plotted using the venn script (MATLAB Central File Exchange, retrieved 08-06-15). To assess whether a protein was significantly enriched or depleted in a given sample, we devised a two-part algorithm using rank abundances and a criterion defined by a set of “control” enrichment patterns. Of the 3351 proteins identified in common across IE, SE, and HC samples, each was assigned to an enrichment profile based on rank abundance: HC-enriched (IE < SE < HC); SE-enriched (SE > average of IE and HC); HC-depleted (IE > SE > HC); or SE-depleted (SE < average of IE and HC). The overall change in rank (absolute difference maximum − minimum) across proteins varied widely (1–2920). Whereas an HC-enriched protein with a large change in rank between IE and HC samples likely represents a biologically meaningful protein enrichment in HCs, a SE-depleted protein with a moderate or low overall change in rank represents a less interpretable profile that may result from technical differences between samples. We thus defined a conservative criterion for significant enrichment/depletion based on the distribution of overall changes in rank observed in the SE-depleted “control” group: the 95th percentile value, equating to a change in rank abundance of at least 1488.
Exemplar tandem mass spectra were extracted from raw files using Xcalibur (version 3.0; Thermo Fisher Scientific) and b- and y-ion peaks were identified with the IP2 spectrum viewer using average mass mode. To illustrate the position of the identified peptides within the linearized protein sequences, protein domain schematics were created based on alignments made with the UniProt alignment tool (http://www.uniprot.org/align) (UniProt Consortium, 2015) and on domains identified with Pfam (http://pfam.xfam.org) (Finn et al., 2014).
To facilitate comparisons of our proteomic data with existing transcriptomes, transgenic animal phenotype databases, and human deafness genes, proteins were first mapped to Mouse Genome Informatics identifiers (MGI IDs). Using the batch query tool on UniProt (http://www.uniprot.org/uploadlists, accessed 08-06-15), 9000 of 9071 UniProt accession numbers (99.2%) were converted to 6394 MGI IDs. One additional protein was successfully matched to an MGI ID using the MGI batch query tool (http://www.informatics.jax.org/batch, accessed 08-06-15). In constructing a gene-centered Venn diagram, we assigned a given gene to a category if all of its associated proteins identified by MS also fell within the same category (e.g., a HC-only gene represents one or more gene products that were only identified in the HC sample). Gene names displayed in tables were derived from the associated UniProt entry information. For UniProt entries lacking gene names, we instead used gene names from the appropriate MGI entry information.
Transcriptomic data from Scheffer et al. (2015) were selected for protein–mRNA comparison because this study used a similar approach: FACS sorted HCs (GFP+) and presumed supporting cells (GFP−) from cochlear and utricular SE from Pou4f3/eGFP mice. Processed data, as described in Scheffer et al. (2015), were downloaded as a single database from the Shared Harvard Inner-Ear Laboratory Database (Shen et al., 2015). Importantly, we considered only data from P4 and P7 mice to mirror the age range used in the current study. Using the minimum read count criterion (>15) established in Scheffer et al. (2015) and ignoring entries with no reads at either P4 or P7, we ultimately used 18,101 of the total 20,207 genes for our analyses. As in Scheffer et al. (2015), we calculated the fold change in read counts for each gene as GFP+/GFP− and used cutoffs of >2 and <0.5 to define “HC-enriched” and “HC-depleted” genes, respectively.
We then used MGI IDs to identify genes across proteomic and transcriptomic datasets. For plotting proteomic-derived genes against corresponding transcripts, genes were matched by MGI ID, ordered by mRNA rank abundance, binned, counted, and expressed as a percentage of the total genes identified through the proteomic dataset. For plotting cumulative count of gene products versus transcript abundance across GFP+ samples, we defined the total abundance of a given transcript as summed read counts across both P4 and P7 and both cochlear and utricular GFP+ datasets. Fourteen transcripts appeared multiple times in the dataset and were excluded from analysis for simplicity. For plotting cumulative count of gene products versus protein abundance, we used protein NSAF values and treated each protein sequence as a separate entity to allow for differing abundances across isoforms or alternative sequences.
We compiled auditory and vestibular phenotypes identified across various transgenic mouse lines using Mammalian Phenotype (MP) ontology terms within the Mouse Genome Database (MGD) (http://www.informatics.jax.org, accessed 11–30-15) (Eppig et al., 2015), comprising information from large consortium studies as well as primary literature. From this hierarchically organized database, we first extracted data (transgenic mouse lines and MP terms) from the following higher-level categories: “abnormal ear physiology,” “abnormal ear morphology,” “abnormal pinna reflex,” “abnormal postural reflex,” “abnormal startle reflex,” “abnormal vestibulocollic reflex,” “abnormal vestibuloocular reflex,” and “abnormal eye physiology.” Then, we filtered the resulting MP terms to those representing either negative outcomes on a particular assay (e.g., “decreased startle reflex,” “increased threshold for auditory brainstem response”) or negative changes in anatomical loci (e.g., “decreased cochlear hair cell number”). These MP terms were each assigned to one of the following groups of impaired phenotypes: (1) behavior, (2) IE physiology, (3) IE morphology, (4) impaired hearing or increased susceptibility to hearing loss, and (5) eye physiology, used as a negative control group to test the sensitivity of our analyses (e.g., comprises assays analogous to hearing assays, such as electroretinograms instead of auditory brainstem response recordings).
The affected genes (and corresponding MGI IDs) of the transgenic mouse lines associated with the phenotypes in these five groups were filtered by genes identified by our proteomic approach. The number of genes represented in each phenotype group ranged from 232 to 372. Only mouse lines with one gene manipulation (e.g., single gene knock-out or point mutation) were included to simplify interpretation. To predict potential new causative IE genes underlying auditory/vestibular impairment in mice, we filtered the genes in the “behavior” phenotype group to those that did not additionally appear in the “physiology,” “morphology,” or “hearing impairment” groups (i.e., filtered the gene list to those without well characterized roles in the IE or hearing/balance).
Data from the Hereditary Hearing Loss home page (http://hereditaryhearingloss.org, accessed 10-07-15) were used to generate a list of known human nonsyndromic and syndromic deafness genes and a list of human nonsyndromic deafness loci without identified causative genes. The latter list was further verified, and updated where necessary, with data compiled by OMIM (http://omim.org, accessed 10-08-15) to ensure current chromosomal locations for deafness loci. Deafness genes were mapped to mouse orthologs and MGI IDs using the MGI batch query tool (accessed 10-08-15).
To predict potential novel deafness genes, we mapped genes identified only in the HC sample to human deafness loci as follows. MGI IDs were matched to human genes and human chromosomal locations using the GRCm38.p4 mouse and GRCh38.p3 human assemblies within Ensembl BioMart (Ensembl Release 82) (Kinsella et al., 2011; Flicek et al., 2014). The matching human chromosome locations were mapped to cytobands using annotations from UCSC Genome Bioinformatics (GRCh38 assembly, http://genome.ucsc.edu, accessed 10-08-15) (Rosenbloom et al., 2015). Then, only for mouse genes with high orthology confidence (orthology = 1, BioMart), corresponding human cytobands were filtered to those matching deafness loci with unknown causative genes. All resulting HC-only genes residing in deafness loci remain possible candidates for deafness genes because none could be ruled out based on previous studies (verified using OMIM, 10-08-15).
Statistical overrepresentation tests of gene ontology (GO) terms were performed with PANTHER gene analysis tools (http://pantherdb.org) (Mi et al., 2013) using UniProt accession numbers of canonical isoforms as inputs and Bonferroni correction for multiple testing. When multiple significantly overrepresented GO terms arose from identical sets of proteins, the term associated with the highest fold enrichment value was used. Two-sample Kolmogorov–Smirnov tests were performed in MATLAB (The MathWorks) using kstest2. Fisher's exact tests, and inspection of Pearson's residuals from post hoc χ2 tests were performed using R (version 3.2.2) (R Core Team, 2015).
IEs from Pou4f3/eGFP mice (P4, both sexes) or FVB wild-type mice (P6-P8, both sexes) were dissected from the temporal bone, perfused with 4% paraformaldehyde (PFA) through oval and round windows, and postfixed either overnight at 4°C or for 1–2 h at room temperature. For sections, ears were decalcified with 8% EDTA in PB for 7–14 d at 4°C and then cryopreserved in 30% sucrose in PB overnight, followed by sequential replacement to 100% Tissue-Tek Optimal Cutting Temperature compound (Akura Finetek) and snap-frozen in liquid nitrogen before cryosectioning at 30 μm thickness. For whole-mount preparations, the bony capsule microdissection was performed to extract cochlear SE or vestibular macular epithelia. Whole mounts of vestibular epithelia were gently brushed with an eyelash to remove the gelatinous otoconial layer. Tissues were blocked and permeabilized with either 10% goat serum or 5% bovine serum with 1% Triton X-100. The following primary antibodies were used, followed by species-appropriate Alexa Fluor-conjugated secondary antibodies: anti-Casz1 at 1:100, NovusBio, CO, NBP1–86618 RRID:AB_11011305; anti-Cfap36 at 1:100, Bioss, MA, bs-812404R; anti-Fjx at 1:100, Bioss, MA, bs-8103R; anti-Pak3 at 1:50, Sigma-Aldrich, WH0005063M8 RRID:AB_1842823; anti-Nlgn3 at 1:50, R&D Systems, MAB6088; anti-Myo7a at 1:50, Proteus Biosciences, 25-6790 RRID:AB_10015251; anti-Ctbp2 at 1:100, BD Biosciences, 612044 RRID:AB_399431. Tissues were counterstained with DAPI nucleic acid stain and/or Alexa Fluor-conjugated phalloidin actin stain. Images were captured with confocal laser scanning microscopy (either an Olympus FV1000 or a Leica DMI4000).
To validate selection of GFP+ cells, FACS sorted GFP+ (cochlear and vestibular HCs) and GFP− cells (supporting cells) were collected into six-well cell culture plates and incubated at 37°C in a humidified 5% CO2 chamber overnight to allow the cells to adhere. The cells were then washed with PBS, fixed in 4% PFA, permeabilized, and blocked with 10% fetal bovine serum/5% Triton X-100. Anti-Myo7a (Proteus Biosciences, 25-6790 RRID:AB_10015251) was applied at 1:400, detected with Alexa Fluor 594 goat anti-rabbit IgG at 1:100 with DAPI DNA counterstain. The immunolabeled cells were imaged with a FSX100 fluorescence microscope (Olympus).
Total RNA was extracted from FACS sorted GFP+ HCs and GFP− supporting cells using TRIzol and RNeasy Kit (Qiagen). RNA was reverse transcribed using Superscript III First-Strand cDNA synthesis kit (Invitrogen). Real-time PCR was performed with StepOnePlus Real-Time PCR system (Applied Biosystems) using Power SYBR Green Master Mix (Applied Biosystems). A total of 500 ng of cDNA was used for each reaction. Cycling parameters were as follows: 95°C for 10 min; 40 cycles of 95°C for 15 s, and 60°C for 60 s. Triplicates of each primer were performed with a no-template control. Three biological replicates were used for each target. Primers were sourced from Qiagen (QuanTitect primer assays) and were reconstituted to 10× concentration: Pou4f3 (QT00278957), Erich3 (“BC007180”) (QT00154028), Rsph10b (QT00319914), Wipf3 (QT01780695), 4930407I10Rik (QT00263487), Acad12 (QT00304948), Krt81 (QT00306656), 5430421N21Rik (QT02434544), Hes1 (QT00313537), and Coch (QT00116774). Normalized gene expression levels in GFP+ cells relative to GFP− cells are expressed as log10 of the mean relative quantification (RQ) across biological replicates.
To investigate the HC proteome, and to globally investigate auditory/vestibular SE and IE proteomes, we used transgenic HC reporter mice that express eGFP under the control of the Pou4f3-8.5 promoter (Pou4f3/eGFP) (Masuda et al., 2011). In these mice, GFP is robustly and selectively expressed in HCs of auditory and vestibular SE throughout the first three postnatal weeks, as schematized in Figure 1A. We isolated GFP+ HCs by microdissection of SE followed by cell dissociation and FACS analysis (Fig. 1B–D). GFP+ cells, but not GFP− cells, express the HC protein myosin-VIIa (Fig. 1D), consistent with selective eGFP expression in HCs among IE cell types in this transgenic line. FACS gating parameters were chosen to ensure capture of the brightest GFP+ cells from Pou4f3/eGFP mice and no cells from wild-type mice (Figure 1E,F). We performed MS-based shotgun proteomic analysis across 3 IE sample types (cochlear and vestibular components) at ~1 week of age: whole IEs (6 ears from 3 mice), microdissected SE (50 ears from 25 mice), and HCs (199,894 HCs from 132 mice) that were FACS sorted from SE to provide a purified HC population for in-depth protein discovery (Fig. 2A–C).
Using semiquantitative analysis based on protein rank abundance, we defined two sets of HC proteins: those found in the HC sample but not IE or SE samples (“HC-only”) and those with significantly higher abundance in the HC sample (“HC-enriched”) compared with SE and IE samples (HC abundance > SE > IE). Of 12,712 total proteins identified, we found evidence for approximately half (6333) in the purified HCs, with 934 proteins identified as “HC-only” (Fig. 2D, Table 1). Among these 934 proteins are many proteins known to be specifically expressed in HCs and known to support HC function critical for audition and balance, such as Espn, Myo3a, Ocm, Prestin (Slc26a5), and Strc (Zheng et al., 2000; Schneider et al., 2006; Sekerková et al., 2006; Verpy et al., 2011; Tong et al., 2016). Although present in all three sample types, HCs comprise only a fraction of the total protein content in the IE and SE samples; therefore, HC-only proteins likely represent both HC-specific proteins and some low-abundance proteins of other cell types that can only be accessed by MS when sample complexity is reduced. We addressed this latter possibility in part by additionally analyzing a purified population of GFP− cells from the same transgenic line and, in Table 1 of HC-only proteins grouped by gene, we indicate with double asterisks the genes for which proteins were identified in both the purified GFP+ and GFP− samples: only 8.3% (38 of 458 genes). Moreover, results of a GO term overrepresentation test (Fig. 2E) support the former possibility that HC-only proteins are largely HC specific, two of three significantly enriched biological processes categories among HC-only proteins are related to ciliated cells (binomial test with Bonferroni correction: “cilium assembly”: p = 0.0097; “cilium organization”: p = 0.0211; “cellular component assembly involved in morphogenesis”: p = 0.0443), based in part on identification of HC and stereocilia bundle proteins such as Dync2h1 and Rab3ip (Shin et al., 2013; Krey et al., 2015). We note that scarce HC proteins that reside primarily extracellularly may be underrepresented in the HC datasets due to the proteinase digestion step needed to dissociate the SE cells before FACS purification. This may explain the absence of certain proteins from our HC dataset such as cadherin-23 and protocadherin-15, elements of the stereocilia tip links (Kazmierczak and Mu, 2012).
We next sought to extract “HC-enriched” proteins: those identified across multiple sample types but with likely specificity for HCs because the HC proteome is enriched as HCs increasingly dominate the sample, as demonstrated by increasing eGFP protein abundance (Fig. 2F). Using a conservative criterion for significant enrichment based on rank abundance (see Materials and Methods), we defined 92 HC-enriched proteins (Fig. 2G, Table 2), which include known HC proteins such as otoferlin (Otof), Stard10, Calb1, Myo6, Twf2, and Calb2 (Moser et al., 2006; Peng et al., 2009; Herget et al., 2013). HC-enriched proteins, largely structural/cytoskeletal proteins, drove overrepresentation of several GO biological process categories related to cell development and morphogenesis (Fig. 2H, binomial test with Bonferroni correction: “cellular component morphogenesis”: p = 0.0000; “developmental process”: p = 0.0000; “cellular component organization”: p = 0.0001), none of which were overrepresented among the 35 SE-enriched (SE abundance > average of IE and HC) or 63 HC-depleted (IE abundance > SE > HC) proteins (Fig. 2I,J). We also identified 80 proteins enriched in HCs relative to the SE sample that were not identified in the more complex IE sample (Table 2).
The ~12,000 identified IE proteins represent thousands of genes, including hundreds of genes with products identified only the HC sample (“HC-only” genes). The 934 HC-only proteins (Fig. 2D) map to 458 genes; however, 351 of these genes are considered “HC-only” because all associated proteins identified by MS were identified only in the HC sample (Table 1, single asterisks), whereas other genes among the 458 have at least one protein form identified only in the HC sample and another form identified in a different sample. We compared the resulting gene products identified by our MS approach with recent cochlear and utricular RNA-sequencing (RNA-seq) analysis performed by Scheffer et al. (2015), in which the investigators FACS purified GFP+ HCs from the same Pou4f3/eGFP mouse line used in the current study and examined differential gene expression in pooled HCs versus pooled GFP− supporting cells (Scheffer et al., 2015; Fig. 3A). To facilitate protein versus transcript comparisons, we mapped our proteomic data to MGI gene identifiers (genes quantified in Figure 3B) and used only transcriptome data from matched ages (P4 and P7) (Fig. 3A). Whereas Scheffer et al. (2015) examined differential gene product expression between GFP+ and GFP− cell populations, we performed progressive enrichment for GFP+ cells by comparing IE, SE, and HC samples. Therefore, to provide a more direct comparison between the two studies, we first compared overlap of gene products between GFP+ and GFP− cells in each study (Fig. 3C). We reanalyzed the RNA-seq data and found that 2.9% of the genes were found exclusively in the GFP+ dataset (Fig. 3C, left). In contrast, the MS results showed that 40.3% of the identified genes were found only in the GFP+ dataset (Fig. 3C, right, top). We then compared our “HC-only” genes that were identified based on enrichment for HC proteins/genes by comparing IE, SE, and HC datasets with those found in the GFP− MS results. As the overlap of genes in the MS “HC-only” and GFP− datasets is minimal, our HC-only gene population is relatively unchanged: 89.2% of the HC-only genes identified by the enrichment strategy are also identified as “HC-only” by the GFP+ versus GFP− comparison (Fig. 3C, right, bottom). Importantly, MS analysis of GFP− cell extracts failed to identify any eGFP peptides, which shows that these cell extracts have very little or no HC contamination.
We then related our “singleton” genes (those with gene products identified only in IE or SE or HC samples) to gene expression enrichment categories defined by transcriptomic data to test the idea that a HC-only protein corresponds to an mRNA transcript defined as “HC-enriched” by differential gene expression (GFP+/GFP− > 2). We arranged our proteomic-derived genes according to corresponding mRNA rank abundance (Fig. 3D) and, as expected, a large fraction of the 351 HC-only genes (46.2%) were highly enriched in HCs as shown by both proteomic and transcriptomic approaches. Complementing this result, genes encoding IE-only proteins were enriched in the transcriptomic-defined “HC-depleted” category (GFP+/GFP− < 0.5).
Our FACS purification of HCs yielded enrichment of HC gene products that was in many ways similar to previous mRNA studies. However, the abundance levels of HC transcripts versus proteins were not tightly linked. Compared with HC gene products identified by both approaches, HC gene products identified only by RNA-seq tended to have lower abundance (Fig. 3E, top), whereas HC gene products identified only by proteomics tended to have higher abundance (Fig. 3E, bottom). Within each comparison, the abundance distributions were significantly different (Kolmogorov–Smirnov test: p = 0.0000 for top and bottom panels).
We next examined unique peptides identified only in the HC sample (and not in the GFP− sample) that reveal expression of known and novel HC-specific protein isoforms, which are not always accessible in transcriptome analysis (Table 3). Among these peptides, five map uniquely to the ribeye domain of C-terminal binding protein 2 (Ctbp2) isoform-2 (Fig. 4A), a component of specialized presynaptic ribbons found at HC synapses (Khimich et al., 2005), which is distinct from the more widely expressed, nuclear-localized isoform-1 (Verger et al., 2006). We also repeatedly observed one peptide uniquely mapping to isoform-2 of the HC protein Otof (Fig. 4B), a protein that is critical for normal synaptic exocytosis (Roux et al., 2006). In addition, we identified the canonical isoform-1 of DNA methyltransferase 1 (Dnmt1) (Fig. 4C), based on a HC-only peptide mapped to the domain that distinguishes isoform-1 from isoform-2.
We selected several proteins for validation of HC-specific expression by immunolabeling, based on criteria such as novelty to specific tissue type, novelty to IE, and/or candidacy for deafness genes. Several peptide sequences for Pak3 (serine/threonine-protein kinase PAK 3) were identified as HC-only (Table 1). We validated vestibular expression of Pak3 via immunolabeling of utricular whole mounts and observed highly specific expression in the HC cuticular plate region (Fig. 5A), consistent with the reported role for PAKs in development of stereocilia bundles in the cochlea (Grimsley-Myers et al., 2009). Neuroligin-3 (Nlgn3), a glutamatergic and GABAergic synaptic adhesion protein (Budreck and Scheiffele, 2007), was identified as an HC-only protein (Table 1) and, when mutated, can result in impaired auditory/vestibular phenotypes in mice (Chadman et al., 2008). We observed punctate expression of Nlgn3 at the base of cochlear inner and outer HCs (Fig. 5B), areas served by both afferent and efferent innervation that use a variety of neurotransmitters (Goutman et al., 2015). Zinc finger protein castor homolog 1 (Casz1), four-jointed box protein 1 (Fjx1), and cilia- and flagella-associated protein 36 (Cfap36, alias Ccdc104) were all identified as HC-only proteins of potential novel deafness genes (described below). We observed specific cytoplasmic labeling of Casz1 in vestibular and cochlear HCs (Fig. 5C), consistent with previous evidence for HC specificity at the transcript level (Cai et al., 2015). We also found Fjx1 expression in cochlear HC stereocilia (Fig. 5D) and Cfap36 expression in kinocilia and/or basal bodies in developing cochlear HCs (Fig. 5E).
Because multiple HC-only proteins identified by MS were not reported previously in HCs by P4–P7 RNA-seq analysis (Fig. 3D, bottom, purple), we used qRT-PCR to validate HC-specific or HC-enriched gene expression for several of these corresponding genes (Fig. 5F): Erich3 (“BC007180”), Rsph10b, Wipf3, 4930407I10Rik, Acad12, Krt81, and 5430421N21Rik.
Hearing loss or vestibular impairment can arise from damage or mutations across different IE cell types; however, we hypothesized that changes in HCs are more likely to produce a measurable phenotype. To test this possibility, we compiled phenotype–genotype associations from the MGD (Eppig et al., 2015) from transgenic mouse assays related to IE structure and function and examined the distribution of these genes across our datasets. Genes related to transgenic mouse hearing/balance behavioral deficits, such as Espn, Grxcr1, and Tomt, were significantly overrepresented by gene products in the HC-only sample compared with IE- or SE-only samples (Fig. 6A) when referenced to the total number of genes found in each of these groups (Fig. 3B; Fisher's exact test, p = 0.0014). Similarly, HC-only genes related to aberrant IE physiology and to overall impaired hearing in transgenic mice, such as Ocm, Slc26a5, and Strc, were significantly overrepresented (Fisher's exact test, p = 0.0019 and p = 0.0015, respectively). In contrast, genes related to impaired eye physiology (used as a negative control), such as Bbs4, were not statistically overrepresented in the HC-only group (Fisher's exact test, p = 0.8127). Overall, proteins associated with impaired hearing/balance phenotypes are relatively more likely to derive from HCs based on their enrichment in the HC sample compared with IE or SE samples.
Based on these results, we identified several HC-only genes, previously undercharacterized in the mammalian ear, as potential novel sources of mouse auditory/vestibular deficits when mutated, signifying gene products potentially uniquely important for HC function. HC-only genes Evl, Otud7b, and Pex5l were associated with Mammalian Phenotype (MP) “decreased startle reflex” (MP:0001489); Nlgn3 was associated with “decreased startle reflex” and “altered righting response” (MP:0002862); and Shank2 and Ugcg were associated with “impaired righting response” (MP:0001523). Although the HC sample comprises more vestibular than cochlear HCs, the group of 351 HC-only genes (Fig. 3B) is not preferentially enriched for vestibular-HC-specific genes when compared with differential expression of cochlear versus utricular HC transcripts (Scheffer et al., 2015) (cf. Fig. 6B,C), suggesting that these HC-only genes are equally likely to play roles in audition and/or balance.
More directly related to human health, we examined the distribution of known human deafness genes identified by MS. Deafness gene orthologs found with our proteomic approach were significantly overrepresented in the HC-only group compared with the SE and IE groups (Fig. 6D) when referenced to the total number of genes found in each of these groups (Fig. 3B; Fisher's exact test, p = 0.0024), representing genes with products enriched in or exclusive to HCs, including: HC differentiation transcription factor Pou4f3, outer HC somatic motility protein Slc26a5, and stereocilia proteins Myo3a, Grxcr1, Grxcr2, Espn, and Strc (Erkman et al., 1996; Schneider et al., 2006; Sekerková et al., 2006; Verpy et al., 2008; Peng et al., 2011; Takahashi et al., 2016). Other examples of deafness genes known to be enriched in or specific to HCs were found in all sample types, indicating a relatively high protein abundance in HCs to be accessible by MS in analyzing the more complex SE and IE samples, including: Serpinb6, Actg1, Gipc3, Rdx, Myo6, and Otof (Fig. 6D) (Avraham et al., 1997; Kitajiri et al., 2004; Roux et al., 2006; Sirmaci et al., 2010; Vona et al., 2015). Additional deafness gene products were identified in organ of Corti replicates (Fig. 6E), including genes with known HC-specific expression (Yoon et al., 2011).
Because of the significant enrichment for deafness-related genes in the HC-only group and the fact that the proportion of deafness genes identified in GFP+ HCs through proteomics (37 of 2853 genes, 1.30%) is double that identified in age-matched transcriptomic data (Scheffer et al., 2015) (117 of 17,742 genes, 0.66%), we thus sought to identify potential novel deafness genes by mapping our GFP+ only, HC-only genes (Fig. 3C, right, bottom) to deafness loci with unknown causative genes. We found 30 such gene products that potentially underlie 19 documented forms of nonsyndromic deafness in humans (Fig. 7, Table 4). At least two of these genes have recently been proposed as possible deafness genes (e.g., Mrpl9 and Mrps11; Sylvester et al., 2004). However, the majority of these candidate deafness genes are proposed for the first time here based on our HC-only proteomic data. HC and HC-stereocilia expression of several candidate deafness genes (Casz1, Fjx1, and Cfap36) (Fig. 5C–E) supports their putative roles in HC function and, ultimately, when mutated, potential roles as sources of human deafness.
The senses of hearing and balance each rely on specialized sensory HCs in the IE that respond to sound, acceleration, and orientation and faithfully convey these signals to afferent sensory fibers of the eighth cranial nerve. Both acute and life-long accumulation of damage to the protein machinery of HCs, including the stereocilia transduction apparatus, presynaptic signaling complexes, and intracellular mechanisms for supporting highly metabolically active processes, can result in significant auditory or vestibular impairment. Genetic bases for hearing loss are continually being discovered with the maturation of high-throughput genomic approaches, although many documented but poorly understood forms of hereditary deafness remain uncharacterized (Vona et al., 2015). The molecular bases of acquired hearing loss, such as through overexposure to noise or through ototoxins, are similarly poorly understood and an area of active research, and many HC-specific genes and gene products likely remain to be identified. Because HCs are necessary for auditory and vestibular sensation, a thorough understanding of HC gene expression at the proteomic level is critical to clarifying normal and aberrant HC structure and function (Ebrahim et al., 2016), sources of dysfunction in hereditary deafness, and developing potential therapeutic treatments (Alagramam et al., 2016).
Here, we used HC reporter mice to produce a population of FACS-purified cochlear and vestibular HCs, as well as the SE, whole IEs, and GFP− control cells to define the most complete IE hair cell proteome to date. Although we have used multiple strategies to minimize the number of proteins incorrectly assigned to being expressed in HCs, there is no way to be completely confident that our datasets lack false-positives. However, we hope that our results can provide a strategic starting point for other investigators to build on. Previous MS-based investigations of IE tissues have produced excellent proteomic characterization of chicken SE (Spinelli et al., 2012), as well as chicken and mouse vestibular HCs and stereocilia bundles (Shin et al., 2013; Krey et al., 2015) or mouse organ of Corti (Peng et al., 2012; Darville and Sokolowski, 2013). We analyzed several tiers of mammalian IE tissue with progressive enrichment for HCs to generate, not only whole-ear and SE proteomes, but ultimately a characterization of the mammalian cochlear/vestibular HC proteome defined by proteins unique to or specifically enriched in HCs. Overall, we identified thousands of proteins expressed in HCs, hundreds of which were uniquely expressed or highly enriched in HCs. However, it is important to point out that this description of the IE proteome is far from complete and many important proteins remain to be identified. This is due to several factors, including very low abundances of many proteins, potentially poor extraction of multipass transmembrane proteins, proteins with amino acid sequences lacking the appropriately sized tryptic peptide fragments, and the absence of protein amino acid sequences in the reference protein database. Our description also lacks any mention of the posttranslational modifications that decorate nearly all IE proteins. Proteomic MS also admittedly is much less sensitive than RNA-seq-based analyses. However, proteomic analysis in combination with RNA-seq-based analysis are together beginning to determine the comprehensive gene and protein expression of all IE cell types.
We found strong correspondence between proteins unique to HCs identified by MS and genes highly differentially expressed in HCs compared with supporting cells identified by RNA-seq (data from Scheffer et al., 2015) and additionally found many HC gene products not previously identified in this transcriptome. Several of these newly identified HC gene products are based on observations of peptides with as few as one to three spectral counts, indicating a high level of sensitivity in the MS-based approach for assessing low abundance proteins. This suggests that, although we cannot rule out a small degree of contamination in our HC sample by proteins from adjacent cell types, it is highly likely that low-abundance proteins in our HC dataset are truly associated with HCs. Among the 170 genes identified only through MS-based gene-product identification (corresponding to 250 gene products; Fig. 3E, bottom, purple), one-third code for histones or structural proteins. This suggests that one reason for detection of gene products by MS but not by RNA-seq may be identification of long-lived proteins (Savas et al., 2012; Zhang et al., 2012) of high abundance that may not require high mRNA levels to maintain protein abundance (Liu et al., 2016). Together, these comparisons reflect the complex nature of transcript–protein expression relationships and demonstrate the contribution of proteomic characterization to fuller understanding of specialized cell populations such as HCs.
Key HC proteins can have distinct localization and roles within the sensory receptor cell related to expression of different isoforms (Ebrahim et al., 2016), underscoring the importance of understanding splice variant expression within HCs. With our combined approach of analyzing purified HCs with high-resolution MS, the resulting HC proteome provides isoform-specific information of gene expression. Among the specific HC protein isoforms that we identified (Table 3) is the synaptic ribbon-associated form of Ctbp2 known to be expressed in HCs, suggesting that other HC isoforms in this subgroup may have specific roles in these sensory cells. For example, we identified isoform 2 of the HC protein otoferlin based on several observations of a unique peptide sequence (Fig. 4B). This peptide sequence partially overlaps an alternative sequence near the transmembrane domain, where several documented missense mutations and deletions reside (Pangršič et al., 2012). This particular isoform may contribute to the unique characteristics of IE afferent synapses, which are characterized by rapid release of neurotransmitter (Jung et al., 2015). We also identified the canonical isoform 1 of Dnmt1, a protein with a role in the IE that is not yet characterized. Mutations in DNMT1 are virtually always associated with progressive hearing loss in two related neurodegenerative diseases: hereditary sensory autonomic neuropathy with dementia and hearing loss and cerebellar ataxia, deafness, and narcolepsy (Baets et al., 2015). Our results suggest for the first time that at least one specific isoform of Dnmt1 exists in HCs and that the frequently observed hearing loss in these syndromes may have cochlear as well as neural origins.
Toward an understanding of the functional relevance of HC-specific proteins to auditory and vestibular function, we examined HC proteins in the context of genes that, when mutated, are known to lead to impaired hearing or balance in mice (through the Mouse Genome Informatics database) or in humans (through compilation of known deafness genes). In each case, we found that genes implicated in aberrant IE function were overrepresented in the proteomic-derived dataset of genes identified only in HCs compared with predictions based on the total number of HC-only genes. Although this is not surprising, it does suggest that the pool of HC-only proteins may provide a resource for discovery of novel proteins critical for normal audition and balance. We localized one such protein, Nlgn3, to the base of cochlear HCs at presumed glutamatergic synapses (based on colocalization with anti-Ctbp2-labeled synaptic ribbons; Fig. 5B). Conventionally, a postsynaptic adhesion protein, the identification of Nlgn3 in the HC sample could reflect the unintended capture of synaptic boutons attached to the HC basolateral membrane. Alternatively, it is possible that Nlgn3 may be expressed in the HC membrane, postsynaptic to GABAergic innervation from medial olivocochlear efferent fibers that transiently innervate inner HCs during development (Wedemeyer et al., 2013). In either case, we suggest that Nlgn3, largely studied for its putative role in autism, may play a previously unappreciated role at HC synapses that potentially underlies auditory and vestibular behavioral anomalies reported as a consequence of a Nlgn3 point mutation (Chadman et al., 2008). We further propose a significant role for many HC proteins in human hearing through association with deafness loci. By mapping our HC-only, GFP+ only genes to corresponding human genes and chromosomal locations, we propose 30 genes as candidates for sources of hereditary nonsyndromic deafness. We confirmed expression of three candidates in cochlear HCs: expression of Casz1 in HC cytoplasm (Fig. 5C), as well as localization of Fjx1 and Cfap36 to HC stereocilia (Fig. 5D,E), a necessary structure for HC function that is often the site of perturbation in genetic deafness. Together, our results suggest that HC proteomic data, in particular the HC-only dataset, provide an opportunity to use cell-specific expression patterns to reveal potential deafness genes.
In summary, by combining HC reporter mice, FACS, and semiquantitative proteomic analysis, we compiled the most complete mammalian IE protein expression catalog to date (MASSIVE, accession number MSV000079756, and ProteomeXchange, accession number PXD004210). In total, we found evidence for protein expression from >5000 genes within the IE. Proteomic analysis of purified HCs revealed key details on isoform-specific protein expression, novel HC gene products, and, overall, products from >2500 genes, 313 of which were identified exclusively in GFP+ HCs. Based on our finding that a disproportionately high number of deafness genes are identified only in HCs, other, as yet unrealized deafness genes are likely present in our datasets. We propose that proteins expressed exclusively in HCs represent a previously underused source of vulnerable deafness-causing substrates.
This work was supported by the National Institute on Deafness and Other Communication Disorders–National Institutes of Health (Grant R00 DC-013805 to J.N.S.), the Veterans Administration Research Service (BLS Grant BX001295 to A.F.R.), and the Garnett Passe and Rodney Williams Memorial Foundation (Research Fellowship to A.C.Y.W.). J.R.Y. is supported by the National Institutes of Health (Grants P41 GM103533 and R01 MH067880). We thank Jaime Garcia-Añoveros, Ann Hogan, Kazuaki Homma, and Jing Zheng for helpful comments and feedback on manuscript content and clarity.
The authors declare no competing financial interests.