|Home | About | Journals | Submit | Contact Us | Français|
An online metal-free weak cation exchange-hydrophilic interaction LC/RPLC system has been developed for sensitive, high-throughput top-down MS. Here, we report results for analyzing PTMs of core histones, with a focus on histone H4, using this system. With just ~24 μg on-column of core histones (H4, H2B, H2A, and H3) purified from human fibroblasts, 41 H4 isoforms were identified, with the type and location of PTMs unambiguously mapped for 20 of these variants. Compared to corresponding offline studies reported previously, the online weak cation exchange-hydrophilic interaction LC/RPLC platform offers significant improvement in sensitivity, with several orders of magnitude reduction in sample requirements and a reduction in the overall analysis time. To the best of our knowledge, this study represents the first online 2-D LC-MS/MS characterization of core histone mixture at the intact protein level.
LC-MS/MS has become an effective and frequently used analytical platform for proteome characterization over the past decade [1, 2]. High-resolution separations prior to MS minimize ion suppression and under-sampling challenges associated with the analysis of highly complex proteomes, which often span several orders of dynamic range in protein abundance. For this reason, developments and applications of multidimensional (MD) LC (MDLC) have accompanied and facilitated advances in proteomics, especially top-down (TD) proteomics where proteins are separated and fragmented directly (i.e. without enzymatic digestion as in the bottom-up approach). One advantage of TD proteomics is the characterization of multiple (i.e. combinatorial) PTMs dispersed along the primary protein sequence. Hence, differentiating between multiple protein isoforms, intractable with bottom-up proteomic approaches, becomes possible.
Historically, several types of MDLC systems, including heart-cutting MDLC , directly coupled-column MDLC , and column-switching MDLC  have been used. Heart-cutting MDLC has only seen a limited use in proteomics, since typically only a single desired range of the eluates from the first dimension is chosen (according to foreknowledge of the sample’s composition) to be transferred to the second dimension for further separation and characterization. Both directly coupled-column MDLC and column-switching MDLC are comprehensive MD separations with high peak capacity, where every component of a sample is subjected to separation in both dimensions. Multidimentional protein identification technology (MudPIT) has become the most prominent directly coupled-column MDLC for proteomic research .
In MudPIT, a sample is fractionated in the first dimension via strong cation exchange chromatography and further separated in the second dimension using RPLC. This configuration is relatively simple, but the first dimension can only be run in a stepwise fashion, which limits resolution. In addition, possible combinations of separation modes are restricted by the limited combinations of mobile phases that could be used for effective orthogonal separation. It is possible to utilize MudPIT-like strategies with a continuous gradient through some additional considerations in methodology .
In column-switching MDLC, fractions from the first dimension are online transferred, or stored, in a series of loops, for the second dimension where the separation is accomplished using two or more columns. This setup allows greater flexibility in terms of combinations of separation modes, and it will be the basis of the work presented here. With this setup, Unger et al. has resolved about 1000 peaks in 96 min in the analysis of proteins and peptides with a molecular weight below 20 kDa .
Characterization of histone PTMs and their dynamics has been the most notable application of TD MS to date, primarily due to their relatively low mass and high abundance. There is great interest in characterizing the many histone isoforms as the specific combination of PTMs form the so-called histone code, which regulates many aspects of chromatin function ranging from transcriptional regulation to chromosome separation during mitosis. Due to the high complexity of histone mixtures, there is a need for separation of both core histone families (H4, H2B, H2A, and H3) and protein isoforms within each family. RPLC has been regularly used to resolve the four histone families as well as some H2A and H3 variants, although H2A and H4 often co-elute [9, 10].
RPLC, however, cannot separate many of the post-translationally modified isoforms (e.g. methylated, acetylated, and phosphorylated) from the unmodified proteins and from each other . (This limited resolution in RPLC mode can actually be advantageous in that all of the variants with different PTMs for each core histone are present in the same fraction allowing for direct comparison of their relative abundance.) Weak cation exchange-hydrophilic interaction LC (WCX-HILIC) has proven to be an excellent complementary orthogonal mode to RPLC and has been successfully used to separate acetylated isoforms of histone H4 , methylated isoforms of histone H4 , phosphorylated isoforms of histone H1 , as well as sequence variants of histone H1 .
WCX-HILIC, a mixed-mode chromatography introduced by Alpert in 1990 , features the simultaneous presence of dominant hydrophilic interaction (due to the use of a higher percentage of organic solvent in the mobile phase than traditional WCX, the mechanism for separation of differentially methylated isoforms) and electrostatic interaction (due to an ionic stationary phase, the mechanism for separation of differentially acetylated isoforms)  between the stationary phase and the analyte. Because WCX-HILIC normally uses a salt (NaClO4) gradient that is not compatible with MS, fractions have to be collected, recovered, and desalted before they are individually introduced into the mass spectrometer using direct infusion ESI.
Using RPLC/HILIC approach, the Kelleher laboratory identified 42 histone H4 isoforms by offline infusion of 93 HILIC fractions from 150 μg of histone H4 collected from multiple RPLC runs of crude HeLa S3 histone proteins . While this is certainly a landmark work in the area of TD MS, such fraction collection and infusion steps are extremely time consuming, labor intensive, and require a large amount of starting material due to the sample losses during the offline transfer, recovery, and desalting processes. Insufficient sensitivity and relatively low throughput significantly limit the applicability of this methodology for biological applications.
To overcome these limitations of offline MDLC and enable sensitive characterization of protein isoforms using TD MS, a new online 2-D WCX-HILIC/RPLC LC platform was developed. RPLC was used for the second dimension due to its compatibility with MS, and WCX-HILIC was implemented as the first dimension. Fractions from the first dimension were desalted and transferred to the second dimension online to achieve higher sensitivity and throughput. The 2-D LC platform was coupled with Fourier-transform (FT) MS and tested by analyzing histones purified from human fibroblasts. Using ~24 μg of core histone mixture, 41 H4 isoforms were identified, and the type and locations of PTMs were unambiguously mapped for 20 of these variants. Compared to corresponding offline studies reported previously, online WCX-HILIC/RPLC platform offered higher sensitivity and throughput for characterization of histones.
The experimental workflow begins with a core histone sample (details following) separated in the first dimension using WCX-HILIC. Online fractions from this separation were collected in a series of 20 capillary loops with the start and stop time of each fraction guided by the chromatographic profile generated by an integrated UV detector. Each of the 20 WCX-HILIC fractions was then further separated in the second dimension using RPLC coupled by ESI to an LTQ Orbitrap mass spectrometer (Thermo Electron, San Jose, CA, USA). Mass spectra were acquired in high-resolution MS and high-resolution MS/MS (CID) mode. A total of 20 LC-MS/MS data sets were acquired, one for each of the 20 fractions collected from the first dimension separation. LC-MS/MS data sets were then subjected to database search using ProSightPC 2.0 (Thermo Electron)  for modified protein identification. Protein identifications from all fractions were combined into a final list of uniquely identified proteins (see Supporting Information).
The schematic diagram of the metal-free 2-D WCX-HILIC/RPLC system is shown in Fig. 1. All valves from V1 to V8 (except V4 and V5) were Cheminert 5000 psi nano-volume valves with polyether ether ketone (PEEK) stators and rotors, 1/32″ PEEK nanovolume fittings, and 100 μm bores (Valco Instruments, Houston, TX, USA). Column selectors, valves V4 and V5, will be discussed in detail in the following section. The entire LC path exposed to analyte, including a silica capillary sample loop, was free of metal to ensure enhanced sensitivity for phosphoproteins by avoiding analyte losses associated with metal surface interactions .
All flexible fused-silica capillary tubing (TSP, 360 μm od and 75, 150, or 200 μm id) was purchased from Polymicro Technologies (Phoenix, AZ, USA). ACN (HPLC grade) and isopropanol alcohol (IPA, HPLC grade) were purchased from Fisher Scientific (Pittsburgh, PA, USA); triethyl-ammonium phosphate buffer solution (1 M in H2O) was purchased from Fluka (Milwaukee, WI, USA); NaClO4 (minimum 99%), TFA (≥99%), and glacial acetic acid (HOAc, 99.99%) were purchased from Sigma (Bellefonte, PA, USA).
Primary neonatal Normal Human Dermal Fibroblasts were purchased from Lonza (Walkersville, MD, USA) and grown in Fibroblast Growth Media-2 (Lonza). Core histones were purified from confluent Normal Human Dermal Fibroblasts cultures using a histone purification kit from Active Motif (Carlsbad, CA, USA). Briefly, cultured cells were centrifuged, washed with serum-free media, and homogenized in the extraction buffer; the cell extracts were then micro-centrifuged separating the crude histones in the supernatant from other proteins. After neutralization, the crude histones were further purified using a resin column to give the final core histone mixtures including H4, H2B, H2A, and H3.
The core histone mixture was first separated on a PolyLC (Columbia, MD, USA) PolyCAT A (5-μm particles; 1000 Å pore size) column (800 mm × 200 μm id) packed in house. The separation was carried out under constant pressure (i.e. 3000 psi) using two ISCO (Lincoln, NE, USA) Model 100 DM 10 000 psi syringe pumps (with a Series D Pump Controller). Mobile phase A consisted of 15 mM triethylammonium phosphate containing 70% v/v ACN, and mobile phase B consisted of mobile phase A plus 0.68 M NaClO4. An aqueous solution of the core histone mixture (1.2 μg/μL) was loaded into the silica capillary sample loop (20 μL, 112 cm × 150 μm id) using a 100-μL Hamilton (Reno, NV, USA) syringe. The sample was then transferred to the first dimension SPE column (SPE1) using Buffer A at 1000 psi with a flow rate of ~5 μL/min for 10 min. After loading, the separation gradient was started. An exponential gradient was generated by adding mobile phase B (3000 psi) to a stirred mixer (2.5 mL of 100% mobile phase A initially), with the split flow rate controlled at ~10 μL/min. The column flow rate was ~1 μL/min at the beginning of the gradient. An online UV detector (SPEC-TRA100, Thermo Separation Products, 214 nm) was used to monitor the elution profile of proteins. Protein fractions were collected based on chromatographic peaks into the storage capillary as described below. Upstream of Mixer 1, three Cheminert 15 000 psi UHPLC nanovolume valves (not shown in Fig. 1), stainless stator, and Valco E3 rotor were used to control buffer refill and selection. All buffers were subjected to online degassing using Phenomenex (Torrance, CA, USA) DEGASSEX DG-4400.
Two Cheminert column selector valves were used in parallel to alternate between silica capillary tubes (56 cm × 150 μm id) to individually store fractions. Each end of the fraction storage capillary was connected to one port of a stream selection valve. The volume of eluant stored in each fraction storage capillary is fully customizable and is mainly determined by the column flow and fractionation rate. Each selector comes with an electric control panel for simple and fast port selection. All the selection valves have PEEK stators and Valco E rotors. Each pair of column selector valves has ten fraction storage capillaries, and here two such systems were used to store a total of 20 fractions. Fractions were collected alternatively between fractions collection systems. This arrangement permitted concurrent fraction collection and initiation of the second RPLC dimension.
Each stored fraction from the first dimension WCX-HILIC was further separated in the second dimension RPLC using a 600 mm × 75 μm id column packed in house with a Phenomenex Jupiter C5 (5 μm particles; 300 Å pore size). The separation was carried out under constant pressure of 4000 psi using two ISCO Model 100 DM 10 000 psi syringe pumps (with Series D Pump Controller). Mobile phase A consisted of 20% ACN aqueous solution with 5% IPA, 0.6% HOAc, and 0.01% TFA; mobile phase B consisted of 45% ACN, 45% IPA, and identical percentages of HOAc and TFA as Buffer A.
A Cheminert ten-port Nanovolume injection valve was used to alternatively equilibrate/load and run the analytical gradient between two capillary RPLC columns to increase sample throughput in the second dimension. Once the first dimension fraction were collected, RPLC Buffer A (1000 psi, about 5 μL/ min) was turned on (valve not shown) for 10 min to push and load the stored fraction into the SPE column of the second dimension. The SPE columns (SPE2 and SPE3, 5 cm × 150 μm id) were packed with the same stationary phase as the RPLC column with online sol–gel frits (5 mm) at both ends . As the first dimension fractions have a high percentage of organic solvent (70% ACN) and cannot be loaded into the second dimension hydrophobic SPE column directly, a tee was used to split the buffer A stream into two sub-streams with one stream (about 1 μL/min, 75 μm id) pushing the stored fraction out and the other stream (about 4 μL/min, 150 μm id) diluting the fraction before it enters the SPE column. Once one fraction was loaded, RPLC Buffer B was switched on to initiate the analytical gradient and separate the loaded proteins through the capillary RPLC column. High-resolution ESI-MS/MS acquisition using a LTQ Orbitrap was started simultaneously with the gradient and continued throughout the 180-min RPLC separation. ESI was performed by connecting the end of the RPLC column to a 20 μm id chemically etched capillary emitter  with a PEEK union. The high voltage required for ESI was applied through a metal union coupled in the split/ purge line, which was not exposed directly to the analyte, so there is no contact between the sample and the metal union. This design has been demonstrated to improve sensitivity and limit of detection for phosphoprotein analysis.
The temperature of the MS inlet capillary was set at 300°C to reduce TFA adducts often seen for core histones, and especially pronounced for H4. Both the MS and MS/MS acquisitions had a resolving power setting of 60 000. Automatic Gain Control target of 1E6 and 3E5 were used for FTMS full scans and MSn scans, respectively. Single microscans were used for both FTMS full scan and MSn scans. Fragmentation of the top five most abundant protein ions, which were isolated in a ± 1.5 m/z window, was performed by CID with normalized collision energy of 35% and an activation time of 30 ms. Dynamic exclusion was implemented with an exclusion duration of 200 s and an exclusion list size of 500. A charge state greater than 3 for the precursor ion was required to trigger MS/MS data acquisition in order to increase the number of tandem MS scans obtained for intact protein precursor ions.
Intact protein RPLC-FTMS mass spectra were first processed using in-house developed software (ICR-2LS available for download at http://ncrr.pnl.gov/software/) to deisotope and deconvolute both the parent and fragment ion spectra and to derive monoisotopic masses for both the intact proteins and their fragments. Protein isoforms and PTMs were identified by searching each acquired data set against annotated TD human database  (117 059 basic sequences, and 7 563 274 protein forms) using ProSightPC 2.0. The Thrash algorithm  option was chosen to calculate neutral mass for both precursor and fragment ions. The mass and retention time tolerance were set to 0.50 m/z and 2.0 min, respectively. The minimum S/N, minimum RL, maximum charge, and maximum mass were set to 1.0, 0.9, 40, and 25 kDa, respectively. The minimum number of fragments was set to 6 and minimum intact mass to 5000 Da. After filtering, the protein database search in the “Absolute Mass” mode was carried out for each tandem mass spectrum.
All candidate protein isoforms in the database with monoisotopic masses within ± 1 Da of the precursor observed in the FTMS acquisition were retrieved and all experimental fragment ions derived from this precursor ion were then matched to in silico fragments of each candidate protein isoform individually with 10 ppm mass tolerance. Dynamic PTMs considered methylation (mono-, di-, and tri-), acetylation, and phosphorylation. The best match for each precursor ion (based on P score) was imported into the Repository Report to generate an identification list for each raw data set. All identification lists were then combined and grouped by theoretical mass, accession number, and PTMs in Microsoft Access. The identification with the lowest P score in each group was chosen as the final identification.
To compare 2-D versus 1-D separations, 2.4 μg of the core histone sample was initially analyzed using RPLC only (second dimension in Fig. 1). The total ion current chromatogram and plot of the neutral mass versus RPLC elution time are shown in Fig. 2. Clearly, the RPLC separation alone was able to resolve the four core histone families. The acquired 1-D data set was then subjected to database search using ProSightPC as described in the experimental section and the final list of identified H4 isoforms is shown in Table 1. Detailed information regarding the amino acid sequence for each isoform, identified b/y ion cleavage locations, and identified PTMs highlighted in different colors are provided as Supporting Information.
In total, 23 histone H4 isoforms were identified confidently with P scores below 1E−4 utilizing the search criteria previously specified. Of these 23 isoforms, 13 H4 isoforms were unique (i.e. the amino acid sequence and the type, number, and location of all PTM(s) were assigned unambiguously with Number of Best Hits as 1). If there were insufficient fragments to uniquely locate PTM(s), one representative isoform was selected and the total number of all other possible isoforms was reported. Therefore, the number of best hits for the remaining 20 isoforms is greater than 1. It should be noted that the current ProSightPC protein database is built from UniProt, previously Swiss-Prot, databases and there is no specific relationship between accession number and sequence (i.e. two sequences with mutation of amino acids may share the same accession number). In this study, all identified H4 isoforms have the same accession number, but some isoforms have a mutation of amino acid 77 from alanine to proline. Amino acid sequences for each identified H4 isoform can also be found in the Supporting Information.
Twenty-four microgram of core histones were initially separated using WCX-HILIC (first dimension in Fig. 1). Protein elution was monitored with an online UV detector (214 nm) placed right after the column. The UV signal was captured and recorded (Fig. 3) using in-house software embedded in ICR-2LS and 20 fractions were collected for subsequent RPLC separation. Each fraction from the first dimension WCX-HILIC was then separated and analyzed using RPLC coupled online a LTQ Orbitrap for high-resolution MS and high-resolution MS/MS acquisition. A total of 20 LC MS/MS data sets were acquired, one for each of the first dimension fractions. These data sets were then processed and subjected to ProSightPC search to identify (modified) proteins.
Following the second dimension separation and MS analyses, fractions 9, 12, 8, and 17 contained mainly H2A, H2B, H3, and H4 histones, respectively (see Fig. 3). The total ion current chromatograms (MS only) and plots of monoisotopic mass (MS only) versus elution time for these fractions are shown in Fig. 4. In these examples, each fraction from the first dimension was further resolved into 2–3 groups of isoforms in the second dimension. Besides the four fractions mentioned above, fractions 2 (H2B), 5 (H2A, H4), 13 (H2B), 15 (H4), 18 (H4), 19 (H4), and 20 (H4) also contained histone proteins. The remaining fractions consisted mainly of contaminant proteins (basically abundant proteins not completely removed in histone enrichment step). For example, fraction 10 contained mainly 60S ribosomal protein L35, while fraction 15 contained mainly ubiquitin.
The complete list of H4 isoforms identified from all fractions (and sorted by P score) is shown in Table 2. The detailed information (including fragment name, m/z, monoisotopic and theoretical mass, and mass measurement error) for the matching b and y ions for the top-ranked isoform (aS1-aK16-2 mK20-H4) is given in Table 3 to illustrate high data quality. The amino acid sequence with PTMs and identified b/y ions, are shown in Fig. 5C. The corresponding mass spectra (parent ion and tandem MS) are shown in Fig. 5A and B. The same information, as shown in Table 3 and Fig. 5C, for all the other histones listed in Table 2 is provided in the Supporting Information.
Histone modifications (acetylation, methylation, phosphorylation, etc.) play an important role in transcriptional and epigenetic regulation, DNA repair processes, DNA synthesis, and cell division. Distinct patterns of PTMs result in a histone code that guide chromatin transitions during these cellular processes. Because of their relatively small mass (<25 kDa) and high abundance, histones are ideal targets for TD MS studies and have been extensively analyzed using various MS approaches over the last decade. In this work, 41 H4 isoforms were identified with high confidence (i.e. with P scores below 1E−4) from 24 μg of core histones (an equivalent of approximately 6 μg of H4) and the type and location of PTMs was unambiguously mapped for 20 of these variants. This represents a significant improvement in sensitivity when compared to the corresponding offline system in terms of the total number of identifications per unit of sample. Namely, in a previously published work, 42 histone H4 isoforms were identified by offline infusion of HILIC fractions from 150 μg of histone H4 collected from multiple RPLC runs of crude HeLa S3 histone proteins . Fraction collection and infusion steps are time consuming, labor intensive, and require a large amount of starting material due to sample losses during the offline transfer, recovery, and desalting processes. Insufficient sensitivity and relatively low throughput significantly limit the usefulness of this methodology for biological applications. A new online 2-D WCX-HILIC/RPLC LC platform described herein overcomes the limitations of offline MDLC and enables sensitive characterization of protein isoforms using TD MS.
Compared to 1-D RPLC (Table 1), 2-D WCX-HILIC/RPLC (Table 2) yielded approximately a two-fold increase in the number of identified H4 isoforms. For instance, isoform aS1-aK12-aK16-mK20-H4 was identified in the 2-D LC-FTMS but not in the 1-D LC-FTMS analysis primarily because it was well separated from other much more abundant isoforms (see Fig. 6). With the additional dimension of separation, this isoform appeared as the most abundant peak in the spectrum (Fig. 6D), and was chosen for fragmentation by data-dependent acquisition, which led to confident identification. In the case of 1-D separation, this particular isoform co-eluted with more abundant isoforms and hence was not chosen for fragmentation due to the limited time window for MS/MS (i.e. under-sampling). Thus in essence, adding a second stage of separation extended the attainable dynamic range and reduced the under-sampling problem. Depletion of the most abundant proteins using antibodies prior to MD separation, which has been a regular practice for bottom-up proteomics, will further boost the dynamic range and will be incorporated in our future work.
Basically, the conventional data-dependent strategy, where precursor selection is typically based on the ion intensity, does not work effectively for intact protein MS/MS due to the presence of multiple charge states of one (or few) highly abundant (and typically related) proteins at any particular elution time. Hence, smarter MS/MS strategies, such as selection of a precursor ion based on charge state instead of intensity (e.g. fragmenting a single charge state per protein), will be required for improved TD MS.
Furthermore, the fact that the elution pattern in the first dimension WCX-HILIC is not governed exclusively by core histone family or by acetylation status, as is true in the corresponding offline 2-D separation scheme (RPLC-HILIC), complicates the downstream second dimension RPLC separation and LC-MS data analysis. However, this dilemma will disappear if the salt gradient WCX-HILIC is replaced with a pH gradient WCX-HILIC. Recently, Young et al. reported a novel approach for the high-throughput characterization of histones that uses a salt-free pH gradient WCX-HILIC, which can be directly coupled to an ion trap acquiring data-dependent ETD tandem mass spectra . The 2-D LC platform described herein is highly versatile and can be easily adapted for reversed operation. In fact, we are currently investigating the possibility of applying this mode of chromatography (i.e. RPLC followed by WCX/HILIC) to first separate major histone families, and then differentially modified forms within each family to ease data analysis and interpretation and improve our ability to perform comparative LC-MS measurements.
In summary, an on-line metal-free 2-D WCX-HILIC/ RPLC system has been set up for sensitive high-throughput protein characterization using TD MS and applied for the characterization of core histones purified from human fibroblasts. Simultaneous separation of both core histones and their post-translationally modified isoforms presents a great challenge for any 1-D separation. Addition of a second orthogonal separation mode greatly increased the dynamic range, and enabled identification of low abundant isoforms, which otherwise are often missed due to insufficient intensity, an increased opportunity for ionization suppression and under-sampling issues. When coupled with FTMS, 2-D WCX-HILIC/RPLC platform enabled confident identification (i.e. with P scores below 1E−4) of 41 H4 isoforms staring from 24 μg of core histones (an equivalent of approximately 6 μg of H4). This represents significant improvement in sensitivity when compared to the corresponding offline system considering the total number of identifications per unit of sample.
The authors thank Professor Neil L. Kelleher for providing the ProSightPC and Dr. Paul Thomas for help with running the program. Portions of this work were supported by the William R. Wiley Environmental Molecular Sciences Laboratory (EMSL) Intramural Research and Capability Development Program, the U.S. Department of Energy (DOE) Office of Biological and Environmental Research, and the NIH National Center for Research Resources (grant RR018522). The research was performed using EMSL, a national scientific user facility sponsored by the Department of Energy’s Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory in Richland, Washington. PNNL is a multi-program national laboratory operated by Battelle for the DOE under Contract DE-AC05-76RLO 1830.
The authors have declared no conflict of interest.