A typical standard mass spectral analysis of a complex protein sample involves an initial fragmentation step, a separation methodology to fractionate the complex mixture, and a subsequent mass spectral analysis of the resulting fractions. Commonly, proteins in a complex mixture are fragmented with proteolytic enzymes such as trypsin. Upon treatment, a typical sample from a whole cell lysate contains thousands to millions of peptides. These complex mixtures need to be fractionated before further analysis. Initial proteomic separation techniques were based on gel electrophoresis, either one-dimensional (1-DE) or two-dimensional electrophoresis (2-DE), in which proteins or peptides are separated based on their charge and molecular weight (19
). The separated proteins are usually visualized with different dyes. The 1-DE or 2-DE gel bands are excised and in-gel digested, and the resulting peptides are analyzed by either MALDI-MS or ESI-MS. Other methods to fractionate the peptide mixture include liquid chromatography and capillary electrophoresis (CE) (51
). While a successful protein and peptide separation and fractionation is essential to a mass spectral analysis, this review focuses on the analytical aspects and the challenges of MS for proteomics. Other reviews discuss the commonly used approaches and challenges of fractionation (see Ref. 69a
MS is a powerful approach to obtain protein sequence data from unknown samples and to correlate the experimental data with sequence information in public databases (“bottom-up proteomics”; ). Such protein or peptide sequence information can be obtained from tandem mass spectral analysis. In a typical tandem MS analysis, the first step consists of the detection of the initial peptide ion. Subsequently, the peptide ions are fragmented by collision-induced dissociation (CID) to break the polypeptide backbone at the amide bond, thereby creating a ladder of fragment ions that reflect the peptides' amino acid sequence (11
). The resulting spectra are then compared with publicly available protein sequence information. The observed masses of the proteolytic fragments are compared with theoretical in silico sequences to identify the peptide sequence and thereby the protein (14
). With the advancements in the separation techniques, up to 2,000 proteins can now be identified in a standard experiment.
Schematic representation of the common top-down and bottom-up approaches for the identification and characterization of proteins using mass spectrometry (MS). GE, gel electrophoresis; HPLC, high performance liquid chromatography.
Alternative approaches to this bottom-up approach, in which the original protein is predicted from sequence information of proteolytic peptides, have utilized “top-down” approaches in which the intact protein is directly dissected and amino acid sequence information is obtained by dissociation (). In this approach, the intact proteins are separated by gel electrophoresis or offline liquid chromatography before MS analysis. The major obstacle in this approach is the determination of product ion masses from multiply charged species of intact proteins. Because of the formation of multiply charged protein precursor ions, it is difficult to interpret top-down fragmentation spectra. This limitation can be evaded by reducing the charge states on the product ions through the introduction of gas-phase anions to strip protons from the product ions through ion-ion proton transfer reaction (106
Another approach of choice for top-down proteomics is the use of the Fourier transform-ion cyclotron resonance (FT-ICR) mass spectrometer or orbitrap mass spectrometer with high mass accuracy (<2 ppm). The product ion charge state can be determined from the isotope spacing in the multiply charged species that facilitates the identification (27
). The dissociation techniques in top-down proteomics are favored toward electron-capture dissociation (ECD) as implemented on FT-ICR-MS or electron-transfer dissociation (ETD) used in orbitrap instruments.
Although the top-down proteomics approach has some limitations because of the formation of very complex spectra, use of expensive instrumentation, and difficulties with proteins of high molecular mass (<50 kDa), it is advantageous over classical bottom-up approaches since it provides access to the complete protein sequence, locates PTMs due to gentle fragmentation methods, and avoids long protein digestion methods.