The ability to dissociate peptides and proteins during mass spectrometry experiments, generating sequence-dependent information, has revolutionized primary structure determination. While the earliest studies were aimed at de novo structure determination, the availability of complete genomic information has lead to a shift toward matching experimental peptide mass spectrometry datasets to sequences translated from DNA sequences, enabling the advent of high-throughput proteomics. Peptides with post-translational modifications (PTMs) and protein sequence polymorphisms (PSPs) that lead to disagreement between experimental data with sequences in protein databases are consequently ignored, unless the software algorithms used for matching are instructed to consider modifications that alter mass. Our ability to fully understand the diversity of the human proteome is inevitably somewhat restrained.
Top-down mass spectrometry and proteomics centers on intact protein mass measurements and dissociation of the intact protein to yield tandem mass spectrometry datasets that can be used to reconcile the primary structure with the measured mass of the protein [1
]. Thus an essential difference between bottom-up peptide-based strategies and top-down
is that the latter approach must fully describe the primary structure of not only the parts of the protein that match the sequence in the protein database, but also the parts that do not. Consequently, data interpretation is more demanding, typically divided into protein identification followed by detailed structural characterization [3
]. Top-down mass spectrometry of proteins started with the observations that large intact proteins could be subjected to collisionally-activated dissociation (CAD) [5
], identified using sequence tags [6
], and has benefited from the discovery of a second fragmentation mechanism, electron-capture dissociation (ECD) [7
]. The importance of high-resolution and high-mass accuracy for data interpretation in top-down experiments has long been recognized [8
], and the Fourier-transform ion cyclotron resonance (FT-ICR) mass analyzer is unsurpassed in these respects [9
]. The recent introduction of hybrid linear ion-trap (and quadrupole) FTICR instruments has further benefited top-down because of dramatic improvements in product-ion mass accuracy, achieved by maintaining optimal pressure in the ICR cell and by carefully regulating the number of ions transmitted to it [10
The human salivary proteome has long interested protein chemists; indeed, Peter Roepstorff and colleagues used mass spectrometry to define the blocked N-terminus of salivary amylase over twenty-five years ago [11
]. More recently, attempts are being made to fully understand the variety of proteins found in saliva and the lists of identified proteins grow ever longer [12
]. The bulk of saliva protein is constituted by amylase and a small group of proline-rich proteins (PRPs) in the mass range 5-30 kDa. Intact protein maps have been developed and have lead to the recognition that although there are only a few parent genes there is a tremendous diversity of protein/peptide products depending upon post-translational processing [15
]. Comparisons between individuals has further shown a variety of human alleles with both sequence and downstream processing diversity [16
]. The net result is that a single liquid chromatography mass spectrometry experiment on a quadrupole instrument (100 ppm) reveals a wide range of proteins each with an apparently unique intact mass tag (IMT). In this report, two IMTs are subjected to detailed top-down analysis with a hybrid FT-ICR mass spectrometer. The data illustrates how high resolution and high mass accuracy is essential for understanding the protein-sequence polymorphisms and post-translational modifications inherent in the expression of these proteins. The analysis of isomeric forms of PRP3 demonstrates that in some instances IMTs are ambiguous, and consequently that the presence of ‘unique’ ions in dissociation spectra are necessary to define ‘distinct’ molecular species.