|Home | About | Journals | Submit | Contact Us | Français|
High-resolution top-down mass spectrometry was used to characterize eleven integral and five peripheral subunits of the 750 kDa Photosystem II (PSII) complex from the eukaryotic red alga, Galdieria sulphuraria. The primary separation used liquid chromatography mass spectrometry with concomitant fraction collection (LC-MS+) yielding around 40 intact mass tags (IMTs) at 100 ppm mass accuracy on a low-resolution electrospray-ionization mass spectrometer, whose retention and mass were used to guide subsequent high-resolution top-down nano-electrospray Fourier-transform ion-cyclotron resonance mass spectrometry experiments (FT-MS). Both collisionally activated and electron capture dissociation (CAD, ECD) were used to confirm the presence of eleven small subunits to mass accuracy within 5 ppm; PsbE, PsbF, PsbH, PsbI, PsbJ, PsbK, PsbL, PsbM, PsbT, PsbX and PsbZ. All subunits showed covalent modifications that fall into three classes including retention of initiating formyl-methionine, removal of methionine at the N-terminus with or without acetylation, and removal of a longer N-terminal peptide. Peripheral subunits identified by top-down analysis included oxygen evolving complex (OEC) subunits PsbO, PsbU, PsbV, as well as Psb28 (PsbW) and Psb27 (‘PsbZ-like’). Top-down high-resolution mass spectrometry provides the necessary precision, typically less than 5 ppm, for identification and characterization of polypeptide composition of these important membrane protein complexes.
Top-down proteomics is dependent upon high resolution and accurate mass measurements for analysis of intact proteins and their dissociation-derived product ions and can provide complete primary structure and post-translational modification assignment [1, 2]. In recent years, with the maturation of FT-MS coupled with electrospray ionization and fragmentation options including collisionally activated (CAD), electron capture (ECD), electron transfer (ETD) and infrared multi-photon (IRMPD) dissociation, top-down mass spectrometry has been developed into an extremely powerful method in protein science and proteomics and has gained high acceptance among the community, culminating with the award of the Biemann medal of the American Society for Mass Spectrometry to Neil Kelleher in 2009. For wider applicability, top-down mass spectrometry must address the entire proteome including the membrane proteins that constitute 20 to 30% of the open reading frames (ORF’s) in fully sequenced genomes , with their important functions ranging from photosynthesis, respiration and cellular communication to small molecule transport. Integral membrane proteins have transmembrane regions that pose major challenges to mass spectrometry as they are deficient in amino acids that carry charge and are insoluble in aqueous solvents, without detergents. Thus it can be argued that the most effective way to characterize these domains is to include them as part of the intact protein in a top-down experiment . To date, studies have shown that integral membrane proteins can be analyzed by ESI-MS with mass accuracies comparable to those achievable for soluble proteins, and top-down high resolution FT-MS has been demonstrated for both polyhelix bundle and transmembrane β-barrel motifs [5–7].
Galdieria sulphuraria is a photosynthetic unicellular red alga that grows in acidic (pH 0.5 to 3.0), higher temperature (50 to 55 °C) environments . Along with Cyanidium caldarium and Cyanidioschyzon merolae, these are the only photosynthetic eukaroyotes populating the Cyanidiales. Molecular phylogenetic studies have revealed that the Cyanidiales are one of the most ancient groups of algae, presumed to be at the base of eukaryotic lineages . Galdieria is unique among other Cyanidiales as it propagates by endospores, has a cell wall and is highly tolerant of toxic metal ions (aluminium, cadmium and mercury) . Galdieria can survive on more than fifty different carbon sources with the ability to grow photoautotrophically, heterotrophically and mixotrophically. The conditions under which Galdieria performs photosynthesis are at the extreme and it is of interest to study the proteins involved.
PSII is a large membrane protein complex that catalyzes the light driven electron transfer from water to plastoquinone, thereby oxidizing two water molecules to produce 4H+, 4e− and molecular oxygen. Because of its importance, several biochemical, biophysical, proteomic and structural studies of PSII from cyanobacteria, green algae and higher plants have been published. PSII is comprised of more than 20 subunits with most of these encoded in the chloroplast and the remainder in the nucleus . Core integral subunits D1 (PsbA) and D2 (PsbD) bind most of the redox co-factors forming the electron transport chain assisted by antenna proteins CP43 (PsbC) and CP47 (PsbB) as well as the alpha and beta subunits of cytochrome b559 (PsbE and PsbF), and are present in PSII of all organisms. Along with the conserved core, there are several small integral membrane subunits (< 10 kDa) that are somewhat more diverse. Fully functional PSII requires assembly of the oxygen-evolving complex (OEC) which is stabilized by peripheral subunits. While the nature of the water-splitting reaction is conserved across organisms, the subunits that stabilize the OEC differ. In cyanobacteria, the OEC stabilizing proteins consist of PsbO, PsbU and PsbV subunits, while PsbO, PsbP and PsbQ are used in green algae and higher plants . The eukaryotic Cyanidiales differs in their OEC composition: C. caldarium has a special subunit PsbQ’ along with cyanobacterial PsbO, PsbU and PsbV subunits, while C. merolae is reported to have PsbO, PsbU, PsbP and PsbQ [13, 14]. Considering Galdieria’s divergence from other Cyanidiales, it is important to characterize their peripheral subunits. Recently, others have shown that there are additional peripheral subunits associated with PSII (Psb27 and Psb28) though they are not thought to be associated with stabilization of the active OEC [15, 16].
Mass spectrometry has now made major contributions to the structural characterization of the PSII complexes from cyanobacteria, green algae and higher plants [5, 15, 17–20]. Recently, high-resolution FT-MS was applied to a number of different integral membrane proteins including the cyanobacterial cytochrome b6f complex . The extension of this work to an ever-increasing array of membrane proteins is exemplified here where we have systematically applied FT-MS to study the composition of PSII from the eukaryotic red alga Galdieria. A total of sixteen peripheral and small integral subunits were analyzed by high resolution FT-MS using both CAD and ECD, allowing assignment of precursor and product ions to mass accuracies typically better than 5 ppm. The results show that while the Galdieria PSII integral core complex is very similar to that of other eukaryotes, the peripheral OEC composition is more similar to cyanobacteria. The detection of five peripheral subunits including Psb27 and Psb28 in the red algal preparation suggests they contain a functional and structurally specialized population of PSII complexes that may be important for survival in the harsh environments in which this algal species thrives.
Galdieria sulphuraria were grown at 42° C in 11 L capacity flasks containing a 10X medium at pH 2.0, with a constant supply of air, CO2 and light irradiances of 25 μmol photon m−2 s−1 (Li-cor, model LI-189). Cells were harvested by centrifugation (2000 x g; 2 minutes; 25° C). Cells were homogenized in 20 mM MES pH 6.0, 10 mM CaCl2, 10 mM MgCl2, 500 mM mannitol (MMCM buffer) with 1 mM PMSF added before cell breakage. Following breakage of the cells, PSII was isolated as described previously (Fig. 1) . The original specimen, Galdieria sulphuraria strain 074, collected at Mount Lawu, Java, Indonesia was provided by Dr. Christine Oesterhelt, Max-Planck-Institut für Molekulare Pflanzenphysiologie, Am Mühlenberg 1, 14476 Golm, Germany. It was thought to be a single strain but growth capacity tests revealed that there were two strains in the sample ; under heterotrophic conditions, strain 074W (W-white) lost pigment whereas 074G (G- green) remained green on all substrates. Galdieria sulphuraria 074W was used to isolate PSII, which is the same strain that was used for the genome sequence information. The small differences in the sequences that were revealed by our study may be due to a diversification of the initial strain due to different growth conditions in the laboratories. The genome was sequenced from the laboratory strain that had been grown under photoheterotrophic conditions, while the Photosystem II was isolated from the strain that had been maintained under photoautrotrophic growth conditions since it was originally isolated.
Samples were precipitated using acetone. The suspension was split into two centrifuge tubes (125 μL each) and 1 mL 80% acetone in water (−20 °C stock) added to each tube, prior to vortex mixing (1 minute) and incubation at −20 °C for 1 hour. Precipitated protein was recovered by centrifugation (10,000 x g) and the supernatant removed. Pellets were dried briefly to allow evaporation of residual acetone (5 minutes, room temperature and pressure) and dissolved in 90% formic acid (total 100 μL) for immediate injection onto an HPLC system prepared for reversed phase chromatography [23, 24]. A column (5 μm, 300 Å PLRP/S, 2 × 150 mm; Varian) is previously equilibrated in 95% buffer A (0.1 % TFA in water), 5% B (0.05% TFA; 50% acetonitrile; 50% isopropanol) at 100 μL/minute at 40 °C for 30 minutes prior to sample injection. The column is eluted with a stepped linear gradient of increasing buffer B as previously described  and the eluent passed through a UV detector (280 nm) prior to a liquid flow splitter delivering 50 μL/minute to a low resolution mass spectrometer and 50 μL/minute to a fraction collector (1 minute/fraction). Fractions collected into microcentrifuge tubes were stored at −80 °C prior to off-line high-resolution nanospray analysis. Mass spectrometry was performed using a triple quadrupole instrument (AP III+, Applied Biosystems) tuned and calibrated using a PEG mixture as previously described . Mass spectra were recorded by scanning from m/z 600 – 2300 with orifice voltage ramped with mass (60 – 120) using a 0.3 Da step size and a scan speed of 6 sec. Data were processed using BioMultiview 1.3.1 software (Applied Biosystems).
Selected HPLC fractions collected during LC-MS+ were subjected to direct infusion nanospray analysis. Samples were individually loaded into 2 μm i.d. externally-coated nanospray emitters (Proxeon, Cambridge, MA) and desorbed using spray voltage of between 1.7 – 1.9 kV (versus the inlet of the mass spectrometer) using the nanospray source supplied by the manufacturer. These conditions produced a flow rate of 20 – 50 nL/min. All samples were analyzed using a hybrid linear ion-trap/FT-ICR mass spectrometer (7 T, LTQ FT Ultra, Thermo Scientific, Bremen, Germany) operated with standard (up to m/z 2000) or extended mass range (up to m/z 4000). Ion transmission into the linear trap and further to the FT-ICR cell was automatically optimized for maximum signal. The ion count targets for the full scan FT-ICR and MS2 FT-ICR experiments were 2 × 106. The m/z resolving power of the FT-ICR mass analyzer was set at 100,000 (defined by m/Δm50% at m/z 400) unless otherwise stated. Individual charge states of the multiply protonated protein molecular ions were selected for isolation and CAD in the linear trap followed by the detection of resulting fragments in the ICR cell. The precursor ions were activated using normalized collision energy settings in the range 10 – 15 at the default activation q-value of 0.25. For ECD studies precursor ions were transmitted to the FT-ICR cell and electrons introduced with normalized energy in the range 5 – 10, 10 ms duration and 50 ms delay. Spectra were derived from an average of between 50 – 200 transient signals. In addition, the analysis of Psb27 was performed on a hybrid linear ion-trap Fourier-transform orbitrap mass analyzer (LTQ XL Orbitrap ETD; Thermo Scientific, Bremen, Germany) similarly to the LTQ-FT experiments. High-energy collisional dissociation (HCD) was performed in the C-trap and ETD was performed using fluoranthene anions produced in a chemical ionization source according to the manufacturer’s instructions. Orbitrap product ion spectra were recalibrated using prominent y- and z- ions.
FT-ICR and FT-Orbitrap mass spectra were processed using ProSightPC software (ProSightPC 2.0, Thermo Scientific, Bremen, Germany) to produce monoisotopic mass lists (s/n = 1.1, fit 0%, remainder 0%, average table set to averagine) that were then assigned to protein sequences with various post-translational modifications. Protein identifications were achieved by generating sequence tags (sequence tag compiler and sequence tag searching tool) and matching these tags to a Galdieria database. Product ion assignments for known proteins were made using the software operated in single protein mode with a 10 ppm mass accuracy threshold and with the Deltamass feature deactivated. ProSightPC reported mass accuracy of assigned monoisotopic mass values of precursor and product ions and the probability that some other database entry might match the same dataset (P score).
All raw data files (.raw) and ProSightPC files (.puf) will be deposited at tranche (https://proteomecommons.org/) upon manuscript acceptance.
For the identification and annotation of chloroplast-endcoded PSII subunits, a collection of all open reading frames (ORFs) identified on the two supercontigs stig_35 and stig_158 from the Galdieria sulphuraria Genome Project [8, 10] (http://genomics.msu.edu/galdieria) was prepared after closure of all gaps (K. Krause, A. Weber et al., publication in preparation). This ORF collection was searched for sequences with homology to the amino acid sequences of PSII subunits from T. elongatus (Thermosynechococcus elongatus) and Arabidopsis thaliana and other higher plants. The nuclear-encoded PSII subunit genes were identified by searching the translated Galdieria nucleotide databases for sequences orthologous to the remaining subunits. The presence of predicted target peptides was analysed using the TargetP 1.1 Server (http://www.cbs.dtu.dk/services/TargetP) . All nucleotide sequences were conceptually translated and nucleotide as well as protein sequences were prepared for annotation in Genbank by using Sequin Version 9.50 (provided by Genbank). The relevant accession numbers are: psbA - GU474525; psbB - GU474526; psbC -GU474527; psbD-GU474528; psbE - GU474529; psbF - GU474530; psbH - GU474531; psbI - GU474532; psbJ - GU474533; psbK - GU474534; psbL - GU474535; psbM -GU474520; psbN - GU474536; psbO - GU474521; psbP - GU474522; psbQ -GU971652; psbQ’ – GU721104; psbT - GU474537; psbU - GU474523; psbV -GU474538; psbW - GU474539; psbX - GU474540; psbY - GU474541; psbZ - GU474542; psb27 - GU474524. Nuclear encoded subunits include PsbM, PsbO, PsbP, PsbQ, PsbQ’, PsbU, Psb27; chloroplast encoded subunits include PsbA, PsbB, PsbC, PsbD, PsbE, PsbF, PsbH, PsbI, PsbJ, PsbK, PsbL, PsbN, PsbT, PsbV, Psb28 (PsbW), PsbX, PsbY and PsbZ.
Separate Galdieria PSII preparations were analyzed by LC-MS+ on four different occasions over a twelve-month period (four biological replicates). Each top-down mass spectrometry experiment typically included CAD analyses of at least two different charge states and one ECD and/or ETD experiment. Results from the experiment that yielded the most product ions are presented.
The primary reversed phase LC-MS+ separation yielded a retention map that was annotated with protein identities where they are known, and intact masses where they remain unknown (Fig. 2). Mass accuracy on the low-resolution quadrupole instrument was typically 100 ppm (1 Da at 10,000 Da) and the spectra were used to guide targeted top-down high-resolution experiments on selected fractions from the LC-MS+ experiment in order to fully define primary structure and post-translational modifications of PSII subunits. The physical basis of the separation relies upon hydrophobicity, and thus the first proteins to elute are the hydrophilic peripheral PSII subunits PsbO, PsbU, PsbV, Psb28 (PsbW) and Psb27 (‘PsbZ-like’). These were followed by a set of proteins in the 16 kD class, identified as phycobiliproteins (ApcA, ApcB, ApcD, ApcF, CpcA, CpcB, CpcG) and a pair of 55 kD proteins identified as AtpA and AtpB of ATP synthase (data not shown). Association of phycobiliproteins with PSII is predictable, while ATP synthase is probably a contaminant of the preparation. The remainder of the chromatogram was dominated by elution of the small and large integral subunits of PSII spanning a substantial range of hydrophobicity, with the PsbE and PsbF subunits that bind cytochrome b559 eluting around 70 minutes, and PsbZ not eluting until 133 minutes. Several of the small subunits (PsbF, PsbM, PsbT, PsbI, PsbH) had singly oxidized isoforms that eluted a little earlier than the unmodified population. The intensities of different peaks in the total ion chromatogram shown in Fig. 2 are dependent on abundance and ionization efficiency and thus stoichiometry should not be inferred. Table 1 lists intact mass tags (IMTs) that were assigned to PSII subunits based upon coincidence of measured average mass with calculated average mass based upon primary structure and limited post-translational modification. These assignments remain coincidental, with little statistical confidence until a high-resolution top-down analysis is completed (Table 2). Out of around 40 detected IMT’s, 20 PSII subunits were assigned with different PTMs. The 3409 Da IMT eluting at 78 minutes is potentially a PSII subunit but so far remains unidentified.
The four large subunits (PsbA, PsbB, PsbC and PsbD), which account for twenty-two of the total thirty four transmembrane alpha helices in PSII, eluted between 88 and 106 minutes along with other smaller subunits (Table 1). The large subunits PsbA, PsbB, PsbC and PsbD measured 38184, 56551, 50927 and 39350 Da respectively, largely consistent with their gene sequences and known N- and C- terminal post-translational modifications (Table 1). The experimentally determined average mass of 38184 Da for PsbA was consistent with the removal of the initiating methionine at the N-terminus with acetylation of Thr 2, and removal of 15 amino acids at the C-terminus (Table 1) in conservation with other species. The difference of 23 Da relative to the calculated average mass (38160.7 Da) could be due to minor DNA sequence differences between the strain (074W) used for PSII preparation and sequencing. In PsbD, the difference between the calculated (39344.1 Da) and experimental (39346 Da) average masses is within experimental measurement error (Table 1). For subunit PsbB, the experimentally derived mass (56551 Da) is in good agreement with the calculated mass (56558.8 Da) taking into account the loss of initial methionine and N-terminal acetylation (Table 1). In PsbC, if 14 amino acids are removed to form a processed PsbC with a free N-terminus in conservation with higher plants, the calculated mass (50588.3) is lower than measured (50888 Da) by 300 Da, indicating either divergent N-terminal processing or some other DNA sequence discrepancy. Despite minor inconsistencies, the intact mass tags of the larger integral subunits can be assigned with confidence. Since the main focus of this study was to identify and characterize the smaller integral (< 10 kDa) and the peripheral subunits constituting Galdieria PSII further experiments on the large subunits were not performed.
Of the three (PsbO, PsbU, PsbV) OEC stabilizing subunits identified, PsbO and PsbU are encoded by the nucleus and carry an N- terminal chloroplast target peptide. PsbO has a target peptide consisting of the first 21 amino acids so the mature form has 242 amino acids. The experimentally determined monoisotopic mass of 28796.7520 Da could be reconciled (Δ = 2.6387 ppm) with the calculated monoisotopic mass of 28796.67415 Da with inclusion of the formation of a single disulfide bond (−2.01565 Da) between the only two Cys residues, Cys23 and Cys47 (Figs. 2 and and3A).3A). Assignment of product ions from a CAD experiment of the disulfide linked mature protein yielded 11 y-and 9 b- ion matches within a 10 ppm tolerance and a consequent P score of 1.41E-26 (Fig. 3A, Table 2). In the case of PsbU, the first 81 amino acids of the signal peptide are cleaved at the N terminus, forming a mature protein of 93 amino acids, with a calculated monoisotopic mass of 10579.4347 Da (Table 2). Top-down CAD analysis of the N-terminally truncated protein confirmed the primary structure of PsbU with a experimentally determined monoisotopic mass of 10579.4648 Da, within 5 ppm of the calculated monoisotopic mass (Δ = 2.8361 ppm), and a CAD experiment yielded 32 b- and 23 y- product ions for a P score of 5.26E-75 (Fig. 3B, Table 2). The third extrinsic subunit PsbV was assigned to an intact mass tag of 15704 Da with proposed modifications including removal of 30 amino acid residues from the N-terminus, removal of 10 residues from the C-terminus and attachment of a covalently bound heme group (615.16947 Da) (Fig. 2, Table 1). Analysis of the high-resolution CAD dataset for PsbV gave a mass difference of −2.0602 Da (−131.2860 ppm) with 11 b- and 15 y- ions matched. Further optimization of the match between measured and calculated mass was reached by changing Thr 80 to Val (mature protein numbering) significantly increasing coverage with 19 b- and 24 y- product ions matched and a P score of 3.32E-45 (Δ = 5.1595 ppm) (Fig. 3C, Table 2).
Besides the 3 OEC stabilizing subunits, 2 other peripheral subunits were identified, Psb27 and Psb28. In the primary LC-MS+ analysis, a hydrophilic protein eluted at 39 minutes with an intact mass tag of 13247 Da and was assigned as Psb28 with removal of Met1 (Fig. 2, Table 1). This protein is similar to the PsbW identified in Synechocystis and will be referred to as Psb28 hereafter (see discussion). Psb28 was assigned to a monoisotopic mass peak of 13237.7755 Da (Table 2), confirming the identity and processing with 4 b- and 17 y- product ions and the precursor matched below 5 ppm (Δ = 4.6250 ppm) for a P score of 9.55E-28 (Table 2). The identity of Psb28 was also confirmed by ECD (Fig. 4) with 57 c- and 39 z- product ions, and agreement between calculated and measured monoisotopic masses for the precursor ion within 5 ppm (Δ = 4.6250 ppm) for a P score of 1.02E-152 (Fig. 4A, Table 2). An IMT of 12820 Da was assigned to Psb27 (psbz-like) using top-down mass spectrometry after attempts to assign it with low-resolution data were unsuccessful. In Fig. 5 the assignments for the dissociation experiments performed on an LTQ-Orbitrap (HCD and ETD) are shown. Both experiments yield b- and y-, and c- and z- ions that support complete agreement with the C-terminal sequence over most of the polypeptide chain. However, the data strongly suggests that the sequence reported for Psb27 has some errors towards its N-terminus and we have therefore represented this region as an unknown modification on the Ala residue shown in Fig. 5 of mass 1375.65 Da to optimize b- and c- ion matches. Using this strategy we matched the experimental monoisotopic mass of 12811.3742 Da to the calculated value at <1 ppm, and matched 12 b- and 23 y- ions in the CAD (HCD) experiment and 36 c- and 44 z- ions in the ETD experiment. The measured P scores for each experiment (respectively) confirm the accuracy of the identification (Fig. 5). The nature of the N-terminus is under further study.
Effort was expended finding the PsbQ, PsbQ’ and PsbP proteins, genes, which can be identified in the Galdieria genome. Attempts to match intact mass tags and top-down MS experiments in the 37 – 76 minute retention range to these sequences were unsuccessful so one minute fractions across this range were digested with trypsin and analyzed by nano-liquid chromatography with tandem mass spectrometry. While these experiments successfully identified many peptides from PsbO, PsbU, PsbV, Psb27, Psb28, ApcA, ApcB, ApcD, ApcF, CpcA, CpcB, AtpA, AtpB as well as some of the integral subunits there were no peptides detected for PsbQ, PsbQ’ or PsbP (data not shown).
The analysis confirmed the presence of core subunits PsbE, PsbF and Psb I, and the remaining integral subunits PsbH, PsbJ, PsbK, PsbL, PsbM, PsbT, PsbX and PsbZ. The results of top-down CAD and, for PsbF and PsbL, ECD, are shown in Fig. 6, with coverage and P scores described in the legend and measured/calculated masses in Table 2. P scores ranged from 2.81E-16 for PsbE and 6.95E-88 for PsbF demonstrating the statistical confidence of the assignments achieved using high-resolution FT-MS with CAD. Even higher P scores were achieved for some of the ECD/ETD experiments. The small subunits could be grouped into three categories based upon post-translational processing. PsbI, PsbT, PsbX and PsbZ were unprocessed with initiating formyl-Met residue intact. PsbE, PsbF, PsbH, PsbJ and PsbL had Met1 removed though only PsbF, PsbJ and PsbL were subsequently acetylated. PsbK and PsbM had longer signal peptides removed from their N-termini and were not acetylated.
Top-down mass spectrometry has been used to characterize integral  and peripheral  thylakoid membrane proteins for several years now. Top-down high-resolution FT-MS was first applied to an integral thylakoid protein demonstrating the potential benefits of using ECD . Recently, all 8 subunits of the cytochrome b6f complex from Nostoc were analyzed in this way for full characterization of post-translational modifications . Here we have used top-down FT-MS and orbitrap FT-MS to characterize the smaller integral and the peripheral subunits of PSII from Galdieria. A significant challenge was the lack of a highly annotated Galdieria proteome database and considerable effort was devoted to the search for PSII homologues. We were able to define the identity of all the subunits with high confidence. In total Galdieria PSII appears to have 4 large subunits, 11 smaller integral subunits and 3 peripheral OEC subunits for a total of 18 subunits. This total gives 2 fewer subunits and 2 fewer transmembrane helices than the cyanobacterial complex in agreement with the 34 transmembrane α-helices identified in spinach  and in contrast to the 36 transmembrane helices identified in the PSII X-ray structure of T. elongatus , perhaps a little surprising given the similarity between the Galdieria OEC and that of the cyanobacteria. It remains possible that the small 3409 Da intact mass tag observed in the LC-MS+ experiment is a PSII subunit, but we were unable to match top-down FT-MS data to any sequence in our Galdieria database. PsbN and ycf12 of the cyanobacterial complex were found in the Galdieria genome by homology but were not found in the protein analysis despite their inclusion in the database used for interpretation of the data. Final resolution of the issue will require a crystal structure of sufficient quality to identify the number and subunit assignment of the transmembrane helices in Galdieria PSII.
Top-down mass spectrometry allowed characterization of PTMs with molecular detail. Modification of small subunits shows a distinct pattern based on the exposure of their N terminus to the stroma or lumen. Subunits PsbI, PsbK, PsbM, PsbT, PsbX and PsbZ have their N-termini exposed to the lumen with PsbI, PsbT, PsbX and PsbZ retaining their initiating formyl-Met residue unmodified. PsbK and PsbM had transit peptides removed (Table 2). The other subunits PsbE, PsbF, PsbH, PsbJ and PsbL have their N-termini exposed to the stroma and in all cases have their initiating Met residue removed. PsbF, PsbJ and PsbL have their N-terminii acetylated. The observed differences in modification for different subunits may be due to the availability of different processing enzymes in stroma and lumen  and accessibility of the N-termini to these enzymes.
Inspection of the Galdieria genome allowed identification of genes for the OEC stabilizing subunits, which are important for the assembly of the lumenal side of PSII that catalyzes the water-splitting reaction (PsbO, PsbU, PsbV, PsbP, PsbQ, PsbQ’) from both prokaryotic and eukaryotic origins. PsbO is the only extrinsic protein conserved from cyanobacteria to higher plants. As in green algae and higher plants, PsbO is nuclear encoded in Galdieria with a transit peptide sequence of the first 21 residues, which is shorter than the 34 residue predicted by TargetP . In the cyanobacterium T. elongatus 26 residues are removed  while transit peptide sequences of 52, 76 or 84 residues are predicted for Chlamydomonas reinhardtii (green alga), Cyanidium caldarium (red alga) and Spinacia oleracea (higher plant) respectively . PsbO from Arabidopsis was shown by mass spectrometry to have 85 N-terminal residues removed during processing [26, 30]. Processing of Galdieria PsbO leaves the mature N-terminus conserved compared with other organisms. Finally, the two cysteine residues in Galdieria are conserved across all organisms, so it is likely that the disulfide bond is present in all organisms. PsbU was processed by cleavage of the first 81 residues of the transit peptide at the consensus sequence A-X-A (79 – 81), a recognition site for the thylakoid processing peptidase . The Galdieria transit sequence is longer compared to that of T. elongatus (30 residues) and C. caldarium (61 residues). In Galdieria, the mature PsbV N-terminus is obtained after removal of the transit peptide (1 – 30 residues) in agreement with the other red algae, Cyanidium caldarium and Cyanidioschyzon merolae and the cyanobacterium, T. elongatus [13, 28]. Alignment of the Galdieria PsbV sequence with PsbV from Cyanidium caldarium (CCPsbV - Q9TLW2-1) and Cyanidioschyzon merolae (CMPsbV - Q85FS3) showed Galdieria to be extended by 10 amino acids at the C-terminus, and their removal allowed matches of more product ions. A change of Thr80 to Val further optimized measured and calculated masses to within 10 ppm and maximized the number of product ions matched. Such discrepancies might be explained by small differences between the laboratory strains that were used for genome sequencing and protein isolation (see methods).
Two other peripheral proteins were identified in the preparation, Psb27 and Psb28. Both are in the 12 – 13 kDa range and hydrophilic, being the first to elute in reversed phase chromatography. The gene for Psb28 was included in the Galdieria database based upon homology to cyanobacterial PsbW, also a hydrophilic 13 kDa protein. It is noted that this is different to the PsbW of higher plants, which is a hydrophobic 6.1 kDa protein. We prefer the Psb28 designation to avoid confusion with PsbW . Top-down MS demonstrated removal of the initiating Met residue and no other PTMs. Psb27 was included in the Galdieria database based upon homology to cyanobacterial ‘psbZ-like’ protein. Similarly to Psb28, we note that Psb27 is not like PsbZ, which is a smaller hydrophobic integral subunit. Top-down MS using both CAD and ECD gave numerous y-and z- product ions matched to better than 5 ppm, confirming the accuracy of the C-terminal sequence. However, there appears to be disagreement toward the N-terminus that could be due to alternative splicing of the primary transcript, sequencing errors or differences between lines of the 074W strain that was used for PSII preparation versus sequencing. Only a sequence analysis of the mature psb27 transcript will be able to resolve the genetic basis of this discrepancy. The problem appears to be toward the N-terminus from Ala19 of the database sequence, and we therefore placed a modification of mass 1375.65 Da on that residue to account for the unmatched stretch of polypeptide. In this way many b- and c- ions could be matched confirming with confidence the sequence beyond Ala19. Psb27 is lipidated at the N-terminus in cyanobacteria  and our ongoing work will reveal the situation in Galdieria.
In Galdieria, the larger subunit of cytochrome b559 PsbE has the intiating Met residue removed, while the smaller subunit, PsbF, loses Met1 and becomes acetylated as also seen in cyanobacteria (T. elongatus and Synechocystis sp. PCC 6803) and higher plants (Hordeum vulgare and Nicotiana tabacum) [20, 28, 31]. The PsbH subunit showed removal of Met1 with a free N-terminus as was reported for other organisms [20, 28, 31, 32]. Though light-sensitive phosphorylation at one or two sites close to the N-terminus has been identified in higher plants  a sequence comparison of PsbH between higher plant (Spinacia oleracea) and Galdieria revealed lack of conservation of these phosphorylation sites, as also seen in T. elongatus. No evidence of phosphorylation of PsbH (or any other subunit of Galdieria) was detected. PsbI retained the initating formyl-Met residue as also observed for cyanobacteria and higher plants [20, 28, 31]. Galdieria PsbJ contains 38 amino-acid residues and is located in a gene cluster containing PsbE, PsbF and PsbL. PsbJ has Met1 removed and the N-terminus acetylated consistent with cyanobacteria . While PsbJ was confirmed as a part of PSII in Synechocystis sp. PCC 6803 and T. elongatus, it was not detected in PSII of T. vulcanus by N-terminal sequencing [28, 32], probably because of blockage of Edman degradation by N-acetylation. In Galdieria, the first 8 residues are removed from the N-terminus of PsbK while in T. elongatus 9 residues are removed and in N. tabacum the first 24 [28, 31]. A free amino terminus was retained in all cases. In Galdieria, PsbL is processed by removal of the initiating Met and N-terminal acetylation. In T. elongatus and H. vulgare, PsbL is unmodified while in N. tabacum, it has lost its initiating Met [20, 28, 31]. In cyanobacteria and higher plants, the PsbM subunit retains the initiating formyl-Met residue [20, 28, 31, 33] and in Chlamydomonas, Edman sequencing showed a blocked N-terminus . While higher plant and green algal PsbM are chloroplast encoded, the Galdieria PsbM is nuclear encoded and has a 38 amino-acid transit peptide removed leaving a free amino-terminus. Galdieria PsbT retains the initiating formyl-Met residue in agreement with PsbTc in P. sativum and N. tabacum, and PsbT in T. elongatus [28, 31, 33]. In H. vulgare, N-terminal acetylation is reported however . In recent structural models of dimeric PSII complexes, PsbT along with PsbL and PsbM are located at the PSII monomer interface . It will be of high interest to know the oligomeric state of PSII in Galdieria as this organism performs photosynthesis under extreme conditions. Galdieria PsbX retains the initiating formyl-Met residue and is otherwise unmodified. Mass spectrometric analysis of a protein assigned as PsbX in T. elongatus, was concluded to show cleavage of the first 10 amino acids due to the presence of a lumen targeting pre-sequence [28, 35]. The current version of T. elongatus PsbX (Uniprot Q9F1R6) does not have this extension, requiring removal of just Met1 to achieve the match. PsbZ is conserved across photosynthetic organisms and our analysis confirmed retention of the initiating formyl-Met residue in Galdieria PsbZ in good agreement with T. elongatus, P. sativum and N. tabacum [20, 28, 31].
Top-down high-resolution MS is clearly the technology of choice for absolute characterization of protein identity and modification. The reversed phase LC-MS+ platform presents an attractive primary separation that avoids problems associated with gels. Low resolution MS data from the primary LC-MS+ experiment is used to drive user directed data-dependent top-down MS experiments. Interpretation of top-down data relies upon accurate annotation of high-quality genomic data though it is now very easy to see where actual primary structure diverges from that predicted by translation of DNA sequence. The small integral and the peripheral subunits of Galdieria PSII have been characterized with unprecedented detail, emphasizing the amenability of membrane proteins to analysis by top-down MS.
The authors congratulate Neil Kelleher and Fred McLafferty for the development of top-down mass spectrometry and NK for his award of the 2009 Biemann medal from the American Society of Mass Spectrometry. NIH is thanked for funds for instrument purchase (S10 RR023045). This work was supported by funds from the National Science Foundation NSF award 0417142 to PF. APMW acknowledges support from DFG SFB TR1. We thank Dr. C. Oesterhelt for providing us with the photoautotrophically grown strains 074W and 074G.
1In this article, peripheral subunits refers to all of the subunits PsbO, PsbP, PsbQ, PsbQ’, PsbU, PsbV, Psb27 and Psb28, whereas OEC stabilizing subunits refers specifically to all these subunits, excluding Psb27 and Psb28.