|Home | About | Journals | Submit | Contact Us | Français|
This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
The α-Proteobacteria are capable of interaction with eukaryotic cells, with some members, such as Ochrobactrum anthropi, capable of acting as human pathogens. O. anthropi has been the cause of a growing number of hospital-acquired infections; however, little is known about its growth, physiology and metabolism. We used proteomics to investigate how protein expression of this organism changes with time during growth.
This first gel-based liquid chromatography-mass spectrometry (GeLC-MS) temporal proteomic analysis of O. anthropi led to the positive identification of 131 proteins. These were functionally classified and physiochemically characterized. Utilizing the emPAI protocol to estimate protein abundance, we assigned molar concentrations to all proteins, and thus were able to identify 19 with significant changes in their expression. Pathway reconstruction led to the identification of a variety of central metabolic pathways, including nucleotide biosynthesis, fatty acid anabolism, glycolysis, TCA cycle and amino acid metabolism. In late phase growth we identified a number of gene products under the control of the oxyR regulon, which is induced in response to oxidative stress and whose protein products have been linked with pathogen survival in response to host immunity reactions.
This study identified distinct proteomic profiles associated with specific growth points for O. anthropi, while the use of emPAI allowed semi-quantitative analyses of protein expression. It was possible to reconstruct central metabolic pathways and infer unique functional and adaptive processes associated with specific growth phases, thereby resulting in a deeper understanding of the physiology and metabolism of this emerging pathogenic bacterium.
The α-Proteobacteria are a biologically diverse group with many members capable of interaction with eukaryotic cells and able to function as intracellular symbionts or as pathogens of plants and animals. Some members are important human pathogens, some can establish asymptomatic chronic animal infections, and others are agriculturally important, assisting plants with nitrogen fixation . The α-2 subgroup of the Proteobacteria contain the well-known genera Rhizobacteria, Agrobacterium, Rickettsia, Bartonella and Brucella, which include species of widespread medical and agricultural importance . A less well known member of this group is the genus Ochrobactrum, which is genetically most closely related to the genus Brucella .
Until 1998, Ochrobactrum anthropi was considered to be both the sole and type species of the genus Ochrobactrum, despite the genetic and phenotypic heterogeneity visible within isolates of the species . Subsequent analysis by Velasco et al.  resulted in the description of O. intermedium as a second species. Two new species, O. grignonense and O. tritici, were isolated from soil and wheat rhizoplane systems by Lebuhn et al. , and most recently, O. gallinifaecis was isolated from a chicken fecal sample, O. cystisi from nodules of Cystisus scoparius and O. pseudintermedium from clinical isolates [7,8].
Ochrobactrum species have been described as being environmentally abundant free-living α-Proteobacteria. A number of reports exist in the literature describing the use of Ochrobactrum species as either a source of biotechnologically useful enzymes [9-11] or in the detoxification of xenobiotic compounds such as halobenzoates [12-16]. The ability of Ochrobactrum species to act as legume endosymbionts in temperate genera such as Lupinus, Musa and Acacia has also recently been demonstrated [17-19].
O. anthropi has been identified in clinical samples  and has been the cause of a growing number of hospital-acquired infections usually, but not always, in immunocompromised hosts [21-25]. The organism has been found to adhere, possibly as a result of biofilm formation, to the surface of catheters, pacemakers, intraocular lenses and silicon tubing, thus representing potential sources of infection in the clinical environment [26,27]. Upon infection, O. anthropi has been shown to cause pancreatic abscess, catheter-related bacteremia, endophthalmitis, urinary tract infection and endocarditis . O. anthropi strains usually are resistant to all β-lactams, with the exception of the antibiotic imipenem. Nadjar and co-workers  demonstrated that in at least one isolate, such resistance was due to an extended spectrum β-lactamase. Other than imipenem, the most effective antimicrobial agents for treating human infection that have thus far been reported are trimethoprim-sulfamethoxazole and ciprofloxacin [23,24].
As with its closest genetically related genus, Brucella, the genomes of O. intermedium and O. anthropi are composed of two independent circular chromosomes . Recent work by Teyssier et al.  revealed an exceptionally high level of genomic diversity within Ochrobactrum species, possibly reflecting their adaptability to various ecological niches. Whilst there is currently no publicly available genome sequence data for any Ochrobactrum species, genome information does exist for 20 α-Proteobacteria species, including four species of Brucella. The availability of such information not only offers an excellent model system to study the forces, mechanisms and rates by which bacterial genomes evolve  but also to carry out functional genomic and proteomic investigations of these and closely related organisms.
Beynon  identified a number of phases in the proteomic study of an organism or disease process. In the initial 'identification' phase, scientists are predominantly concerned with gaining insight into the identities of proteins present within the system with which they are working. Recently, we reported such a study of the soluble sub-proteome of O. anthropi . This allowed the identification of 249 proteins involved in a variety of essential cellular pathways, including nucleic acid, amino and fatty acid anabolism and catabolism, glycolysis, TCA cycle, pyruvate and selenoamino acid metabolism. In addition, we identified a number of potential virulence factors of relevance to both plant and human disease. This previous study is a valuable reference point for the proteome of this emerging pathogen. These types of 'identification' studies, whilst useful, tell us very little about the functional role of these proteins within cellular networks. Further developmental phases were described by Beynon , including 'characterization' proteomics, and finally 'quantitative' proteomics in which the emphasis is on the pair-wise comparison of two proteomes and the quantifying of specific proteins present. To develop further our understanding of O. anthropi we have performed a comparative and semiquantitative proteomic analysis to identify the temporal changes in expression and abundance of proteins during growth of this organism. The soluble sub-proteome of O. anthropi grown aerobically in nutrient broth was compared at early phase and late phase growth, with 19 proteins having significant changes in their observed expression. Pathway reconstruction analysis was carried out and led to the identification of a variety of core metabolic processes, thus giving insights into the underlying physiology and biochemistry of this organism. During the late phase of growth of O.anthropi a number of gene products normally induced in response to oxidative stress were identified. These expressed gene products, part of the OxyR regulon, have been linked with pathogen survival in the host environment.
In this study we report the first gel based comparative proteomic analysis of the α-Proteobacterium O. anthropi at two distinct phases of growth. This multidimensional analysis involved the soluble sub-proteome being first separated by one-dimensional PAGE. The resultant gel was then cut into nine fractions based on the SeeBlue™ Plus 2 molecular mass markers. Each gel fraction was then trypsinized and the extracted peptides separated on a reversed phase C18 column over a 60 minute time period prior to being introduced onto the mass spectrometer. This methodology allowed the identification of a total of 131 proteins from the soluble sub-proteome under the two growth phases. This expressed gene product subset represents an estimated 3% of the total O. anthropi proteome, employing data based upon the typical predicted genome size . No data are currently available in the literature on the expected distribution of proteins within sub-proteomic fractions of O. anthropi. As a benchmark, however, a study concentrating mainly on the analysis of the cytosolic proteins of Brucella melitensis 16M, a phylogenetically closely related organism, identified 187 proteins equating to 6% of its predicted proteome [33,34].
As previously reported,  due to the complex nature of the peptide mixtures to be analyzed, the separation capabilities of the liquid chromatography (LC)-mass spectrometry (MS) systems are often exceeded. In this study all peptide fractions were analyzed three separate times in order to increase overall peptide identifications. In the current study, automated curation of our initial dataset by the heuristic bioinformatic tool PROVALT , along with manual curation, led to the positive identification of 89 proteins at early phase and 95 proteins at late phase growth.
Within the protein subset identified from the soluble sub-proteome, 34 proteins were uniquely identified in the early phase of growth, 55 proteins were found under both growth conditions and 40 were found to be unique to the later growth phase. The identified proteins had a wide range of physio-chemical properties in respect to pI and molecular mass (Mr) (Figure (Figure1).1). This two-dimensional visualization showed that the smallest protein identified in early growth was the 30S ribosomal protein S17 (Mr = 9,123 Da) whilst at the late growth condition it was the cold shock protein CSPA (Mr = 8,963 Da). The largest protein identified under both conditions was DNA directed RNA polymerase beta chain (Mr = 153,688 Da). The most acidic protein identified under both conditions was the 30S ribosomal protein S1 (pI = 4.28) while the most basic in the early growth condition was the 30S ribosomal protein S5 (pI = 10.49) and in the late growth condition was the 30S ribosomal protein S20 (pI = 11.63).
Proteins identified within the two growth conditions were quantified using the Exponentially Modified Protein Abundance Index (emPAI) and can be seen in Table Table11 (for those proteins unique to early phase growth), Table Table22 (for those proteins common to both growth conditions) and Table Table33 (for those proteins unique to late phase growth) . This method allows the quantification of individual identified proteins by utilizing database and Mascot output information, in order to give an emPAI value. The emPAI value can then be used to estimate the protein content within the sample mixture in molar fraction percentages. In addition, the fold change in expression level of proteins identified under both growth conditions can be estimated, thus giving further insights into cellular processes. The most abundant protein as calculated by molar fraction percentages under both conditions was the 30S ribosomal protein S1 (Table (Table2).2). The least abundant protein under early growth conditions was 30S ribosomal protein S17 (Table (Table1)1) and under late phase growth conditions was Valyl-tRNA synthetase (Table (Table33).
Proteomic analysis of the origin of the identified proteins in this study supports previous genomic studies showing that, phylogentically, the genus Ochrobactrum is most closely related to Brucella, with 93.9% of the proteins identified having closest match to this genus. The remaining proteins were matched to other members of the α-2 subgroup of the Proteobacteria (Rhizobacteria (3.8%), Bartonella (1.5%) and Agrobacterium (0.8%)).
Of the 131 proteins detected in this study, functional roles for 125 proteins (95.4%) were known or could be predicted from database analysis. Proteins within this soluble sub-proteome were assigned to functional categories utilizing methodologies as previously described by Takami et al.  and Wasinger et al. . Figure Figure22 shows that proteins of the largest category of identified proteins under both growth conditions were involved in protein synthesis (ribosomal proteins), followed by those involved in metabolism of nucleotides and nucleic acids, then those involved in metabolism of amino acids and related molecules. The remaining proteins were distributed amongst the other functional categories. The functional categories of Metabolism of nucleotides, DNA replication, RNA synthesis (elongation), Protein modification and Protein folding are found to be present at higher levels in early growth phase compared to late phase growth. In the late phase of growth, Transport proteins, Specific pathways, Metabolism of amino acids, Protein synthesis (ribosomal proteins) and Protein synthesis (tRNA synthetases) are better represented. Furthermore, the late growth phase was the only one to have proteins present from the Adaptation to atypical conditions (2.1%) and Detoxification (4.2%) functional categories. It is worth noting that assignment of proteins to functional categories is complicated, as exemplified in the case of the Metabolism of nucleotides category, by the anaplerotic nature of bacterial enzymes with a number of proteins that could also have been classified within the Metabolism of amino acids category.
The rapid increase in genomic data over the past decade has revealed many important aspects of microbial cellular processes; however, there are still a significant number of potential gene products for which we know nothing, save that they are classified as 'hypothetical proteins'. Indeed, within the genome sequence of B. melitensis strain 16M, the closest relative phylogenetically of O. anthropi for which genomic data are available, some 716 predicted gene products, equivalent to 22% of the total genome, are predicted to be either hypothetical proteins or proteins of unknown function. In previous work we have underlined the necessity to assign, where possible, an element of biological functionality to such gene products in order to develop both systems biology and our understanding of cellular processes within these organisms. Within the current study we have established the presence of six proteins that had previously been annotated as hypothetical conserved proteins. The identification of such proteins within the cell-extract of O. anthropi establishes the biological functionality of these 'hypothetical' predicted protein coding sequences, and once more elegantly demonstrates the potential of proteomics to validate bioinformatics predictions.
Having established the presence of such proteins and wishing to understand how they contribute to functional processes, we further examined them using NCBI BLASTp . Such an approach allows conserved domains within protein sequences to be identified and thereby enables a degree of inferred functionality. Using this methodology, however, allowed us to assign putative function to only one of these proteins, NCBI:23463995. The search identified two conserved domains, pfam 01480, GFO_IDH_MocA; Oxidoreductase family involved in utilization of NADP or NAD and COG 1748; Saccharopine dehydrogenase and related proteins involved in amino acid transport and metabolism.
Sub-cellular localization prediction tools have been used for many years to identify those proteins that are retained by and exported from cells. They may also have uses in identifying possible diagnostic and therapeutic targets as well as providing information on the functionality of a protein . In the current study a number of bioinformatics tools, including PSortB [41,42], SignalP [43,44] and SecretomeP [45,46] were utilized. These bioinformatics tools endeavor to assign a sub-cellular location for each protein. These tools use a set of descriptor rules and a variety of computational algorithms and networks to analyze a protein's amino acid composition in an attempt to identify known motifs or cleavage sites. The proteins identified in this study were separated into three groups and analyzed using the above bioinformatics tools. The groups were: those proteins only identified in early growth (bioinformatics search results can be seen in Table Table1);1); those proteins found to be common to both growth conditions (bioinformatics search results can be seen in Table Table2);2); and those proteins identified only at late growth phase (bioinformatics search results can be seen in Table Table3).3). Overviews of the bioinformatic analysis on the proteins from the soluble sub-proteome of O. anthropi are shown for early growth (Figure (Figure3),3), for both growth conditions (Figure (Figure4)4) and for late growth (Figure (Figure55).
Within the protein subset identified only in early growth, nine proteins were predicted to be secreted (26.5%), with six of those identified as possessing an amino-terminal signal peptide (Table (Table1);1); of those proteins common to both growth conditions, 15 were predicted to be secreted (27.3%), with five of those identified as possessing an amino-terminal signal peptide (Table (Table2);2); and of those identified only in late growth, 15 were predicted to be secreted (37.5%), with six of those identified as possessing an amino-terminal signal peptide (Table (Table33).
The subset of 17 proteins identified as possessing an amino-terminal signal peptide were further analyzed for the presence of lipobox, RR-motif, and signal peptide cleavage sites to allow assignment, where possible, to a particular secretion pathway [31,32] (Table (Table4).4). Of these 17 proteins, only seven had the required architecture that would allow them to be assigned to the Sec pathway (NCBI:17982453, 17984491, 17982015, 17982340, 17982154, 17983192 and 86283673). The remainder of the proteins, whilst containing the correct cleavage site for a signal peptide, did not, in fact, have the full amino-terminal architecture that would be required to allow us to classify them as secreted proteins [47,48]. This once again highlights the limitations of some of the present generation of bioinformatic tools, which presently are concentrated largely on motif-based predictors. This aptly demonstrates the absolute necessity of manual interpretation of results in order to gain any level of biological significance.
Utilizing the emPAI calculation for measuring protein abundance within our proteomic investigation allowed us to use the molar fraction percentage values for proteins common to both growth conditions; this enabled us to compare the fold change in protein expression that occurs under the two different conditions [37,49,50]. Two ranges are generally used in comparative proteomics to ascertain if the fold change in expression is significant. In the isobaric labeling technology iTRAQ™, a ≥20% change is considered significant and sufficient to take account of systematic errors; therefore, fold changes of ≥1.2 or ≤0.8 are significant, with a fold change value of 1 representing no difference in protein levels between the two states [51,52]. In the comparative two-dimensional PAGE technologies, a ≥50% change is considered significant and sufficient to take account of systematic errors; therefore, fold changes of ≥1.5 or ≤0.5 are significant, again with a fold change value of 1 representing no difference in protein levels between the two states [53,54]. The fold change in protein expression between proteins from the two growth conditions can be seen in Table Table2.2. Taking the ≥20% cut-off value, 44 proteins significantly changed in expression; at the ≥50% cut-off value this is reduced to 19 proteins that significantly changed in expression between the two growth conditions. Utilizing the more stringent ≥50% value as a measure of differential protein expression, it can be seen that 11 proteins have much higher expression levels in the early growth condition and 6 have higher expression levels in the later growth condition (Figure (Figure66).
Using the available genome sequence of B. melitensis 16M, the closest relative of O. anthropi, and assuming a high degree of synteny between the genomes of these organisms, we investigated the genomic context of each gene found to be differentially expressed in this study. In this manner, we hoped that predicted transcriptional units for these proteins would be identified, thus elevating our functional understanding of the processes occurring within the organism.
Of the 30S ribosomal proteins identified as differentially expressed, all were predicted to be transcribed independently , and four (30S ribosomal proteins S3, S5, S7 and S17) were found within the same region of the B. melitensis 16M genome. The reported differential expression of ribosomal proteins is not unusual in proteomics investigations; however, little information is available as to why certain component proteins of the 30S ribosome should be present at different levels. Of the remaining 12 proteins that were differentially expressed, the available in silico evidence suggests they are independently transcribed within the B. melitensis 16M genome .
Previously, Djordjevic et al.  reported the necessity to identify within a proteomic study three enzymes present in a particular biochemical pathway in order to definitively state that such a pathway is present and active within the system under study. Ergo, in conjunction with the pathway reconstruction tool BioCyc , we have been able to identify the following pathways: superpathway of glycolysis, pyruvate dehydrogenase and TCA cycle (10 proteins) (Additional data file 1); superpathway of glyoxylate cycle (3 proteins) (Additional data file 1); fatty acid elongation (4 proteins) (Additional data file 2); de novo purine nucleotide biosynthesis (9 proteins) (Additional data file 3); arginine biosynthesis (3 proteins). Lying outside of our stringent rules for pathway identification but nonetheless worthy of note are two additional short pathways for which two out of four proteins (non-oxidative branch of the pentose phosphate pathway) and two out of three proteins (phenylalanine biosynthesis II) were identified in the current study.
At both time points it is clear, as would be expected, that central metabolic pathways such as the TCA cycle and fatty acid biosynthesis are active in O. anthropi. In addition, two key enzymes involved in the oxidative pentose phosphate pathway, transketolase and transaldolase, are also found. These enzymes are essential in the recycling of excess pentose phosphate, formed when there is high demand on NADPH2-dependent biosynthetic pathways. In addition, this pathway also provides intermediates for nucleotide biosynthesis, and indeed nucleotide biosynthetic pathways are also apparently active, as might be expected, at both sampling points . In early phase growth, enzymes necessary for peptidoglycan and lipid A biosynthesis were specifically detected, presumably due to the high demand for new cell wall and outer membrane layer components at this growth point . Whilst the enzymes essential for ribonucleotide biogenesis were present at both conditions, only in early phase growth was nucleoside diphosphate kinase, the key component for deoxyribonucleotide synthesis, detectable, indicative of a demand for components involved in DNA replication. In late phase growth, evidence for the activation of gluconeogenesis was found as a likely result of nutrient depletion, with expression of malate dehydrogenase upregulated by 2.4-fold, and the two enzymes phosphoenol pyruvate (PEP) carboxykinase and pyruvate phosphate dikinase, essential for the production of PEP from oxaloacetate and pyruvate, respectively, detectable for the first time. It is of note that this pathway is considered to be important for the establishment of host infection by certain bacteria, such as Mycobacterium bovis and Xanthomonas campestris [60,61]. Similarly, the presence of serine hydroxymethyltransferase, which converts glycine to L-serine prior to its potential conversion to pyruvate, was found only in late phase growth. Intriguingly, the enzymes involved in biosynthesis of the amino acids arginine, lysine and phenylalanine were also unique to late phase growth. Whilst this may reflect a cellular demand for these amino acids, it is of note that the co-product of arginine synthesis is fumarate, which has the dual role of being an intermediate in both the TCA cycle as well as in gluconeogensis.
One additional feature of late phase growth is the presence of a number of stress response proteins, particularly those of importance in oxidative stress resistance. The proteins thioredoxin (TrxC), alkyl hydroperoxide reductase (AhpC) and thiol peroxidase have been reported to be important for the survival of pathogenic bacteria within a host organism [62,63]. Indeed, these proteins have also been detected in both transcriptomic and proteomic studies investigating the role of oxidative stress in a number of important pathogenic organisms that include Escherichia coli, Candida albicans and Porphyromonas gingivalis [64-68]. The TrxC and AhpC genes are subject to control by the oxyR regulon, which is induced in response to oxidative stress resulting from hydrogen peroxide and other free oxygen radicals. During this process, the regulatory protein oxyR becomes oxidized by these reactive oxygen species to form an intramolecular disulphide bond, thus allowing the activation of expression from trxC, grxA, gorA and hence other genes of the OxyR regulon (Figure (Figure7).7). Glutathione is an essential element in the regulatory cycle of oxyR, which may explain the presence of N-methylhydantoinase in late phase growth, as it is an integral part of the γ-Glutamyl pathway, which is responsible for the generation of glutathione from 5-oxoproline. During oxidative stress thioredoxin is produced and in its reduced form it acts as an acceptor of oxygen radicals generated as a result of the catalytic activity of thiol peroxidase on H2O2, thus scavenging and detoxifying the damaging oxygen radicals and concomitantly forming H2O and thioredoxin disulphide . Thioredoxins may also be involved in a cascade that triggers the transcription of other detoxifying genes, as they have also been shown to interact with DNA gyrase and thus influence a multitude of transcriptional responses in the cell by increasing or decreasing DNA supercoiling. This strongly suggests that the gyrase-mediated effect of thioredoxins on gene expression is a common redox-dependent signaling pathway in bacterial adaptation . Intriguingly, AhpC has also been found to be an important conserved bacterial allergen that interacts with mammalian Toll-like receptors, specifically the MyD88 protein, thus activating innate and adaptive immune responses within the host .
The popularity of 'identification' proteomics is evident from the abundant studies reported in the literature, and their contribution to our understanding of the diversity of protein expression in cells and organisms has been immense; however, these studies have clear limitations with regard to the amount of insight they can give on the function of a system. The trend within the proteomics community has, therefore, moved from this cataloguing approach towards the development of comparative and quantitative analyses that, as outlined in the present study, give greater insights into the functional processes occurring. Combining both of these techniques with rigorous data curation and interpretation, coupled with the vast array of bioinformatics tools available to the life scientist is the only way to ensure that these processes are accurately described such that meaningful data can be provided for the wider scientific community.
In this study we were able to identify distinct proteomic profiles associated with specific growth points for the emerging nosocomial pathogen O. anthropi. For those proteins common to both growth phases, the use of emPAI allowed a semi-quantitative analysis of protein expression to be made and it was possible to reconstruct core metabolic pathways functioning within this organism. It was also possible to infer unique functional and adaptive processes associated with specific growth phases and, therefore, gain a much deeper understanding of the physiology and metabolism of this pathogenic bacterium. Of particular interest was the identification of a number of protein products involved in oxidative stress response that are known to be regulated as part of the oxyR regulon and have previously been shown to be key in pathogen survival within host environments.
All reagents were purchased from Sigma-Aldrich (Poole, UK) with the exception of mass spectrometry grade water and acetonitrile, which were purchased from Romil (Cambridge, UK) and trypsin, which was purchased from Promega (Southampton, UK).
O. anthropi UU551 was routinely maintained at 37°C on nutrient agar. Routine growth of the organism involved the inoculation of nutrient broth (50 ml in 250 ml Erlenmeyer flasks) with a loop of fresh, actively growing (16 h) culture from agar plates. Flasks were incubated aerobically at 37°C with orbital shaking at 200 rpm in an Innova™ 4230 refrigerated incubator shaker (New Brunswick Scientific, Edison, NJ, USA). Growth was monitored by the increase in culture attenuance at 600 nm.
O. anthropi cultures were harvested under two separate growth conditions, at early phase (D600 = 0.3) and at late phase (D600 = 1.2) growth, by centrifugation at 9000 × g for 10 minutes at 3-5°C. The cell pellet was weighed and resuspended in 10 mM phosphate-buffered saline (pH 7.8) at a ratio of 1 g cells to 2 ml buffer. The cells were then broken using sonication as described previously by Graham et al. . The soluble proteome fraction was isolated by centrifugation of the homogenate at 25,000 × g for 30 minutes at 3-5°C (Beckman J2-HS, Beckman Instruments, Fullerton, CA, USA) followed by ultracentrifugation at 150,000 × g for 2 hours at 3-5°C (Beckman L8-M, Beckman Instruments) to sediment the insoluble fraction. The supernatant was decanted and stored frozen in 1 ml aliquots at -70°C until required. The total soluble protein content was measured using the Bradford assay .
An aliquot of the supernatant was diluted ten-fold with deionised water; 10 μl of this diluted sample was added to 10 μl Tris-Glycine SDS sample loading buffer (Invitrogen, Paisley, Renfrewshire, UK) and boiled for 5 minutes. The samples (20 μl; 100 μg total protein) were loaded onto a 1 mm thick Nu-Page 4-12% Bis-Tris gel (Invitrogen). SeeBlue™ Plus 2 (Invitrogen) was used as a protein molecular mass marker. The gel was electrophoresed, using MES SDS running buffer, in an X-Cell II mini gel system (Invitrogen) at 200 V, 120 mA, 25 W per gel for 35 minutes. Proteins were visualized using SimplyBlue™ Safestain (Invitrogen). The entire lane was excised from the gel and cut into nine fractions based on molecular mass as previously described by Graham et al. .
Excised gel fractions were washed for 30 minutes in 200 mM NH4HCO3, pH 7.8 at 37°C. These fractions were then dehydrated by incubation for 30 minutes in 200 mM NH4HCO3 pH 7.8/MeCN (4:6 v/v) at 37°C, followed by rehydration for 30 minutes in 50 mM NH4HCO3, pH 7.8 at 37°C. Following incubation in 100% acetonitrile for 2 minutes, 0.1 μg trypsin in 50 mM NH4HCO3, pH 7.8 was added to each sample, which was then incubated overnight at 37°C. The supernatant was subsequently recovered into microcentrifuge tubes and a second peptide extraction from these gel pieces was carried out (0.1% trifluoroacetic acid (TFA) in 60% acetonitrile for 5 minutes). Peptide-containing liquid fractions were pooled, dried under vacuum and re-suspended in 20 μl 0.1% formic acid in 2% acetonitrile prior to storage at -70°C until required.
Mass spectrometry was performed using a 3200 Q-TRAP Hybrid ESI Quadropole linear ion trap mass spectrometer, ESI-Q-q-Qlinear ion trap-MS/MS (Applied Biosystems/MDS SCIEX, Toronto, Canada) with a nanospray interface, coupled with an online Ultimate 3000 nanoflow liquid chromatography system (Dionex/LC Packings, Amsterdam, The Netherlands). A μ-Precolumn™ Cartridge (300 μm × 5 mm, 5 μm particle size) was placed prior to the C18 capillary column (75 μm × 150 mm, 3 μm particle size) to enable desalting and filtering. Both columns contained the reversed phase material PepMAP™ 100 (C18 silica-based) with a 100Å pore size (Dionex/LC Packings). The elution buffers used in the gradient were Buffer A (0.1% formic acid in 2% acetonitrile) and Buffer B (0.1% formic acid in 80% acetonitrile). The nanoLC gradient used was 60 minutes in length: 0-55% B in 45 minutes, 10 minutes at 90% B followed by 5 minutes at 100% A. The flow rate of the gradient was 300 nlmin-1. The detector mass range was set at 400-1,800 m/z. MS data acquisition was performed in positive ion mode. During MS acquisition, peptides with 2+ and 3+ charge states were selected for fragmentation.
Protein identification was carried out using an internal MASCOT server (version 1.9; Matrix Science, London, UK) searching against the bacteria sub-set of the MSDB database (latest version at the time of processing). Peptide tolerance was set at ± 1.2 Da with MS/MS tolerance set at ± 0.6 Da and the search set to allow for one missed cleavage. In order to expedite the curation of the identified protein list from MASCOT, the result files were re-analyzed against an extracted database comprising eleven α-proteobacterial genome databases downloaded from NCBI using the heuristic method known as the protein validation tool PROVALT . This automated program takes large proteomic MS datasets and reorganizes them by taking multiple MASCOT results and identifying those peptides that match. Redundant peptides are removed and related peptides are grouped together associated with their predicted matching protein; thus, the program dramatically reduces this portion of the curation process. For identification purposes the minimum peptide length was set at 6 amino acids, the minimum peptide MOWSE score was set at 25 and the minimum high quality peptide MOWSE score was set at 40. PROVALT also uses peptide matches from a random database (in this case the extracted α-proteobacterial database was randomized) to calculate false-discovery rates (FDRs) for protein identifications as previously described by Weatherley et al. . Briefly, identifications from searching the normal and random databases are used to calculate the FDRs and set score thresholds and thus identify as many 'actual' proteins as possible while encountering a minimal number of false-positive protein identifications. Rather than calculate error rates at the peptide level, the FDR calculations employed by PROVALT provide a reasonable balance between the number of correct and incorrect protein assignments. In this study the FDR was set at 1%, meaning that 99% of the reported proteins identified should be correct.
Proteins within the two growth conditions were quantified utilizing emPAI [37,49,50]. This method allows the quantification of individual identified proteins by utilizing database and Mascot output information, in order to give an emPAI value. The emPAI value can then be used to estimate the protein content within the sample mixture in molar fraction percentages. Also, the fold change in expression levels of proteins identified under both growth conditions can be estimated, allowing further insights into cellular processes.
Pathways were reconstructed utilizing the BioCyc database , a collection of 160 pathway/genome databases for most eukaryotic and prokaryotic species whose genomes have been completely sequenced. The BioCyc collection provides a unique resource for computational systems biology by enabling global and comparative analyses of genomes and metabolic networks. Identified proteins can be entered into the database and searched against specific species, thus allowing scientists to visualize combinations of gene expression maps of these organisms and thus reconstruct pathways that are present.
PSORTb version 2.0.4 [41,42] was used for the prediction of bacterial protein subcellular localization. SignalP 3.0 [43,44] was used to predict the presence and location of signal peptide cleavage sites in amino acid sequences, for classically secreted proteins. SecretomeP 2.0 [45,46] was used for the prediction of non-classical protein secretion (that is, protein secretion that is not triggered by signal peptides).
The following additional data are available with the online version of this paper. Additional data file 1 is a figure illustrating the superpathway of glycolysis, pyruvate dehydrogenase, TCA and the superpathway of the glyoxylate cycle. Additional data file 2 is a figure illustrating fatty acid elongation. Additional data file 3 is a figure illustrating de novo purine nucleotide biosynthesis. Within these pathways proteins unique to the early growth phase are boxed in green, those identified in both growth conditions are boxed in blue and those unique to the late growth phase are boxed in yellow.
Proteins unique to the early growth phase are boxed in green, those identified in both growth conditions are boxed in blue and those unique to the late growth phase are boxed in yellow.
Proteins unique to the early growth phase are boxed in green, those identified in both growth conditions are boxed in blue and those unique to the late growth phase are boxed in yellow.
Proteins unique to the early growth phase are boxed in green, those identified in both growth conditions are boxed in blue and those unique to the late growth phase are boxed in yellow.
RLJ Graham was supported by the Northern Ireland Centre of Excellence in Functional Genomics, with funding from the European Union (EU) Programme for Peace and Reconciliation, under the Technology Support for the Knowledge-Based Economy.