|Home | About | Journals | Submit | Contact Us | Français|
At its most ambitious, untargeted metabolomics aims to characterize and quantify all of the metabolites in any system. Metabolites are often present at a broad range of concentrations and possess diverse physical properties complicating this task. Performing multiple sample extractions, concentrating sample extracts, and using several separation and detection methods are common strategies to overcome these challenges but require a great amount of resources. This protocol describes the untargeted, metabolic profiling of polar and non-polar metabolites with a single extraction and using a single analytical platform.
Metabolomics can be defined as the comprehensive analysis of metabolites, or small molecules, in a system. A metabolomics experiment can be either targeted, where several metabolites are chosen for analysis, or untargeted, where the metabolites analyzed are not predetermined, but instead all metabolites are analyzed, limited only by their propensity to be detected. While nuclear magnetic resonance (NMR) is often successfully employed for metabolomics studies (Smolinska et al., 2012; Sumner et al., 2010), this protocol describes a high-pressure liquid chromatography coupled mass spectrometry (LC-MS) based metabolomics platform and this technique will be the focus of following discussions (Dove et al., 2012).
Targeted metabolomics platforms are typically performed on triple quadrupole mass spectrometers using multiple reaction monitoring to assess the relative levels and sometimes quantify a set of predetermined metabolites (Dudley et al., 2010). Choosing which metabolites to analyze is a subjective process and driven by hypotheses formed from previous experimental evidence. Though highly focused, targeted platforms are often quite sensitive and circumvent the need for metabolite identification post data acquisition, increasing maximum analytical throughput.
Untargeted metabolomics experiments offer a more comprehensive and unbiased approach to metabolite analysis. Analysis is often performed using separation techniques in combination with high resolution accurate mass spectrometers with the goal of detecting as many metabolites as possible. The data are then statistically analyzed in an unbiased fashion to uncover the greatest and most significant metabolic perturbations. In contrast to the targeted approach, untargeted metabolomics aims to analyze metabolites in an unbiased manner and to develop hypotheses. Sensitivity is typically lower than targeted platforms, but untargeted metabolomics platforms offer unbiased insight into metabolism and the chance for discovery of novel small molecules and functions (Patti et al., 2012b).
Besides metabolite detection, the extraction and the separation of metabolites prior to MS analysis are of equal importance and both offer a unique set of challenges. Metabolite extraction is complicated by the extreme concentration range and great physical diversity of metabolites often present in the metabolome. Concentrating sample extracts and developing separate extractions specific for polar and non-polar metabolites are common strategies to overcome these challenges and increase metabolome coverage (Everroad et al., 2012; Patti, 2011). Many studies do however show that extraction of metabolites of diverse polarity can be achieved quite efficiently using solvent mixtures (Bruce et al., 2009; El Rammouz et al., 2010; Yanes et al., 2011).
Separating metabolites by high-pressure liquid chromatography (LC) prior to analysis by mass spectrometry (MS) is a widely used and effective tool for metabolomics analysis and is the technique described in this protocol. Separation by gas chromatography (GC) is also fairly common but is specific for volatile molecules and chemical derivatization may be required (Sun et al., 2012). The major challenge of metabolite separation stems mainly from their great physical diversity. The separation of non-polar metabolites, such as fatty acids and membrane lipids, is relatively straightforward, often performed with C18 LC columns (Gregory et al., 2012). The separation of polar metabolites, such as amino acids and polyamines, is a much more difficult task and more focused LC columns with polar functional groups are being employed (Patti, 2011). Mixed mode LC columns are being used to separate polar and non-polar metabolites in a single LC-MS run and this technology warrants further development (Patti, 2011; Yanes et al., 2011).
Taking into consideration the complexity of the metabolome and associated challenges, it is clear that to achieve the goal of detecting all metabolites in a system will likely require a complex platform involving several different sample extractions, methods of separation, and multiple detectors. This degree of comprehensiveness is necessary for maximum metabolome coverage, but significantly confounds the time and cost associated with the analysis of each sample. In this protocol, we describe the streamlined, untargeted metabolic profiling of polar and non-polar metabolites with a single extraction, one LC separation, and one detector.
This protocol describes the simultaneous extraction of polar and non-polar metabolites from several different types of biological samples. Because the physical properties of metabolites vary greatly, we have chosen not to use an internal standard for extraction in this protocol. Without an internal standard, it is critical to treat each sample equally to reduce technical variation, increasing the demand for technical precision. It is also important to note that enzyme mediated reactions can occur very quickly and lengthy extraction procedures may affect metabolite levels. To reduce this source of variation, extractions are initially carried out quickly and with cold, organic solvents. Organic solvents are also important here to precipitate proteins and produce a clean sample. LC-MS is a highly flexible platform allowing for the analysis of a wide variety of sample types. The following protocol provides details for metabolite extractions from rat plasma, mouse C2C12 muscle cells, and whole zebrafish.
This protocol describes the simultaneous separation and detection of polar and non-polar metabolites by LC-MS using a quadrupole time-of-flight (Q-TOF) mass spectrometer. The TripleTOF 5600 (AB SCIEX) mass spectrometer used here has the ability to acquire MS/MS spectra on an information-dependent basis (IDA) during a LC-MS experiment. In this mode, the acquisition software continuously evaluates the full scan survey MS data as it collects and triggers the acquisition of MS/MS spectra depending on preselected criteria. Both, the survey data and the MS/MS spectra are acquired with high resolving power (~35,000) and high mass accuracy (≤ 2 ppm). IDA acquisitions allow the highly accurate mass analysis and MS/MS experiments of thousands of ions in a short period of time, a valuable trait for metabolomic analysis. Each sample is analyzed once in positive ion mode and once in negative ion mode to maximize coverage of the metabolome. For the described applications, polarity switching during an LC run was omitted to avoid loss ions during the switching event.
Metabolite identification is one of the most important aspects of a metabolomics platform and greatly affects the subsequent generation of hypotheses. To minimize misidentification, metabolites are rigorously identified by accurate mass, MS/MS fragmentation pattern, isotope pattern, and when standards are available, retention time. When new chemical standards are acquired they are run on our LC-MS system and system-specific identification parameters, such as retention time and MS/MS fragmentation, are experimentally determined and stored in our in-house metabolite database. Larger scale metabolomics facilities often contain hundreds or thousands of metabolites in a database, but even at a modest 150 metabolites, our in-house library allows for the streamlined identification of many central metabolites.
Genes, transcripts, and proteins often undergo chemical modifications, such as methylation, phosphorylation and acetylation, which can greatly affect function. Post-translational modifications are dynamic and have a profound effect on protein localization, functional activity, and molecular interactions. Multi-site modifications govern cellular processes and homeostasis and cell-type specific (Jensen, 2006). To track all of the regulatory modifications for the thousands of known genes, transcripts, and proteins is currently an incomprehensible feat, but to make an accurate correlation to phenotype, these changes must be taken into consideration. Metabolites are the end products of cellular metabolism and lack regulatory modifications and as such, may offer a much easier and more accurate reflection of phenotype (Patti et al., 2012c).
While metabolomics is not a new concept, recent advances in analytical technology have allowed for the detection of thousands of m/z-features in a short period of time. Each of these m/z features represents an ion with a unique m/z value and retention time. Some of these m/z features do indeed represent metabolites, but a large portion of the detected data obtained in a typical metabolomics LC-MS experiment are MS-artifacts. Many metabolites may produce more than one m/z feature. Many metabolites are still uncharacterized. And, a major bottleneck in contemporary metabolomics remains the unequivocal identification of metabolites (Bowen and Northen, 2010).
Because metabolites have extremely diverse physical properties, a comprehensive metabolomics platform often consists of multiple separation methods (for polar and non-polar metabolites) and multiple detection systems (Suhre et al., 2010). Multiple sample extractions are also employed to accommodate the different separation techniques for polar and non-polar metabolites. Such a platform has amazing potential for metabolome coverage and detection of novel, low-level metabolites, but is incredibly restrictive due to high cost and manpower. In this protocol, we describe a streamlined yet comprehensive metabolomics platform consisting of a single metabolite extraction, separation, and detection.
The major goal of a metabolomics-oriented sample extraction is to minimize technical variation and maximize metabolome coverage. Variation is kept to a minimum in this protocol by providing similar extraction conditions for all samples (e.g. same vortex time and speed, pipette speed, sample thaw time) and by stopping enzyme activity (quenching metabolism) quickly with cold, organic solvents. Organic solvents also serve to precipitate proteins resulting in a clean, particle-free extract. Internal standards can be highly effective when used for targeted metabolomic studies where a single class or small group of metabolites is being analyzed. We have decided not to use internal standards for several reasons. The process of adding an internal standard introduces potential sources of analytical variation such as pipetting volume. Also, it is difficult to choose a single internal standard adequate for normalizing the extraction of thousands of metabolites of great physical diversity. Metabolome coverage is maximized in this protocol by using mixtures of mild non-polar solvents (EtOH, MeOH, water) and a low solvent-to-sample ratio.
Analytical run-to-run reproducibility is usually high. To monitor run-to-run variation over the course of multiple LC-MS runs, a QC sample should be run periodically. To account for potential matrix effects stemming from different sample types, the QC sample should be comprised of an equal mixture of all the extracts that will be analyzed together. Samples in the same batch should always be run in a random order to prevent artificial significance that can result from metabolite degradation or a change in instrument sensitivity over time.
When extracting metabolites from sample types not described here, optimization may be necessary to reduce variation find an acceptable metabolite concentration. As mentioned above, consistency in the extraction procedure and the rate of metabolic quenching can have a great impact on the metabolome and should be main points of focus when developing a procedure for metabolite extraction. Metabolite concentration is also important and a balance must be struck. A highly concentrated sample is ideal for the detection and characterization of low-level metabolites, but metabolites present at high levels may saturate the detector of the mass spectrometer, masking potentially important changes. Also, higher level metabolites may suppress the ionization of other molecules. Extracting metabolites with a mixture of solvents is a commonly used technique used to increase metabolite extraction and the 50:50 extraction solvent used in this protocol (50:50 (v:v) MeOH:EtOH) was adapted from Bruce et al, where this mixture was determined optimal for the extraction of metabolites from plasma (Bruce et al., 2009). Also of note regarding plasma sample preparation, EDTA may suppress metabolite ionization and citrate is a major biological metabolite and both should be avoided when searching for an anticoagulant. The sodium or lithium salt of heparin is a suitable choice.
LC conditions including gradient and solvent composition for metabolomic experiments are usually very general and not aimed towards enhancing the separation of any specific metabolites. The LC method described here was developed in the search for a sensitive method that was suitable for the separation of both polar and non-polar metabolites. Well-characterized metabolites of diverse polarities were used to optimize LC conditions including column, solvent composition, and elution gradient. Many metabolomics platforms use solvent modifiers, such as ammonium acetate, during positive ion analysis to increase sensitivity. We determined that with our system, 0.1% formic acid provided optimal chromatography for several polar molecules of interest and the increase in sensitivity provided by ammonium was not worth the sacrifice in separation. Figure 1 displays an experimental total ion chromatogram (TIC) obtained from rat plasma and below, normalized extracted ion chromatograms for several polar and non-polar metabolites are given.
Metabolites are initially identified based on measured mass and for this reason, consistently accurate mass measurements are crucial to the success of a metabolomics experiment. Time-of-flight (TOF) mass spectrometers are sensitive to temperature fluctuations and need to be calibrated periodically to maintain accuracy over the course of a metabolomics experiment. As the degree of temperature fluctuations is lab-dependent, so is the recommended number of calibrations per hour. For the here described system, external calibration is used. To maintain the highest degree of mass accuracy possible, this protocol describes instrument calibration before each LC-MS run, about once every hour. This process requires about two minutes. The 5600 TripleTOF (AB SCIEX) used here is specified to maintain an error in mass measurements below 2 ppm and, under this calibration regimen, we averaged 1.36 ppm error.
In IDA mode, the maximum number of ions to be monitored for triggering MS/MS events can greatly influence the information content of the dataset. While it is desirable to maximize the number of MS/MS events to increase the amount of product ion data gathered, the duty cycle time must be kept low to ensure a sufficient number of data points across chromatographic peaks. For the LC conditions provided in this protocol, we used an IDA methods composed of a survey scan of 0.25 seconds and 4 MS/MS events with product ion accumulation times of 0.17 seconds each, resulting in a duty cycle time of 0.98 seconds. It should be noted that mass accuracy are impacted by ion accumulation times. When the accumulation time is set too low, few spectra are averaged leading to poor mass accuracy. Using the above described IDA settings, 8–10 data points across chromatographic peaks were achieved providing quantitative MS data of sufficiently high quality for downstream statistical evaluation and concurrent metabolite identification.
The analysis of untargeted metabolomics datasets should be unbiased and aim to uncover the largest and most statistically significant changes. Principal component analysis (PCA) is a mathematical procedure allowing for the analysis of multiple groups and is commonly performed on metabolomic datasets to visualize and compare sources of variation (e.g. technical vs. biological, inter- vs. intra-group) and to reveal groupings. Principal component analysis-discriminant analysis (PCA-DA) takes into account sample groupings and acts to minimize intragroup variation. Figure 2 displays the scores plot from a PCA-DA of muscle cells at several different timepoints after treatment with xanthohumol, a flavonoid from hops. The three timepoints, 15, 45, and 90 minutes, are clearly separated suggesting that xanthohumol is acutely affecting the muscle cell metabolome (Fig. 2). Also of note, the quality control (QC) replicates are grouped together indicating little analytical variance (Fig. 2). There are several data pretreatment methods used for PCA which can greatly affect the results and the most suitable choice is dependent on experimental design and properties of the dataset (van den Berg et al., 2006). Pareto scaling is preferentially used here as it tends to reduce the relative importance of large values relative to other data pretreatment methods (van den Berg et al., 2006).
It should be noted however, that the data pretreatment method utilized is less important to the metabolomics platform described here, where the importance or significance of metabolites is determined not by PCA, but instead by univariate statistics. When comparing two groups, performing a Student’s t-test for all peaks and first identifying those with low p-values and large fold-changes is a simple but effective strategy for analyzing metabolomics datasets. To visualize these data, a volcano plot can be produced by plotting p-value against log (fold-change) for all peaks (Fig. 3). A recently developed visualization tool for metabolomics data sets termed cloud plots also incorporate m/z value and retention time information for all peaks (Patti et al., 2012a).
Some metabolites appear more abundant as adducts (+Na, +NH4) or in-source fragments (loss of H2O, loss of CO2) than as parent ions and this should be kept in mind during the metabolite identification process. METLIN online metabolite database allows the user to choose which modifications to include for metabolite searches and HMDB automatically searches for multiple modifications to a chosen mass.
The potential for metabolite misidentification is actually quite high, especially when the identification is based only on accurate mass. The greater the number of parameters used to identify a small molecule the greater the confidence and including MS/MS fragmentation pattern as a metabolite identification parameter greatly enhances the level of confidence. Initially matching metabolites in a database to an experimentally measured mass is not very subjective. After a mass error cutoff value is determined (5 ppm) a quick search often returns a list of several metabolites and the first part of metabolite ID is complete. Unlike experimental m/z values, matching experimental MS/MS spectra to those in an online database is a highly subjective process and is difficult to standardize. How many fragments are required for a positive ID? How closely should fragment intensities match? These are just a couple of the many questions that remain largely unaddressed. Additionally, different instrument parameters such as collision energy and alternate solvent conditions can greatly affect MS/MS fragmentation patterns increasing the complexity of standardized identification. METLIN online metabolite database is arguably the most comprehensive of its kind, often containing MS/MS spectra for a single metabolite at several different fixed collision energies in both positive and negative ionization modes. In this protocol, MS/MS spectra are gathered on the fly, on an information-dependent basis. To acquire information-rich MS/MS data across the wide variety of metabolites typically present in a metabolomics sample in a single LC-MS run, we employ the use of a collision energy spread instead of using a fixed value. This allows for the collection of quality MS/MS spectra from metabolites that fragment well only under low or high energies, but since METLIN provides MS/MS spectra at fixed collision energies with no spread, there is an additional degree of complexity when comparing MS/MS spectra. Once a metabolite has been identified by accurate mass and MS/MS spectra, confidence in the identity can be increased slightly by comparing the experimental isotopic ratio to the theoretical, determined by chemical formula. PeakView software (AB SCIEX) automates this process and to compensate for error associated with low-level metabolites, we accept isotope ratio error less than 20%. Retention time is a powerful metabolite identification parameter but requires a chemical standard. When identifying metabolites of importance or MS/MS spectra matches are less than clear, the use of chemical standards for validation may be necessary.
An example of the metabolite identification process is displayed in Figures 4–6. Starting with an m/z value of 166.0864 in positive ion mode; a search for this value in the METLIN online database yields 12 matches within 5 ppm. Phenylalanine is the only metabolite of the 12 matches with MS/MS data but, comparing the experimental MS/MS spectrum of 166.0864, where collision energy (CE) was spread between 25 and 55 volts (V), to the MS/MS spectra of phenylalanine at fixed CEs of 20 and 40 V, a match appears evident (Fig. 4). Though the relative intensities of the MS/MS fragments do not match very well, more importantly, most of the fragments produced by CEs of 20 and 40 V are also present in the experimental spectrum (Fig. 4). After a metabolite has been matched by MS and MS/MS, confirming the experimental isotope ratio matches the theoretical adds an extra, though minor degree of confidence to the metabolite identification (Fig. 5). To further identify a metabolite by retention time, a chemical standard must be acquired and run on the LC-MS system (Fig. 6A). An experimental MS/MS spectrum of the phenylalanine standard, now with a CE spread, is a very close match to that obtained from the plasma sample (Fig. 6B).
The number of peaks detected in a single LC-MS run is highly variable, depending on the origin of the sample, method of extraction, and peak picking parameters. As a result of adducts and in-source fragmentation, most metabolites produce several peaks and anywhere up to 10,000 features is not unexpected for a single sample. After the significant metabolites have been identified, it is the researcher’s task to interpret the data, develop hypotheses, and design experiments to confirm hypotheses.
The time required for metabolite extraction is dependent on the complexity of the procedure and number of samples. For example, any one of the three extraction procedures described above can be easily completed with 20 samples in less than one day. LC-MS/MS analysis usually consumes more time and one should reserve approximately 2.8 hours/sample, 1.4 hours for positive ion mode and 1.4 hours for negative ion mode, or about 56 hours for 20 samples. Identifying metabolites is often a lengthy process and will likely require significantly more time than sample preparation or analysis, perhaps weeks.
This work was supported by the National Institutes of Health grants R21AT005294, S10RR027878 and P30ES000210.