|Home | About | Journals | Submit | Contact Us | Français|
The ‘omics’ approaches – genomics, proteomics and metabolomics – are based on high-throughput, high-information-content analysis. Using these approaches, as opposed to targeting one or a few analytes, a holistic understanding of the composition of a sample can be obtained. These approaches have revolutionized sample-analysis and data-processing protocols. In metabolomic studies, hundreds of small molecules are simultaneously analyzed using analytical platforms (e.g., gas chromatography-mass spectrometry (GC-MS) or liquid chromatography coupled to tandem mass spectrometry (LC-MS2)). This philosophy of holistic analysis and the application of high-throughput, high-information-content analysis offer several advantages. In this article, we compare the conventional analytical approach of one or a few analytes per sample to the LC-MS2-based metabolomics-type approach in the context of pharmaceutical and environmental analysis.
Genomics and proteomics studies have become a common tool in many research areas. In recent years, a new ‘omics’ discipline – metabolomics – has been gaining popularity due to its applications in diverse fields (e.g., functional genomics, drug discovery, toxicology, environmental science, nutritional science and disease diagnosis) [1–4]. In addition to their significance in understanding biology, the advances in ‘omics’ technologies have changed the way we now obtain, analyze and present data from biological samples. These approaches have significantly increased the rate of sample screening as well as the information content in the data obtained. These holistic approaches, applied together or individually, can provide novel biological insights that can often be missed with conventional highly-targeted sample-analysis approaches.
Currently, holistic analysis is mainly restricted to biochemical research. However, no matter the discipline, the advantages of screening a sample for hundreds of molecules simultaneously are enormous. Hence, the concept needs to be tested and applied to other fields, such as pharmaceutical and environmental analysis.
Though the analytical platforms used for genomics and proteomics are relatively specific, metabolomics uses very common analytical platforms such as gas chromatography with mass spectrometry (GC-MS), liquid chromatography with MS (LC-MS) and nuclear magnetic resonance (NMR) spectroscopy. Hence, the application of metabolomics to pharmaceutical and environmental analyses, which utilize the same analytical platform, should be relatively easy and would provide several advantages (Table 1). An excellent example of the application of metabolomics-type study to pharmaceutical analysis is the LC with tandem MS (LC-MS2) method developed by Kirchherr et al.  for simultaneous analysis of 48 drugs in human serum.
We provide a critical overview of the metabolomics approach and its potential for pharmaceutical and environmental analysis. The metabolomics approach discussed here is related to the determination of the pharmaceuticals and environmental chemicals in a sample and should not be confused with, for example, environmental metabolomics, which deals with the effect of environment on the biochemical processes in organisms. Thus, the metabolomics applications to analyze the pharmaceutical composition of a sample and environmental metabolomics differ but can be complementary.
Metabolomics and its applications in various research areas have been extensively reviewed [6,7]. Here, we provide just a brief overview in order to lay down a foundation for our discussion of applications of metabolomics approaches to pharmaceutical and environmental analyses. Metabolomics involves analysis of all or a large number of cellular metabolites. It is usually used to compare the metabolite levels in organisms under a given condition (e.g., comparison of mutants to wild-type strains). A typical metabolomics study involves
Sample preparation for metabolomics analysis requires unbiased extraction of metabolites so that those with a wide range of physicochemical properties are detected and quantitated. For determination of metabolite levels two major approaches are commonly used:
As the name indicates, the targeted analysis involves quantitation of a known set of metabolites whose chemical identities are known. A well-defined method is required for performing targeted analysis. This approach is based on a specific selected reaction monitoring (SRM) developed for each targeted metabolite. Hundreds of these SRMs are then monitored by MS in a single separation.
On the other hand, non-targeted analysis involves analysis of metabolites without prior knowledge of their identities.
Both approaches have advantages and disadvantages which will be discussed in the following sections. For both approaches, several analytical platforms such as NMR, GC-MS, LC-MS and capillary electrophoresis-MS (CE-MS) are used. However, LC-MS is the most commonly used, due to its sensitivity and amenability for detecting highly polar molecules (e.g., nucleotides) as well as its ability to provide structural information.
The raw data obtained in metabolomic studies normally require pre-processing before being subjected to statistical analysis. Software tools are used to correct the data for background noise, peak alignment and normalization. The corrected data are then subjected to supervised and non-supervised multivariate statistical analysis (e.g., Analysis of Variance (ANOVA), Principle Component Analysis (PCA), hierarchical clustering (HC), self-organized maps (SOMs), partial least squares (PLS), and Genetic Algorithm-Discriminant Function Analysis (GA-DFA)).
The main objective of metabolomic studies is to identify the differences in metabolite content of biological samples. Typically, the relative quantitation of a large number of metabolites belonging to diverse chemical classes (e.g., amino acids, carbohydrates, carboxylic acids, nucleotides and their precursors and co-factors, such as glutathione) is achieved from a single sample. Similarly, in pharmaceutical and environmental studies, complex matrices containing diverse chemical components (e.g., drugs and their metabolites, toxic chemicals, and pesticides) are analyzed. Hence, performing metabolomics-type analysis of these complex matrices would present significant challenges for sample preparation, analysis, data processing and interpretation. We will discuss each of the sample-processing steps in the context of their applications to pharmaceutical and environmental analysis.
There are several differences in sample collection and processing that have to be considered for conventional versus metabolomics-type analysis (Table 2).
A metabolomics study requires rapid quenching of a sample to stop the biochemical processes, in order to obtain a snapshot of the biochemical composition at the time of sampling. For this purpose, usually cold (−20°C or less) organic solvent (e.g., buffered/unbuffered methanol or acetonitrile) is added to the sample . Next, the stability of a sample obtained is of the utmost importance, as several molecules (e.g., nucleotides and thiols) are very unstable. A metabolomics-type approach for pharmaceutical and environmental analysis would require careful consideration of the nature of matrix and the analytes of interest. Microbes are commonly present in the environmental samples and are capable of degrading or metabolizing organic chemicals including proteins . Hence, filtration followed by storage at −20°C to −80°C or immediate analysis might be required. Quenching the sample with cold (−20°C or lower) organic solvent after the filtration should also be considered, as enzymes from dead microorganisms might be present in the sample. The presence of reactive components (e.g., oxidizing agents, reducing agents or metals) may change the composition of the sample after collection. Proper handling of the sample is therefore necessary to avoid sample-to-sample variability and the loss of unstable and volatile components. Proper sample collection and handling is critical in a clinical environment, given that blood and urine samples contain numerous reactive components, including drug-metabolizing enzymes. When analyzing several analytes together, the stability of each individual analyte and its reactivity in the mixture have to be considered, and steps should be taken to protect it from transformation or degradation. For example, as reported by Kirchherr et al. , simultaneous analysis of 48 antidepressant and antipsychotic drugs in human serum required protection of the sample from light during collection, processing and storage, as some of the drugs were light sensitive. A great deal of care was also required during transport of the samples from collection site to the analysis site, as is often necessary for environmental studies.
Direct analysis of a sample without any extraction can been tried in some cases to avoid loss of any analytes during extraction . However, such an approach is not suitable for samples containing proteins, and matrix effects might distort the analytical make-up of the sample.
Sample-processing techniques that can be used for simultaneous analysis of large number of pharmaceuticals and environmental chemicals are summarized in Table 3.
For plasma and serum samples, protein precipitation with organic solvent might work well, as demonstrated for the analysis of a large number of pharmaceuticals in human plasma . Solid-phase extraction (SPE)  or liquid-liquid extraction (LLE)  might be a good choice if the analytes of interest have similar physicochemical properties. However, a simple one-step extraction may not be possible if targeted compounds involve polar, non-polar and neutral compounds. These cases might require a complex extraction (e.g., sequential SPE) .
Considering that there is no universal extraction method, a compromise is needed between the number of molecules to be extracted versus efficiency and reproducibility of extraction. The suitability of the extraction solvent for the MS analysis also has to be taken into account.
An analytical platform capable of detecting a large number of molecules with diverse chemical functionalities is required for metabolomics analysis. It must combine high sensitivity with a wide dynamic range. Several analytical techniques routinely used in metabolomics studies can be applied to pharmaceutical and environmental analysis, including: NMR ; direct infusion MS (diMS) ; CE-MS ; GC-MS or LC-MS [18,19]; and, LC with electrochemical detection (LC-ECD) . Each of these techniques can simultaneously detect hundreds of molecules in a sample and has certain advantages and disadvantages .
Careful selection of the platform is required, depending upon the nature of the sample and the targeted set of metabolites (e.g., NMR may not be a good choice for low-abundance analytes, while CE or LC combined with high-resolution MS (HRMS) would be the platform of choice if the sample was expected to contain a number of isobaric or isomeric compounds. Over the years, LC-MS has emerged as the popular option for metabolomics due to several advantages (e.g., selectivity, sensitivity, versatility and the ability to identify unknown analytes) . Due to these features, this technique may the obvious choice for metabolomics-type pharmaceutical and environmental analysis. In the following sections, we will review the LC-MS-based approaches for highly-parallel analysis.
Though direct infusion of samples into a mass spectrometer without prior separation has been tried for analysis of hundreds of molecules , the simultaneous presence of many analytes in the MS source usually causes ion suppression. With this approach, many low-abundance analytes may not be detected due to competition or ion suppression caused by high-abundance molecules. The purpose of the chromatographic separation in metabolomics is to reduce analyte crowding in the MS source and to separate as many analytes as possible, including isomeric or isobaric analytes, using a single analytical method.
So far, few methods capable of simultaneous analysis of a large number of analytes have been reported in pharmaceutical and environmental studies. Most of the reported analytical methods focus on one or a few analytes, the majority of which methods are performed with reversed phase chromatography. Considering the relatively simple mode of interaction (hydrophobic) between the analyte and the sorbent, one would assume that a generic method using a C18 column and a solvent gradient from aqueous to organic would work for a large number of drugs. However, the literature abounds with not only different methods for the same class of drugs but also varied methods for a single analyte. A careful look at many of these methods would reveal that they are not very different after all. They mainly differ in the column dimensions, particle size, column packing, column manufacturer, the solvent and the gradient.
With the appeal and the availability of individual methods, a laboratory involved in analysis of several different classes of analytes, such as a clinical-biochemical laboratory in a hospital setting, tends to adopt numerous methods. The disadvantages of this approach are listed in Table 1. This can be exemplified with anti-HIV drugs as model analytes. There are numerous methods available in the literature for the analysis of different classes of anti-HIV drugs either individually or simultaneously, using different column chemistries (e.g., C18 , C8 , or Phenyl ). Even with the C18 column, there are many different methods that use different solvents, different pH, different ion-pairing reagents and columns from different manufacturers [27–29]. Recently, a simple method was developed  using a C18 column with a simple solvent system and a linear gradient (Fig. 1), which could be used to separate 16 different anti-HIV drugs simultaneously. This new method has replaced multiple assays used previously for individual drugs, simplifying the process and making it more cost effective. Similarly, 27 benzodiazepines and related drugs in human plasma have been analyzed simultaneously for detection of poisoning and screening for driving under the influence of alcohol and narcotics .
To increase the analytical efficiency of pharmaceutical and environmental analyses, generic separation methods aimed at simultaneous profiling of numerous compounds have to be developed. Selected reports of this type of simultaneous analysis of drugs and environmental chemicals are listed in Table 3 along with the sample-preparation and analytical methods used.
Phan-Tuan and colleagues  developed a generic HPLC method for high-throughput profiling of bio-fluids. They used a short monolithic column with a rapid gradient for fast profiling of a large number of samples. If a particular sample showed interesting peaks, then the same method was used with a slow gradient to identify the peaks.
In another study, a simple reversed phase separation was used by Gutteck and Rentsch  for baseline separation of 18 CNS active drugs in serum samples.
Garratt et al.  used diethylhexylamine as an ion-pairing agent for separating 119 folates, while Piraud et al.  tried perfluorinated carboxylic acids for separating 76 underivatized amino acids of biological interest.
A sample expected to contain a number of stereoisomers would necessitate use of a chiral column (e.g., as used by Desai and Armstrong for baseline separation of enantiomers of 25 amino-acid and peptides ).
For the separation of analytes with a diverse chemical functionality and polarity, complex and complementary separation modes (e.g., hydrophilic interaction chromatography , anion-cation exchange/hydrophilic interaction  or two dimensional normal-phase-reversed-phase separation) can be used .
Separation methods for simultaneous analysis of hundreds of analytes from a single sample are now being developed. However, successful development and application of such methods also requires detectors capable of fast data-acquisition rates combined with high sensitivity and specificity. Targeted and non-targeted detection are the two major modes used in MS-based metabolomics.
In the targeted detection mode, the analytes to be detected are predetermined and have a specific mass filter assigned to each analyte, making it a very specific, sensitive approach. Triple-quadrupole (QqQ) instruments are very popular for targeted analysis of pharmaceuticals and environmental chemicals due to their high sensitivity and specificity. Conventionally, they are used to quantitate one analyte or a few analytes using SIM or SRM. As SIM is not very specific and a metabolomics-type analysis of a large number of analytes requires a high level of specificity, SIM has been largely neglected for complex analysis. However, SRM offers high specificity and has been successfully used for quantitation of hundreds of analytes simultaneously. Modern QqQ instruments allow simultaneous measurement of as many as 300 analytes in a sample using SRM. We reported analysis of 141 analytes in a single sample using an SRM-based method . A similar approach was used by Garratt et al. to monitor 119 folate species . Fig. 2 shows the application of a targeted approach for the analysis of 17 glucocorticoids in commercial milk and eggs . However, unlike conventional analysis, developing an LC-MS2 method with so many SRMs is complicated and involves substantial pre-development work. The first step is to obtain fragmentation patterns for all the analytes of interest in both positive and negative modes. This data is then used to select at least two SRMs for each analyte. Selection of multiple SRMs is required in order to find the SRM that gives the highest signal-to-noise ratio in the real sample. Careful selection of SRMs is required for analysis of isobaric or isomeric compounds (e.g., glutamine and lysine with a molecular mass of 146), as these molecules often produce very similar products. Once the SRMs are selected, then the mixture of standard compounds is analyzed to obtain the retention time for each analyte necessary to develop the LC-MS2 method, as shown in Fig. 3.
Good chromatographic separation and consideration of the data-acquisition rates and chromatographic peak widths are required for successful development of such a method. To obtain a reproducible peak shape, at least 10 data points per peak are required. For example, if the mass spectrometer is scanned with 0.1 sec per SRM, then a 1-min wide chromatographic peak in a segment with 60 SRMs will get 10 data points. If the peak width is less than 1 min (which is often the case for chromatography columns with shorter diameters and smaller particle sizes), the reproducibility is compromised unless the chromatographic method provides good separation. With instruments and columns now available with a particle size of less than 2 microns, peak shapes of less than 30 s can be achieved. In this case, selection of tandem mass spectrometers with a scanning speed of at least 10 ms is required. A sample normally has to be analyzed in both positive and negative polarity modes to cover a wide range of analytes. With hundreds of analytes being monitored in each mode, it is hard to obtain a chromatographic run time of less than 30 min. Hence, each sample requires at least 1 h for analysis when scanned separately in positive and negative modes. If a study involves a large number of samples, stability becomes an issue, as it often takes more than 24 h to analyze the set while samples sit in an autosampler. A combination of a faster scanning instrument, good chromatographic separation and automated positive/negative switching can significantly reduce the sample analysis time. In order to account for the MS detection as well as the extraction variability, at least one internal standard (preferably, a labeled stable isotope) per chromatographic segment (in both positive and negative ionization modes) is required.
A semi-targeted approach using a constant neutral loss scan, which is more sensitive and more specific than full-scan mode, was tried recently to detect glucuronides in human urine . This approach would be suitable if the analytes belong to a certain chemical class and lose the same neutral fragments (e.g., phosphate, sulfate or glucuronide) in the MS2 stage.
The drawback of targeted analysis is that it is not a truly holistic approach. For a sample containing numerous analytes, it is not possible to prepare SRMs for all the molecules of interest in the sample, so the analytes for which there are no SRMs in the method cannot be detected. Also, it is often not possible to predict the composition of the sample, or the sample could have unknown analytes. In such cases, a non-targeted screening approach is commonly used in metabolomics to cover a broad range of analytes and also to detect and identify unknown or novel molecules [40,41].
In this detection mode, the mass spectrometer is scanned over a set m/z range (for example, m/z 100–1000 in both positive and negative ionization modes) to detect the majority of the molecules within the specified molecular mass range. Data obtained in full-scan mode with low-resolution instruments are generally used for comparative profiling. For example, one can compare if the samples are different in terms of analyte levels and composition. However, the identities of many analytes in the sample are not known and further analysis is required. Data acquired in full-scan mode with auto MS to MS2 switching, using an accurate mass instrument (e.g., qTOF or Fourier transform MS), provide additional information on the elemental composition as well as fragmentation pattern of the analytes, including unknown molecules. The data obtained can then be subjected to a library search (such as NIST or in-house libraries) to identify the compounds , or used for de novo structure elucidation. As discussed in section 3.2.2., automated positive/negative switching in full-scan mode can significantly reduce the analysis time and increase the information content in the data. Full-scan chromatographic data has to be aligned in terms of peak-retention times, and the aligned data then can then be subjected to various statistical analyses. For more reproducible peak alignment, several retention-time markers must be added to a sample.
The unbiased detection of non-targeted analysis provides a holistic insight into the chemical nature of the sample. Such an approach is particularly important in pharmaceutical and environmental analysis as unexpected or unknown potentially toxic contaminants or drug metabolites could be detected and identified.
There are several examples of a non-targeted approach being applied in pharmaceutical and environmental analysis. Using non-targeted analysis with SPE-LC-qTOF-MS, Ibanez et al. detected six unknown compounds in environmental waters . A successful application of non-targeted analysis for rapid detection of drug metabolites in pharmaceutical studies has been reported . The advantage of such an approach is emphasized by the fact that it can detect previously unknown metabolites of even well-studied drugs.
The utility of the UPLC-TOF-based metabolomics approach was also shown for quality control and standardization of phytopharmaceuticals .
Fig. 4 demonstrates the application of non-targeted LC-MS analysis using a reversed phase monolithic capillary column for the detection of thousands of components in a plant-tissue extract.
The application of a non-targeted approach was also shown in screening 400 component libraries generated by combinatorial synthesis . The information obtained is helpful in optimizing drug-synthesis procedures.
Finally, a non-targeted approach is also used in ‘General Unknown Screening (GUS)’ procedure in clinical and forensic toxicology, which involves identification of drug, poisons and intoxicating agents in samples without the prior knowledge of the analytes to be monitored .
There are therefore several areas where non-targeted profiling offers great advantages over a conventional highly-focused approach.
The handling and the proper interpretation of the enormous amount of data generated by the metabolomics approach are very challenging and require the use of sophisticated statistical and bioinformatics tools . LC-MS2 data processing involves background-noise subtraction, peak alignment, and intensity normalization and transformation before performing statistical analysis. Efficiency and accuracy of analysis requires a comprehensive approach, such as that used by Bijlsma et al.  for metabolomic analysis of 600 human samples. Fortunately, advances in metabolomics have promoted the development of several software algorithms for automated analysis of large data sets. The data-analysis approaches used for metabolomics have been reviewed [49,50] and will not be discussed in this article. Selected freely available bioinformatics tools, listed in Table 4, could easily be applied to pharmaceutical and environmental analysis.
With advances in technology and pressure for accelerated development of new drugs, drug-candidate screening has become a high-throughput process. Application of metabolomics combined with multivariate statistical data analysis can help to quickly move the drug candidates through the development pipeline. For example, Plumb et al. reported the utility of multivariate statistical analysis (PCA) of metabolomics data (Fig. 5) for efficient detection of drug metabolites in bio-fluids without prior knowledge of the drug administered . Multivariate analysis can enhance the information extracted from the metabolomics data and help in rapid, correct detection of drug metabolites, including unknown metabolites .
Application of various highly sophisticated statistical methods to the analysis of complex analytical data sets is growing rapidly [52,53], and careful selection of the proper statistical tool needs to be based on the objectives of the analysis.
Metabolomics, while providing tangible advantages over traditional approaches, also poses significant challenges. Metabolomics aims at quantitation, either targeted or non-targeted, of a large number of analytes in the same sample. This task has inherent difficulties because many metabolites in the sample are unknowns, and, for many known metabolites, there is no authentic standard available. Targeted analysis requires availability of all the analytes of interest, which, depending on the study, could be in the range 50–500. Obtaining the pure compounds could be difficult, as some analytes may not be commercially available while others are very costly. Targeted analysis requires substantial method development and validation work. Stock solutions, working solutions and mixtures of all the analytes must be prepared in a short period of time (2–3 days) so that all the metabolites are simultaneously available for method development, validation or quality control. If some of the analytes are unstable, fresh solutions and mixtures need to be prepared periodically.
Careful selection of optimized chromatography conditions is necessary and may be time consuming. For example, analysis of acid, base and neutral molecules, as well as polar and non-polar molecules present in the sample, requires screening of a number of chromatography solvents, columns and gradients to select the most efficient combination of factors affecting separation. Hence, a compromise in terms of chromatography performance is needed, as all the analytes may not show acceptable chromatography.
Validation of the method can be complicated as all the analytes need to be validated simultaneously and the presence of reactive analytes could affect reproducibility. Multiple internal standards are needed to represent each class of analyte.
With a non-targeted approach, it is often difficult to identify the analytes, and it requires considerable complementary analytical approaches and, again, the availability of a large number of standard compounds.
Another challenge for metabolomics is the presence of unknowns in the sample, so a complementary research effort needs to focus on structural elucidation of the unknowns. With hundreds of analytes being analyzed in a single run, it is difficult, with current technology, to obtain faster separation times, so it is hard to achieve the high throughput required for large studies. This raises sample stability and storage issues. Manual data analysis can be complicated, and fast, proper data analysis requires novel, sophisticated computer algorithms.
Metabolomics is a promising and rapidly developing area of research. Despite current limitations, it still provides an opportunity to re-think current analytical methodology and offers researchers in pharmaceutical and environmental sciences an additional research platform that is robust and powerful.
High-throughput and high-information-content analysis is common in genomics, proteomics and now metabolomics. Metabolomics and pharmaceutical and environmental analysis currently share analytical techniques, which suggests a natural partnership.
A metabolomics approach applied to pharmaceutical and environmental analysis offers several advantages, such as low cost of analysis, high analytical efficiency and high information content in the data obtained. It also allows monitoring of unexpected and unpredicted metabolic changes regularly omitted by traditional analysis.
With constant technological advances in both analysis and data processing, metabolomics will provide an alternative platform to pharmaceutical and environmental analysis. In return, pharmaceutical and environmental research will add to metabolomics by providing more data on a wide range of known metabolites. As libraries grow and more metabolites are catalogued, a targeted metabolomics approach will take precedence.
The research in the authors’ laboratory is financially supported by the NIH/NIGMS grant R01 GM068947-01, NIH/NIAID grant 2R01AI045774-06A2, and NSF grants MCB-0312857 and MCB-0520140. The authors thank Jim Walke for reading the manuscript critically.
Sunil Bajad is a Metabolomics Specialist in Vladimir Shulaev’s laboratory at the Virginia Bioinformatics Institute, Blacksburg, VA, USA. He is working on the development of highly-parallel analytical methods for metabolomics and lipidomics. He holds a Ph.D. in pharmaceutical sciences with postdoctoral experience in pharmacogenomics and metabolomics. He is the author of a patent, a book chapter and several peer-reviewed research articles. His interests are development of highly-parallel LC-MS-based analytical methods and application of these methods to metabolomics and lipodomics of malarial parasite and cancer cells.
Vladimir Shulaev is Associate Professor at the Virginia Bioinformatics Institute, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA. He holds dual Ph.Ds. in Biological Sciences and Plant Biology, and heads the Biochemical Profiling Group, which focuses on developing methods for high-throughput metabolite profiling and application of metabolomics to systems biology, host-pathogen interactions and to study stress response in microorganisms, plants and animals.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.