Living systems contain a complex network of interactions that, until recently, have been studied reductively and analytically rather than with an integrative systems approach. The field of systems biology takes up this systemic approach, combining information from different domains to find new insights into the operation of biological systems. As shown in , information from genomics, transcriptomics, and proteomics are now supplemented by information provided by the relatively new field of metabolomics.
The goal of metabolomics is to provide a qualitative and quantitative description of metabolites and their connected pathways. The metabolites synthesised in a biological system are known collectively as the metabolome, and the goal of metabolomics is the ‘unbiased simultaneous identification and quantification’ of the elements and pathways in the metabolome (
Fiehn 2002). The definitions for metabolomics and metabonomics have some subtle distinctions but are used interchangeably here (
http://en.wikipedia.org/wiki/Metabolomics). Metabolites are generally in the small molecule class, with current data showing a bimodal distribution in size with more than 30% in the 100–400 Da range and a similar number in the 700–900 Da range, and with physico-chemical properties that are generally distinct from xenobiotics such as drugs and toxins (
Khanna and Ranganathan 2009). An overview of the metabolomics of toxins and xenobiotics is provided in two recent reviews (
Patterson and Idle 2009,
Patterson et al. 2010a), while a third review provides a perspective on the use of mass spectrometry (
Patterson et al. 2010b). To minimise overlap with these publications, we provide here only a brief overview of the metabolomics field, data acquisition methods and data handling, and focus in later sections on current results related to radiation response.
The goals of metabolomics for identification and quantification of the metabolome are challenging, but the promise of the new science is great, largely because, as noted by (
Fiehn 2002) “Metabolites are the end products of cellular regulatory processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes”. It is this suggestion that metabolomics may be among the most sensitive and accessible windows into biology that fuels current interest. Metabolomics is now one of the most rapidly evolving fields in science, as illustrated by the nearly-exponential growth in the number of peer-reviewed papers shown in .
The comprehensive characterisation of the metabolome, however, is a daunting task since the endogenous metabolites vary widely in their physical and chemical properties, occurring across the range of chemical classes, which in turn makes their concurrent extraction, separation, and detection a major challenge (
Dettmer and Hammock 2004,
Dettmer et al. 2007). The explosive growth of new results is important from a comprehensive scientific perspective; however, an ultimate goal of metabolomics in a systems biology context is to be of use in a clinical and screening setting, in our case to identify exposed individuals so that informed decisions can be made on treatment for radiation exposure. Reaching this goal requires a two-step process. In the discovery phase, key metabolite signatures are determined and validated by pathway analysis. This is a top-down, non-targeted, analysis that acquires and interprets huge data sets using sophisticated mathematical techniques and pathway analysis. Some of the recent promising results in the area of radiation exposure are described in the following section. When the identities of biomarkers or related groups of biomarkers are validated and in hand, it is possible to design targeted assays for diagnostic purposes, in our case, for radiation biodosimetry based on minimally-invasive sampling. These bottom-up, targeted analyses can use new technologies and instrumentation that can be simpler and lower in cost than the state of the art complex instrumentation used in the discovery phase. We discuss promising developments for targeted analysis in the section entitled
Large-scale screening for radiation exposure based on metabolomic biomarkers.
Recent technology advancements have led to the development of several analytical platforms for qualitative and quantitative metabolomics experimentation. While the nuclear magnetic resonance (NMR)-based approach has traditionally made important contributions in the development of the metabolomics field, it is relatively insensitive and slow and typically measures relatively abundant metabolites in given biofluids (
Want et al. 2005). In some recent work, microdroplet microcoil NMR has been combined with liquid chromatography (LC) mass spectrometry (MS) (
Lin et al. 2008) to form a LC-MS-NMR platform. The slower NMR analysis achieves high sensitivity by being performed off-line in this case. Another technique, capillary electrophoresis (CE) is a fast and high resolution method which has been employed for targeted analysis of metabolites because the separation is based on the difference in their mass-to-charge (
m/z) ratio in solution (
Schauer et al. 2005), but cannot easily be used in discovery. The workhorse of platforms for discovery remains techniques involving gas chromatography or liquid chromatography and high-resolution and/or multiple-reaction-monitoring (MRM) mass spectrometry, both for discovery and in designs for diagnostic instrumentation.
Gas chromatography coupled with mass spectrometry (GC-MS) is ideal for thermally stable metabolites including compounds such as sugars, fatty acids, amino acids, aromatic amines, sugar alcohols, as well as for many lipids. This technology has an advantage of unambiguous identification of compounds because of the availability of a comprehensive spectral library for small molecule metabolites (
Schauer et al. 2005,
Issaq et al. 2008). However, the samples need to be initially derivatised to be amenable for ionisation and detection by GC-MS, and this chemical process can prevent the detection of some metabolites.
LC-MS provides versatility and high selectivity by combining different chromatographic techniques including reverse-phase, normal phase, size exclusion and hydrophilic interaction chromatography, with different ionisation methods, unit or accurate mass resolution and fragmentation analysis. The use of liquid chromatography makes chemical derivatisation unnecessary (enabling the ‘dilute and shoot’ approach) and is compatible with compounds of varying polarity such as lipids, peptides, nucleotides, etc., with respect to ionisation and detection and hence is a good method of choice for global metabolomics. When combined with ultra performance liquid chromatography (UPLC), MS instrumentation can typically yield 5,000 ions or more in both positive and negative electrospray ionisation (ESI) modes. Each of these ions will have a retention time value on the LC column, a mass-to-charge ratio (m/z), and an intensity value. Each biological sample may therefore generate 30,000 data points. A typical metabolomic experiment with such methodologies can easily generate a million data points, so informatics approaches including multivariate data analysis (MDA) methods, are utilised for data reduction.
For complete access to metabolic constituents and pathways, it is necessary to use a combination of techniques. The platform used by the analytical services company Metabolon is detailed in (
Evans et al. 2009). They broaden the range of detected metabolites by using three separate protocols on each sample: GC-MS with derivatisation, LC-MS for positive ions, and LC-MS for negative ions. The three sets of results are then critically evaluated and combined prior to data analysis.
First, pre-processing methods are used to process the spectral data through peak filtering, detection, alignment, and normalisation, into discrete peak list where each peak is represented as a function of
m/
z, retention time, and ion abundance (peak intensity). This augments noise and data reduction as well as removal of systematic bias for downstream analysis. Several software tools have been developed for LC-MS metabolomic data preprocessing. These include MarkerLynx, Met-Align, XCMS, and MZmine. Other packages, some of them specific for LC-MS-based metabolomics, have been reviewed (
Katajamaa and Oresic 2007).
Following data preprocessing, multivariate data analysis and statistical methods are generally used to identify significant differences in metabolic changes between different biological groups. Since preprocessed LC-MS and microarray data share several common features, statistical methods previously developed for microarray data analysis have been utilised for difference detection. An overview of our methods for LC-MS data is presented in
Varghese et al. (2010). Notable techniques for data interpretation and visualisation include the Gene Expression Dynamics Inspector (GEDI) (
Eichler et al. 2003) originally developed for microarray data analysis, which has been used for UPLC time-of-flight mass spectrometry (TOFMS) radiation studies in (
Patterson et al. 2008) to identify subsets of dose-dependent metabolites in the cellular response to radiation. Other methods, such as Significant Analysis of Microarray (SAM) or Empirical Bayesian Analysis of Microarray (EBAM), MeltDB30,
t-test, principle component analysis (PCA), independent component analysis (ICA), (orthogonal) projection to latent structures discriminant analysis ((O)PLS-DA), soft-independent analysis of class analogy (SIMCA) methods are widely used to reduce the dimensions of metabolomics data and identify relevant metabolites (
Yin et al. 2006,
Bao et al. 2009,
Kim et al. 2009,
Ramautar et al. 2009). Use of these techniques in discovery requires a comparative evaluation of performance, as is done in (
Bylesjo et al. 2006) for OPLS, partial least squares (PLS), OPLS-DA, partial least squares discriminant analysis (PLS-DA), and SIMCA. Each of these techniques has its relative strengths, resulting in a sometimes confusing landscape, but one in which results are supported by views from different perspectives. The overview in shows the role of these techniques in discovery. The techniques bifurcate into supervised and unsupervised classes. Supervised analysis makes use of the information about the treatment, which generated each sample (dose, cohort properties, etc.), while unsupervised analysis searches for clusters and discriminating properties using only the data itself. Supervised and unsupervised methods support each other in reducing the list of thousands of ions to a much smaller list with a dose response characteristic for radiation exposure.
The final step in the metabolomics workflow involves mass-based metabolite identification and validation. Several databases are available to retrieve putative identifications (
Smith et al. 2005,
Wishart et al. 2007,
2009,
Cui et al. 2008). However, the identification is insufficient for unambiguous metabolite identification. Therefore, further validation needs to be performed which involves comparison of fragmentation spectra and retention time with an authentic standard combined with an understanding of metabolic pathways to select the correct chemical identity and to eliminate variable responses such as those related to gut flora and diet.
Because of the explosion of results in metabolomics shown in , it would be counter-productive to attempt a survey of the entire field. Some notable examples of successful use of metabolomics are given in (
Spratlin et al. 2009) that range from discovery to clinical applications in oncology. They indicate that metabolomic analyses can often be performed non-invasively in vivo and usefully imaged by techniques such as [18F] 2-fluoro-2-deoxy-D-glucose positron emission tomography (FDG-PET) and magnetic resonance spectroscopy. We believe that the same utility is in reach for radiation exposure biodosimetry.