depicts the superimposition of two representative 1
H NMR spectra (projections of JRES spectra), which show the average signals from healthy (blue line) and diseased (red line) subjects. The spectra are dominated by the resonances of carbohydrates, in particular, both anomeric forms of glucose (~3–5.5 ppm, Gluc) and some of the intermediate metabolites of the glycolytic pathway, such as lactate (δ 1.33 and 4.11 ppm, Lac). Other compounds such as amino acids, in particular, valine (Val, 0.9–1.1 ppm) and alanines (Ala, 1.46 ppm), show large methyl signals in the spectrum. Smaller contributions to the spectrum are often more relevant to the analysis and are enhanced by applying a generalized log transformation () [32
Figure 1 Superimposition of two representative 1H NMR spectra (projections of JRES spectra), which show the average signals from the serum samples of healthy subjects (blue line) and patients with disease (red line) after application of a generalized log transformation. (more ...)
One-dimensional and 2D JRES NMR spectra were recorded for filtered blood sera from patients with head and neck cancers and from a control group of healthy volunteers. Two-dimensional JRES spectra were used to visualize both chemical shifts and scalar couplings along different spectral dimensions and increase peak dispersion and therefore metabolite specificity in 1D projections [34
]. Principal component analysis applied to 1D projections is depicted in and shows a clear separation between samples from healthy subjects and OSCC patients. Samples from patients with the disease group geared toward smaller or negative values of PC2; samples from higher stage disease also spread out toward higher positive values of PC1. Furthermore, the PCA result shows clustering according to the stage of disease; samples B06, B07, B08, B09, and B12 are from patients with stages III and IV disease, with associated metastasis, and are observed at higher absolute values of PC1 and PC2. Samples B01 and B15 represent an exception to good separation of low- and high-stage disease, which, in the case of B01, can be attributed to the small size of the tumor that was classified as high-stage IV because it had invaded the locally adjacent bony structures although there was no indication of metastatic disease. Therefore, PCA not only differentiates disease and control samples but also shows significant potential to distinguish early and late stage disease with high specificity.
Figure 2 Principal components analysis scores plot of 1H NMR spectra of human blood sera from OSCC patients (●) and from healthy controls ().
To obtain a more objective statistical estimation and specific loadings, we used PLS-DA for a model discriminating samples from cancer patients and controls (). In this case, the sensitivity and specificity for oral cancer detection are both >95% (see receiver operating characteristic curves in Figure W1
) after applying the Venetian blind algorithm for cross-validation. Despite limited statistical significance owing to a small number of samples, this result is remarkable considering that some early stage tumors were below 0.005% of the overall body mass.
Figure 3 PLS-DA scores plot of 1H NMR spectra of human blood sera using two classes: OSCC patients (● early and ■ late stage diseases) and healthy controls ().
To probe for a possible bias arising from other factors, we probed the ensemble of data for the influence of age and sex. Both factors do not cluster in the unsupervised PCA. Therefore, PLS-DA analyses were used to build models for disease, age, and sex. In this analysis, a substantially less-pronounced clustering and a lower degree of specificity and sensitivity were observed for age and sex compared with disease versus
control (see receiver operating characteristic curves in Figures W1–W3
Loadings plots were calculated from PLS-DA models to identify discriminatory metabolites for different models. , A and B, shows different areas of loadings in the first latent variable of PLS-DA, with positive values representing healthy controls and negative values representing patients with disease. In the samples from cancer patients, the levels of valine, ethanol, lactate, alanine, acetate, citrate, phenylalanine, tyrosine, methanol, formaldehyde, and formic acid were reduced compared with those of healthy controls, whereas signals arising from glucose, pyruvate, acetone, acetoacetate, 3-hydroxybutyrate and 2-hydroxybutyrate, choline, betaine, and, to a lesser degree, dimethylglycine, sarcosine, asparagine, and ornithine showed enhanced loadings. In the aromatic region, additional contributions arise from yet unidentified metabolites. These contributions were summarized in metabolic pathways shown in . To avoid possible errors arising from PLS-DA for a small sample size, we compared these results with differences of average late and early stage spectra, which gave the same results.
Figure 4 Loadings plots for the first principal component for different regions of the spectrum. A representative NMR spectrum of human serum from an OSCC patient is shown above the loadings plots. Signals representing the most relevant discriminatory metabolites (more ...)
Schematic representation of the most relevant metabolic differences between oral cancer patients and healthy controls.
To identify the metabolites that are responsible for the distinction between different stages of disease, PLS-DA models using only samples from patients with disease were built (not shown). Loadings plots from latent variable 1 are displayed in , A
, showing similar patterns as for the distinction between patients with disease and healthy controls (, A
). In samples from patients with late stage disease, the levels of 2-hydroxybutyrate, 3-hydroxybutyrate, acetone, acetate, acetoacetate, creatinine, asparagine, glucose, dimethylglycine, betaine, and choline were remarkably increased compared with patients with early stage disease, whereas valine, lactate, alanine, pyruvate, lysine, creatine, acetyl-l
-carnitine, and carnitine showed reduced concentrations (, A
). In addition, a model using three classes (healthy controls and patients with early and late stage diseases) is shown in Figure W4
. The application of this multivariate chemometric model shows a clear separation between the samples from healthy controls and patients with disease. Very narrow groupings are observed for the control samples and the patients with early stage disease. On the contrary, the sera of patients with late stage disease are more widely spread in the scores plot highlighting the higher inhomogeneity of their metabolic profile.
Figure 6 Loadings plots (latent variables) from a PLS-DA analysis of OSCC patients showing metabolites that discriminate between early (positive loadings) and late stage diseases (negative loadings). A representative NMR spectrum of human serum from an OSCC patient (more ...)