We found that the SVM with DMLLE shows better generalizability than SVM with all features, SVM with PCA or DM Nystrom dimensionality reduction methods. This result supports our conjecture that we can learn a better classification function in the low dimensional embedding space than in the original high dimensional space. That is also one reason why feature selection methods work. In addition, the advantage of DMLLE over PCA reveals that the colonic polyp data may have a low dimensional manifold structure [

4,

5] which cannot be discovered by linear dimensionality reduction methods. Although we can not describe the low dimensional structure definitely, we can still find some clues from the embeddings and figures of true polyps shown in . That means the key factors to discern a true polyp from false polyps are not as many as the features extracted.

In addition, as we mentioned in previous section, the embedding mechanism of LLE for new samples is in essence a variation of the Nystrom extension. The basic Nystrom extension will use all landmarks to infer the embedding of a new sample, whereas the LLE will just use the nearest K neighbors. If we view the changing of the size of neighbor region for a new sample as the changing of scale, then it is obvious that the Nystrom extension and the LLE compute the embedding for a new sample at different scales. The basic Nystrom extension operates at a larger scale and LLE operates at a finer scale. Although LLE works better than the basic Nystrom extension, is it the best scale to infer the coordinates of a new sample based on the embeddings of landmarks? It is still an open question and worthy of further exploration.

Since current detection algorithms perform very well on true polyps which are greater than 9 mm [

3], we focused on the detection of true polyps whose sizes are in the range 6 mm to 9 mm. To some extent, improving the detection probability of 6-9 mm polyps is the main target of current CTC CAD research. In order to compare the performance of different methods, we censored other than 6-9 mm polyps in most our experiments. In addition, according to the comparisons of our DMLLE method with other methods on polyps which are greater than 9 mm, the DMLLE method also outperformed other methods on bigger polyps. Because the resolution of CTC is limited, the size of polyps will affect the significance of features on the classification task (good features for big polyps may be not significant for smaller polyps [

13]). In practical CTC CAD systems, one could train different classifiers for different sizes of polyps.

In our current experiments, we do not combine the results from supine and prone scans in order to compare the performance of different dimensionality reduction methods. In practical CTC CAD systems, one can get higher sensitivity if the classification of a true polyp is based on supine and prone scans [

3]. We also do not focus on the detection of adenomatous polyps only but instead consider all histologic types of polyps in our database. Interested readers may find the performance of an SVM committee based on joint classification of adenomatous polyps on supine and prone scans in our previous work [

3].

In this paper, we propose a nonlinear dimensionality reduction method based on Diffusion Map and Local Linear Embedding for large-scale data. By using Markov transition probability as the measure of distance between samples, Diffusion Map shows high resistance to noise. Because Diffusion Map needs to do eigenvalue decomposition for Markov matrix, it limits the application of Diffusion Map to large-scale data. Inspired by the idea of landmark of ISOMAP algorithm, we first randomly select partial data as landmarks. The embeddings of landmarks in low dimensional space is constructed by using Diffusion Map. These embeddings constitute the global skeleton of whole data in embedding space. Then by using Local Linear Embedding, non-landmark points are embedded into the same low dimensional space by using nearest landmarks. For any non-landmark point, the linear geometry relationship with its nearest landmarks is maintained in both high dimensional space and embedding space. By combining global and local dimensionality reduction methods, DMLLE provides a faithful representation of original high-dimensional data at coarse and fine scales. As a consequence, it can capture the intrinsic distance relationship between samples and reduce the influence of noisy features. Both of these characteristics are crucial to the performance of classifiers. PCA can be regarded as a global dimensionality reduction method and LLE is a local dimensionality reduction method. In the experiments in our paper, we compared the DMLLE with PCA and LLE; the comparison results support our findings.

We applied the proposed DMLLE method to the colonic polyp detection problem on virtual colonoscopy. The embedding of all polyp candidates in three-dimensional space shows that most true polyps concentrate in the valley of two Gaussian-like distributions formed by false polyps. The clustering of true polyps enables the supervised learning method to determine a good decision boundary and achieve higher generalization ability than could be achieved in the original high dimensional space. We compared the test errors of the SVM classifier trained in a low dimensional space by using DMLLE with that of the SVM classifier using all available features. FROC analysis showed that with DMLLE, the SVM classifier shows higher generalization ability than that using all features. DMLLE also shows higher generalization ability compared with an SVM committee using feature selection technology and SVM using PCA linear dimensionality reduction.