3.1. Evaluation of Support Vector Machine Classification
We chose images that depict one of nine specific patterns (an example of a pattern is shown in ). These pattern classes are: centrosome, cytoskeleton, ER, Golgi apparatus, punctate patterns- which includes lysosomes, peroxisomes, endosomes-, mitochondria, nucleoli, nucleus, and plasma membrane (PM), and 834 proteins out of 1902 showed just one of these patterns in this dataset. Each class has at least 10 proteins, and the total number of image fields in this resulting dataset is 3557 samples. Images from the three different cell lines are considered.
Example of a segmented single cell region. A protein (Atlas ID 1915) exhibiting a mitochondrial pattern (A), the parallel nuclear (B), microtubule (C), and endoplasmic reticulum channels (D). Unprocessed image fields consist of multiple cells.
We first classified images using field features with 10-fold cross validation. Classification accuracies range from 30.0–96.3% accuracy between the classes, with an overall accuracy of 84.8% (, column 4). Classes with fewer samples have lower accuracies. 19.4% of the nucleolar samples and 20.0% of the centrosome samples were confused with the nuclear class (data not shown), and the centrosome class was also highly confused with the Golgi class. The cytoskeletal, ER, Golgi, and PM classes each had more than 10% of their samples confused with the mitochondria class (data not shown).
Comparison of classification approaches.
We next applied classification at the single cell level. Since each image contains multiple cells, we first segmented the images into single cell regions. This increased the number of samples to 29099 regions. Classification of these samples using 5-fold cross validation yielded an overall accuracy of 80.6%, and class accuracies ranged from 11.4–95.7% (, column 5). Using the classifier probability outputs to choose a single, maximum probability label for all cells belonging to the same image, we found that we could boost classification accuracy to 87.4%. In doing this, all class accuracies save that of the centrosome class improved over the single cell classifier (, column 6). Moreover, classification accuracy improved over the simple field level analysis for seven of the nine location classes.
We performed precision-recall analysis on these latter results. We sorted the labeled samples by the magnitude of the maximum probability value for each sample. Generally, as only more confident assignments are considered, classification accuracy increases (). At a recall of 60%, the classification accuracy is 98.5%.
Precision-recall curve for cell labels with voting across image field. The black profile denotes SVM, while gray shows RF performance. At a recall of 60%, the precision for both approaches is 98.5%.
3.2. Evaluation of Random Forest Classification
We next applied the RF classification framework to analyze the field and cell level images. The overall accuracy using field features was 87.5%, with class accuracies ranging from 45.0-98.0% (, column 7). Compared to SVM using field features, RF performs better on seven of the nine classes. At the cell level, RF performs better than SVM on all classes and achieved an 85.8% accuracy (, column 8). Finally, RF with voting performs better than SVM with voting in overall accuracy and in eight of the nine classes (, column 9). As more confident assignments are considered using field features, classification accuracy increases (). At a recall of 60%, accuracy is 98.5%.
3.3. Comparison of Feature Selection Methods
SDA has proved to be an effective method for feature selection in subcellular pattern recognition [12
]. One drawback to SDA, however, is that it is highly sensitive to training samples; the addition or subtraction of a few samples can affect the ranking and selection of features. RF is less sensitive to this issue. We compared the features selected by RF to features selected by SDA. Both RF and SDA were applied to all of the field level data. RF identifies 16 features as especially discriminative, while SDA returns 107 features. All of the 16 RF features appear in the top 64 ranked SDA features, and the top ranked features in both selection methods match (). Of these 16 features, nine are related to the nucleus, ER, or tubulin reference channels.
Comparison of feature selection methods on field level features. The first two columns show rankings by different selection methods. RF returned 16 features while SDA with SVM classifier tuning returned 107 features.