We have examined two estimators, the nonlinear least squares estimator and the maximum likelihood estimator, with respect to their performance for estimating the location of single molecules or other point sources. Amongst a selection of estimators that did not contain the maximum likelihood estimator, the nonlinear least squares estimator had been identified as the most accurate in a comparative study [11
]. This was done in the context of fitting a Gaussian profile to data generated with an Airy model. The maximum likelihood estimator, however, is a classical estimator [12
] that has also been investigated in the context of single molecule localization [13
]. It was therefore important to extensively compare these two estimation approaches.
The first criterion for evaluation is whether or not an estimator recovers the correct parameter, in this case, the coordinates of the location of the single molecule. Considering the stochastic nature of the data, it cannot be expected that the correct parameter can be exactly recovered. However, it is shown here with simulations that both estimators, on average, can retrieve the location of single molecules without any obvious bias. The performance of estimators for the localization of single molecules is typically evaluated based on the standard deviation or variance of the resulting estimates [17
]. This is consistent with the desire that an estimator, which on average produces the correct results, does so with a small standard deviation. Our results show that in the absence of model misspecifications, the maximum likelihood estimator consistently produces estimates with lower standard deviation than the nonlinear least squares estimator.
In any practical situation, the effect of extraneous noise (background autofluorescence, camera readout noise, etc) on the performance of estimation algorithms is an important consideration. We have shown that as the extraneous noise component in the data increases, irrespective of whether the noise is Gaussian (camera noise) or Poisson (background autofluorescence) in nature or a combination of both, the standard deviations of the estimates from the nonlinear least squares algorithm become comparable to those from the maximum likelihood estimator and converge to the PLAM. This behavior of the estimators is not completely surprising as similar results has been observed in deconvolution studies [15
]. We know from the statistical literature that the nonlinear least squares algorithm is better suited for data with a Gaussian distribution while the maximum likelihood algorithm is better suited for data with a Poisson distribution. Consider the scenario when the data contains only signal from the point source and Poisson noise. At high signal levels, i.e. when there is a large Poisson noise component in this scenario and consequently a non-trivial signal measurable at every pixel, we know that data with a Poisson distribution can be approximated well by a Gaussian distribution. Since the nonlinear least squares algorithm is suited for data that exhibits Gaussian behavior, it is not surprising that for data with a large Poisson noise component the nonlinear least squares algorithm produces results with accuracies comparable to those from the maximum likelihood estimator. In the case when there is a large readout noise component, the Gaussian nature of the readout noise dominates the nature of the overall signal. Therefore, it can again be expected that the performance of the nonlinear least squares algorithm would be comparable to that of the maximum likelihood algorithm. However, in the realm of quantum limited data, i.e. data where the overall signal level is low at a significant number of pixels in the image, as is the case with single molecule data, the Poisson nature of the data dominates and the maximum likelihood estimator is better suited in this situation. Consequently, at low extraneous noise levels, the maximum likelihood algorithm produces more accurate estimates.
From a practical point of view the question of robustness is of great importance. In a concrete experimental situation the acquired data may not precisely match the theoretical assumptions. For example, the image profile might deviate from the assumptions due to the presence of unmodeled optical effects or due to the fact that some of the parameters that specify the profile might not be accurately known. The robustness question now relates to what extent the behavior of the estimator changes if the matching image profile is not perfectly modeled. Importantly, for the model mismatches that were considered here, both estimators, on average, recover the correct location parameters. For the nonlinear least squares estimator the standard deviations can be significantly larger than in the ideal case. If the model is misspecified this estimator can even produce better results than when the model is correctly specified, as is the case when the specified width parameter produces a profile that is considerably wider than the single molecule image profile. In contrast the performance of the maximum likelihood estimator does not depend very significantly on the nature and size of the misspecification.
It should, however, be emphasized that it cannot be expected that the estimators behave properly under any distortions. For example, if the actual data and the fitted profiles no longer exhibit radial symmetry, different results should be expected. If very precise results are desired, our analysis suggests that great care has to be taken that the model that is used for fitting is highly accurate. In general, if a certain type of modeling uncertainty is expected and very highly accurate results are required, it might be important to carry out an analysis of the type performed here for the specific modeling uncertainty. In a number of scenarios it was shown that the maximum likelihood estimator achieves the best results especially in the ideal case, i.e., when no model mismatch is present. Importantly, in this case, it attains the best possible performance as predicted by the PLAM [13
]. In addition, it does not exhibit the potentially very large performance deteriorations that are seen with the nonlinear least squares estimator. Based on the analysis carried out here the maximum likelihood estimator might therefore be a better algorithm choice.
The objective function of the maximum likelihood estimator is chosen based on the nature of the underlying data. In this study, we have used two objective functions with the maximum likelihood estimator, one for data with only Poisson characteristics, and a computationally more complex one for Poisson distributed data with additive Gaussian noise. The objective function of the nonlinear least squares estimator, however, remains the same irrespective of the nature of the data for which it is being used. The execution speed of the maximum likelihood estimator using the objective function for data with only Poisson characteristics is comparable to the execution speed of the nonlinear least squares estimator. However, the maximum likelihood estimator using the objective function for Poisson distributed data with additive Gaussian noise is computationally more demanding than the nonlinear least squares estimator because of the complexity of the objective function. We have shown that for a number of important scenarios, the maximum likelihood estimator using the more complex objective function produces more accurate results than the nonlinear least squares estimator. In such scenarios, it is for the microscopist to decide whether the increased computational burden of using the maximum likelihood estimator with the more complex objective function is justified by the increased accuracy.
The comparison between the two analytical expressions for the localization accuracy showed that differences can occur in the predicted levels of accuracy. The expressions based on the Fisher information matrix had been derived using a rigorous statistical framework. In contrast, the approach in [17
] relied on ad hoc derivations. Both approaches lead to identical results when the image profile is Gaussian and no pixelation is assumed. The presence of pixelation will also, in many cases, not lead to significant differences. Both approaches can, however, lead to significantly different results when the image profile is not Gaussian. The Fisher information matrix based approach can be used for any point spread function model. In the approach by Thompson et. al. [17
], an approximation is typically carried out, whereby the best Gaussian approximate to the actual point spread function profile is determined. Assuming that the actual point spread function is an Airy profile, the two approaches were compared. It was seen that in this situation the two approaches typically no longer predict the same accuracy with which the location of the single molecule can be estimated. The lowest standard deviations are predicted by the PLAM, i.e., by the Fisher information based approach and are also attained by the maximum likelihood estimator. Thus the difference between the localization accuracy predicted by the approach utilizing the Fisher information matrix and the approach by Thompson et. al. arise from the assumptions made about the underlying data model. Since these two approaches produce different results, a microscopist should take this difference into consideration when comparing the experimentally observed localization error with the theoretically predicted one. Many factors can lead to differences between the experimentally observed and theoretically determined localization accuracies, e.g. inadequate correction for stage drift, optical aberrations. Our comparison of the two approaches to predict the localization error suggests that the approach used to determine the localization error can also contribute to these differences. Our results show that the standard deviations of estimates from the maximum likelihood estimator are consistent with the PLAM in the absence of model mismatches. Therefore, using the Fisher information matrix based approach to calculate localization accuracies can help minimize differences between the experimentally observed and theoretically determined localization errors.
In the present study, we are primarily interested in the performance of the nonlinear least squares and maximum likelihood algorithms with respect to localization accuracy. Therefore, in most cases we have estimated only the location coordinates and fixed various other parameters (e.g. width, photon detection rate, background). In our experience, floating some of these parameters in addition to the location parameters can cause significant problems with the estimation. For example, we have observed that floating the background parameter along with the width parameter can lead to significant deterioration of the results. Estimating fewer parameters also has advantages with regard to computational efficiency. This requires more parameters to be independently specified. Previously, we have theoretically shown, and numerically confirmed here, that the estimation of the photon detection rate is uncorrelated with the estimation of the location of the single molecule or point source [14
]. This means that the photon detection rate and location parameters can be estimated independently without any change in the accuracy with which either parameter is estimated. We have here also explored how errors in specifying the width parameter affects the accuracy of location estimates. From the results, we see that misspecifications in the width parameter in one direction do not affect the accuracy of the estimates from either algorithm in some scenarios. This implies that care must be taken to make sure that if misspecifications occur they are such that they do not produce severely deteriorated estimates.
The computations required to estimate the location of a single molecule and to evaluate the limit of the localization accuracy can sometimes be fairly complex. To make these methodologies available to users who do not wish to write their own code, we have developed software packages to accomplish these tasks. The EstimationTool [23
] allows users to perform various estimation tasks including single molecule location estimation and resolution/distance measurements in 2D and 3D. It offers different choices of estimation models (Airy, Gaussian, Born-Wolf 3D point spread function model [25
]), the ability to use either the nonlinear least squares or the maximum likelihood estimator, and supports various models for extraneous noise sources (Poisson only, Poisson and Gaussian noise). The tool also provides access to advanced calculation parameters, for example parameters involved in performing the numerical integrations. The EstimationTool also allows results of estimations to be visualized and exported for further analysis. All calculations of the limit of the localization accuracy can be performed with the FandPLimitTool [24
]. Using the FandPLimitTool both localization and resolution measures can be calculated in 2D and 3D for various estimation models. User-friendly graphical user interfaces are available for both packages [24