PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
IEEE Trans Nucl Sci. Author manuscript; available in PMC 2017 September 19.
Published in final edited form as:
IEEE Trans Nucl Sci. 2015 Oct-Nov; 2015: 10.1109/NSSMIC.2015.7582178.
PMCID: PMC5603291
NIHMSID: NIHMS846808

The ML-EM Algorithm is Not Optimal for Poisson Noise

Gengsheng L. Zeng, Fellow, IEEE

Abstract

The ML-EM (maximum likelihood expectation maximization) algorithm is the most popular image reconstruction method when the measurement noise is Poisson distributed. This short paper considers the problem that for a given noisy projection data set, whether the ML-EM algorithm is able to provide an approximate solution that is close to the true solution. It is well-known that the ML-EM algorithm at early iterations converges towards the true solution and then in later iterations diverges away from the true solution. Therefore a potential good approximate solution can only be obtained by early termination. This short paper argues that the ML-EM algorithm is not optimal in providing such an approximate solution. In order to show that the ML-EM algorithm is not optimal, it is only necessary to provide a different algorithm that performs better. An alternative algorithm is suggested in this paper and this alternative algorithm is able to outperform the ML-EM algorithm.

Index Terms: Computed tomography, expectation maximization (EM), iterative reconstruction, maximum likelihood (ML), noise weighted image reconstruction, Poisson noise, positron emission tomography (PET), single photon emission computed tomography (SPECT)

I. Introduction

The ML-EM (maximum likelihood expectation maximization) algorithm became popular in the field of medical image reconstruction in the 1980s [1][2]. This algorithm considers the Poisson noise model and finds wide applications in PET (positron emission tomography) and SPECT (single photon emission computed tomography). Besides Poisson noise modeling, it also guarantees that the resultant image is non-negative and the total photon count of the forward projections of the resultant image is the same as the total photon count of the measured projections.

It is known that as the iteration number of the ML-EM algorithm increases, the likelihood objective function monotonically increases. The ML-EM algorithm will converge to a maximum likelihood solution. However, the convergence of the algorithm does not imply that the converged image is close to the true image. In fact, the ML-EM algorithm first converges towards the true solution, and then diverges away from it. The final maximum likelihood solution is too noisy to be useful. One approach is to apply a post lowpass filter on the finally converged noisy image, but one has to determine an ad hoc lowpass filter. Another approach is to stop the algorithm early. If the algorithm is terminated at the right time, a good approximate solution is close to the true solution.

This short paper argues that the ML-EM algorithm is not optimal in providing such an approximate solution. In order to show that the ML-EM algorithm is not optimal, it is only necessary to provide a different algorithm that performs better. An alternative algorithm is suggested in this paper and this alternative algorithm is able to outperform the ML-EM algorithm.

II. Methods

Let xi(x) be the ith image pixel at the nth iteration, pj be the jth line-integral (ray-sum) measurement value, aij be the contribution of the ith image pixel to the jth measurement, and qj(n)=iaijxi(n) be the forward projection of the image at the nth iteration, then the ML-EM algorithm can be expressed as [1][2]

xi(n+1)=xi(n)jaijpjqj(n)jaij,
(1)

where the summation over the index j is the backprojection. This algorithm can be extended into a more general algorithm by introducing a new parameter α:

xi(n+1)=xi(n)jaijpj[qj(n)]αjaijqj(n)[qj(n)]α.
(2)

When α = 1, Algorithms (2) and (1) are identical. It is obvious that Algorithm (2) produces non-negative images. The motivation of introducing the parameter α is to change the noise weighting factor for each measurement pj. This can be seen more clearly by expressing (2) in the additive form [3]

xi(n+1)=xi(n)jaijpj[qj(n)]αjaijqj(n)[qj(n)]α=xi(n)jaijpjqj(n)+qj(n)[qj(n)]αjaijqj(n)[qj(n)]α=xi(n)jaijqj(n)[qj(n)]αjaijqj(n)[qj(n)]α+xi(n)jaijpjqj(n)[qj(n)]αjaijqj(n)[qj(n)]α=xi(n)+xi(n)jaijqj(n)[qj(n)]αjaij1[qj(n)]α[pjqj(n)].
(3)

The last equation in (3) is in the form of an additive gradient decent algorithm, which is used to minimize an objective function:

F=jWj(iaijxipj)2.
(4)

In (3), the step-size is

Step size=xj(n)jaijqj(n)/[qj(n)]α
(5)

and the noise weighting factor is

Wj=Noise weighting factor=1[qj(n)]α.
(6)

Here qj(n) is the forward projection of the image and is the approximation of the mean value of the corresponding projection measurement. The mean value is the same as the variance for the Poisson distribution. Therefore, by changing parameter α, the noise model is changed, and the step-size is also affected. The “correct” model for Poisson noise is α = 1. We need to point out that algorithm (3) is not exactly the traditional gradient descent algorithm, because the weighting factor Wj is allowed to vary from iteration to iteration as indicated by (6).

The ML-EM algorithm guarantees to converge to the noisy maximum likelihood solution, but it is not guaranteed to reach the best possible solution (i.e., closest to the true solution for the given data set) if the algorithm is terminated early at a certain iteration number.

The computer simulations in this paper used analytically generated projections and the object was not pixelized. The phantom was a large uniform ellipse containing 5 small hot uniform discs and 1 small cold uniform disc. The activity ratios of ellipse: hot: cold are 1: 2: 0.5. The Poisson noise was incorporated in the projections with 5 different noise levels with scaling factors of 0.1, 1, 10, 100, and 1000; the corresponding total photon counts in these 5 noise levels of measurements were approximately 106, 107, 108, 109, and 1010, respectively. The parallel beam imaging geometry was assumed, with 120 views over 360 and 128 detector bins at each view. The images were reconstructed in a 128 × 128 array.

A series of computer simulations is conducted in this paper to investigate how to reach the best solution by varying the parameter α and the iteration number n. By “best” it is meant that the mean-square-error (MSE) between the true image and the reconstructed image reaches the minimum among sampled parameter α and n. The mean-square-error is defined as

MSE=1Nisupport(xitruei)2,
(7)

where truei is the ith pixel of the true image and N can be any fixed number. The product of the number of image pixels and the phantom scaling factor was used for N in this paper. In addition to MSE, a rectangular uniform region in the phantom is selected and the image standard deviation value is calculated in this rectangular region. The standard deviation value is normalized by the phantom scaling factor. The standard deviation value ignores the image bias, while MSE does not.

III. Results

The implementation of the Algorithm (2) is almost the same as that of the conventional ML-EM algorithm (1), except that each iteration requires two backprojections instead of one. Since the true image is available, the algorithm is stopped when the MSE starts to increase for a given parameter α. The parameter α was sampled from 0.1 to 1.9 with a sampling interval of 0.1. The results for the 5 noise levels are listed in Tables I through throughV,V, respectively. For each noise level, results of 5 noise realizations are shown. The images shown are from one noise realization.

Table I
Best Results for Noise Level 1 where the Data Scaling Factor is 0.1
Table V
Best Results for Noise Level 5 where the Data Scaling Factor is 1000

Each image in this paper is displayed individually from its minimum to its maximum using a linear gray scale. The images in each noise level look similar, but their numerical MSE values differ. In each table, central horizontal line profiles are provided to compare the image resolution. In order to visualize the important resolution information from the line profiles, the random noise from the projection data was removed while the parameter and the number of iterations were kept the same.

Fig. 1(A) through 5(A) show the well-known phenomenon that the MSE initially decreases as the iteration number increases, while later the MSE increases. Fig. 1(B) through 5(B) show the trend that the projection data discrepancy error decreases as the iteration number increases. The data discrepancy is calculated by the squared error between the forward projection of the image and the measured data.

Fig. 1
(A) MSE vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 0.1 (B) Data fidelity error vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 1.
Fig. 5
(A) MSE vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 1000 (B) Data fidelity error vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 1000.

IV. Discussion and Conclusions

When the projection measurement noise is Poisson distributed, the ML-EM algorithm is able to provide the maximum likelihood solution, which is noisy and has a large MSE from the true image. If the ML-EM algorithm stops early, it can provide solutions with a smaller mean-square-error (MSE) from the true image. This paper investigates whether the ML-EM algorithm is able to provide the best approximate solution (among all algorithms) when the algorithm can stop early. The result is negative. The ML-EM algorithm is not the optimal choice in this situation. The extended Algorithm (2) can outperform the conventional ML-EM algorithm (1). The unique feature of the extended Algorithm (2) is a new parameter α, which controls the noise model. When α = 1, the noise model is Poisson and the extended algorithm is identical to the conventional ML-EM algorithm (1).

A trend is observed from Tables 1 ~ 5 that when data total count is lower, the optimal number of iterations is smaller and the optimal parameter α is larger (can be greater than 1). When data total count is higher, the optimal number of iterations is larger and the optimal parameter α is smaller (can be smaller than 1). This observation can be used as a general guidance that user can follow to choose the right α for clinical cases based on their study count level.

Numerical analysis is a branch of computational mathematics. It focuses on whether an algorithm converges, how fast it converges, and how stable the solutions are. Little attention is paid to the intermediate solutions by stopping an algorithm early. However, in ML-EM reconstruction, one of those intermediate solutions (not the final converged solution) has a small MSE with respect to the true image. However, an intermediate solution from a different algorithm (2) with a slightly different (and “incorrect”) noise model can give a better solution that is closer to the true solution than the intermediate ML-EM solution.

This paper presents a novel concept: There is no universal optimal weighting function for Poisson noise. The so-called “correct” weighting function (α = 1) is sub-optimal in almost all cases. The optimal weighting (i.e., the parameter α) as well as the iteration number is data count dependent. Algorithm (2) is better than Algorithm (1). We cannot even say that the proposed Algorithm (2) is able to achieve the “ultimate” optimal solution that has the “ultimate” smallest MSE for the given data set, because there may be other noise models or algorithms that may outperform Algorithm (2).

Fig. 2
(A) MSE vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 1 (B) Data fidelity error vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 1.
Fig. 3
(A) MSE vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 10 (B) Data fidelity error vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 10.
Fig. 4
(A) MSE vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 100 (B) Data fidelity error vs. iteration number curves for the MLEM and the Algorithm (2), when the phantom scaling factor is 100.
Table II
Best results for noise level 2 where the data scaling factor is 1
Table III
Best Results for Noise Level 3 where the Data Scaling Factor is 10
Table IV
Best Results for Noise Level 4 where the Data Scaling Factor is 100

References

1. Langer K, Carson R. EM reconstruction algorithms for emission and transmission tomography. J Comp Assist Tomogr. 1984;8:302–316.
2. Shepp LA, Vardi Y. Maximum likelihood reconstruction for emission tomography. IEEE Trans Med Imag. 1982;1:113–122. [PubMed]
3. Zeng GL. Medical Imaging Reconstruction, A Tutorial. Beijing, China: Higher Education Press, Springer; 2009.