Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2872793

Formats

Article sections

- Abstract
- 1 Introduction
- 2 The randomized Kaczmarz’s algorithm
- 3 The results of Strohmer and Vershynin
- 4 Theory and practice
- References

Authors

Related links

J Fourier Anal Appl. Author manuscript; available in PMC 2010 August 1.

Published in final edited form as:

J Fourier Anal Appl. 2009 August 1; 15(4): 431–436.

doi: 10.1007/s00041-009-9077-xPMCID: PMC2872793

NIHMSID: NIHMS154925

Yair Censor: li.ca.afiah.htam@riay; Gabor T. Herman: moc.oohay@namrehtrobag; Ming Jiang: nc.ude.ukp@gnaij-gnim

See other articles in PMC that cite the published article.

In a recent paper by T. Strohmer and R. Vershynin [“A Randomized Kaczmarz Algorithm with Exponential Convergence”, Journal of Fourier Analysis and Applications, published online on April 25, 2008] a “randomized Kaczmarz algorithm” is proposed for solving systems of linear equations ${\{\langle {a}^{i},x\rangle ={b}_{i}\}}_{i=1}^{m}$
. In that algorithm the next equation to be used in an iterative Kaczmarz process is selected with a probability proportional to ‖*a ^{i}*‖

In this note we point out that the reported success of the algorithm of Strohmer and Vershynin in their numerical simulation depends on the specific choices that are made in translating the underlying problem, whose geometrical nature is “find a common point of a set of hyperplanes”, into a system of algebraic equations. If this translation is carefully done, as in the numerical simulation provided by Strohmer and Vershynin for the reconstruction of a bandlimited function from its nonuniformly spaced sampling values, then indeed good performance may result. However, there will always be legitimate algebraic representations of the underlying problem (so that the set of solutions of the system of algebraic equations is exactly the set of points in the intersection of the hyperplanes), for which the selection method of Strohmer and Vershynin will perform in an inferior manner.

Kaczmarz’s algorithm [21] is a sequential projection method for the solution of linear systems of equations of the form *Ax* = *b*. It was rediscovered in the field of image reconstruction from projections in [13] where it is called the Algebraic Reconstruction Technique (ART), see, e.g., [5, 7, 17], and it is obtained also by applying to the hyperplanes described by the linear system the method of successive projections onto convex sets. The latter is called in the literature POCS (for projections onto convex sets), see, e.g., [25], and was originally published by Bregman [3] and further studied by Gubin, Polyak and Raik [15], see, e.g., Bauschke and Borwein [1] and Combettes [9] for reviews. It belongs also to the class of *row-action methods* described in [6], see also [7]. The literature on Kaczmarz’s algorithm is vast and ranges from beautiful mathematical theoretical results to many successful applications in important real-world problems. Kaczmarz’s method is also a special case of several other algorithmic structures, see, e.g., [4, 8, 12, 20].

In a recent paper [24] (see also [23]) Strohmer and Vershynin propose a randomized Kaczmarz’s algorithm and claim that “it outperforms all previously known methods on general extremely overdetermined systems.” In this note we point out that the reported success of the algorithm of [24] in their numerical simulation depends on the specific choices that are made in translating the underlying problem, whose geometrical nature is “find a common point of a set of hyperplanes”, into a system of algebraic equations. If this translation is carefully done, as in the numerical simulation provided by [24] for the reconstruction of a bandlimited function from its nonuniformly spaced sampling values, then indeed good performance may result. However, there will always be legitimate algebraic representations of the underlying problem (so that the set of solutions of the system of algebraic equations is exactly the set of points in the intersection of the hyperplanes), for which the selection method of [24] will perform in an inferior manner.

Each equation of the *m*× *n* system *Ax* = *b* gives rise to a hyperplane

$${H}_{i}=\{x\phantom{\rule{thinmathspace}{0ex}}\in \phantom{\rule{thinmathspace}{0ex}}{R}^{n}\phantom{\rule{thinmathspace}{0ex}}|\phantom{\rule{thinmathspace}{0ex}}\langle {a}^{i},x\rangle ={b}_{i}\}$$

(1)

in the *n*-dimensional Euclidean space *R ^{n}*, where

*x*^{0} *R ^{n} is arbitrary.*

*Given the current iterate x ^{k}, calculate the next iterate x^{k+1} by*

$${x}^{k+1}={x}^{k}+\frac{{b}_{r(i)-\langle {a}^{r(i)},{x}^{k}\rangle}}{\parallel {a}^{r(i)}{\parallel}^{2}}{a}^{r(i)},$$

(2)

* where r (i) is chosen from the set {1, 2, …, m} at random with probability proportional to ‖a ^{r (i)}‖ ^{2}*.

Obviously, the equations can be scaled, i.e., both sides of each equation *a ^{i}*,

Strohmer and Vershynin use the *scaled condition number* of a matrix *A*, introduced by Demmel [10],

$$\kappa \left(A\right):=\parallel A{\parallel}_{F}\parallel {A}^{-1}{\parallel}_{2},$$

(3)

where ‖*A*‖_{2} and ‖*A*‖ _{F} are the spectral norm and the Frobenius norm of *A*, respectively. Their main result [24, Theorem 2] states that any iterative sequence ${\{{x}^{k}\}}_{k=o}^{\mathrm{\infty}},$
, generated by Algorithm 1, converges to the solution of the system *Ax* = *b* with an expected exponential rate, and that the rate of convergence depends on the scaled condition number κ (*A*).

In their numerical experiments they compare what they call the standard Kaczmarz method (in which the rows of A are selected in their “natural” order 1, 2, …) with a simple randomized Kaczmarz method (in which the rows of A are selected at random with equal probability and with the *randomized Kaczmarz* method of Algorithm 1 (where the rows of *A* are selected at random with probability proportional to the 2-norm of the rows). They plot the least squares error versus the number of projections for each algorithm and claim that “Algorithm 1 significantly outperforms the other Kaczmarz methods, demonstrating not only the power of choosing the projections at random, but also the importance of choosing the projections according to their relevance.”

Clearly, Kaczmarz’s method is a *geometric algorithm* (this term was used by Gordon and Mansour [14]) in the sense that the sequence of iterates generated by it depends only on the hyperplanes defined by the equations and not on any particular algebraic representation of the hyperplanes. One can always use real nonzero numbers, say *c _{i}*,

A scaling of the equations will change the system matrix *A* and its scaled condition number κ(*A*) and, in the light of [24], it might be tempting to think that it is possible to control in such a way the convergence rate of Kaczmarz’s method. However, the geometric nature of Kaczmarz’s method precludes such a possibility. Given any probability distribution whatsoever to be used for selecting the next hyperplane in Algorithm 1, we can always use real nonzero numbers to divide through the equations so that the rule for selecting the rows in Algorithm 1 will be exactly according to that probability distribution.^{1} Thus, the rule proposed in Algorithm 1 cannot be in general optimal (or even in any sense superior).

Thus, the results shown in Figure 1 of [24], cannot be fully explained by the presented theory there. It is true that the convergence behavior depends on the order-selection of hyperplanes (as is well-known), but this dependence is determined by the geometry of the hyperplanes (the angles among them) and not by their algebraic representation, see, e.g., [2, 11, 16, 18, 19, 22].

Stated differently, if, for any reason whatsoever, we believe that at iteration *k* hyperplane *H _{i}* should be picked with higher probability then hyperplane

To avoid inferior behavior by the randomized Kaczmarz algorithm, it is essential that the system of algebraic equations that represent the set of hyperplanes be carefully chosen. Indeed, this was done by Strohmer and Vershynin [24] in their numerical simulation in Section 4.1, see their equation (18). Had they selected a different algebraic representation, they would have obtained different convergence behavior. In particular, if they scaled their equations so that ‖*a ^{i}*‖ = 1 after scaling, then the reported difference between the behaviors of simple randomized Kaczmarz and of Algorithm 1 would have disappeared. And, if by the luck of the draw, the algebraic representation happened to be such that the norm associated with one equation is very much larger than the norms associated with the other equations, then the progress made by Algorithm 1 towards a solution would be poor due to the fact that the random selection would keep selecting the same equation most of the time.

This work was supported by grant No. 2003275 of the United States-Israel Binational Science Foundation (BSF), by a Chinese NKBRSF grant No. 2003CB716101 and by National Science Foundation of China (NSFC) grants No. 60325101, 60532080, and 60628102 and by Award Number R01HL070472 from the National Heart, Lung, And Blood Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, And Blood Institute or the National Institutes of Health.

’This is an author-created, un-copyedited version of an article accepted for publication in Journal of Fourier Analysis and Applications. The definitive publisher authenticated version is available online at http://www.springerlink.com/content/t861180p278v/?p=9c1df8db5f2348dab0f861d9c4ee911d&pi=0

^{1}For any distribution λ * _{i}* > 0,

1. Bauschke HH, Borwein JM. On projection algorithms for solving convex feasibility problems. SIAM Review. 1996;38:367–426.

2. Bauschke HH, Borwein JM, Lewis AS. The method of cyclic projections for closed convex sets in Hilbert space. Contemporary Mathematics. 1997;204:1–38.

3. Bregman LM. The method of successive projections for finding a common point of convex sets. Soviet Mathematics Doklady. 1965;6:688–692.

4. Butnariu D, Davidi R, Herman GT, Kazantsev IG. Stable convergence behavior under summable perturbations of a class of projection methods for convex feasibility and optimization problems. IEEE Journal of Selected Topics in Signal Processing. 2007;1:540–547.

5. Byrne CL. Applied Iterative Methods. Wellesley, MA, USA: A. K. Peters Ltd.; 2008.

6. Censor Y. Row-action methods for huge and sparse systems and their applications. SIAM Review. 1981;23:444–466.

7. Censor Y, Zenios SA. Parallel Optimization: Theory, Algorithms, and Applications. New York, NY, USA: Oxford University Press; 1997.

8. Censor Y, Elfving T, Herman GT, Nikazad T. On diagonallyrelaxed orthogonal projection methods. SIAM Journal on Scientific Computing. 2008;30:473–504.

9. Combettes PL. The convex feasibility problem in image recovery. Advances in Imaging and Electron Physics. 1996;95:155–270.

10. Demmel JW. The probability that a numerical analysis problem is difficult. Mathematics of Computation. 1988;50:449–480.

11. Deutsch F, Hundal H. The rate of convergence for the cyclic projections algorithm I: Angles between convex sets. Journal of Approximation Theory. 2006;142:36–55.

12. Eggermont PPB, Herman GT, Lent A. Iterative algorithms for large partitioned linear systems, with applications to image reconstruction. Linear Algebra and Its Applications. 1981;40:37–67.

13. Gordon R, Bender R, Herman GT. Algebraic reconstruction techniques (ART) for three-dimensional electron microscopy and x-ray photography. Journal of Theoretical Biology. 1970;29:471–481. [PubMed]

14. Gordon D, Mansour R. A geometric approach to quadratic optimization: an improved method for solving strongly underdetermined systems in CT. Inverse Problems in Science & Engineering. 2007;15:811–826.

15. Gubin L, Polyak B, Raik E. The method of projections for finding the common point of convex sets. USSR Computational Mathematics and Mathematical Physics. 1967;7:1–24.

16. Hamaker C, Solmon D. The angles between the null space of X-rays. Journal of Mathematical Analysis and Applications. 1978;62:1–23.

17. Herman GT. Image Reconstruction From Projections: The Fundamentals of Computerized Tomography. New York, NY, USA: Academic Press; 1980.

18. Herman GT. Algebraic reconstruction techniques in medical imaging. In: Leondes CT, editor. Medical Imaging Systems Techniques and Applications: Computational Techniques. Amsterdam, The Netherlands: Gordon and Breach, Overseas Publishers Association (OPA); 1998. pp. 1–42.

19. Herman GT, Meyer LB. Algebraic reconstruction techniques can be made computationally efficient. IEEE Transactions on Medical Imaging. 1993;12:600–609. [PubMed]

20. Jiang M, Wang G. Convergence studies on iterative algorithms for image reconstruction. IEEE Transactions on Medical Imaging. 2003;22:569–579. [PubMed]

21. Kaczmarz S. Angenäherte Auflösung von Systemen linearer Gleichungen. Bulletin de l’Académie Polonaise des Sciences et Lettres. 1937;A35:355–357.

22. Kayalar S, Weinert HL. Error bounds for the method of alternating projections. Mathematics of Control, Signals, and Systems. 1988;1:43–59.

23. Strohmer T, Vershynin R. A randomized solver for linear systems with exponential convergence. In: Diaz J, Jansen K, PRolimand JD, Zwick U, editors. Approximation, Randomization and Combinatorial Optimization: Algorithms and Techniques, Lecture Notes in Computer Science (LNCS) Vol. 4110. Berlin, Heidelberg, Germany: Springer-Verlag; 2006. pp. 499–507.

24. Strohmer T, Vershynin R. A randomized Kaczmarz algorithm with exponential convergence. Journal of Fourier Analysis and Applications, to appear. The OnlineFirst version is available at: http://www.springerlink.com/content/802836566684l332/

25. Youla DC. Mathematical theory of image restoration by the method of convex projections. In: Stark H, editor. Image Recovery: Theory and Applications. Orlando, Florida, USA: Academic Press; 1987. pp. 29–77.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |