PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Proc SPIE Int Soc Opt Eng. Author manuscript; available in PMC 2010 September 29.
Published in final edited form as:
Proc SPIE Int Soc Opt Eng. 2006 January 1; 6272: 62721W.
doi:  10.1117/12.672798
PMCID: PMC2947459
NIHMSID: NIHMS233983

Task Performance in Astronomical Adaptive Optics

Abstract

In objective or task-based assessment of image quality, figures of merit are defined by the performance of some specific observer on some task of scientific interest. This methodology is well established in medical imaging but is just beginning to be applied in astronomy. In this paper we survey the theory needed to understand the performance of ideal or ideal-linear (Hotelling) observers on detection tasks with adaptive-optical data. The theory is illustrated by discussing its application to detection of exoplanets from a sequence of short-exposure images.

Keywords: Adaptive optics, image quality, covariance, detection, exoplanets, Hotelling observer

1. INTRODUCTION

Adaptive optical (AO) systems have reached a high level of sophistication, and many ingenious system configurations and data-processing algorithms have been proposed. Methods for specifying system performance and image quality, on the other hand, seldom go beyond Strehl ratio. Indeed, it is often said that the goal of adaptive optics is to improve the Strehl ratio.

In this paper we take a different view, one adopted from the medical-imaging literature. In this view, the goal of a medical imaging system is to perform a specific task of clinical interest, and the system performance is determined by how well the task can be performed. Similarly, astronomical AO systems are used for specific tasks of scientific interest, and it is reasonable to define image quality in terms of performance on these tasks. This approach has proven valuable for evaluating and optimizing radiological imaging systems,1 and it is the premise of this paper that it will also be useful in adaptive optics.

In both fields, the tasks of interest are either classification or estimation. In a classification task, the goal is to assign the object that produced the image to one of two or more classes or hypotheses. In an estimation task, the goal is to determine numerical values of one or more parameters describing the object. The method by which the task is performed is called the observer.

An example of a classification task in medical imaging is detection of a tumor in a mammogram. The observer for this task can be either a human (a radiologist) or an automated detection algorithm, and in either case task performance can be specified by an ROC (receiver operating characteristic) curve, described briefly in Sec. 2.

An example of an estimation task in medical imaging is determination of the uptake of a targeted radiotracer that binds preferentially to a specific neuroreceptor. The data set for this task is a reconstructed tomographic image obtained by PET (positron emission tomography) or SPECT (single-photon emission computed tomography), and various mathematical observers are used to determine the uptake from the reconstructed image. Performance in this case can be defined either by accuracy of the estimate (including both bias and variance) or by the usefulness of the estimated value in performing a subsequent diagnostic classification.

Detection of a faint object such as an asteroid on a cluttered star field has much in common with detection of a small tumor in a mammogram, where normal anatomical structures obscure the tumor in the same way as background astronomical objects obscure the asteroid. Similarly, detection of an exoplanet is analogous to detection of a metastatic bone tumor on a rib in a SPECT image when the tumor is adjacent to the heart; like the host star in astronomy, the heart appears very bright since it is large and emits photons of the same energy as does the tumor, and it can be challenging to separate the two in a noisy, low-resolution image.

A common estimation task in astronomy is determination of the brightness of a star. Known as photometry, this task is completely analogous to assaying the local uptake of a radiotracer in SPECT or PET imaging.

For both estimation and classification tasks, calculation of task performance (in either medicine or astronomy) requires detailed knowledge of all statistical effects that influence the image or the observer. In radiology, the images are random because of photon noise from the discrete x rays or gamma rays, but also because the objects themselves are random. The photon noise in raw projection data is well described by independent Poisson statistics, but tomographic reconstruction algorithms lead to spatial correlations and more complicated image statistics. Object randomness includes normal anatomical structures as well as randomness in the tumor or other object of diagnostic interest, and these stochastic processes must also be transformed through the image-forming elements and any post-processing such as tomographic reconstruction. The images are said to be doubly stochastic1 since there is randomness from both object variability and measurement noise.

In AO systems, there is an additional source of randomness, the point-spread function (PSF) of the imaging system itself. If the adaptive optics functioned perfectly, this PSF would be a nonrandom Airy pattern, but residual speckle and unknown residual aberrations make it imperative to treat the PSF as random, so AO images are triply stochastic. Even though one of the three stochastic components, the Poisson noise from discrete photoelectric events, is statistically independent from measurement to measurement, the contributions from object and PSF randomness produce correlations and complicated statistical distributions.

In a paper recently submitted for publication,2 we presented expressions for the mean data and spatiotemporal covariance matrices for general AO systems with all three sources of randomness. We then related these expressions to three specific tasks of interest in astronomical adaptive optics: (1) detection of a faint point object on a complex background with random spatial and temporal structure; (2) detection of a faint companion at an unknown location around a bright star, and (3) photometry of a given star in a crowded field. In all three cases, expressions were derived for the form of the optimal linear observer for performing the task as well as for the final performance achievable. Practical ways of calculating or estimating these covariance components were discussed, and ways of using them to compute task-based measures of image quality were presented.

The present paper concentrates on detection tasks, but there is a brief discussion of joint detection and estimation. It expands considerably on the analysis of Ref. 2 by looking at the Bayesian ideal observer, which almost always requires nonlinear operations on the data. The example of detection of an exoplanet from a sequence of short-exposure images is treated in some detail. Previous work on Bayesian detection of exoplanets3, 4 has considered only a single long-exposure image and a much simpler stochastic model.

Section 2 introduces the notation and some important definitions relating to triply stochastic spatiotemporal data, and Sec. 3 briefly reviews some key concepts from statistical decision theory. Section 4 summarizes results from ref. 2 on triply stochastic covariance matrices for AO systems and adds some discussion of probability density functions. Section 5 shows how these ideas can be applied to detection of an exoplanet.

2. DEFINITIONS AND NOTATION

This section introduces the notation that will be used in this paper, reviews the basic principles of multiply stochastic data, and discusses the nontrivial concept of linearity.

2.1. Doubly stochastic spatial data

Consider a digital imaging system viewing a time-independent object f(r), where r = (x, y) is a 2D position vector. We shall often think of this object as a vector in a Hilbert space, in which case we shall denote it as f.

A single digital image consists of M pixel values, {gm, m = 1, , , M}, which can be considered as the components of an M ×1 vector g. As discussed above, this vector is random because of measurement noise and because different objects, selected at random from some ensemble of objects, will be imaged on different repetitions of the experiment. We can define the statistical expectation of this image with respect to both sources of randomness as

g¯¯dgpr(g)g,
(2.1)

where pr(g) is the overall M-dimensional probability density function (PDF) for the data, and the integral runs over all values that can be assumed by each of the components of g.

We can indicate the doubly stochastic nature of the average by writing

pr(g)=dfpr(gf)pr(f),
(2.2)

where pr(f) is an abstract way of indicating the PDF on all parameters needed to express the statistics of the random process f(r). Thus,

g¯¯=dgdfpr(gf)pr(f)gggff.
(2.3)

We can also define a conditional average over the measurement noise alone by

g¯(f)ggf=dgpr(gf)g.
(2.4)

The imaging system is said to be linear if g(f) is a linear function of f, in which case the components of g can be expressed as1

g¯m=d2rhm(r)f(r),
(2.5)

where the kernel hm(r) is the response at pixel m for a point source at r. The subscript ∞ indicates that the integral runs over the entire x-y plane.

The overall average image, in component form, is given by

g¯¯m=d2rhm(r)f¯(r),
(2.6)

where f(r) is the ensemble-average object.

2.2. Triply stochastic spatiotemporal data

Now consider an AO system that produces a temporal sequence of J images while viewing a time-dependent incoherent object f(r, t). The jth image in the sequence is indicated by the M × 1 vector g(j), and the entire sequence is denoted G.

By analogy to the doubly stochastic case, we shall use a single overbar to indicate an average over the measurement noise alone, conditional on a specific object and PSF. For incoherent imaging, this average is a linear function of the spatiotemporal object, given by

g¯m(j)=d2rddm(rd)tjtj+Tdtd2rp(rd,r,t)f(r,t),
(2.7)

where rd is a 2D vector in the image plane, T is the frame time, dm(rd) describes the sensitivity of the mth detector pixel, and p(rd, r, t) is the random, anisoplanatic, time-dependent, incoherent PSF of the system.

As in ref. 2, we shall assume that the object is a slowly varying function of time, essentially constant over one frame of the science camera, in which case (2.6) becomes

g¯m(j)=d2rhm(j)(r)f(j)(r),m=1,,M,j=1,,J,
(2.8)

where f(j)(r) = f(r, tj),

hm(j)(r)=d2rddm(rd)p(j)(rd,r),
(2.9)

and

p(j)(rd,r)=tjtj+Tdtp(rd,r,t).
(2.10)

The sequence of object functions, {f(j)(r), j = 1, …J}, will be denoted F, and the sequence of PSFs, {p(j)(rd, r), j = 1, …J}, will be denoted P.

A second overbar will be used to indicate an average over the random PSFs P conditional on F. In component form,

g¯¯m(j)=g¯¯m(j)(f(j))=d2rh¯m(j)(r)f(j)(r),
(2.11)

where the average kernel is related to the average incoherent PSF by

h¯m(j)(r)=d2rddm(rd)p¯(j)(rd,r).
(2.12)

The final average, over the object variability, yields

g¯¯¯m(j)=d2rh¯m(j)(r)f(j)(r)F=d2rh¯m(j)(r)f¯(j)(r),
(2.13)

where the second form holds only if p(j) is statistically independent of F.

Some special cases of these averages will be discussed in Sec. 4.

3. CONCEPTS FROM STATISTICAL DECISION THEORY

In this section we briefly review some basic concepts from statistical decision theory, relating them specifically to triply stochastic spatiotemporal data.

3.1. Detection problems and ROC curves

In any classification task, the goal is to assign the object that produced an image to one of two or more classes. If the hypothesis that the object belongs to the kth class is denoted Hk, then the probability law for the data when hypothesis Hk is true is denoted pr(G|Hk). In a signal-detection task, the hypotheses are signal-absent and signal-present.

If we assume that each image must be assigned without equivocation either to hypothesis H0 (signal absent) or to H1 (signal present), the decision on a detection task can be made in complete generality by computing some scalar test statistic t(G) from the data; the observer then decides H1 if the test statistic is greater than a decision threshold and decides H0 otherwise. The value of the threshold controls the tradeoff between true positive decisions (correctly choosing H1) and false positive decisions (choosing H1 when H0 is true). In signal-detection problems, the true-positive fraction (TPF) is called the probability of detection, and the false-positive fraction (FPF) is called the false-alarm rate.

A plot of TPF vs. FPF as the threshold is varied is called a receiver operating characteristic (ROC) curve. Meaningful figures of merit for binary classification include the true positive fraction at a specified false-positive fraction (the Neyman-Pearson criterion), the area under the ROC curve (AUC), and certain detectability indices derived from the ROC curve. The probability of detection alone is not a meaningful metric since it can always be made large, even unity, simply by choosing a low threshold.

3.2. Random signals, backgrounds and PSFs

In the medical imaging literature, it is common to divide detection tasks further depending on whether the background object, the signal to be detected or both are random. The most tractable – and least realistic –problem is where the background is specified exactly and the signal, if present, is also specified exactly; such problems are known as SKE/BKE (signal known exactly, background known exactly). The only remaining randomness in an SKE/BKE problem is the measurement noise and, of course, whether or not the signal is present. Though SKE/BKE problems are never realistic models for the actual clinical tasks of interest, we shall see that they are necessary stepping stones to the analysis of more practical problems.

In clinical practice, the background is not known. In mammography, for example, the normal anatomical structures vary randomly from patient to patient and must be treated as a spatial random process. To make any progress on this problem, we must have at least partial knowledge of the statistics of this random process, so in this case we say that the problem is BKS (background known statistically).

Similarly, in real clinical applications, the signal to be detected (e.g., a possible tumor) is also random in size, shape and location. Again, we must have some partial statistical knowledge of these parameter in order to even be able to define the signal part of the image, so this situation is referred to as SKS (signal known statistically).

These distinctions are needed in astronomy as well. In planet detection, for example, the planet brightness and location are not known ahead of time, so the problem is inherently SKS. Sky background and other astronomical objects contribute to a random background, so real astronomical problems are also BKS.

In astronomy, however, there is the additional complication of the random PSF. In a perfect AO system or in a long-exposure image where the PSF fluctuations average out, the PSF is nonrandom and the problem is designated PKE (PSF known exactly). On the other hand, if the residual speckle is important as in a sequence of short-exposure images, the problem is PKS (PSF known statistically).

3.3. Ideal Bayesian observer

The ideal observer on a detection task is defined variously as one that maximizes the area under the ROC curve, maximizes the TPF at all specified FPFs or minimizes a cost function defined in terms of TPF and FPF. By any of these criteria, the test statistic used by the ideal observer is the likelihood ratio,

Λ(G)pr(GH1)pr(GH0).
(3.1)

Equivalently, the ideal observer can use the logarithm of the likelihood ratio, λ(G) [equivalent] ln Λ(G), as the test statistic. In either case, the ideal observer computes the test statistic and compares it to a threshold, deciding that H1 is true if the statistic exceeds the threshold. Some optimization criteria fix the value of the threshold according to the prevalences of the two hypotheses or costs assigned to erroneous decisions, but in every case the ideal test statistic is the likelihood ratio or its logarithm.

In practice, the relevant probabilities pr(G|Hk) in (3.1) are seldom known directly. Instead, they are given by

pr(GHk)=dFpr(GF)pr(FHk)=pr(GF)FHk,(k=0,1).
(3.2)

For a PKE problem, pr(G|F) describes the measurement noise, and it is easy to compute from the basic physics of the problem. For detectors with no significant readout noise, for example, pr(G|F) describes independent Poisson random variables at each pixel in each frame. With no PSF randomness and the specific object F, knowledge of the mean image at each pixel and frame is sufficient to determine pr(G|F).

If the PSF is random, it is useful to write

pr(GHk)=dPdFpr(GP,F)pr(PF)pr(FHk)=pr(GP,F)PFFHk.
(3.3)

The conditioning of the average over PSF P on the object F is needed in an AO system if the control signals for the adaptive element are derived from the object of interest; if a separate guide star is used and not treated as a part of F, then the object and PSF are statistically independent. In either case, pr(G|P, F) is easily derived from the physics of the problem. Markov-chain Monte Carlo (MCMC) methods can in many cases be used to perform the two averages.1

3.4. Ideal linear observer

Often the likelihood ratio is difficult to compute, and in these cases a useful alternative to the ideal observer is the ideal linear observer, called the Hotelling observer1, 57 in the medical literature.

Linear observers compute linear discriminants, so the test statistic for an image sequence has the form t(G) = WtG, where W is another image sequence called the template. The notation WtG denotes a pixel-by-pixel, frame-by-frame scalar product of the template with the spatiotemporal data.

The Hotelling discriminant uses a template that maximizes a certain class separability measure.8 Linear test statistics are usually normally distributed by dint of the central limit theorem, and in this case maximizing this class separability is equivalent to maximizing the AUC among linear observers.

It can also be shown that the Hotelling test statistic is equal to the log-likelihood ratio if the raw data are normal with the same covariance under both hypotheses, so the Hotelling observer is identical to the ideal observer in this case and thus maximizes AUC among all observers, not just linear ones.

Computation of the Hotelling test statistic requires only the overall mean vectors and the covariance matrices of the data under the two hypotheses. The test statistic is given by

tHot(G)=WtG=[G¯¯1G¯¯0]tKav1G,whereKav12[KGH1+KGH0].
(3.4)

The inverse of the average covariance matrix is related to the familiar signal-processing operation of prewhitening, and for this reason, the Hotelling observer is sometimes called a prewhitening matched filter; unless the noise is stationary, however, the prewhitening and matched filtering cannot be carried out in the Fourier domain.

A figure of merit for the Hotelling observer is the Hotelling signal-to-noise ratio, sometimes called the Hotelling trace; it is given by

SNRHot2[G¯¯1G¯¯0]tKav1[G¯¯1G¯¯0]=tr{Kav1[G¯¯1G¯¯0][G¯¯1G¯¯0]t},
(3.5)

where tr{·} denotes the trace (sum of the diagonal elements) of the matrix.

Computational approaches to evaluating the Hotelling test statistic and SNR are discussed in refs. 1 and 2.

3.5. Detection with location uncertainty

In most detection problems, we want to know not only that a signal is present but where it is. A useful alternative to the ROC curve in these cases is the LROC (localization ROC) curve, which plots the probability of detection and correct localization (within some preset tolerance) against the false-alarm rate. In the medical literature, the common figure of merit for joint detection/localization is the area under the LROC curve (ALROC).

As developed so far, neither the ideal nor the Hotelling observer gives any information about signal location, but both are readily modified to do so. A well-known way of using the likelihood for any detection problem with parameter uncertainty is the generalized likelihood ratio test (GLRT), in which the numerator in (3.1) is modified to

Λ(G;θ)pr(Gθ,H1)pr(GH0),
(3.6)

where the vector θ specifies the location or any other random parameter of the signal. In a GLRT, the parameter-dependent likelihood ratio Λ(G; θ) is first maximized with respect to θ, and the maximized result is compared to a threshold to decided if the signal is present; if that decision is positive, the maximum value of θ is chosen as the estimate of the signal parameters.

For the case where θ is signal location, Khurd and Gindi9 have shown that a modified GLRT, where the likelihood ratio is weighted by the probability of occurrence of each location, maximizes ALROC.

The Hotelling observer alone is ineffective with location uncertainty because the average signal, G¯¯1G¯¯0 in (3.4) or (3.5), tends to a small constant, independent of location in the image, when the double-bar average includes random signal location. As with the GLRT, however, a simple modification salvages the concept. The Hotelling template can be defined for an SKE problem with a specified location. The template can then be scanned over all possible locations, and the resulting maximum value becomes the final test statistic for the detection decision. This strategy, called the scanning Hotelling observer, is discussed in more detail in refs. 1 and 2.

4. STATISTICAL PROPERTIES OF AO IMAGES

As we saw in the previous section, the covariance matrices under both hypotheses are needed for the Hotelling or scanning Hotelling observer, and the full PDFs of the data are needed for the ideal observer or the GLRT. In this section we discuss some approaches to obtaining these statistical descriptions.

4.1. Covariance matrix

The overall covariance matrix of a triply stochastic image sequence is defined as

KG[GG¯¯¯][GG¯¯¯]tG,P,Fn=[GG¯¯¯][GG¯¯¯]tGP,FPFF.
(4.1)

To be explicit, KG is an MJ × MJ matrix with components given by

[KG]mm(j,j)=[gm(j)g¯¯¯m(j)][gm(j)g¯¯¯m(j)]GP,FPFF.
(4.2)

Adding and subtracting terms in each factor of (4.2), we obtain

KG=[GG¯+G¯G¯¯+G¯¯G¯¯¯][GG¯+G¯G¯¯+G¯¯G¯¯¯]tGP,FPFF.
(4.3)

Even without any assumptions of independence, the cross terms vanish identically, and we can write

KG=K¯¯Gnoise+K¯G¯PSF+KG¯¯obj,
(4.4)

where

K¯¯Gnoise[GG¯][GG¯]tGP,FPFF;
(4.5)
K¯G¯PSF[G¯G¯¯][G¯G¯¯]tPFF;
(4.6)
KG¯¯obj[G¯¯G¯¯¯][G¯¯G¯¯¯]tF.
(4.7)

Thus the overall covariance matrix for a triply stochastic image sequence can be rigorously decomposed into three terms representing, respectively, the contributions from measurement noise, from the random PSF and from randomness in the object being imaged.

The noise term in the covariance is almost always diagonal: measurements at different detector pixels and/or different frames are uncorrelated. The other two terms, however, exhibit significant spatial and temporal correlations; approaches to estimating them by simulation are discussed in ref. 2.

4.2. Probability density functions

Multivariate normal random processes are fully specified by their mean vector and covariance matrices, but normal statistics cannot accurately describe incoherent images since irradiance cannot go negative. Thus, if we wish to compute the performance of an ideal Bayesian observer on a detection task, we need more information than just the mean and covariance. This section briefly discusses some approaches to obtaining relevant probability density functions.

4.2.1. Background models

The simplest model for a random sky background is that it is spatially constant over the field of view and does not vary with time over the observation period, but that the absolute level of the background is not known a priori. In that case, the background is described by a single scalar random variable fb. The ideal observer needs information on the PDF pr(fb), and the Hotelling observer needs its mean and variance.

The next level of sophistication is to allow the spatially uniform background level to vary with time, denoting it fb(t). In many cases, this random process might be treated as temporally stationary. Its point statistics could be established experimentally for a particular observing situation.

Various background models with random spatial variation are used in medical imaging, and many of these yield images similar to astronomical scenes. In many cases, the models are not only realistic but also mathematically tractable; analytic expressions can be given for the characteristic functional, which in principle contains all statistical information about the random process. For a review of this topic, see Barrett and Myers.1

One particular spatial model that should accurately model globular clusters and other aggregations of point-like object is a nonstationary Poisson point process.2, 3

4.2.2. PSF models

How one models a random PSF depends on the exposure time T relative to the speckle correlation time of the PSF, denoted τc, and on how many images are included in the data set.

If the data consist of a single very long exposure, the PSF may not be known accurately, but it is fixed for that exposure and does not have to be treated as random. It can be estimated from measured data10 or calculated from knowledge of the atmosphere, telescope and AO system, and this estimated PSF can then be used for performing the tasks of interest.

If the whole observing period is divided into shorter exposures but each individual frame is still long compared to the speckle correlation time, then the PSF for one frame can be considered as random but with a form parameterized by the local atmospheric conditions during that exposure. It may even be possible to specify the form of the PSF adequately by knowing the average Fried parameter r0 for that frame. In that case the sequence P is temporally correlated but fully specified by the instantaneous Fried parameter for each frame. The problem of modeling the PSF boils down to understanding the statistics of the scalar random process r0(t).

The more interesting problem is when the data G consist of many short exposures, each with T < τc, so that P is a full spatiotemporal random process. More details on this case are given in Sec. 5.

5. EXAMPLE: DETECTION OF EXOPLANETS

One way to formulate an interesting planet-detection problem is to assume that the astronomical scene of interest consists of a uniform nonrandom background, a single star, and possibly a single planet, and that the data are a sequence of short-exposure images. The randomness in the data then comes from measurement noise, the residual speckle in the PSF and the random size and location of the planet. The problem can be classified as SKS/BKE/PKS in the language of Sec. 3.2. We shall use this problem to illustrate the statistical approaches described above.

Numerous authors have analyzed the deterministic and statistical properties of residual speckle in AO systems1114 None of these authors, however, consider either the full multivariate PSF needed by the ideal observer or even the spatiotemporal covariance needed by the Hotelling observer.

5.1. Pinned speckle

For simplicity, assume that the star is located at r = 0, and assume that the light is narrowband with mean wavelength λ. If the instantaneous pupil function, after the AO correction, is given by aap(r) exp[i[var phi](r, t)], where aap(r) is the aperture function of the telescope and [var phi](r, t) is the residual uncorrected phase, then the complex scalar field at point rd in the image (or detector) plane is given, within the Fresnel approximation, by

u(rd,t)d2raap(r)exp[iφ(r,t)]exp(2πir·rdλf).
(5.1)

By expanding exp[i[var phi](r, t)] in a Taylor series, taking the squared modulus of the field, and retaining terms linear or quadratic in the residual phase, we can express the random irradiance in the image plane approximately as13

I(rd,t)Aap(r)22Aap(r)Im{[AapΦ](r,t)}+[AapΦ](r,t)2Aap(r)Re{[AapΦΦ](r,t)},
(5.2)

where Aap(ρ, t) and Φ(ρ, t) are the 2D spatial Fourier transforms of a(r) and [var phi](r, t), respectively, and [r with tilde] [equivalent] rd/(λf). Since there is only a single on-axis point source in the field, this expression is the random PSF, p(rd, r, t) that appears in (2.7), but evaluated at r = 0.

The leading term in (5.2) is the ideal (nonrandom) Airy pattern for no aberrations (which is a real function for a symmetric aperture), and the other terms, corresponding to residual speckle, are spatiotemporal random processes. The second and fourth terms are referred to as pinned speckle since they are modulated by the Airy pattern; the third term is not pinned since it is convolved with the Airy pattern but not multiplied by it. For later reference, we label the terms in (5.2), in sequence, as I1, I2, I3 and I4. Thus I = I1 + I2 + I3 + I4.

5.2. Potentially useful assumptions

Next we look at three assumptions that will simplify computation of the spatiotemporal covariance and PDF of the data. All three derive from properties of uncorrected Kolmogorov turbulence, but it is argued that the same properties hold to a reasonable approximation after AO correction.

5.2.1. Spatial stationarity in the pupil

According to Kolmogorov theory, the uncorrected pupil phase fluctuations (before correction by the deformable mirror and truncation by the aperture) are well described as a widesense-stationary spatial random process, but it is not so obvious that the wave emerging from the deformable mirror is spatially stationary. The fixed locations of the actuators on the mirror argue against any claim that the spatial autocorrelation function of the field or the phase in the pupil depends only on the separation of the two observation points. A well-functioning AO system, however, will accurately correct all phase fluctuations within a subspace spanned by the deformable mirror modes. If the original random process is stationary and the component of the atmospheric phase in the mirror subspace is uncorrelated with the component in its orthogonal complement, then the corrected phase is stationary as well.

5.2.2. Frozen flow

The spatiotemporal statistics can be reduced to a purely spatial problem if we make the familiar Taylor frozen-flow assumption, under which [var phi](r, t) = [var phi]0(rvt) and

Φ(ρ,t)=Φ0(ρ)exp(2πiρ·vt),
(5.3)

where v is the wind velocity. Normally the frozen-flow hypothesis is applied to uncorrected phase aberrations, but if the AO system responds rapidly and nearly nulls out the components in the subspace spanned by the mirror modes, the hypothesis should apply to the residual phase after correction as well.

5.2.3. Circular Gaussian

Another useful assumption is that Φ([r with tilde], t) is a circular Gaussian spatial random process for a fixed time t. One justification is similar to the argument used above: The uncorrected phase is a Gaussian in Kolmogorov turbulence theory, and a good AO system projects the Kolmogorov phase onto the orthogonal complement of the subspace spanned by the mirror modes. That projection is a linear operator, and any linear transformation of a Gaussian is a Gaussian. Moreover, we can argue that the real and imaginary parts of Φ([r with tilde], t) are identically distributed since a small shift of the uncompensated part of the pupil phase will interchange real and imaginary parts.

5.3. PSF term in the covariance matrix

Extensive simulations are underway to understand fully the PSF term in the covariance for the special case of pinned speckle and short-exposure images. Numerical results will be reported separately, but in this section we examine the problem analytically with the help of the assumptions discussed above. Particular attention will be paid to the effect of the linear term, I2, in (5.2).

5.3.1. Autocovariance of the irradiance

The spatiotemporal autocovariance function of the irradiance produced by a single point object on the optical axis is defined by

kI(rd,rd,t,t)I(rd,t)I(rd,t)I(rd,t)I(rd,t).
(5.4)

Decomposing I(rd, t) as in (5.2), using the fact that I1 is nonrandom, and assuming that all odd moments of Φ are zero, we obtain

kI=k22+k33+k44+k34+k43,
(5.5)

where

kmnIm(rd,t)In(rd,t)Im(rd,t)In(rd,t).
(5.6)

All of these terms can be evaluated analytically if we are willing to make some of the assumptions introduced in Sec. 5.2. We illustrate this point by considering specifically k22, given in the scaled coordinates ([r with tilde] = rdf) by

k22(r,t,r,t)=4Aap(r)Aap(r)Im{[AapΦ](r,t)}Im{[AapΦ](r,t)}.
(5.7)

In the frozen-flow approximation, the essence of the calculation is to compute expectations of the form,

[AapΦ](r,t)[AapΦ](r,t)=d2ρd2ρAap(rρ)Aap(rρ)exp[2πi(ρ·vtρ·vt)]Φ0(ρ)Φ0(ρ).
(5.8)

If we also assume that [var phi]0(r) is widesense stationary, it can be shown that

Φ0(ρ)Φ0(ρ)=Sφ0(ρ)δ(ρρ),
(5.9)

where S[var phi]0(ρ) is the power spectral density of [var phi]0(r). Then (5.8) becomes

[AapΦ](r,t)[AapΦ](r,t)=d2ρAap(rρ)Aap(rρ)exp[2πiρ·v(tt)]Sφ0(ρ).
(5.10)

The complex conjugate of (5.10) also appears in k22, as do two terms involving left angle bracketΦ0(ρ0(ρ′)right angle bracket and its conjugate. We can still use (5.9) for these latter expectations because Φ0(ρ)=Φ0(ρ), and the final answer is

k22(r,t,r,t)=Aap(r)Aap(r){d2ρAap(rρ)Aap(rρ)cos[2πρ·v(tt)]Sφ0(ρ)d2ρAap(rρ)Aap(r+ρ)cos[2πρ·v(tt)]Sφ0(ρ)}.
(5.11)

This result predicts a stationary time dependence for k22, and it also predicts a strong positive correlation for t = t′ and [r with tilde]′ near [r with tilde] and a strong negative correlation for [r with tilde]′ near −[r with tilde]. Simulation studies to check these predictions are in progress.

The quadratic terms in (5.2) are more complicated, but their means and covariances are calculable from properties of circular Gaussians if we are willing to add that assumption.

5.4. Complete covariance matrix

To go from the autocovariance function of the PSF, kI(r,rd,t,t), to the PSF term in the covariance matrix, we must integrate over frames and pixels as in (2.10) and (2.12).

In principle, we should also include an object variability term, but we have assumed in this section that the background and the host star are nonrandom, so there is no object variability under the planet-absent hypothesis, H0. There is variability in planet location and brightness under H1, but these factors come into the mean signal, not the average covariance, if the star is much brighter than its companion. Moreover, with the scanning Hotelling observer, the mean and covariance are conditioned on a particular location.

We do need to include a covariance term accounting for measurement noise. If the noise is statistically independent (conditional on the object and PSF) from pixel to pixel and frame to frame, we can write

[K¯¯Gnoise]mm(j,j)=σmj2δmmδjj.
(5.12)

If both readout and Poisson noise are present, then

σmj2=σm2+g¯¯m(j),
(5.13)

where σm2 is the electronic noise for the mth detector pixel in photoelectron units and g¯¯m(j) in the present problem is obtained by averaging (5.2) over random PSFs and normalizing to photoelectron units.

Thus, with only minimal numerical calculation, we will have access to the full covariance matrix needed to implement the Hotelling or scanning Hotelling observer.

5.5. Probability density functions

If we wish to go beyond Hotelling to the ideal observer, we need PDFs for the data, not just means and covariance. We have already suggested that it might be valid take Φ as circular Gaussian; in that case the I2 term in (5.2) is fully characterized, as is its contribution to the PDF of measured data. It is still necessary, however, to validate the circular-Gaussian assumption with detailed simulations.

The statistics of the I3 and I4 terms are not so easy to determine, even under a circular Gaussian model, because they involve products of spatiotemporal random processes. Further theoretical investigation is needed if these terms are significant.

The probability law for the measurement noise (again conditional on an object and PSF) is easy to state in the limit of pure Poisson noise or pure Gaussian readout noise.

6. SUMMARY AND CONCLUSIONS

The main point stressed in this paper is that task performance is the ultimate measure of image quality. Astronomical images are produced either for estimation tasks, such as photometry, or for classification tasks such as planet detection. Both kinds of task were analyzed previously for AO systems in ref. 2, but the analysis was restricted to linear observers, namely the Hotelling observer for binary detection tasks and the generalized Wiener estimator for estimation tasks.

In this paper we applied the linear analysis of ref. 2 specifically to detection of exoplanets, and we extended the discussion of detection tasks to nonlinear ideal observers that maximize the area under the ROC curve. We also briefly discussed the LROC curve for joint detection and estimation tasks.

As in ref. 2, we noted that task performance in AO systems is limited by three main sources of data randomness: measurement noise, object randomness and residual PSF randomness. For short-exposure images, the data from an AO system is a discrete spatiotemporal random process. Calculation of achievable performance by any observer depends on accurate multivariate statistical knowledge. For linear observers, mean vectors and spatiotemporal covariance matrices are needed, and for the ideal observer full multivariate PDFs are required. Univariate statistics say very little about task performance. In particular, pixel signal-to-noise ratio is almost unrelated to detectability.

For the specific problem of exoplanet detection from a sequence of short-exposure images, we found that the relevant covariance matrices could be expressed in a numerically tractable form with the help of two plausible assumptions, namely frozen flow and spatial stationarity of the phase in the pupil. The range of validity of these assumptions remains to be explored.

As a result of the investigations reported here, we now have the ingredients needed to implement the Hotelling and scanning Hotelling observer for exoplanet detection. This future work should lead to improved detectability, but perhaps more importantly it will provide a firm basis for optimization of AO systems for this task. In particular, it will provide a direct comparison of the achievable detectability with multiple short-exposure images as compared to a single long exposure.

Implementation of the ideal observer is further in the future, but its realization should be facilitated by the discussion in this paper.

Acknowledgments

This research was supported by Science Foundation Ireland under grant no. 01/PI.2/B039C and by an SFI Walton Fellowship (03/W3/M420) for H. H. Barrett. Development of the basic methodology for objective assessment of image quality was also supported in part by the National Institutes of Health under grant nos. R37 EB000803 and P41 EB002035.

Contributor Information

Harrison H. Barrett, College of Optical Sciences and Department of Radiology, University of Arizona, Tucson AZ 85724.

Kyle J. Myers, NIBIB/CDRH Laboratory for the Assessment of Medical Imaging Systems, Food and Drug Administration, Rockville, MD 20850.

Nicholas Devaney, Department of Physics, National University of Ireland, Galway, Ireland.

J. C. Dainty, Department of Physics, National University of Ireland, Galway, Ireland.

Luca Caucci, Department of Electrical and Computer Engineering, University of Arizona, Tucson AZ 85721.

References

1. Barrett HH, Myers KJ. Foundations of Image Science. John Wiley and Sons; New York: 2004.
2. Barrett HH, Myers KJ, Devaney N, Dainty JC. Objective assessment of image quality: IV. Application to adaptive optics. 2006 Submitted to J. Opt. Soc. Am. A. [PMC free article] [PubMed]
3. Hobson MP, McLachlan C. A Bayesian approach to discrete object detection in astronomical data sets. Mon Not R Astron Soc. 2003;338:765784.
4. Braems I, Kasdin NJ. Bayesian hypothesis testing for planet finding. 203rd Meeting; American Astronomical Society; 2004.
5. Barrett HH. Objective assessment of image quality: effects of quantum noise and object variability. J Opt Soc Am A. 1990;7:1266–1278. [PubMed]
6. Smith WE, Barrett HH. Hotelling trace criterion as a figure of merit for the optimization of imaging systems. J Opt Soc Am A. 1986;3:717–725.
7. Fiete RD, Barrett HH, Smith WE, Myers KJ. The Hotelling trace criterion and its correlation with human observer performance. J Opt Soc Am A. 1987;4:945–953. [PubMed]
8. Hotelling H. The generalization of Student’s ratio. Ann Math Stat. 1931;2:360–378.
9. Khurd P, Gindi G. Decision strategies that maximize the area under the LROC curve. IEEE Trans Med Im. 2005;24:1626–1636. [PubMed]
10. Veran JP, Rigaut F, Maitre H, Rouan D. Estimation of the adaptive optics long-exposure point-spread function using control loop data. J Opt Soc Am A. 1997;14:3057–3068.
11. Aime C, Soummer R. The usefulness and limits of coronagraphy in the presence of pinned speckles. Astrophysical J. 2004;612:L85L688.
12. Bloemhof EE. Anomalous intensity of pinned speckles at high adaptive correction. Opt Lett. 2004;29:159–161. [PubMed]
13. Perrin MD, Sivaramakrishnan A, Makidon RB, Graham JR. The structure of high strehl ratio point-spread functions. Astrophysical J. 2003;596:702–712.
14. Sivaramakrishnan A, Lloyd JP, Hodge PE, Macintosh BA. Speckle decorrelations and dynamic range in speckle noise-limited imaging. Astrophysical J. 2002;581:L59–L62.