|Home | About | Journals | Submit | Contact Us | Français|
Reported here is a phantom-based comparison of methods for determining the power spectral density of ultrasound backscattered signals. Those power spectral density values are then used to estimate parameters describing α(f), the frequency dependence of the acoustic attenuation coefficient. Phantoms were scanned with a clinical system equipped with a research interface to obtain radiofrequency echo data. Attenuation, modeled as a power law α(f)=α0fβ, was estimated using a reference phantom method. The power spectral density as estimated using the short-time Fourier transform (STFT), Welch's periodogram, and Thomson's multitaper technique, and performance was analyzed when limiting the size of the parameter estimation region. Errors were quantified by the bias and standard deviation of the α0 and β estimates, and by the overall power-law fit error. For parameter estimation regions larger than ~34 pulse lengths (~1cm for this experiment), an overall power-law fit error of 4% was achieved with all spectral estimation methods. With smaller parameter estimation regions as in parametric image formation, the bias and standard deviation of the α0 and β estimates depended on the size of the parameter estimation region. Here the multitaper method reduced the standard deviation of the α0 and β estimates compared to those using the other techniques. Results provide guidance for choosing methods for estimating the power spectral density in quantitative ultrasound.
Quantitative Ultrasound (QUS) methods attempt to characterize tissue microstructure to aid in disease diagnosis and treatment monitoring.1,2 Two QUS parameters with demonstrated potential are the frequency dependent attenuation coefficient, α(f) and the backscatter coefficient, σb(f).3,4 The feasibility of estimating α(f) and σb(f) and their use as diagnostic tools have been reported for pathologies in the liver, kidney, breast, and thyroid.1,5-10
Most methods for estimating α(f) and σb(f) rely on a spectral analysis of radiofrequency (RF) echo signals by computing their power spectral density.11-21 Estimations of α(f) and σb(f) are based on the assumption that the power spectral density quantifies the expected energy resulting from the incoherent superposition of backscattered waves from scatterers randomly distributed within the acoustic pulse volume. The power spectral density is usually computed as the average of a set of periodograms from gated segments of RF echo signals from adjacent acoustic scanlines within an estimation window. Averaging reduces “coherent noise” components in the power spectral density estimate, resulting from the statistical nature of the backscatter process.22-27 For data from a clinical scanner, the size of the power spectral density estimation window is defined axially by the duration of the gate and laterally by the number of scanlines included in the estimation.
Each periodogram is the squared magnitude of the Fourier transform of the gated scanline signal.22,23 Other methods for estimating the power spectral density based on the periodogram include Welch's method and Thomson's multitaper method. In Welch's method, the periodogram resulting from one RF echo-signal segment is the average of the periodograms of shorter and overlapping subsections of the original segment.28,29 This method was recently suggested as optimum for σb(f) estimations.26 The multitaper method has been shown to offer a better tradeoff among bias, variance, and frequency resolution of the power spectral density estimate than Welch's method and has been suggested as useful for attenuation estimations.26,30
In practice, the frequency response of the gating window is convolved with the true RF signal spectrum. The frequency response of the windowing operation is characterized by a central main lobe bracketed by lower-amplitude side lobes. The main-lobe width, which is inversely proportional to the length of the estimation window, defines the frequency resolution, i.e., the amount of correlation between values of the power spectral density at neighboring frequency points. The side-lobe amplitude defines the bias introduced by spectral leakage. Bias can be reduced by multiplying each RF signal segment within the estimation window by a tapering function such as the Hamming or Hann, but this leads to wider main lobes, and affects the accuracy and precision of power spectral density estimates.
The area defined by the estimation window is illustrated in figure 1. The region of interest (ROI) outlined on the image might include a lesion or a section of it and the surrounding tissue. Within the ROI, a “parameter estimation region” defines the total amount of echo data applied to obtain an estimate of the QUS parameter of interest. This parameter estimation region can include one or more power spectral density estimation windows, depending on the estimated QUS parameter. The number of parameter estimation regions within an ROI depends on the task to be performed. In bulk parameter estimation the ROI includes usually one parameter estimation region. This can be thought of as a “large parameter estimation region” mode aimed at comprehensively gathering values of tissue properties to be used in tissue classification.22 In contrast, in parametric imaging small parameter estimation regions are positioned at different locations within the ROI to display spatial variations of the QUS parameter.22
Algorithms for local estimation of α(f) characterize the variation of a spectral property along the propagation depth. At least three methods are quantitatively meaningful because they compensate for diffraction and system settings. These are the spectral difference method (included as part of the reference phantom method),13 the spectral log difference method, and the hybrid method.19 In general, the reference phantom method requires smaller parameter estimation regions, although still fairly long (some cases suggesting up to 50mm), compared to the other methods.31-36 Although this does not limit bulk-parameter estimation, it is a major limitation in terms of the spatial resolution of parametric images. In this sense, it has been suggested that Thomson's multitaper method may outperform the conventional short-time Fourier transform and Welch's power spectral density estimate in terms of the overall diagnostic performance of parametric images in which the size of the parameter estimation region is severely reduced.26
This work presents a phantom-based analysis of constraints in the size of the parameter estimation region when estimating the attenuation coefficient for two different tasks: either bulk-parameter estimation or parametric image formation. The analysis is based on a power-law model for the frequency dependence of attenuation.37-44 We present a comparison of the performance of two spectral estimation methods commonly applied in QUS when estimating the attenuation coefficient (the short-time Fourier transform and Welch's periodogram) with Thomson's multitaper method. Results quantify the accuracy and precision of the estimates as the amount of data incorporated into the estimation is reduced.
The reference phantom method13 was used to estimate the attenuation coefficient of a sample from RF echo signals obtained from a clinical scanner. The reference phantom method accounts for system-dependent parameters (transducer and pulser-receiver transfer functions) using power spectral density estimates from a reference material (with speed of sound equal to that of the sample), generated at the same depths as estimates from the sample. The sample consisted of a homogeneous tissue-mimicking phantom composed of water-based gel containing graphite powder (50mg/cm3 of agar) to control the attenuation and 3000E glass beads (Potters Industries, Inc., Valley Forge, PA; 5 to 40μm diameter, 4mg/cm3 of agar) to provide scattering.45 The reference phantom was a similar homogeneous material, consisting of an emulsion of 70% safflower oil in gelatin and also containing 3000E glass-bead scatterers (4mg/cm3 of emulsion). Both the sample and the reference phantoms were in acrylic boxes that had 25μm thick Saran wrap® (Dow Chemical, Midland, MI) scanning windows.
During phantom fabrication, 2.5cm-thick test samples were prepared to measure the sound speed and attenuation of the phantom materials. Laboratory estimates of α(f) and the speed of sound of these materials were performed using a narrowband substitution technique46 at frequencies from 2.25 to 10MHz. The attenuation coefficient was modeled as a power of frequency f, using:
Table I shows the resultant α0 and β (and the corresponding R2 of the power-law fit), and the estimated speed of sound c ± one standard deviation for both phantoms. The speed of sound of the sample and the reference materials agreed within 0.4%. This difference is not expected to be a significant source of bias in the attenuation estimates.47 The laboratory-estimated parameters of the power-law fit were used as the expected values to which the α0 and β estimates from scanner derived backscatter signals were compared.
The phantoms were scanned with an 18L6 linear array transducer on a Siemens Acuson S2000 clinical imaging system (Siemens Medical Solutions USA, Inc, Malvern, PA). The transducer was driven by single cycle 9MHz pulses. The system was equipped with the Axius Direct Ultrasound Research Interface48 that allows acquisition of beamformed RF echo signals after time-gain compensation. The transmit focus was placed as shallow as possible (2.5mm) to estimate attenuation in regions distal to the focus, in accordance with previous results.47,49 RF echoes for 20 parallel image planes from the sample and 25 parallel planes from the reference phantom were acquired by sweeping the transducer elevationally between successive planes in steps equal to half the height of the transducer's elements. Each plane scanned provided a 2D RF echo-signal frame consisting of 456 acoustic scanlines spaced at 0.128mm. The RF data were digitized at 40MHz for a total of 2076 samples spanning 39mm axially.
The size of the scattering volume determines the correlation among neighboring RF echo samples. This volume is affected by the pulse bandwidth and the diffraction field. Therefore, the number of uncorrelated axial samples and acoustic scanlines within a spectral estimation window or a parameter estimation region depends on the experimental conditions. To generalize our results, we parameterized the size of the spectral estimation window and the area spanned by the parameter estimation region in terms of the number of uncorrelated axial (time) and lateral (acoustic scanlines) samples of RF echo signals. For this purpose we computed the separation between the two closest axial samples and closest lateral samples considered to be uncorrelated.20,26,50-53 For segments including m axial samples of RF echo signals from N adjacent acoustic scanlines, the lateral correlation coefficient is defined as:26
where Xj,i is the ith (i=1,..,m) axial sample of the segment of the jth (j=1,...,N) acoustic scanline. The upper bar indicates averaging of all the axial samples within the segment of a particular acoustic scanline signal. The computation of the axial correlation coefficient ρA(Δ) was performed similarly, substituting the lateral dimension for the axial dimension in (2). ρA(Δ) and ρL(Δ) were estimated over a region spanning 5mm axially (m=267) and 15mm laterally (N=121). Correlation lengths were defined empirically as the range of spatial lags (samples in the axial direction, acoustic scanlines in the lateral direction) over which the one-sided correlation coefficient (linearly interpolated at 1/1000 of a sample to obtain fractional lags) remained above 0.2.20,53 Under this criterion the axial and lateral correlation lengths within the parameter estimation region (defined below) were 0.188mm and four adjacent acoustic scanlines, respectively. In the axial direction, we define the “pulse echo correlation length” as the full width of the two-sided correlation curve at a 0.2 level, i.e., twice the one-sided axial correlation length.
The parameter estimation region required to obtain one α(f) estimate includes a set of axially-aligned spectral estimation windows, as illustrated in figure 1. The size of the parameter estimation region is defined axially by the number of spectral estimation windows, and laterally by their widths.
It is important to notice the trade-offs in the axial size of a spectral estimation window and the parameter estimation region. Each spectral estimation window should be long enough so that the power spectral density estimate is not dominated by windowing and tapering-function effects while also short enough that nonstationarity effects (diffraction, spatial variations in scattering or attenuation) can be considered negligible. Following this criterion, the maximum spectral estimation window length that was explored in this work was smaller than 10mm. The axial span of the parameter estimation region should be long enough to be sensitive to attenuation-dependent variations in the power spectral density amplitude over depth.
The size of the spectral estimation window was varied both axially and laterally, and the effects of these variations on the accuracy and precision of estimates of the power-law parameters of α(f) were studied. The window length was reduced from 20 pulse-echo correlation lengths down to 2.5 pulse-echo correlation lengths leading to window lengths from 7.5mm down to 0.9mm. The number of uncorrelated acoustic lines included in the spectral estimation window, and therefore in the parameter estimation region, was varied from 22 down to 4, resulting in widths from 10.9mm down to 2.0mm. Fourier transforms were computed after zero-padding the echo signals segments to 1024 points.
For each spectral estimation window size, a periodogram was computed by each of three methods: The Short-time Fourier transform (STFT), Welch's method, and Thomson's multitaper method. The STFT estimate of the periodogram is computed as the squared magnitude of the Fourier transform of the RF echo segment multiplied by a tapering function. Two tapering functions were used: the rectangular and Hann functions.28,29 In Welch's method, the RF echo segment is first divided into NSS overlapped subsegments. The power spectral density estimate from Welch's method is the average of the periodograms of each of the subsegments after tapering.54 The subsegment length and the overlap ratio were 50% of the original segment, and 50% of the subsegment length, values commonly used in QUS.26,29,55-59 The Hann tapering function was used for subsegment tapering. The multitaper method requires defining a value for the time-half bandwidth product NW (described below). Then, a set of tapered signal segments is obtained by multiplying an RF echo-signal segment by each one of the first NDPSS=2NW-1 Discrete Prolate Spheroidal Sequences (DPSS). These tapering functions contain the greatest amount of energy within a frequency band 2W.29,60,61 The squared magnitude of the Fourier transform of each tapered signal segment is referred to as an eigenspectrum, and the final power spectral density estimate is given by the weighted average of the set of NDPSS eigenspectra.29,60 The first three DPSS for 256 time samples and NW=3 are displayed in figure 2.
The rectangular and Hann are common tapering functions used in STFT-based α(f) estimations and offer good frequency resolution.28 Other tapering functions with lower side-lobe levels (Blackman-Harris, Chebyshev, and Kaiser) were not found to improve the accuracy and precision of the estimates over the Hann function. Assuming an underlying white Gaussian noise process, the 50%-overlap, Hann-tapered Welch's method is expected to reduce the variance of the power spectral density estimate by a factor kWeclh= 18NSS2/(19NSS-1).29 Thus, kWelch= 2.89 for our case. Under the same assumptions, the multitaper method would reduce the spectral variance by a factor kMultitaper~NDPSS=5,29,60 although this factor varies with the power spectral density value. When applied to RF echo signals from tissues or phantom materials, noise properties of the RF echo signals differing from white noise, such as colored noise from the pulse-tissue spectrum and coherent noise from the interference of scattered waves, affect the variance reduction performance of these methods.29
The multitaper method allows the user to conveniently choose the time-bandwidth product (NW) for the windowing function, where N is the axial length of the spectral estimation window and W is half the main-lobe bandwidth of the frequency response of the multitaper method, WMultitaper, normalized to the sampling frequency fs (W=WMultitaper/fs). It thus expresses the main-lobe width of the multitaper method as a multiple k of the main-lobe width of the STFT method with a rectangular taper (WRect=1/NΔt), where Δt =1/fs is the sampling period:
As previously mentioned, the variance of the power spectral density estimate can be reduced by using a larger number of tapers, and this number is related to the selected time- half bandwidth product (NDPSS=2NW-1). This comes at the expense of losing frequency resolution.29 Aiming at resolving smooth features in the spectrum while smoothing statistical and coherent noise, the following criterion was used to select NW: In order to resolve the main peak of a unimodal power spectral density having bandwidth BW, the width of the spectral response main lobe of the multitaper method (2WMultitaper) must be no greater than 0.5BW.29 If two peaks whose widths are each BWare separated by a frequency gap of BW, they could be properly resolved. Therefore, our criterion is 2WMultitaper=(BW/2)/3 or WMultitaper=BW/12. BWwas defined as the frequency bandwidth corresponding to the section of the power spectral density estimate of the sample material with values 10dB above the noise floor (from 4.8MHz to 8MHz in the present experiment) measured at the farthest power spectral density estimate from the transducer to consider bandwidth reduction caused by attenuation. Under these conditions, BW =3.2MHz and WMultitaper=0.27MHz. In the longest spectral estimation window, there were 401 samples (10 μs), giving NW~3. The resulting NW value is consistent with common values used to estimate smooth spectra.26,29 Further optimization of the NW value for the reduction of coherent noise is the subject of ongoing studies.
Attenuation estimates were made using power spectral density estimates from the sample and from the reference phantom computed using various window sizes. Three non-overlapped parameter estimation regions were defined side by side in each RF data frame. Each one of these regions included 22 uncorrelated scanlines. Estimates of the power spectral density were computed within each parameter estimation region at 34 axial locations, separated by one pulse-echo correlation length. This axial span accommodated at least four uncorrelated power spectral density estimates (see Discussion). The length and width of the spectral estimation windows centered at these locations, as well as the spectral estimation methods, were varied (see above), and attenuation estimates were obtained for each combination of these parameters. Reference spectra were obtained at the same axial locations as in the sample. To reduce spectral coherence noise in the reference spectra, a total of 4225 estimates from 25 parallel frames were averaged.
The sample-to-reference power spectral density ratio at each spectral estimation window depth was obtained. Assuming negligible differences in attenuation losses over the spectral estimation window length, this is given by:13
σb(f,z) is the backscatter coefficient, PSD(f,z) is a power spectral density estimate, and the subscripts “sample” and “ref” refer to the sample and the reference phantoms, respectively. The factor A(f;z) accounts for differences in attenuation between the transducer and the spectral estimation window center in the sample and in the reference phantom. Notice that the depth z refers to the center of the spectral estimation window; thus, we are accounting for possible effects of attenuation differences between sample and reference phantom within the spectral estimation window by performing a point correction.62 Although more accurate methods have been proposed to account for attenuation within the spectral estimation window, these methods are comparable to point compensation over the window lengths considered in this work.62
Assuming that the backscatter and attenuation coefficients do not vary over the parameter estimation region the logarithm L(f ;z) of the power spectral density ratio can be expressed as:13
A least squares linear fit is applied to L(f ;z) vs. z. The slope of this fit is proportional to the difference between the attenuation coefficients of the sample and the reference phantom at frequency f. The latter is known so the attenuation coefficient for the sample at f is derived. Three estimates of αsample(f) were available per frame, one for each parameter estimation region. Their average was taken, and a power-law least-squares fit (vs. frequency) was applied to produce one estimate of α0 and β per frame [see Eq. (1)].
In addition to the effects of reducing the spectral estimation window size, we also analyzed the accuracy and precision of α0 and β estimates when shortening the distance over which the linear fit of L(f ;z) vs. depth was applied. In figure 1 this distance is referred to as the “α-estimation length”. This was achieved by gradually reducing the number of power spectral density estimates within the parameter estimation region from 34 down to 2.
Two characteristics of the power spectral density estimate that affect the accuracy and precision of QUS parameters are its frequency resolution, and variance. These were quantified in terms of Δf, the spectral correlation bandwidth, and SNRPSD, the signal to noise ratio of the power spectral density estimate. Δf is defined by Lizzi et al.22 as the frequency band over which the spectral correlation coefficient varies by no more than 70%. The power spectral density correlation coefficient CPSD is defined as:
The value of Δf was assessed using the power spectral density estimates at the deepest locations (34th row of the 34×3 array) because they present the most severe bandwidth constraints caused by attenuation. The values of CPSD(f, f+Δf) were averaged across all frequency bins within the analysis bandwidth previously defined (4.8-8.0MHz). The average and standard deviation of Δf were computed from each of the three deepest PSDsample(f) estimates in each of the 20 sample frames.
The value of SNRPSD at frequency f and depth z was computed as
where r refers to each estimate of PSD(f,z) from the 20 sample frames. Brackets indicate averaging across the 20 estimates. For simplicity, we report the values of SNRPSD(f,z) at the frequency of the maximum power spectral density value, i.e., 6.2MHz. The average and standard deviation of SNRPSD (6.2MHz, z) were obtained from all estimates within the parameter estimation region.
The accuracy and the precision of α0 and β estimates as independent QUS parameters were quantified when constraining the spectral estimation window size and the α-estimation length. For this purpose, the normalized bias and standard deviation of α0 and β estimates were computed. The normalized bias (BiasQ) was defined as the expected value of the difference between estimated quantity Q (either α0 and β) and its value, Qexp obtained from laboratory estimates (Table I), normalized with respect to the latter. BiasQ was taken as the inter-frame average of the normalized difference between each frame's Q estimate and Qexp:
where the subindex r refers to each of the NR=20 sample frames. The normalized standard deviation (STDQ) was defined as the square root of the estimated value of the variance of Q, normalized with respect to Qexp. STDQ was computed as the square root of the inter-frame average of the normalized difference between each frame's Q estimate and the inter-frame average (brackets) of Q:
We also analyzed an overall “fit error”, FE, defined as the expected value of the absolute difference between the estimated power-law fit αFit,r(f) and the laboratory-estimated power law fit αexp(f), normalized with respect to the latter at each frequency point, and averaged across all frequency points included in the analysis bandwidth. It was estimated by computing for each frame r:
and then obtaining the average and standard deviation across the 20 sample frames.
Error bars in plots of BiasQ, STDQ, and FE correspond to the standard error of the inter-frame average estimated from the inter-frame standard deviation of each error estimate divided by NR1/2.
Figure 3(a) shows the frequency resolution measured by the spectral correlation bandwidth Δf as a function of the power spectral density estimation window length in pulse-echo (axial) correlation lengths. Each curve corresponds to a different spectral estimation method. Shortening the window increases Δf, as expected. Furthermore, Δf for the short-time Fourier transform (STFT) with a Hann taper, Welch's periodogram, and Thomson's multitaper method was larger than for the rectangular-taper STFT (or no taper) by a factor of 1.5, 2.2, and 2.8, respectively, because of the wider frequency-response main lobe of the latter spectral estimation methods with respect to the former. Thus, the STFT methods lead to better frequency resolution than the Welch and the multitaper methods.
The SNRPSD is shown in figure 3(b) as a function of the number of uncorrelated scanline signals within the spectral estimation window. A guide curve corresponding to the square root of the number of uncorrelated scanlines is also provided. SNRPSD=1 when no spectral averaging is performed and increases with N1/2 when averaging N estimates from uncorrelated RF echo scanlines.63,64 The agreement with the N1/2 curve corroborates our estimation of the lateral correlation length. SNRPSD is also expected to increase by factors kWelch and kMultitaper (defined in the Methods section) when using the Welch and multitaper methods. From figure 3(b), kWelch=2.7 and kMultitaper=4.5. These values are slightly smaller than those predicted for a white Gaussian noise process because of the particular noise properties of the RF echo signals and the use of Thomson's version of the multitaper method.29
Figure 4 shows power spectral density estimates of the sample from all spectral estimation methods using either (a, left) a severely length-constrained but wide spectral-analysis window (2.5 pulse-echo correlation lengths, 22 uncorrelated scanlines), or (b, right) a severely width-constrained spectral estimation window with its length in the middle of the analyzed range (10 pulse-echo correlation lengths, 4 uncorrelated scanlines). In the former case, the Welch and multitaper methods have wider power spectral density estimates than the STFT-based methods. In the latter case, the smooth appearance of the multitaper estimate is due to the ability of this method to reduce coherent noise, as quantified in figure 3(b).
Figure 5 shows αsample(f) estimates plotted versus frequency for an α-estimation length of 33 pulse-echo correlation lengths (12.4mm). Results are divided into two rows (i and ii) and three columns (a-c). The rectangular-taper STFT was used for spectral estimation in the top (i) row, while the multitaper was used in the bottom (ii) row. Columns (a), (b), and (c) correspond to different window sizes: 20 pulse-echo correlation lengths and 22 uncorrelated scanlines (largest window), 10 pulse-echo correlation lengths and 22 uncorrelated scanlines, and 10 pulse-echo correlation lengths and 2 uncorrelated scanlines, respectively. In each subfigure, the dark gray continuous curve is the estimated αsample(f) averaged across 20 frames, the light gray continuous curves are the average ± one standard deviation, and the black curve is the laboratory estimate. The αsample(f) estimates are centered about the laboratory estimates, indicating their accuracy. Furthermore, the standard deviation of the estimated αsample(f) does not depend on frequency, which agrees with theoretical derivations.27,63,64
The reduction of the spectral coherence noise by the multitaper method leads to a less fluctuating frequency dependence of the estimated αsample(f) (figure 5, bottom row), compared to the use of the STFT method (figure 5, top row). The reduction of the spectral estimation window length also reduces the spectral coherence noise, although a slight increase in variance expressed as an increase in the overall separation between the lightly shaded curves (average ± one standard deviation) is observed. This can be attributed to the loss of backscatter signal information as the window length is reduced.29
Figures 6 and and77 show the normalized bias as defined in Eq. (8) [α0 in figure 6(a), β in figure 7(a)] and the normalized standard deviation as defined in Eq. (9) [α0 in figure 6(b), β] in figure 7(b)] as functions of the α-estimation length expressed in pulse-echo correlation lengths. Subfigures (a) and (b) have four components (i-iv), one for each spectral estimation method. Different curves in each component correspond to a different spectral estimation window length. In all cases, the number of uncorrelated scanlines used for spectral estimation was 22. To facilitate image interpretation, error bars for window lengths only of 20 and 7.5 pulse-echo correlation lengths at two different α-estimation length values are included. Figure 8 shows the overall fit error defined in Eq. (10), and subfigures (a)-(d) correspond to different spectral estimators.
In figures 6--8,8, combinations of spectral estimation window lengths and α-estimation lengths shorter than 5 pulse-echo correlation lengths led to errors in α0 and β estimates larger than 200% and hence are excluded from further discussion. The plots can be divided in two regions. For α-estimation lengths longer than ~15 pulse-echo correlation lengths, the standard deviation in α0 estimates (37%), the bias and standard deviation in β estimates (2% and 16%, respectively), and the overall fit error (4%) are comparable among spectral estimation methods and for the different spectral estimation window lengths and α-estimation lengths. The only significant difference was in the bias of α0: the bias from the rectangular-taper STFT method (~3%) was one third of the bias from other methods. When the α-estimation length is shorter than about 15 pulse-echo correlation lengths, the errors increase as the α-estimation length is reduced. The rate of increase is more severe for shorter spectral estimation windows. The bias and standard deviation of α0, the bias of β, and the overall fit error from the multitaper method was smaller than those from other methods by a factor of 0.8 (ratio of 32% bias of β from the multitaper method over 40% bias from the rectangular-taper STFT) or less.
In some cases, an apparent decrease in fit error when using α-estimation blocks shorter than ~5 pulse-echo correlation lengths combined with spectral estimation window lengths shorter than 7.5 pulse-echo correlation lenghts was observed. This might have been caused by the inability to measure a signal decrease over such short data lengths. Under these circumstances, the estimate of the slope in Eq. (5) is effectively zero, and therefore the value assigned to αsample(f) is that of αref(f), which happened to be similar to the expected sample attenuation in this experiment.
Figure 9 depicts (a) the normalized bias and (b) the normalized standard deviation in estimates of (i) α0 and (ii) β as the spectral estimation lateral window size is reduced. In each subfigure, each curve corresponds to a different spectral estimation method. Similarly, figure 10 presents the overall fit error of α(f). The spectral estimation window length was 10 pulse-echo correlation lengths and the α-estimation length was 33 pulse-echo correlation lengths (the minimum total error expected is 4%). The multitaper method produced values of the standard deviation of α0 and β estimates smaller than those from other methods by a factor of 0.88 or less and values of the overall fit error smaller by a factor of up to 0.93.
The problem of constraints in the size of the parameter-estimation region on α(f) estimates was addressed by Parker,27 Yao et al.,63 and Huisman et al.64 These authors show that the variance of the slope of a linear fit of α versus frequency is proportional to the ratio of the square root of the frequency resolution divided by the signal-to-noise ratio of the power spectral density estimate (SNRPSD). Considering this and the results presented in figures 3(a) and 3(b), the multitaper method is expected to offer the best precision for attenuation coefficient estimations from spectral data. However, theoretical expressions for the variance of descriptors of the attenuation coefficient27,63,64 cannot be directly applied to our experiment because they were derived assuming the estimates of the power spectral density are normally distributed random variables. As discussed by Lizzi et al.,22 this assumption does not apply to severely constrained parameter estimation regions like some of those considered here.
Averaging periodograms from adjacent scanlines was used for computing the sample and reference power spectral density.22 When a small number of independent realizations of the periodogram are available, the Welch and multitaper methods notably reduced the contribution of spectral coherence noise. The increase in SNRPSD for the Welch and multitaper methods, with respect to the rectangular-taper STFT method (kWelch and kMultitaper), was close to theoretical predictions for white Gaussian noise. Thus the multitaper method outperformed the other spectral estimation methods analyzed in terms of SNRPSD.
Average estimates of αsample(f) shown in figure 5 were in agreement with laboratory estimates done on test samples of the phantom material. However, it is important to note that an estimate of the parameters of the power-law model applied to αsample(f) is obtained from a single estimate of αsample(f) and not from the inter-frame averaged αsample(f) presented in figure 5. Assuming a Gaussian distribution,63 a single estimate of αsample(f) is expected to lie between the two “average ± one standard deviation” curves in each subfigure of figure 5 with a probability of 68%. Although this assumption does not hold when the number of uncorrelated scanlines used in spectral estimation is severely constrained, it gives an idea of the expected variability of α0 and β estimates.
Figures 6--1010 show that the shortest reliable spectral estimation window length was about 8 pulse-echo correlation lengths. The pulse-echo correlation lengths estimated from intensity (squared RF) data can be considered an estimate of the pulse length.65,66 The intensity pulse-echo correlation length was 1.47 RF pulse-echo correlation lengths in our experiment. Therefore, 8 RF pulse-echo correlation lengths ~ 11 pulse lengths. This is within the range of 7-15 pulse lengths that is frequently presented as the window length offering the best tradeoff between spatial resolution, accuracy and precision when estimating attenuation and effective scatterer size.20,24-26,31,36,64-69 Tapering functions such as the Hann or Hamming, commonly used in QUS,14,15,24,25,36,62,67,70 were found to increase the bias and standard deviation of the attenuation estimates. This indicates a less important role of side-lobe reduction in the accuracy and precision of the estimated parameters than the frequency resolution and SNR of the power spectral density estimate.
The bias and standard deviation of both α0 and β did not change significantly when the α-estimation length was longer than 15 pulse-echo correlation lengths (figures 6--8).8). Furthermore, the choice of a particular spectral estimation method or window length had little effect on the attenuation estimate error when the estimation length was longer than about 15 pulse-echo correlation lengths, consistent with findings of Huisman et al.,64 and Labyed et al.33 This is explained by the number of axially uncorrelated power spectral density estimates within the 12.4mm span that the parameter estimation region covered. To determine this number, we computed the axial autocorrelation function among values at the same frequency of the 34 axial estimates of the power spectral density, and applied Eq. (23) in Ref. 63. This resulted in 4.3 independent power spectral density samples within 12.4mm (~33 pulse lengths). Therefore, the span required to have at least 2 uncorrelated axial estimates of the power spectral density would be 15.4 pulse-echo correlation lengths. This indicates that the selection of the spectral estimation method and window length is not of primary importance when estimating α0 and β if the parameter estimation region contains at least 2 axially uncorrelated power spectral density estimates. For our experimental condition, and considering the smallest reliable window length (~8 pulse-echo correlation lengths), the total data length would span about 23 pulse-echo correlation lengths or ~34 pulse lengths or 9mm.
When constraining the lateral dimension of the spectral estimation window (figures 9 and and10),10), our results show that the standard deviation of the α0 and β estimates are significantly larger than the bias. In the results shown, the window length was already reduced to 10 pulse-echo correlation lengths and the α-estimation length was 33 pulse-echo correlation lengths. According to the discussion in the previous paragraph, these parameter estimation region length values are within the ranges where the α0 and β estimates are not expected to vary significantly. The average standard deviations of α0 and β in these ranges, shown in figures 6(b) and 7(b), were about 37% and 16%. To keep these standard deviation levels, figures 9(b.i) and 9(b.ii) show that the window width can be reduced to about 15 uncorrelated scanlines. This is within the range of 10-20 independent scanlines suggested elsewhere.22,25,26,32 For our conditions, this parameter estimation region width spans 7.4mm. This is smaller than the parameter estimation region length of ~9mm stated in the previous paragraph. These results agree with previous findings that the total area spanned by the parameter estimation region would be limited by errors caused by its axial length rather than those caused by its lateral width.26,27,63
If tissue homogeneity allows the selection of a parameter estimation region that can axially accommodate at least two uncorrelated power spectral density estimates and at least 15 uncorrelated scanlines, all spectral estimation methods shown here are expected to perform similarly. This parameter estimation region size (15 pulse-echo correlation lengths × 15 uncorrelated scanlines) fits the task of bulk parameter estimation where large homogeneous parameter estimation regions are available to obtain accurate and precise estimates of α0 and β.
The diagnostic performance of a parametric image is related to the contrast-to-noise ratio (CNR) exhibited by the structure of interest. The CNR is affected by the loss of accuracy and precision of the estimates when the parameter estimation region is gradually reduced below ~1cm × 1cm to create parametric images with sufficient spatial resolution to depict spatial fluctuations of the tissue property. In the case of α0, the minimum α-estimation and spectral estimation window lengths for which the bias is less than 10% are ~10 and ~8 pulse-echo correlation lengths, respectively, spanning a total length of 18 pulse-echo correlation lengths or ~6.8mm (considering a parameter estimation region width of 22 uncorrelated scanlines or 10.9mm). This parameter estimation region length would allow, for example, to distinguish fatty tissue from invasive ductal carcinoma in the breast, whose α0 values differ from healthy parenchyma by about 72% and 53%, respectively.42 For this parameter estimation region length, the multitaper method reduces the standard deviation of α0 over the rectangular-tapered STFT method by a factor of 0.9. If the parameter estimation region width is also reduced to 6.8mm (14 uncorrelated scanlines), the use of the multitaper method reduces the standard deviation of α0 by a factor of 0.79 compared to use of the rectangular-taper STFT. Although a complete analysis of the CNR must consider the shape and size of the object of interest, this analysis indicates the potential advantage of the multitaper in the creation of parametric images. In addition, it is important to consider that the multitaper method reduced the error of modeling the attenuation coefficient as a power-law by a factor of up to 0.93, suggesting its applicability when estimating effective scatterer sizes. These estimates are obtained by comparing the frequency dependence of the backscatter coefficient to a theoretical model, so accurate and precise compensation of the frequency-dependent attenuation is of primary importance.
The advantage of the multitaper method has been attributed to its ability to reduce spectral coherence noise without severely compromising the accuracy of the power spectral density estimates. The influence of spectral coherence noise may be more important when the analysis bandwidth is further reduced, such as when dealing with low SNR values caused by strongly attenuating or deep lesions. In these scenarios, the multitaper method would offer additional advantages compared to other estimation methods. As an example, in figure 11 the overall fit error is plotted versus the spectral estimation window width for the rectangular-taper STFT (gray curves) and multitaper (black curves) methods. As observed, the multitaper method maintains the same error level under the bandwidth reductions tested, while the error from rectangular-taper STFT significantly increased when the bandwidth was reduced. Another method to reduce spectral coherence noise is spectral smoothing, achieved by convolving the power spectral density estimate with a smoothing function, such as that applied by Kim et al.19 for estimating attenuation. However, spectral smoothing introduces bias that would affect the smooth spectral components that the multitaper method was designed to resolve by means of the NW selection criterion.29 Alternatives to spectral estimation based on the Fourier transform include autoregressive algorithms. A comprehensive comparison between the multitaper method and the autoregressive methods is beyond the scope of this manuscript and will be a topic of future research.
This work has investigated the performance of various methods to compute the power spectral density in the context of estimating parameters describing the frequency dependence of the acoustic attenuation coefficient. For the task of obtaining bulk estimates of these parameters, our results support the use of any of the power spectral density estimations tested, with expected error levels of the power-law fit to α(f) on the order of 4%. When creating parametric images, the selection of the size of the parameter estimation region must consider its strong influence on the bias and standard deviation of the α0 and β estimates. However, in general, the multitaper method reduced the standard deviation of the parameter estimates with respect to the other methods (21% reduction for a parameter estimation region offering a 10% bias in α0). These results, which are specific to the materials investigated here, encourage use of Thomson's multitaper method for estimating acoustic parameters under spatially constrained conditions.