Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2715874

Formats

Article sections

Authors

Related links

Neuroimage. Author manuscript; available in PMC 2010 August 1.

Published in final edited form as:

Published online 2009 April 1. doi: 10.1016/j.neuroimage.2009.03.051

PMCID: PMC2715874

NIHMSID: NIHMS123391

The publisher's final edited version of this article is available at Neuroimage

See other articles in PMC that cite the published article.

A functional smoothing approach to the analysis of PET time course data is presented. By borrowing information across space and accounting for this pooling through the use of a non-parametric covariate adjustment, it is possible to smooth the PET time course data thus reducing the noise. A new model for functional data analysis, the Multiplicative Nonparametric Random Effects Model, is introduced to more accurately account for the variation in the data. A locally adaptive bandwidth choice helps to determine the correct amount of smoothing at each time point. This preprocessing step to smooth the data then allows subsequent analysis by methods such as Spectral Analysis to be substantially improved in terms of their mean squared error.

It is increasingly important in many types of neuroimaging studies to try to account for both the signal and noise efficiently while reducing the number of assumptions made about the nature of the underlying data. In addition, study designs are becoming increasingly complex and the signals under investigation closer to the limits of detection. In this note, we consider one specific example where techniques to account for noise can be difficult due to the nonlinear nature of the data, dynamic positron emission tomography (PET) time course analysis, and propose a non-parametric approach to noise reduction.

Dynamic PET data is collected in four dimensions, three spatial and one temporal, and it will be the time dimension that is of particular interest here, although the analysis will incorporate a spatial pooling of information. The time series data collected, while typically having independent errors across time, does have a nonlinear nature due to the chemical reactions that drive the system. Due to experimental constraints such as the need to limit the radiation dosage given to a subject or the count rate properties of the PET tomograph, the time course measurements are often also fairly noisy. In many fields of application, one method to deal with noisy data would be to smooth it.

However, this causes difficulty here due to the nonlinear nature of the time course which is measured on an unequal grid and contains distinct peaks. Some methods of incorporating spatial information have been investigated in the literature (O’Sullivan, 2006; Jiang and Ogden, 2008), but these tend to require techniques such as mixture modelling, which limit their potential to be combined with commonly used techniques in PET time course analysis. Methods such as wavelets in the time dimension (Millet et al., 2000) offer one possible solution using prespecified basis functions, and can be used with additional spatial wavelet bases to incorporate both spatial and temporal information (Alpert et al., 2006). In this paper, an alternative approach is taken, where the basis functions which best describe the data are estimated directly, incorporating both spatial and temporal information non-parametrically so that subsequent model assessment, whether it be parametric or otherwise, can be carried out as before. In practice, no smoothing or spatial pooling is usually performed and a particular model or model class is assumed (Gunn et al., 2001) and estimated, and the noise just seen as a measurement error to the model. However, fits to models can be adversely affected by noise in the data (Slifstein and Laruelle, 2000), and a way of preprocessing the data to account for some of the noise could provide significant advantages in terms of reducing the Mean Squared Error (MSE) of the parameter estimation. However, any smoothing would have to be able to account for the nonlinear nature of the data.

In traditional analysis, either voxel or region data is analyzed separately, one voxel or region at a time. If region data is used, then an average is taken across voxels presumed to be in the region, making the implicit assumption that the only source of difference in the region is measurement error. However, in the proposed method, the data are treated as random functions coming from a population. Thus the data is pooled in a manor to both account for the mean of the population and also the way in which the population varies (through analysis of the covariance structure). This makes particular sense for PET data, where the data is very much related, but where each individual curve varies dependent on a large number of biological factors (such as receptor density, blood flow, specific and non-specific binding) which could be treated as random across the set of curves.

Here, an algorithm, based on functional principal component analysis (FPCA), is proposed to smooth PET time courses without making particular model assumptions in the preprocessing. The technique is related to Principal Component Analysis which has previously been used in PET blood flow studies (Friston et al., 1993) and in PET time course analysis (Kimura et al., 2002; Turkheimer et al., 2003; Joshi et al., 2008), but differs in that it includes a smoothing component due to the functional nature (ie the data can be consider as a continuous mathematical function) of the time course data. Functional data analysis (Ramsay and Silverman, 2005) has received only limited treatment in the neuroimaging literature with studies on the special cases of single subject pooled detection (Viviani et al., 2005) and on non-stationary noise estimation (Long et al., 2005). Here it is of interest to look at both the estimation procedure for the component in the presence of noise, but also how to take into account the large numbers of curves from different parts of the image. In addition, one particular advantage of the analysis is that some of the noise and covariance characteristics well known to be present in PET can be easily incorporated using a new statistical model we have termed the multiplicative nonparametric random effects model. The algorithm also includes steps to choose the appropriate amount of smoothing at each temporal point and the number of representative functions needed to model the data. Another intrinsic advantage is that the algorithm takes advantage of the information available on multiple neighboring time courses when adjusting an individual time course. This also has the effect of measurement error being correctly accounted for without the need to choose a weight function as in most PET analysis, as the choice of weight function can influence the bias present in the estimation after analysis (Peng et al., 2008). It will be demonstrated through simulation and measured data that the method greatly reduces the MSE estimates of parameters subsequently estimated from models or model classes and stabilizes subsequent model fitting.

While PET data will be the focus of this note, it is indeed possible to extend this analysis to other types of data, particularly fMRI. However, the presumed smooth structure of the underlying system in PET makes the method particularly appealing in this case, and thus is a compelling motivation for the methodology.

The rest of the paper is as follows. A detailed explanation of the method is given in Section 2, including the new multiplicative nonparametric random effects model. The next section contains a set of simulations to evaluate the method, and a small application to measured PET data. Finally some discussion relating to the work is given.

We take the point of view that the observed PET time-course data at the *i ^{th}* voxel are generated from an underlying smooth random function

Consequently, for simplicity if the time of measurements is assumed the same for each voxel, the observed data at the *i ^{th}* voxel is:

$${Y}_{\mathit{ij}}={C}_{i}({t}_{j})+{e}_{\mathit{ij}},\phantom{\rule{0.2em}{0ex}}\mathrm{where}\phantom{\rule{0.1em}{0ex}}{t}_{j}\in \phantom{\rule{0.1em}{0ex}}I,\phantom{\rule{0.2em}{0ex}}\mathrm{for}\phantom{\rule{0.2em}{0ex}}j=1,\cdots ,p;$$

(1)

and *e _{ij}* are independent mean zero random variables with var(

Here we assume that the noise *e _{ij}* is independent but possibly inhomogeneous across time, which is a reasonable assumption in PET (Aston et al., 2000). We also assume the measurement error is independent across

Our main interest is to recover the latent PET concentration trajectory *C*(·) = {*C*(*t*) : *t I*}, from the noise-contaminated measurements taken at finitely many locations. An ad hoc approach would be to smooth each observed individual voxel time-course data with a nonparametric smoothing method (Fan and Gijbels, 1996), but such an approach is not efficient as the number *p* of observations is typically moderate. Besides, the level of smoothing needs to be tuned for each individual trajectory, requiring different tuning parameters at different voxels. A much more effective way to restore the entire process *C*(·) is to perform the smoothing procedure on a uniform platform, which also allows information sharing between neighbouring voxels. We demonstrate in this paper that FPCA is well suited for such a purpose.

For multivariate data, principal component analysis amounts to conducting a spectral decomposition of the covariance matrix of the multivariate random vector. Such a spectral decomposition has been extended to functional data measured without error at a densely spaced temporal grid by, e.g. Castro et al. (1986), Dauxois et al. (1982), and Rice and Silverman (1991), among others. We adopt a new approach that accommodates unequally spaced temporal grid as well as noise, which is common for PET data. Specifically, let *μ*(*t*) = *E*(*C*(*t*)) and Γ(*s, t*) = Cov (*C*(*s*), *C*(*t*)) denote the mean and covariance function of the process *C*(·). The functional principal component (Karhunen-Loève) representation for the *i ^{th}* concentration temporal curve,

$${C}_{i}(t)=\mu (t)+\sum _{k=1}^{\infty}{A}_{\mathit{ik}}{\phi}_{k}(t),\phantom{\rule{0.2em}{0ex}}t\in I,$$

(2)

is based on the orthogonal eigenfunctions *ϕ _{k}*(

Many of the processes which are represented in the PET time course data have chemical rates associated with them. These rates are dependent on a large number of biological factors, too numerous and complex to be exhaustively represented or identified in the discretely and noisily measured data. However, if an alternative viewpoint that the rates are random variables is taken, then a small additive random change in one rate will lead to a multiplicative change in the time course. This is a separate multiplicative random effect from the effect of the poisson decay process which is assumed to only affect the measurement error process and not the covariance function when the measurement error has been accounted for. In addition, due to the complexity of the underlying factors which influence the random nature of the curves, it is highly desirable not to make parametric assumptions about nature of the randomness, other than the likely presence of a multiplicative effect.

Moreover, from the random curve viewpoint, a source of variation may be in the scale difference of the curves suggesting that one of the eigenfunctions, e.g. *ϕ _{m}*(

$${C}_{i}(t)={B}_{i}\mu (t)+\sum _{k=2}^{\infty}{A}_{\mathit{ik}}{\phi}_{k}(t),$$

(3)

where *B _{i}* = 1 +

A form of multiplicative standard PCA model was used by Joshi et al. (2008) for PET time course data, as the mean was included in the regression used to reduce noise. However, it was not explicitly removed from the remaining covariance, and thus does not have a true multiplicative structure as defined above.

Since time course data are observed at each voxel, a question arises as to how to utilize the voxel positions when performing the multiplicative random effects model (3). We propose two ad hoc procedures in this subsection, and two additional extensions in the next section. All four procedures involve a functional PCA, which forms the bedrock of all nonparametric random effects models considered in this paper. Below we describe the estimation procedures involved in each of the building blocks. In addition, in Figure 1, a flowchart of the analysis procedure used in the measured data analysis is given.

Flowchart of the proposed fFPCA analysis procedure from start to finish. This describes the procedure used in the ^{11}C-Diprenorphine analysis

Firstly, only in-brain voxels are used for the analysis, with voxels outside the brain not considered. The mean function can be estimated by pooling all the data points and their measurement times {(*Y _{ij}; t_{j}*) :

For computational efficiency and model flexibility (no parametric assumptions are made on the distribution of *B _{i}*), we propose a least squares method to estimate the multiplicative random effect

After the mean function *μ*(*t*) and the multiplicative random effect *B _{i}* have been estimated for all subjects, the remaining covariance structure in

The variance *σ*^{2}(*t*) of the noise function is also of interest and can be estimated by first estimating the variance of *R*(*t*) = *Y*(*t*) − *Bμ*(*t*). To do so, apply a smoother to the scatter plot based on the diagonal elements {(G* _{ikk}; t_{k}*) :

Once we have a covariance estimate for Γ_{(−B)}, the eigenfunctions and eigenvalues {(*ϕ _{k}*, λ

$${{\mathit{\int}}_{I}\widehat{\mathrm{\Gamma}}}_{(-B)}(s,t){\widehat{\phi}}_{k}(s)\mathit{ds}={\widehat{\mathrm{\lambda}}}_{k}{\widehat{\phi}}_{k}(t),$$

(4)

where the * _{k}* are subject to

Since PET time-course data are available on each voxel, which belongs to a particular slice, there are several ways to perform FPCA. Here slice can be taken to mean any particular segmentation of the data in space. The natural one to use in the measured PET data will be the z-plane. Below we describe two ad hoc approaches.

**pFPCA**: This approach combines all the PET time-course data across all voxels into a pooled sample of curves and then apply steps 1-5 of the algorithm. The prefix “p” stands for “pooling”.**smFPCA**: This approach estimates the mean function for each slice separately by implementing the one-dimensional smoother in Step 1 using all the voxel curves on the same slice. Once the mean function on each slice has been estimated, Step 2 of the algorithm is employed for each voxel curve using the estimated mean function for the same slice in which this voxel curve lies. Afterwards, all the residual time-courses*R*are pooled together from all voxels, and a common residual covariance is estimated through a two-dimensional smoothing approach as in Step 3 of the algorithm. Thereafter, Steps 4 and 5 are employed. The prefix “sm” of this approach suggests that the difference is in the sliced mean function, but not the covariance function, has been accounted for._{ij}

To summarize, these ad hoc approaches utilize information of the slice location of each voxel differently, with pFPCA ignoring this information completely. A similar FPCA approach that does not assume the multiplicative structure of our pFPCA was proposed in Viviani et al. (2005) for fMRI data. Our numerical findings in section 3 suggest efficiency loss when pooling all the voxel curves together to borrow information. An interesting question arises as how to best utilize the slice information. Below we describe two alternative approaches.

The above question can be framed in a more general way:

How does one take into account available covariate information in PET time-course data under the platform of functional principal component analysis?

As an example, in the PET data in section 3, we have information on the thickness of each slice and its vertical location in the brain which can be used as information to assist the smoothing. The information is more informative than simply the voxel size as location gives relative position of voxels. Let *Z* reflect this covariate information; in this case *Z* is a scalar. Since we do not know for sure the form of the covariate effects, a nonparametric approach seems suitable as an exploratory tool. The only assumption that we will make is that the mean functions, and possibly the covariance surfaces as well, change smoothly across the slices, a very mild assumption given the spatial resolution of the data. Therefore sharing some information across slices likely will increase efficiency. There are two ways to model covariate effects, either through the mean function only, or through both mean and covariance functions. Both approaches assume that the mean function *μ* depends on a covariate value *z*, so that *μ* *μ*(*t, z*). The difference is in the handling of the covariance structure. While the fully adjusted approach has the covariance surfaces also depending on the values of the covariate, the mean adjusted approach does not account for the covariate effect on the covariances. Similar approaches to handle covariate effects are investigated in a technical report of Jiang and Wang (2008), but for a different model. Below we briefly describe how to adapt these approaches to the setting (3) of a multiplicative random effects model.

Here one assumes for the *i ^{th}* voxel whose slice information is coded in

$${C}_{i}(t)={B}_{i}({z}_{i})\mu (t,{z}_{i})+\sum _{k=2}^{\infty}{A}_{\mathit{ik}}{\phi}_{k}(t),$$

(5)

Thus, only the mean function varies with *z _{i}* but not the residual covariance. This model is a counterpart of the smFPCA in section 2.2, where centering of the voxel time-course curves from the same slice

This model assumes that both the mean function *μ*(*t, z*) and residual covariance function vary with the covariate information *z*. Thus, for the *i ^{th}* voxel curve with covariate

$${C}_{i}(t)={B}_{i}({z}_{i})\mu (t,{z}_{i})+\sum _{k=2}^{\infty}{A}_{\mathit{ik}}({z}_{i}){\phi}_{k}(t,{z}_{i}),$$

(6)

where *ϕ _{k}*(

The aforementioned algorithms to perform the multiplicative FPCA require nonparametric smoothing methods involve the choice of a smoothing parameter to regulate the amount of desired smoothness. For the local polynomial smoothers used in the smoothing steps, the corresponding smoothing parameter is the bandwidth, which assigns less weights to data further away from the time point where the estimate is located. Usually, a single global bandwidth is used at all locations for simplicity and cross-validation methods are typically used to select the desired bandwidths. For PET time course data, a global bandwidth is appropriate along the covariate coordinate (*z*) but not desirable in the time coordinate due to the denser measurement schedule at the beginning of the time period and the sharp peak for each voxel curve near the left boundary (see Figure 2(a)). To retain these peaks and in order not to compromise the performance at other temporal locations, local bandwidths, which adapt to the temporal location, are recommended and used in our analysis.

Example of analysis for five voxels from ^{11}C-Diprenorphine data. Top Left (a) Original Voxel Data; Top Right (b) Local Bandwidth Function; Bottom Left (c) Scree Plot; Bottom Right (d) Smoothed Voxel Data

In essence, a smaller bandwidth which involves less smoothing is preferred near the peak location, while larger bandwidths are used near the right boundary where the curve is relatively flat. The resulting local bandwidths used in our numerical analysis in section 3 are plotted in Figure 2(b). Specifically, we first selected 13 time locations where measurements of the time-course data were sampled and then determined the respective bandwidths for each of these 13 locations. To choose bandwidth *b*(*t*) at location *t* we used a rule of thumb that the interval [*t − b*(*t*), *t* + *b*(*t*)] included at least four observations. We also employed boundary correction to ensure a positive bandwidth choice. Overall, as the PET data were sampled more frequently in the beginning than near the end of the observation period, this choice led to increasing bandwidths for increasing time points. To ensure a smooth outcome, the bandwidth needs to vary smoothly across time locations, we therefore fitted a polynomial of order 4 to the 13 pairs (*t _{j}, b_{j}*(

The selection of *α* through cross validation follows standard practice that deletes one voxel curve while fitting the mean of covariance function using the remaining voxel curves. This is possible for two-dimensional smoothers such as in pFPCA, smFPCA, or mFPCA, but can be computing intensive, especially when a three-dimensional smoother is employed in fFPCA. We therefore employed a leave-one-slice-out cross-validation method for fFPCA by deleting one slice at a time to save computing time. For the remaining FPCA approaches, we also recommend to adhere to the leave-one-slice-out cross validation as numerical results in section 3 suggest that it performs well in general and the computational cost is manageable on a PC.

We further speed up computationally the bandwidth selection process by using a small approximation to the full multidimensional smoothing which was undertaken in the data analysis. Specifically, separable smoothing in the cross-validation step is used, in that the time dimensions are smoothed sequentially and then the covariate dimension to select the bandwidth. However, when the smoothing parameters have been selected, a full 2-D or 3-D smoothing incorporating all information is used to smooth the functions of interest (ie the final smoothing is not sequentially performed but performed in a full multidimensional manner at each point).

Several approaches could be applied to choose the number of components, such as AIC, BIC, and fraction of variance explained (FVE). However, AIC and BIC require model assumptions for all the random terms while FVE does not. Thus to avoid making further model assumptions, we employ FVE to select the number of components for the predicted trajectories. The FVE method is defined as the minimum number of components needed to explain at least a specified fraction, say 80%, of the total remaining variation not explained by the multiplicative random effects. That is, the number of components *k* selected by FVE can be represented as
${\sum}_{i=2}^{k}{\widehat{\mathrm{\lambda}}}_{i}/{\sum}_{i=2}{\widehat{\mathrm{\lambda}}}_{i}\ge 0.80$. In practice, the FVE method could be guided by a scree plot (see Figure 2(c)) and 80% provides satisfactory results in our simulation setting and for the measured [^{11}C]-Diprenorphine data in section 3. Examining the PET simulation data, two eigenfunctions explain about 80% of the variation when the noise level is close to that of measured data and thus was chosen most often as the number of components needed. In the measured data, three components seemed to most often represent at least 80% of the variation. Furthermore, under the multiplicative model assumptions as described in section 2.1, the eigenfunctions actually explain about 80% of the residual variation not explained by the multiplicative term, in essence making this value a strict lower bound of the amount of total variation explained, with it likely to be much large in practice. Therefore, even though only a very small number of eigenfunctions are used, the reconstruction of the concentration trajectories after smoothing would appear to be very accurate (see Figure 2(d)).

Once all the unknown quantities in the multiplicative model (3) are estimated and the number of components *K* selected, the concentration temporal curve for the *i ^{th}* voxel can be reconstructed as

$${\widehat{C}}_{i}(t)={\widehat{B}}_{i}\widehat{\mu}(t)+\sum _{k=2}^{K}{\widehat{A}}_{\mathit{ik}}{\widehat{\phi}}_{k}(t).$$

(7)

Having found a suitable decomposition of the time course data using the multiplicative models above, the data are reconstructed from its functional components and analyzed using standard techniques such as Spectral Analysis (Cunningham and Jones, 1993). While such methods often are affected by high noise levels, as the preprocessing reduces the noise by only selecting a few components to represent the data, the fitting becomes more robust. In addition, a particular feature of the preprocessing is that the measurement error does not need to be subsequently specified, as it is nonparametrically assessed in the procedure and then accounted for. This means that in the subsequent analysis no weight function needs to be specified. Weighting schemes in PET are usually based on a single global scheme for all voxels, but it has been observed that incorrect weights can lead to significant biases in analysis due to the non-linearities present (Peng et al., 2008). Therefore removing this step of requiring pre-specification of the weights is an interesting byproduct of the algorithm.

In the analysis below, the data was all analysed using Spectral Analysis (Cunningham and Jones, 1993) after smoothing. Spectral Analysis does not assume a known compartmental structure, but rather performs a model selection through a non-negativity constraint on the parameters. In particular, the curve *C*(*t*) is parameterised by

$$C(t)=I(t)\otimes \sum _{j=1}^{K}{\alpha}_{j}{e}^{-{\beta}_{j}\phantom{\rule{0.1em}{0ex}}t}$$

(8)

where *I*(*t*) is a known input function, and *α _{j}* and

$${V}_{T}={\mathit{\int}}_{0}^{\infty}\sum _{j=1}^{K}{\alpha}_{j}\phantom{\rule{0.1em}{0ex}}{e}^{-{\beta}_{j}\phantom{\rule{0.1em}{0ex}}t}\mathit{dt}=\sum _{i=1}^{K}\frac{{\alpha}_{j}}{{\beta}_{j}}$$

(9)

For more details see Cunningham and Jones (1993) or Gunn et al. (2002). This method is well known to be sensitive to noise with the bias being highly dependent on the level of noise present (Gunn et al., 2002; Peng et al., 2008). Thus if smoothing can reduce the noise level in the data, the analysis should be much less biased even at noise levels associated with voxel data.

In order to determine the efficacy of the proposed methods in relation to standard analyses, simulations were performed along with the analysis of a measured Diprenorphine data set. The simulations were designed to determine whether combining information, through the different types of pooling proposed above, in different settings would be beneficial to the analysis, while the measured data set allowed an evaluation of the methods in practice. The results of the above models were compared to the results from Spectral Analysis preformed on the original data, which neither has any intrinsic smoothing associated with it, nor makes use of any spatial pooling of the data, as the analysis is performed independently on each voxel.

A set of simulations were generated for different measurement error noise levels based on the simulations in Gunn et al. (1997) for a blood/plasma input function model. Simulations were generated from a two-tissue compartment model with associated input function such that the simulated data was directly comparable to the measured data below. The rate constants used in the simulations were taken from Gunn et al. (1997) along with a measured plasma input function (the input function associated with the measured data). Thus the rate constants [*K _{1}* = 6.7 × 10

An additional level of randomness was added to the simulations, by allowing the rate constants to vary in such a way that the associated true *V _{T}* in the simulations had a variability of approximately 6%, roughly equivalent to the voxelwise variability observed within regions in the measured data. Finally, the data was blurred using a standard Gaussian blurring kernel, with FWHM of 6mm, with the voxels in the image being presumed to be 2mm × 2mm (as this is a 2 dimensional simulation).

The results of the single blurred region analysis can be seen in Table 1. Firstly, it does seem counterintuitive that the measurement noise level seems to have little effect on results. However, this can be explained by considering the fact the noise simulations are blurred across the planes, inducing a dependence structure. Due to the curves each being reconstructed from a set of random parameter values, this causes there to be multiple components present in the data even without the measurement error noise being added. These components can be difficult to detect, causing a large variance in the resulting estimates. The MSE is dominated by this variance in all cases. However, it is also very noticeable that the smoothed results all greatly improve on the MSE, in particular the sliced mean FPCA data analysis. This suggests that in principle it would be advantageous to smooth the data prior to analysis.

The simulations of the previous section do however fail to account for the feature in PET data that multiple regions are present in the data, and that the smoothing takes advantage of a spatial pooling of the data. In order to assess the effect of different regions, a similar simulation was now performed using a brain phantom (Shepp-Vardi phantom, 128 × 128 pixels, repeated over 50 realisations) with five different regions of varying sizes. Different signals were placed in each of the regions based again on random parameters and then the phantom blurred as above. The values used in each region, along with its size is give in Table 2. The idea was to see how well different regions were recovered when they would be subjected to different partial volume effects based on their size and differing intrinsic *V _{T}* values. Again the covariate adjustment was across columns in the 2-D simulation.

The results of the blurred phantom simulation study are somewhat consistent with the results of the previous simulations, although one marked difference with the performance of smFPCA occurs. In this case there is no covariate information in the first simulation set, so pooling over all the curves and only adjusting the means in each column actually helped capture more information. However, in the phantom case, the performance is enhanced by assuming that there is a covariate function across the columns, as the differing regions have different variance structures (due to the random parameter values), and thus smFPCA performance is affected as it does not use this information. pFPCA is somewhat effective at low noise levels, but becomes inefficient at higher noise levels, while mFPCA is also beneficial at all noise levels except the largest. In contrast, in all cases using fFPCA, where both the mean and covariance functions are adjusted, greatly improves the MSE for the data. The highest noise level is approximately of the same magnitude as the noise level observed in voxelwise measured data (in addition to the within region variability added in the simulations) and fFPCA still yields considerable gains over the method without smoothing.

A dynamic scan from a measured [^{11}C]-diprenorphine study of normal subjects, for which an arterial input function was available, were analysed. The subject underwent a 95-min dynamic [^{11}C]-diprenorphine PET baseline scan. The subject was injected with 185 MBq of [^{11}C]-diprenorphine. PET scans were acquired in 3D mode on a Siemens/CTI ECAT EXACT3D PET camera, with a spatial resolution after image reconstruction of approximately 5 mm. Data were reconstructed using the reprojection algorithm (Kinahan and Rogers, 1989) with ramp and Colsher filters cutoff at Nyquist frequency. Reconstructed voxel sizes were 2.096 mm × 2.096 mm × 2.43 mm. Acquisition was performed in listmode (event-by-event) and scans were rebinned into 32 time frames of increasing duration. Frame-by-frame movement correction was performed on the dynamic [^{11}C]-diprenorphine PET images.

From the results of the simulation it was deemed that fFPCA was most applicable to the analysis of data where there are multiple regions. The analysis was carried out in three dimensions and the covariate adjustment carried out using z-plane information (rather than column information as was done in the simulations). A scree plot was used to determine the number of eigenfunctions needed for the smoothing, and on average three eigenfunctions (*K* = 3) were needed to explain 80% of the variation of the remaining eigenfunctions after accounting for the multiplicative random effect component. The flowchart of the analysis procedure is given in Figure 1.

Figure 2(a) shows five typical voxels from the study prior to smoothing while Figure 2(d) shows the same voxels after smoothing. The results show that the smoothing retains the functional form of the data while removing almost all the noise from the signals. Indeed in terms of mean squared residuals, the noise is reduced on average by 72% (range 55%-90%). As can be seen in Figure 3, the data without smoothing has higher *V _{T}* on average than that after smoothing. The average difference was about 4.5% and this value was fairly uniform over the range of

In this paper, a methodology has been presented for smoothing PET time course data, using a new statistical model based on multiplicative functional principal components. It has been demonstrated that using this method reduces the MSE for subsequent parameter estimation as many techniques such as Spectral Analysis are known to have noise dependent bias, and thus the removal of noise prior to the analysis will improve their accuracy. In terms of analysis the model improves estimation in two distinct ways. Firstly, the error distribution is accounted for automatically yielding reduced noise after smoothing with the remaining noise having homogeneous variance across time removing the need to specify a variance weight function for the noise in the data. Secondly and most importantly, the analysis borrows information across space and when adjusted using a non-parametric covariate for the spatial information, this yields dramatic improvements in MSE. Although we have concentrated on using Spectral Analysis in subsequent analysis, due to both its popularity in clinical applications and also its well known noise dependent bias, the multiplicative model developed here is completely independent of any subsequent parameter estimation method used. In addition, it is not dependent on whether a plasma input function is available (as was the case in the measured data here) or not, in that the output of the analysis is image data that has been temporally smoothed.

One could also perform a standard FPCA without assuming a multiplicative random effects model. The algorithm is actually simpler by skipping the estimation of the multiplicative random effects, or equivalently, setting *B _{i}* = 1 for all voxels. However, given the nature of the data, and the likely way the random nature of the curves is expressed through the unknown rate constants, a multiplicative model would seem more appropriate.

Several different methods have been presented to carry out the smoothing, all attempting to borrow information spatially to inform the temporal smoothing. The more naive methods of pooling such as pFPCA have been shown to be less effective than methods which try to account for the spatial structure to some degree through the use of non-parametric covariate adjustment, particularly using a full covariate adjustment on both the mean and the covariance structure fFPCA. However, the more complex methods do carry a computational penalty associated with them, in that carrying out the covariate adjustment can be time consuming. In addition, it would be possible to perform a full FPCA analysis on each slice separately, sfFPCA as a natural extension of smFPCA. However, this would be equivalent to doing a slicewise pFPCA which lacks the necessary covariate adjustment to make full use of the data. In addition, due to the need to perform a new bandwidth choice for each slice (the most computationally intensive part of the procedure due to the cross-validation), this method is much more computationally expensive than the other methods. We did however verify that no advantage would be gained by attempting this method in the simulations (data not shown) and it was found to be worse that the other smoothing methods for the blurred phantom multiple region data. In addition, we do not claim to find unbiased estimates of the signal after smoothing as smoothing naturally introduces a bias into the estimation procedure. One source of bias is the possible autocorrelation that enters the system due to the smoothing. However, given the non-parametric approach undertaken, it is difficult to assess the exact nature of any autocorrelation present with respect to any other particular parametric model. However, we do believe that any bias is small in comparison to the bias which arises from applying a procedure such as Spectral Analysis to highly noisy data as is evidenced by the simulations.

In all non-parametric smoothing methods there is the problem of choosing the bandwidth parameter. It has been shown that a locally adaptive smoothing structure is more applicable here, given the nature of the curve, and cross-validation is used to select the final bandwidth choice. In addition, the scree plot of percentage of variance explained is also used to select the number of components. This could be modified to an AIC or BIC approach to choose the number of components if that were preferred, but the approach used here seemed robust both in the simulations and real data analysis. The fact of using the multiplicative model means that if the FVE criteria is used, the amount of total variation explained will be considerably more than the level chosen when the mean is proportional to the first eigenfunction. By examining a non-multiplicative FPCA model for the simulation data, it was found that the first functional principal component not only resembled the mean to a great extent, but that it accounted for 70% of the total variation. This would imply that an FVE criteria for the multiplicative model of 80% would actually account for closer to 95% of the total variation in the data. Alternatively, more localised measures could be used but these are more difficult to assess objectively than the global measures such as FVE and as such FVE was chosen.

While the current choice of the non-parametric covariate function could justifiably be considered slightly naive, just being the z-location in space, it would appear that this function captures enough of the spatial structure to yield the pooling of the information over this covariate very useful in reducing the MSE of subsequent fitting. However, this is just a first step and it is intended to make the covariate information more complex in future. Several possibilities are available. The use of surrogate information such as structural MRI could be combined to give some covariate function that would contain more information, or a full three dimensional spatial covariate used. This second approach would be particularly computationally intensive in that the scatter plot smoother would now need to be over four dimensions in mFPCA and five dimensions in fFPCA. However, by using a simpler smoothing algorithm it may be possible to achieve this, and this is one of the lines of enquiry in our current research.

The methodology presented here is still very much in its infancy especially when it is applied to neuroimaging data. The aim of this note was to show that the methods of functional principal component analysis can be applied in practice to measured PET data using spatially pooled information. While there is still much scope to improve how the spatial information is incorporated in the pooling, it is interesting to see that even a simple and somewhat naive approach of pooling in one dimension can dramatically reduce the MSE error estimates. We believe that further research into the use of functional data analytic methods in neuroimaging will lead to many fruitful advances.

**Publisher's Disclaimer: **This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

- Alpert NM, Reilhac A, Chio TC, Selesnick I. Optimization of dynamic measurement of receptor kinetics by wavelet denoising. Neuroimage. 2006;30:444–451. [PubMed]
- Aston JAD, Gunn RN, Worsley KJ, Ma Y, Evans AC, Dagher A. A statistical method for the analysis of positron emission tomography neuroreceptor ligand data. Neuroimage. 2000;12:245–256. [PubMed]
- Castro PE, Lawton WH, Sylvestre EA. Principal modes of variation for processes with continuous sample curves. Technometrics. 1986;28:329–337.
- Cunningham VJ, Jones T. Spectral analysis of dynamic PET studies. Journal of Cerebral Blood Flow and Metabolism. 1993;13:15–23. [PubMed]
- Dauxois J, Pousse A, Romain Y. Asymptotic theory for the principal components analysis of a vector random function: Some applications to statistical inference. Journal of Multivariate Analysis. 1982;12:136–154.
- Fan J, Gijbels I. Local Polynomial Modelling and Its Applications. London: Chapman and Hall; 1996.
- Friston K, Frith C, Liddle P, Frackowiak R. Functional connectivity: The principal-component analysis of large (PET) data sets. Journal of Cerebral Blood Flow and Metabolism. 1993;13:5–14. [PubMed]
- Gunn RN, Gunn SR, Cunningham VJ. Positron emission tomography compartmental models. Journal of Cerebral Blood Flow and Metabolism. 2001;21:635–652. [PubMed]
- Gunn RN, Gunn SR, Turkheimer FE, Aston JAD, Cunningham VJ. Positron emission tomography compartmental models: A basis pursuit strategy for kinetic modeling. Journal Of Cerebral Blood Flow And Metabolism. 2002;22:1425–1439. [PubMed]
- Gunn RN, Lammertsma AA, Hume SP, Cunningham VJ. Parametric imaging of ligand-receptor binding in PET using a simplified reference region model. Neuroimage. 1997;6(4):279–287. [PubMed]
- Jiang CR, Wang JL. Covariate adjusted functional principal components analysis for longitudinal data. 2008 submitted.
- Jiang H, Ogden RT. Mixture modelling for dynamic PET data. Statistica Sinica. 2008;18:1341–1356.
- Joshi AD, Fessler JA, Koeppe RA. Improving PET receptor binding estimates from logan plots using principal component analysis. Journal of Cerebral Blood Flow and Metabolism. 2008;28:852–865. [PMC free article] [PubMed]
- Kimura Y, Senda M, Alpert NM. Fast formation of statistically reliable FDG parametric images based on clustering and principal components. Physics in Medicine and Biology. 2002;47:455–468. [PubMed]
- Kinahan P, Rogers J. Analytic 3D image reconstruction using all detected events. IEEE Transactions on Nuclear Science. 1989;36:964–968.
- Long CJ, Brown EN, Triantafyllou C, Aharon I, Wald LL, Solo V. Non-stationary noise estimation in functional mri. NeuroImage. 2005;28:890–903. [PubMed]
- Millet P, Ibáñez V, Delforge J, Pappata S, Guimón J. Wavelet analysis of dynamic PET data: application to the parametric imaging of benzodiazepine receptor concentration. Neuroimage. 2000;11:458–472. [PubMed]
- O’Sullivan F. Locally constrained mixture representation of dynamic imaging data from pet and mr studies. Biostatistics. 2006;7:318–338. [PubMed]
- Peng JY, Aston JAD, Gunn RN, Liou CY, Ashburner J. Dynamic positron emission tomography data-driven analysis using sparse bayesian learning. IEEE Transactions on Medical Imaging. 2008;27:1356–1369. [PubMed]
- Ramsay JO, Silverman BW. Functional Data Analysis. 2. Berlin: Springer; 2005.
- Rice JA, Silverman BW. Estimating the mean and covariance structure non-parametrically when the data are curves. Journal of the Royal Statistical Society: Series B. 1991;53:233–243.
- Slifstein M, Laruelle M. Effects of statistical noise on graphic analysis of PET neuroreceptor studies. Journal of Nuclear Medicine. 2000;41:2083–2088. [PubMed]
- Turkheimer FE, Hinz R, Gunn RN, Aston JAD, Gunn SR, Cunningham VJ. Rank-shaping regularization of exponential spectral analysis for application to functional parametric mapping. Physics in Medicine and Biology. 2003;48:3819–3841. [PubMed]
- Viviani R, Grön G, Spitzer M. Functional principal component analysis of fMRI data. Human Brain Mapping. 2005;24:109–129. [PubMed]
- Yao F, Müuller HG, Wang JL. Functional data analysis for sparse longitudinal data. Journal of American Statistical Association. 2005;100:577–590.

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |