Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC5611856

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Methodology
- 3. Theoretical results
- 4. Extension to sparsely and noisy observed functional covariate
- 5. Extension to partial functional linear regression models
- 6. Real data application
- 7. Simulation study
- Supplementary Material
- References

Authors

Related links

J Nonparametr Stat. Author manuscript; available in PMC 2017 September 25.

Published in final edited form as:

Published online 2016 August 20. doi: 10.1080/10485252.2016.1231806

PMCID: PMC5611856

NIHMSID: NIHMS832203

The publisher's final edited version of this article is available at J Nonparametr Stat

We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.

Functional regression models have become increasingly popular in the field of functional data analysis, with applications in various areas such as biomedical studies, brain imaging, genomics and chemometrics, among many others. We consider the functional linear model (Ramsay and Dalzell 1991) where the response of interest is scalar and the covariate of interest is functional, and the primary goal is to investigate their relationship. In this article, our main focus is to develop hypothesis testing procedures to test for association between the functional covariate and the scalar response when the functional covariate is observed on a dense grid and corrupted with measurement error. We discuss four testing procedures and investigate the theoretical properties for our recommended tests. The approaches are then extended in two directions: 1) first to the case of noisy and sparsely observed covariates, and 2) second to the partial functional linear model (Shin 2009), which accounts for additional covariates using a linear relationship. The finite sample performance for different realistic scenarios is evaluated numerically via a simulation study. The testing procedures are then applied to two data sets: a Diffusion Tensor Imaging tractography data set, portraying a densely and regularly observed functional covariate situation with missingness; and an auction data on eBay of the *Microsoft Xbox* gaming systems, portraying a sparsely observed functional covariate setting.

In functional linear models, the effect of the functional predictor on the scalar response is represented by an inner product of the functional predictor and an unknown, nonpara-metrically modeled, coefficient function. Typically, such coefficient function is assumed to belong to an infinite dimensional Hilbert space. To estimate the coefficient function, one often projects the functional predictor and the coefficient function onto certain basis systems, such as eigenbasis, or pre-determined basis systems such as spline basis or wavelet basis system to achieve dimension reduction. There is a plethora of literature on estimation of the coefficient function; see for example, Cardot, Ferraty, and Sarda (1999), Yao, Müller, and Wang (2005b). For a detailed review of functional linear model, we refer the readers to Ramsay and Silverman (2005) and the references therein.

Our primarily interest in this article is the problem of testing whether the functional covariate is associated with the scalar response, or equivalently, whether the coefficient function is zero. The problem of testing in the context of functional linear models is important for two main reasons. First, in many real life situations, especially in biomedical studies, evidence for association between a predictor and a response is as valuable as, if not more than, estimation of the actual effect size. In the case when the predictors are functional, estimates of the actual coefficient curves are often hard to interpret and it may not be clear whether the covariate is in fact useful to predict the outcome. Secondly, the tactic of constructing a pre-specified level confidence interval around the estimate and then inverting the interval to construct a test, as is usually done in multivariate situation, is not readily applicable in the functional covariate case. Most of the available literature on functional linear models present point-wise confidence bands of the estimated coefficient functions rather than a simultaneous one. Inverting such a point-wise confidence band to construct a test holds very little meaning. Thus testing for association remains a problem of paramount interest. Unfortunately, the literature in the area of testing for association is relatively sparse and often makes assumptions that are quite strong and impractical.

Cardot, Ferraty, Mas, and Sarda (2003) discussed a testing procedure based on the norm of the cross covariance operator of the functional predictor and the scalar response. Later, Cardot, Goia, and Sarda (2004) proposed two computational approaches by using a permutation and F tests. Hilgert, Mas, and Verzelen (2013) introduced two minimax adaptive procedures to test the nullity of the slope function in the functional linear model. These two methods built strong theoretical support for their test statistics and have good performance numerically. González-Manteiga, González-Rodríguez, Martínez-Calvo, and García-Portuguéss (2014) proposed a bootstrap independence test to achieve the same goal. A key assumption of these approaches is that the functional covariates are observed on dense regular grids, without measurement error. This assumption is not realistic in many practical situations; for example, in both applications considered, the covariates are observed on irregular grids or with measurement error. Müller and Stadtmüller (2005) proposed the generalized functional linear model and studied the analytical expression of the asymptotic global confidence bands of the coefficient function estimator. A Wald test statistic can be derived from the asymptotic properties of this estimator. However, a crucial assumption in that work is also that the functional covariate is observed fully and without error. Additionally, as we observe in our simulation studies, the Wald test statistic is not very reliable for small sample sizes and exhibits significantly inflated Type I error rate even when the functional covariate is observed on very fine grids and without error. Swihart, Goldsmith, and Crainiceanu (2014) addressed a similar testing problem when the setting involves multiple functional covariates; they discussed the restricted likelihood ratio test and investigated its performance numerically, via simulation studies, but did not present its theoretical properties.

In this paper, we consider the situation where the functional predictor is observed at irregular sets of points and is possibly corrupted with measurement error. We investigate four traditional test statistics, namely, score, Wald, likelihood ratio and F test statistics. To facilitate these testing procedures, we mainly rely on the use of the eigenbasis functions, derived from the functional principal component analysis of the observed functional covariates, to model the coefficient function. This method, commonly known as functional principal component regression has been well researched in literature; see for example Müller and Stadtmüller (2005), and Hall and Horowitz (2007).

We use functional principal component analysis and model the coefficient function using the eigen functions derived from the Karhunen-Loève expansion of the covariance function of the predictor. As a result, we re-express the functional linear model as a multiple regression model, where the effect of the functional covariate can be approximated as a linear combination of the functional principal component scores. Traditional tests such as Wald, score, likelihood ratio and F tests are then formulated using the unknown coefficients in the re-written model. Using functional principal component analysis to model the coefficient function has various advantages. First, one can accommodate sparsely observed functional covariates at the subject level, where smoothing of individual curves is practically impossible. In addition, theoretical properties of the functional principal component scores have been studied in a variety of settings: see for example Hall and Hosseini-Nasab (2006), Hall, Müller, and Wang (2006) and Yao, Müller, and Wang (2005a). Finally, functional principal component analysis provides automatic choices of data adaptive, empirical, basis functions, and as such one can readily choose the number of basis functions to be used in the model by looking at the percent of variance explained by the corresponding number of principal components.

This article makes two major contributions. First, we derive the theoretical properties for our recommended tests, namely F test and score test. In particular, we derive the null distributions and asymptotic theoretical alternative distributions, for dense and noisy observations of the functional covariate. Furthermore we develop the asymptotical rate of our tests: the testing procedures are shown to be asymptotically near optimal. Second, as a consequence of our theoretical results, we develop a procedure for sample size calculation in the context of functional linear regression. To the best of our knowledge, this is the first such result in the existing literature. Such sample size calculation procedures are immensely useful when one has a fair idea of what the underlying covariance structure of the functional covariates is, from a pilot or preliminary study, and is interested in determining the sample size of a future larger study within the same cohort.

Our theoretical results are asymptotic, in the sense that they are derived assuming that the sample size is diverging to infinity. While such results are of great interest, it is also important to observe the performance of the testing procedures in finite sample sizes. We investigate numerically the performance of the four tests, when the functional covariate is observed either at regular, dense designs as well as sparse, irregularly spaced designs. The results show that, while all the four test statistics behave very similarly in terms of both Type I error rate and power, for very large sample size, they show different behavior for small and moderate sample sizes. In particular, for small and moderate sample sizes, the likelihood ratio and the Wald tests exhibit significantly inflated Type I error rate in all the designs, while the score test shows a conservative Type I error. On the other hand the F test retains close to nominal Type I error rates and provides larger power than the score test; thus F test may be viewed as a robust testing procedure, even for small sample sizes and sparse irregular designs.

The rest of this article is organized as follows. Section 2 describes the proposed methodology including the model setup and testing procedures. Asymptotic properties of our method are studied in Section 3. Sections 4 and 5 discuss the extension to the sparsely and noisy observed functional data and the partially functional linear model. The testing procedures are applied to two real data sets in Section 6, and evaluated numerically in Section 7.

Suppose for *i* = 1*,..., n*, we observe a scalar response *Y _{i}* and covariates {

$${Y}_{i}=\alpha +{\int}_{\mathcal{T}}{X}_{i}(t)\beta (t)dt+{\epsilon}_{i},$$

(1)

where *ε _{i}* are independently and identically distributed normal random variable with mean 0 and variance

Our goal is to test the null hypothesis that there is no relationship between the covariate *X*(·) and the response *Y*. Formally, the null and the alternative hypotheses can be stated as

$${H}_{0}:\beta (t)=0\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{any}\phantom{\rule{0.16667em}{0ex}}t\in \mathcal{T}\phantom{\rule{0.16667em}{0ex}}\text{vs}\phantom{\rule{0.16667em}{0ex}}{H}_{a}:\beta (t)\ne 0\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{some}\phantom{\rule{0.16667em}{0ex}}t\in \mathcal{T}.$$

(2)

To the best of our knowledge most of the existing methods, for example Müller and Stadtmüller (2005), Cardot et al. (2003) and Cardot et al. (2004), assume that the functional covariates are observed fully and without noise. In this paper, we consider the case where the functional covariate may be observed densely with measurement error. We develop four testing procedures to test *H*_{0}, study their theoretical properties, and compare their performances numerically.

The idea behind developing the testing procedures is to use an orthogonal basis function expansion for both *X*(·) and *β*(·) and then reduce the infinite dimensional hypothesis testing to the testing for the finite number of parameters by using an appropriate finite truncation of this basis. In this paper, we consider the eigenbasis functions obtained from the covariance operator of *X*(·). Specifically, let the spectral decomposition of the covariance function
$K(s,t)={\sum}_{j=1}^{\infty}{\lambda}_{j}{\varphi}_{j}(s){\varphi}_{j}(t)$, where {*λ _{j}, j* ≥ 1} are the eigenvalues in strictly decreasing order with
${\sum}_{j=1}^{\infty}{\lambda}_{j}<\infty $ and {

However, such a model is impractical as it involves an infinite sum. Instead, we approximate the model with a series of models where the number of predictors
${\{{\xi}_{ij}\}}_{j=1}^{\infty}$ is truncated to a finite number *s _{n}*, which increases with the number of subjects

$${Y}_{i}=\alpha +{\sum}_{j=1}^{{s}_{n}}{\xi}_{ij}{\beta}_{j}+{\epsilon}_{i},$$

(3)

and the hypothesis testing problem can be reduced to

$${H}_{0}:{\beta}_{1}={\beta}_{2}=\dots ={\beta}_{{s}_{n}}=0\phantom{\rule{0.16667em}{0ex}}\text{vs}\phantom{\rule{0.16667em}{0ex}}{H}_{a}:{\beta}_{j}\ne 0\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{at}\phantom{\rule{0.16667em}{0ex}}\text{least}\phantom{\rule{0.16667em}{0ex}}\text{one}\phantom{\rule{0.16667em}{0ex}}j,1\le j\le {s}_{n}.$$

(4)

Our model specification allows the coefficient function *β*(·) to be identifiable only within the eigenspace of *X*’s; nevertheless for testing purposes, it is only required that *β*(·) does not lie in the orthogonal complement of this space. The truncation value *s _{n}* is selected with the intention of recovering the full space of

We consider four classical testing procedures, namely Wald, Score, likelihood ratio and F test and examine their application in the context of (3). Define *Y* = (*Y*_{1}*,..., Y _{n}*)

$${L}_{n}(\theta )=-(n/2)log(2\pi {\sigma}^{2})-{(Y-\alpha {1}_{n}-M\beta )}^{\top}(Y-\alpha {1}_{n}-M\beta )/(2{\sigma}^{2}),$$

(5)

where 1* _{n}* is a vector of ones of length

Let *B* = [1* _{n}, M*], and define the projection matrices
${P}_{1}={1}_{n}{1}_{n}^{\top}/n$ and

$${S}_{n}(\theta )={\{-n/2{\sigma}^{2}+{(Y-\alpha {1}_{n}-M\beta )}^{\top}(Y-\alpha {1}_{n}-M\beta )/2{\sigma}^{4},{(Y-\alpha {1}_{n}-M\beta )}^{\top}B/2{\sigma}^{2}\}}^{\top};$$

the corresponding information matrix * _{n}*(

$${T}_{S}={S}_{n}{(\stackrel{\sim}{\theta})}^{\top}{\{{\mathcal{I}}_{n}(\stackrel{\sim}{\theta})\}}^{-1}{S}_{n}(\stackrel{\sim}{\theta})={Y}^{\top}({P}_{B}-{P}_{1})Y/{\stackrel{\sim}{\sigma}}^{2}.$$

The advantage of the score test is that this statistic only depends on the estimated parameters under the model specified by the null hypothesis, and thus it requires fitting only the null model. In contrast to the score test, the advantage of the Wald test is that it only requires to fit the full model. In particular, let = (^{2}*, , *^{})^{} denote the maximum likelihood estimate of *θ* under the full model. Define *V*() to be the variance-covariance matrix of evaluated at , that is, the *s _{n} × s_{n}* submatrix of
${I}_{n}^{-1}(\widehat{\theta})$ corresponding to

$${T}_{W}={\widehat{\beta}}^{\top}{\{V(\widehat{\beta})\}}^{-1}\widehat{\beta}.$$

In this work, we consider a slightly modified version of this statistic, where ^{2} is replaced by the restricted maximum likelihood estimate
${\widehat{\sigma}}_{\mathit{REML}}^{2}={Y}^{\top}({I}_{n\times n}-{P}_{B})Y/(n-{s}_{n}-1)$, rather than the usually used maximum likelihood estimate. In our simulation study, we found that Wald test with the restricted maximum likelihood estimate for *σ*^{2} yields considerably improved results in terms of Type I error when the sample size is small; even with this adjustment the Type I error is significantly inflated. For large sample sizes, the performance of the Wald test is similar for the two types of estimates for *σ*^{2}.

Next we consider the likelihood ratio test statistic. Usually, this statistic is defined as −2{*L _{n}*(

$${T}_{L}={s}_{n}+nlog({\stackrel{\sim}{\sigma}}_{\mathit{REML}}^{2}/{\widehat{\sigma}}_{\mathit{REML}}^{2}),$$

where
${\stackrel{\sim}{\sigma}}_{\mathit{REML}}^{2}={Y}^{\top}({I}_{n\times n}-{P}_{1})Y/(n-1)$ is the restricted maximum likelihood estimate for *σ*^{2} under the null model. Notice that one needs to fit both the full and the null model to compute this test statistic.

Finally, we define the F test in terms of the residual sum of squares under the full and the null models. In particular, define *RSS*_{full} = *Y*^{}(*I _{n×n}* −

$${T}_{F}=\frac{({\mathit{RSS}}_{\text{red}}-{\mathit{RSS}}_{\text{full}})/{s}_{n}}{{\mathit{RSS}}_{\text{full}}/(n-{s}_{n}-1)}=\frac{{Y}^{\top}({P}_{1}-{P}_{B})Y/{s}_{n}}{{Y}^{\top}({I}_{n\times n}-{P}_{B})Y/(n-{s}_{n}-1)}.$$

Similar to the modified likelihood ratio test, computation of the F test statistic also requires fitting of both the full and the null models.

The test statistics discussed above are based on the true functional principal component scores. In practice, these scores are unknown and need to be estimated. Estimation of the functional principal component scores has been previously discussed in the literature; for example Yao et al. (2005a), and Zhu, Yao, and Zhang (2014). For completeness, we summarize the common approaches in the Supplementary Material. There are various approaches to estimate the number of functional principal component scores, *s _{n}*. A very popular approach in practice is based on the cumulative percentage of explained variance of the functional covariates; commonly used threshold values are 90%, 95%, and 99%. The choice of

Once the truncation level *s _{n}* and the functional principal component scores are estimated, the testing procedures are obtained by substituting them with their corresponding estimates. Specifically, let be matrix of the estimated functional principal component scores,

In this paper, we focus on the eigenbasis function, and expand both functional predictor and coefficient function on the eigenbasis. Actually, one may also use pre-determined basis systems such as spline or wavelet. These are interesting topics for future research.

As discussed in Section 2, the tests considered - Wald, score, likelihood ratio, and F -resemble their analogue for multivariate covariates, with a few important differences: 1) the number of true functional principal components, *s _{n}*, is not known and thus is approximated, and 2) the functional principal component scores

First, we present the results of the asymptotic distribution of the test statistics under *H*_{0}; all the proofs are included in the Supplementary Material. For the distributions discussed in this section, we refer to the distributions conditional on the original curve *X _{i}*(·) and the observed data points {

*Assume that X _{i}*(·)

Under the null hypothesis and conditioning on the number of functional principal components, the distributions of these test statistics are similar to their counterparts in multiple regression. No matter whether measurement error exists, the null distribution would be exactly the same. In particular, for fixed truncation value *s _{n}*, the null distribution of the F test statistic behaves like

Next, we consider the distribution of the tests under the alternative distribution *H _{a}* :

We begin with describing the assumptions required by our theoretical developments. With a slight abuse of notation, let *C* denote a generic constant term. Recall that {*λ _{j}, j* ≥ 1} are the eigenvalues in strictly decreasing order with
${\sum}_{j=1}^{\infty}{\lambda}_{j}<\infty $, we define
${\delta}_{j}=\underset{1\le k\le j}{min}({\lambda}_{k}-{\lambda}_{k+1})$.

- The number of principal components selected,
*s*, satisfies the condition_{n}*s*→ ∞ and ${\delta}_{{s}_{n}}^{-1}{s}_{n}=o({n}^{1/2})$._{n}

Condition (A) concerns the divergence of the number of functional principal components with *n*. Specifically, it is assumed that this divergence depends on the spacing between adjacent eigenvalues. Our assumption allows *s _{n}* to be diverging, but at a much slower rate than

- (B1)For all
*C*> 0 and some*ε*> 0,$$\begin{array}{rrr}\hfill \underset{t\in \mathcal{T}}{sup}\{E{\mid {X}_{i}(t)\mid}^{C}\}& \hfill <& \hfill \infty \\ \hfill \underset{{t}_{1},{t}_{2}\in \mathcal{T}}{sup}(E[{\{{\mid {t}_{1}-{t}_{2}\mid}^{-\epsilon}\mid {X}_{i}({t}_{1})-{X}_{i}({t}_{2})\mid \}}^{C}])& \hfill <& \hfill \infty .\end{array}$$ - (B2)For all integers
*r*≥ 1, ${\lambda}_{j}^{-r}E{({\int}_{\mathcal{T}}[{X}_{i}(t)-E\{{X}_{i}(t)\}]{\varphi}_{j}(t)dt)}^{2r}$ is bounded uniformly in*j*.Assumptions (B1)-(B2) are common in functional data analysis; see Hall and Hosseini-Nasab (2006). For example, (B1) and (B2) are met when we have a Gaussian process with Hölder continuous sample paths; see Hall and Hosseini-Nasab (2006) for detail.

Denote the bandwidth used for each individual smoothing of the *i*th curve as *h _{i}*. Suppose the support of each trajectory

- (C1)Let
*X*^{(}^{k}^{)}(*t*) be the*k*th derivative of*X*(*t*). Assume that*X*^{(2)}(*t*) is continuous onwith probability 1 and ∫_{d}_{}*E*[{*X*^{(}^{k}^{)}(*t*)}^{4}]*dt*< ∞ with probability 1 for*k*= 0*,*2. Also assume that $E\{{e}_{ij}^{4}\}<\infty $, where*e*’s are independent and identically distributed, and independent of_{ij}*X*(·)._{i} - (C2)Assume there exists
*m**m*(*n*) → ∞ such that $\underset{1\le i\le n}{min}{m}_{i}\ge m$ as*n*→ ∞, and $\underset{1\le i\le n}{max}\underset{2\le k\le {m}_{i}}{max}\{{t}_{ik}-{t}_{i(k-1)}\}=O({m}^{-1})$. - (C3)Assume there exists a sequence
*h*=*h*(*n*), such that $ch\le \underset{1\le i\le n}{min}{h}_{i}\le \underset{1\le i\le n}{max}{h}_{i}\le Ch$ for some constant*C*≥*c*> 0. Furthermore,*h*→ 0 and*m*→ ∞ as*n*→ ∞ in rates that (*mh*)^{−1}+*h*^{4}+*m*^{−2}=*O*(*n*^{−1}). Also assume that the kernel function*K*(·*,*·) is compact supported and Lipschitz continuous.Assumptions (C1)–(C3) are regularity assumptions for the functional predictor process*X*(*t*) for the dense design. They are similar to the Conditions 1–3 in Zhu et al. (2014). Under assumption (C3), we obtain*m*≥*Cn*with^{κ}*κ*≥ 1/2. For example, if*m*achieves order*n*^{1/2}, we require that*h*is between the rate*n*^{−1/4}and*n*^{−1/2}.

For a function *β _{a}*(·), denote ||

*Assume the conditions (A),(B1)(B2),(C1)–(C3) are met. Then under the assumption that H _{a} : β*(·) =

- $\{(1+\int {\beta}_{a}({t}_{1}){\beta}_{a}({t}_{2})K({t}_{1},{t}_{2}){dt}_{1}{dt}_{2}){T}_{S}-{s}_{n}-{\mathrm{\Lambda}}_{n}\}/\sqrt{2{s}_{n}}{\to}^{d}N(0,1)$,
- $\{{s}_{n}{T}_{F}-{s}_{n}-{\mathrm{\Lambda}}_{n}\}/\sqrt{2{s}_{n}}{\to}^{d}N(0,1)$,
*where*Λ=_{n}*n*∫*β*(_{a}*t*_{1})*β*(_{a}*t*_{2})*K*(*t*_{1}*, t*_{2})*dt*_{1}*dt*_{2}(1 +*o*(1))._{P}

The proof is included in the Supplementary Material. We want to emphasize that *β _{a}*(·) is some function that is fixed before observing the data; in particular, we exclude the case

The results presented by Theorems 3.1 and 3.2 are asymptotic results and, while they are interesting, they require large sample sizes to ensure the correct Type I error probability. In practice all the testing procedures discussed above behave like the usual *χ*^{2} and *F*-distributions with appropriate degrees of freedom, which depend on the sample size *n*. If the null hypothesis, that *β*(*t*) = 0 for all *t*, is true, as in Theorem 3.1, we have that: (i) *T _{S}* behaves like
${\chi}_{{s}_{n}}^{2}$, (ii)

Our empirical investigation showed that these approximate null distributions are substantially more accurate, in terms of Type I error probability. Because of this reason, we use these null distributions in our simulation study.

The alternative distributions discussed in *Remark 2* can be used for sample size calculation. We briefly illustrate the ideas using the F test, *T _{F}*. Let

Next we present the rate of our testing procedures. The following corollary shows that these tests can detect a local alternative of order
$\sqrt{{s}_{n}}/n$; thus they achieve the same optimal rate as the goodness-of-fit test for the high dimensional linear model when the number of parameters is *s _{n}* (Verzelen and Villers 2010).

*Let β _{a}*(

In practical applications, we often observe sparse realizations of the functional covariate which in addition are corrupted with measurement error; this setting is commonly known as ‘sparse design’. Our testing procedures can still be applied to this scenario, with the difference that the estimated functional principal component scores account for the sparse design. In particular, the two step procedure of first curve smoothing and then functional principal component analysis of Zhang and Chen (2007) is no longer applicable. Instead, the conditional expectation method (Yao, Müller, and Wang 2005a) is used; to avoid redundancy, the estimation procedure is detailed in the Supplementary Material. The null distributions of the testing procedures are similar to the dense design case: we use chi-square with degree of freedom of *s _{n}* for the null distribution for

*Under the sparse design, assume that X _{i}*(·)

- $({T}_{S}-{s}_{n})/\sqrt{2{s}_{n}}{\to}^{d}N(0,1)$,
- $({s}_{n}{T}_{F}-{s}_{n})/\sqrt{2{s}_{n}}{\to}^{d}N(0,1)$.

We emphasize that the asymptotic null distribution holds true irrespective of the sampling design (sparse or dense) of the functional covariates, or whether the functional covariate is measured with noise. In particular we can still use the approximate null distributions *F*_{sn,n−sn−1} and
${\chi}_{{s}_{n}}^{2}$ for *T _{F}* and

Often, of interest, is to investigate the association between a scalar response and a functional covariate, while accounting for other covariate information that is available. For example, in our tractography study the interest is to test for the association between the cognitive score of multiple sclerosis patients and their fractional anisotropy along the white matter tract by accounting for the patients’ sex and age; see Section 6.1 for details. Thus model (1) cannot be used per se; however it can be modified to account for additional covariates.

More generally, we define the following modeling framework. Let the observed data be [*Y _{i},* {

$${Y}_{i}={Z}_{i}^{\top}\alpha +{\int}_{\mathcal{T}}{X}_{i}(t)\beta (t)dt+{\epsilon}_{i},$$

(6)

where *X _{i}*(·) is the true functional predictor,

The objective is to test the hypothesis *H*_{0} : *β*(*t*) = 0 for all *t*, by accommodating nuisance parameters using the modeling framework (6). The four testing procedures can be easily extended to this setting. As in Section 2.2, the approach is based on using a (3), obtained by approximating the model using a truncated number *s _{n}* of the functional principal component scores. Let

The score function and the information matrix can be derived accordingly; the Wald, likelihood ratio and F test statistics follow easily. In particular, the maximum likelihood estimate of *σ*^{2} is ^{2} = *Y*^{}(*I _{n×n}* −

In practice, the test statistics are calculated based on the estimated functional principal component scores, and thus based on the estimated design matrix , as detailed in Section 2.2. Under the null hypothesis that *β*(·) 0, the null distribution of *T _{W}*,

A more general extension is the partially functional linear regression model with multiple functional predictors (Kong, Xue, Yao, and Zhang 2016).

$${Y}_{i}={Z}_{i}^{\top}\alpha +\sum _{\ell =1}^{d}{\int}_{{\mathcal{T}}_{\ell}}{X}_{i\ell}(t){\beta}_{\ell}(t)dt+{\epsilon}_{i},$$

(7)

where *X _{i}*(·) is the th functional predictor,

The four testing procedures can be easily extended to this setting. Suppose we select *s _{n}* principal components for the th functional predictor

Consider our motivating application, the Diffusion Tensor Imaging tractography study, where we investigate the association between cerebral white matter tracts in multiple sclerosis patients and cognitive impairment. The study has been previously described in Greven, Crainiceanu, Caffo, and Reich (2010); Staicu, Crainiceanu, Ruppert, and Reich (2012); Goldsmith, Feder, Crainiceanu, Caffo, and Reich (2011), and we discuss it briefly here. Multiple sclerosis is a demyelinating autoimmune disease that is associated with lesions in the white matter tracts of affected individual and results in severe disability. Diffusion Tensor Imaging is a magnetic resonance imaging technique that allows the study of white matter tracts by measuring the diffusivity of water in the brain: in white matter tracts, water diffuses anisotropically in the direction of the tract. Using measurements of diffusivity, Diffsion Tensor Imaging can provide relatively detailed images of white matter anatomy in the brain (Basser, Mattiello, and LeBihan 1994; Basser, Pajevic, Pierpaoli, and Duda 2000). Some measures of diffusion are fractional anisotropy, and parallel diffusivity among others. For example, fractional anisotropy is a function of the three eigenvalues of the estimated diffusion process that is equal to zero if water diffuses perfectly isotropically, such as Brownian motion, and to one if water diffuses anisotropically, such as for perfectly organized and synchronized movement of all water molecules in one direction. The measurements of diffusion anisotropy are obtained at every voxel of the white matter tracts; in this analysis, we consider averages of water diffusion anisotropy measurements along two of the dimensions, which results in a functional observation with scalar argument that is sampled densely along the tract.

Here we study the relationship between the fractional anisotropy along the two well identified white matter tracts, corpus callosum and left corticospinal tracts, and the multiple sclerosis patient cognitive function, as measured by the score at a test, called Paced Auditory Serial Addition Test. Specifically, each multiple sclerosis subject is given numbers at three second intervals and asked to add the current number to the previous one. The score is obtained as the total number of correct answers out of 60.

The study, in its generality, comprises 160 multiple sclerosis patients and 42 healthy controls observed at multiple visits spanning up to four years. For each subject, at each visit, are recorded: diffusion anisotropy measurements along several white matter tracts at many hospital visits, as well as additional information such as age, gender and so on. In this analysis, we use the measurements obtained at the baseline visit. Because Paced Auditory Serial Addition Test was only administered to multiple sclerosis subjects, we limit our analysis to the multiple sclerosis group. Few subjects do not have Paced Auditory Serial Addition Test scores recorded and they are removed from the analysis, leaving 150 multiple sclerosis patients in the study. Part of these data is available in the R-package refund (Crainiceanu et al. (2012)). For illustration, Figure 1 shows the fractional anisotropy along the corpus callosum (left panel) and corticospinal tracts (middle) tracts, and the Paced Auditory Serial Addition Test scores (right panel) for all the subjects in the study. Depicted in solid black/solid gray /dashed black are the fractional anisotropy measurements of three different subjects, with each line type representing a subject. Our goal is to test for association between the Paced Auditory Serial Addition Test score in multiple sclerosis patients and the fractional anisotropy along the corpus callosum and the corticospinal tracts, while accounting for age and gender.

Fractional anisotropy profiles along corpus callosum (left) and corticospinal tracts (middle) and the associated Paced Auditory Serial Addition Test scores (right panel) in the group of multiple sclerosis patients. Depicted in different colors and line/symbols **...**

Consider first the corpus callosum tract, which has an important role in the cognition function. Fractional anisotropy is measured at 93 locations along this tract: the measurements include missingness and measurement error. Using our notation, let *W _{ij}* denote the noisy fractional anisotropy observed at location

As discussed in Section 2 the preliminary step of the hypothesis testing is the estimation of the subject specific functional principal component scores corresponding to the fractional anisotropy profiles along the corpus callosum tract. We use functional principal component analysis through conditional expectation Yao et al. (2005a), and select the number of eigenfunctions using the cumulative explained variance. The results yield that five eigenfunctions are required to explain 90% of the variability in the data, while 15 are required to explain 99% of the variability. For stability reasons, we take a more conservative approach and select the number of eigenfunctions using 90% cumulative explained variance. Figure 2, top three panles and bottom two left most panels display the estimated leading eigenfunctions along with the corresponding estimated eigenvalues; the variance of the measurement error in the functional covariate is estimated to ${\stackrel{\sim}{\sigma}}_{e}^{2}=0.002\times {10}^{-2}$.

Panels (a),(b),(c), (d), and (e) show the top five estimated eigenfunctions _{k}(·), along with the associated eigenvalues, _{k} (at a scale of 10^{−2}). Panel (f) shows the estimated coefficient function **...**

Then, we test whether the coefficient function *β*(·) is zero along the corpus callosum, by accounting for age and gender effects using the methods discussed in Section 2.2. Figure 2 the bottom right panel depicts the estimated regression function for the fractional anisotropy along the corpus callosum, (*t*). It indicates that the multiple sclerosis subjects who have a higher than average fractional anisotropy along the middle area of corpus callosum tend to score higher on the Paced Auditory Serial Addition Test. The *p*-value for testing that *β*(·) = 0 reported by the F statistic equals 2.33*×*10^{−4}, indicating very strong evidence of association. This result is consistent across the other testing procedures: the modified likelihood ratio test *p*-value is 1.57*×*10^{−4}, the Wald *p*-value is 1.03*×*10^{−4}, while the score *p*-value is 3.42 *×* 10^{−4}. As one annonymous reviewer suggested, we also provide the other estimated model components, for completeness. The estimated intercept is _{1} = 44.205, the estimated effects associated with the gender and age are _{2} = −0.979 and _{3} = −0.305 respectively, and the estimated model variance is ^{2} = 144. These results confirm our expectation that the cognitive performance of the multiple scelrosis subjects, as assessed via the Paced Auditory Serial Addition Test, is negatively associated with age and furthermore show that it tends to be lower for women than men.

Next, we are interested to assess whether the fractional anisotropy along the left corticospinal tract adds significantly to a model fit for the Paced Auditory Serial Addition Test score with the fractional anisotropy along the corpus callosum. Fractional anisotropy is measured at 55 locations along the corticospinal tracts; the missingness along this tract is notably larger than along the corpus callosum. We assume modeling framework as in (7),
${Y}_{i}={Z}_{i}^{T}\alpha +{\int}_{{\mathcal{T}}_{\mathit{CCA}}}{\beta}_{\mathit{CCA}}(t){X}_{i1}(t)dt+{\int}_{{\mathcal{T}}_{\mathit{lCST}}}{\beta}_{\mathit{lCST}}(t){X}_{i2}(t)dt+{\epsilon}_{i}$; here *Y _{i}* and

As before, we first apply functional principal component analysis to both sets of functional covariates; for the fractional anisotropy along the left corticospinal tract we select the number of eigenfunctions using 90% explained variance (which results to 8 eigenfunctions) and estimate the functional principal component scores. The percentage of explained variance was again selected for stability reasons; in particular 99% variability is explained by 15 eigenfunctions. Using the methods discussed in the paper to assess the testing hypothesis of no relationship between the test score and the fractional anisotropy along the left corticospinal tract while accounting for the other covariates, we obtain a *p*-value of 0.0771 using *F* test (0.0624 with modified likelihood ratio test, 0.0670 using Wald and 0.0641 with score test statistic). Thus there is no significant relationship between the cognitive function as assessed by Paced Auditory Serial Addition Test and the corticospinal tracts tract, as measured by fractional anisotropy at level of significance 5%, when the model accounts for fractional anisotropy along the corpus callosum.

Overall, our findings corroborate the specialists’ prior expectations that the cognitive function is associated with the corpus callosum tract. Further results (not included here) show surprising association of the cognitive function with the corticospinal tract, when the model accounts only for age and gender. However as we show above, the association is not significant if the model accounts for the corpus-callosum fractional anisotropy. Interestingly, both findings are in agreement with Swihart et al. (2014), who used the fractional anisotropy along the two tracts of the multiple sclerosis subjects measured at all the available hospital visits and a restricted likelihood ratio-based testing approach.

Next, we consider an application from electronic commerce (eCommerce) field. The eBay auction data set (Wang, Jank, and Shmueli 2008) consists of time series of bids placed over time for 172 auctions for *Microsoft Xbox* gaming systems, which are very popular items on eBay. For each auction, the associated time series is composed of bids made by users located at various geographical locations, and thus it shows very uneven features. In addition, the time between the start and the end of an auction varies across auctions, and furthermore the actions duration varies across actions. Nevertheless, as Jank and Shmueli (2006) argues “bidding in eBay auctions tends to be concentrated at the end, resulting in very sparse bid-arrivals during most of the auction except for its final moments, when the bidding volume can be extremely high”. The dynamics of the bids has attracted large interest, especially in the literature of functional data (Liu and Müller 2008). Here we investigate whether the dynamics of the bids in the first part of the auction duration is related to the auction’s closing price.

To handle the challenge of different starting times and durations of the auctions, we think of the bids for an action as varying with the percentile of the auction length (see also Jank and Shmueli (2006)). For example if an auction has a length of 7 days, then the bid placed in the 5th day from the starting time corresponds to 71.4 percentile of the auction’s duration. Here we focus on the bids placed in the first 71.4% of the auction’s duration, and study whether their dynamics influences various measures of the closing price of the auction. To be specific define the formation of the price during the first 71.4% of duration of an action as the process of interest observed with noise. Using the notation in Section 2, let *W _{ij}* denote the bid placed for action

Before we tackle these two important problems, we carefully examine the data. A close inspection confirms that most auctions have a duration of at least 7 days and thus the auctions with length less than 7 days are removed. Also we remove all the auctions for which there is only one bid in the first 71.4% of the auction duration. The remaining data set contains bids from 125 Xboxes auctions. Moreover, for very action, the number of bids placed in the first 71.4% of the auction’s duration, varies between 2 to 14. Our analysis regards the observed partial auction curve as a noisy functional predictor observed at sparse and irregular time points in = [0*,* .714].

For the first objective, the response for each action *i*, is taken as the relative change in the final price, as defined as *Y _{i}* = (

Panels (a),(b),(c), and (d) show the top four estimated eigenfunctions _{k}(·), along with the associated eigenvalues, _{k} (at a scale of 10^{3}).

One might also be interested in the relationship between the auction price during the first part of the week and the final price. We performed this analysis and found the following results: *p*-value reported by the F statistic is 0, the likelihood ratio *p*-value is 0, the Wald *p*-value is 0, and the score *p*-value is 9.2 *×* 10^{−15}. These results indicate very strong evidence of association between the price at the beginning of the week and the final price.

Next, we turn to the second objective, and re-define the response for each action *i*, as the rate of change in the final price. Specifically let *Y _{i}* = (

The performance of the Wald, score, modified likelihood ratio test and F tests in terms of Type I error and power is investigated in a simulation experiment. First we consider a functional linear model and study the tests performance under various sample sizes and sampling designs for the functional covariate (Section 7.1). Moreover, we illustrate how to use the asymptotic alternative distribution of the tests to calculate the ideal sample size to detect a specified alternative (Section 7.2). Finally, we consider a partial functional linear model, in an attempt to mimic the Diffusion Tensor Imaging data generation process, and evaluate the tests performance, when the model is misspecified (Section 7.3).

The underlying generating process for the *i*th functional covariate is *X _{i}*(

We consider there types of sampling designs for the functional covariate.

- Design 1: (Dense design). The observed points on each curve are an equally spaced grid of 300 points in [0
*,*10]. - Design 2: (Moderately sparse design with a few points). The number of points per curve,
*m*, is moderate and varies across subjects. Specifically,_{i}*m*is chosen randomly from a discrete uniform distribution on {5_{i}*,*6*,*7*,*8*,*9*,*10}. Each curve is assumed to be observed at*m*points that are randomly selected from the set of 501 equally spaced points in [0_{i}*,*10]. - Design 3: (Very sparse design). The number of points per curve is small and varies across subjects. Similar generating process of the sampling points as Design 2, with exception that the number of measurements
*m*is chosen from a discrete uniform distribution on {2_{i}*,*3*,*4}.

The response *Y _{i}* is generated from model (1), where

$${\beta}_{c}(t)=c{\{1+exp\phantom{\rule{0.16667em}{0ex}}(1-0.1t)\}}^{-1},$$

(8)

where *c* ≥ 0 is a parameter that controls the departure from the null function. The performance of the tests was assessed in testing the hypothesis *H*_{0} : *β*(·) 0, when the sample size increases from 50 to 500. For Type I error rate performance, we consider data generated from the above model when *β*(·) = 0 corresponding to *c* = 0. For power performance, we consider *β*(·) = *β _{c}*(·) corresponding to

The four tests were calculated as described in Section 2, after having estimated the functional principal component scores as a preliminary step. For the latter, the estimation of the functional principal component scores was obtained using the Matlab package, PACE, available at http://anson.ucdavis.edu/~ntyang/PACE. The number of functional principal components is selected such that the cumulative explained variance is 99%; other threshold levels have been also investigated, and the results remained in general unchanged. We used 5000 simulated data sets are used to estimate the Type I error rate and 1000 simulated data sets to estimate the power.

The results are presented in Figure 4, and correspond to fixing the level of significance at 5%. Figure 4 (a) shows the performance of the tests with respect to Type I error rate for various sampling designs and as the sample size increases from 50 to 500. In particular, F test gives reasonable Type I errors for all the designs and various sample sizes. The score test seems to be somewhat conservative for small samples for all the sampling designs, while Wald and the modified likelihood ratio test indicate an inflated Type I error for small and moderate sample sizes (*n* = 50 or *n* = 100). For large sample size (*n* = 500), all of the tests give Type I error rates close to the nominal level.

Panel (a) shows the estimated Type I error (depicted as the height of the bars) for all the four tests in nine settings obtained from combining three sampling designs and sample sizes when the nominal level is 5% (horizontal dashed red line). The bars **...**

Figure 4 (b)–(d) display the power performance of the tests for the dense sampling design and various sample sizes. The tests have comparable power for all sample sizes investigated. The results are similar for the other two designs and are included in the Supplementary Material: as expected, the power of the tests decreases with the sparseness of the design.

One neat property of selecting the number of principal components using the cumulative percentage of explained variance of the functional covariates, and thus not involving the data *Y*, is that the technique does not require multiple testing correction. For example, Hilgert et al. (2013) proposed AIC-based selection of the number of principal components using some grid search {*S* := 1*,* 2*,* 4*,* 8 ... *s _{max}*} combined with a multiple testing procedure. However, the downside of our approach is that selecting an unnecessarily large number of components may result in a loss of power of the testing procedure. To gain more insight, we compared numerically our proposed methods and the minimax adaptive testing method proposed in Hilgert et al. (2013); this comparison is included in the Supplementary Material, Section 5.2, due to space limitation of the paper. Furthermore we compared the F test with our discussed asymptotic null distribution with two other alternatives: 1) the F test with the bootstrap-based approximation of its null distribution discussed in González-Manteiga et al. (2014) and 2) the likelihood ratio test of Swihart et al. (2014) for single functional covariate. The results are described in the Supplementary Material. Some of the competitive methods, namely the minimax and the boostrap methods, are not applicable to the case when the functional covariate is measured at an irregular sparse design, nor corrupted with measurement error; thus we restricted the comparison to the dense design scenario and when the covariate is measured without noise. We found that the Type I error is significantly inflated for the likelihood ratio test; this is in agreement with our own numerical experience, that the likelihood ratio test yields inflated Type I error. The results show similar performance in terms of Type I error rate and power for the minimax and the bootstrap methods. The advantage of our methods, over with the minimax and the bootstrap methods, is that they accommodate irregular or sparse designs and measurement error in the functional covariate. Finally, we investigated the robustness of the testing procedures to normality: the true functional covariate is generated such that the functional principal component scores have the scaled

We have also conducted simulations for the case when *X _{i}*(

In this section, we discuss how to employ the asymptotic distribution of the tests under the alternative hypothesis to calculate appropriate sample sizes for detection of the effect, when both the power and the precision are a priori specified. This research direction is novel and has not been addressed hitherto in the literature of functional data analysis. We begin by assessing the accuracy of the asymptotic distribution of the tests under the alternative hypothesis in finite sample sizes. The intuition is that if the alternative asymptotic distribution of a test has good performance in finite samples, then this distribution can be used for sample size calculation, just as in typical linear regression.

Consider model (1) where the response *Y _{i}* is generated as described in the previous section, and the covariate

Figure 5 (a) displays the power of the F test, as a function *c*, when the level of significance is fixed at 5%. Empirical and theoretical power curves are compared for varying sample sizes, *n* = 50, *n* = 100 and *n* = 500. The empirical power curves (dashed lines) are basically the power curves of the F test that are shown in Figure 4 panels (b)–(d) and restricted to the domain (0*,* 0.1]. Theoretical power curves (solid lines) are calculated using R software to compute various probabilities and quantiles corresponding to *F* distribution of various degrees and different values for the non-centrality parameter.

Panel (a) shows the empirical (dashed line) and theoretical (solid) power curves for Design 1, and different sample sizes. Panel (b) displays theoretical power curves corresponding to several sample sizes: 50, 100, 150, 200, 300, 400, 500 (from bottom **...**

For fixed sample sizes, the theoretical and empirical power curves are very close, indicating that the asymptotic distribution of the F test under alternative is reliable for calculation of sample sizes. For example, consider model (1), assume that there is a linear association between the response and the functional covariate, and that the true regression parameter is *β*(·) = *β*_{0.08}(·). Then, corresponding to a power level of at least 80%, the smallest sample size at which one can detect significant association at tolerance level of 0.05 is *n* = 150. In Figure 5 (b) this is represented by tracing up the vertical line at *c* = 0.08 that corresponds to parameter function *β*_{0.08}(·) to intersect the power curves of different sample size, at different power levels. The smallest sample size at which the power level is at least 80% is the desired sample size.

The sample size calculation is illustrated on the F test, mainly because the alternative asymptotic distribution of this test is very accurate, even for smaller samples. In particular Wald and the modified likelihood ratio tests yield inflated Type I error rate for moderately large sample size. For the score test, close agreement between the asymptotic and empirical power approximation occurs when the sample size is large. Because of these considerations, our recommendation is to use F test for sample size calculations.

The sample size calculation is an important novelty of this paper. Indeed it depends on the true covariance surface *K*(·*,* ·) and the desired magnitude of effect one hopes to detect with the test. The logic follows the typical sample size calculation techniques in classical regression models. Nevertheless, the complexity of the quantities involved in this calculation arises from the complex objects - random functions - that we deal with. To be more specific, one can use pilot studies to estimate this covariance surface and calculate the sample size for future, much larger studies depending on the magnitude of the effect.

Next, we investigate the performance of the tests in a partial functional liner model setting that mimics the Diffusion Tensor Imaging data generation process, and we study the robustness of the results when the distribution of the errors is not Gaussian. In particular consider the case-study, where of interest is the association between the Paced Auditory Serial Addition Test score and the fractional anisotropy profiles along the corpus callosum tract in multiple sclerosis, while accounting for the gender and age of the patients; see Section 6.1. We analyze these data using the partial functional linear model approach discussed in Section 5; in the interest of space, the model components estimates are given in the Supplementary Material. We use these estimates to perform a simulation experiment for partial functional linear model.

The estimated eigenfunctions and eigenvalues, are used to obtain the generating process for the underlying functional covariates {*X _{i}*(

The four tests are applied, as discussed in Section 2, where for consistency with the real data analysis, the number of functional principal components is selected using a threshold level of 90% for the cumulative explained variance. Type I error is estimated based on 5000 simulations when data are generated under the assumption that *β*(·) 0, and the power is estimated based on 1000 simulations when data are generated under the assumptions that *β*(·) = *c*(·) for *c* > 0, for various values of *c*.

Table 1 gives the results separately for the two models for the error distribution, when the significance level is 5%. Overall it appears that all the tests are robust to the model mispecification: both the Type I error rate and various powers of the tests seem to be similar under the two error distributions considered. Furthermore, the Type I error rates are close to the nominal level for the score and F tests, while they seem somewhat inflated for the Wald and the modified likelihood ratio tests. All the tests have comparable powers.

A.-M. Staicu’s research was supported by U.S. National Science Foundation grant numbers DMS 1007466 and DMS 0454942 and the U.S. National Institute of Health grants R01 NS085211 and R01 MH086633. A. Maity’s research was supported by U.S. National Institute of Health grant R00ES017744. We thank Ciprian Crainiceanu, Daniel Reich, the National Multiple Sclerosis Society, and Peter Calabresi for the diffusion tensor imaging dataset.

*AMS Subject Classification*: 62G08, 62G10

Supplementary material available online includes details of the estimation of the functional principal component scores, complete proofs of the two main theorems, the expressions of the testing procedures for partial functional linear model, and additional simulations.

- Basser P, Mattiello J, LeBihan D. MR diffusion tensor spectroscopy and imaging. Biophysical Journal. 1994;66:259–267. [PubMed]
- Basser P, Pajevic S, Pierpaoli C, Duda J. In vivo fiber tractography using DT-MRI data. Magnetic Resonance in Medicine. 2000;44:625–632. [PubMed]
- Cardot H, Ferraty F, Sarda P. Functional linear model. Statistics & Probability Letters. 1999;45:11–22. http://dx.doi.org/10.1016/S0167-7152(99)00036-X.
- Cardot H, Ferraty F, Mas A, Sarda P. Testing hypotheses in the functional linear model. Scandinavian Journal of Statistics. 2003;30:241–255. http://dx.doi.org/10.1111/1467-9469.00329.
- Cardot H, Goia A, Sarda P. Testing for No Effect in Functional Linear Regression Models, Some Computational Approaches. Communications in Statistics - Simulation and Computation. 2004;33:179–199.
- Crainiceanu C, (Coordinating authors), PRS, Goldsmith J, Greven S, Huang L, (Contributors), FS refund: Regression with Functional Data. 2012 http://CRAN.R-project.org/package=refund. R package version 0.1-5.
- Goldsmith AJ, Feder J, Crainiceanu CM, Caffo B, Reich D. Penalized Functional Regression. Journal of Computational and Graphical Statistics. 2011;20:830–851. [PMC free article] [PubMed]
- González-Manteiga W, González-Rodríguez G, Martínez-Calvo A, García-Portuguéss E. Bootstrap independence test for functional linear models. 2014 unpublished manuscript.
- Greven S, Crainiceanu C, Caffo B, Reich D. Longitudinal functional principal component analysis. Electronic Journal of Statistics. 2010;4:1022–1054. [PMC free article] [PubMed]
- Hall P, Horowitz JL. Methodology and convergence rates for functional linear regression. The Annals of Statistics. 2007;35:70–91. http://dx.doi.org/10.1214/009053606000000957.
- Hall P, Hosseini-Nasab M. On properties of functional principal components analysis. Journal of the Royal Statistical Society, Series B. 2006;68:109–126.
- Hall P, Müller HG, Wang JL. Properties of principal component methods for functional and longitudinal data analysis. The Annals of Statistics. 2006;34:1493–1517. http://dx.doi.org/10.1214/009053606000000272.
- Hilgert N, Mas A, Verzelen N. Minimax adaptive tests for the functional linear model. The Annals of Statistics. 2013;41:838–869.
- Jank W, Shmueli G. Functional data analysis in electronic commerce research. Statistical Science. 2006;21:155–166. http://dx.doi.org/10.1214/088342306000000132.
- Kong D, Xue K, Yao F, Zhang HH. Partially functional linear regression in high dimensions. Biometrika. 2016;103:147–159. http://dx.doi.org/10.1093/biomet/asv062.
- Li Y, Wang N, Carroll RJ. Generalized functional linear models with semipara-metric single-index interactions. Journal of the American Statistical Association. 2010;105:621–633. http://dx.doi.org/10.1198/jasa.2010.tm09313. Supplementary materials available online. [PMC free article] [PubMed]
- Liu B, Müller HG.
*Statistical methods in e-commerce research*, Statist. Practice. Hoboken, NJ: Wiley; 2008. Functional data analysis for sparse auction data; pp. 269–289. http://dx.doi.org/10.1002/9780470315262.ch12. - McLean MW, Hooker G, Ruppert D. Restricted Likelihood Ratio Tests for Linearity in Scalar-on-Function Regression. Statistics and Computing. 2014 to appear.
- Mclean MW, Hooker G, Staicu AM, Scheipl F, Ruppert D. Functional Generalized Additive Models. Journal of Computational and Graphical Statistics. 2014;23:249–269. http://dx.doi.org/10.1080/10618600.2012.729985. [PMC free article] [PubMed]
- Müller HG, Stadtmüller U. Generalized functional linear models. The Annals of Statistics. 2005;33:774–805. http://dx.doi.org/10.1214/009053604000001156.
- Müller HG, Wang JL. PACE: Functional Data Analysis and Empirical Dynamics. 2012 http://anson.ucdavis.edu/mueller/data/pace.html. MATLAB package version 2.15.
- Müller HG, Wu Y, Yao F. Continuously additive models for nonlinear functional regression. Biometrika. 2013;100:607–622. http://dx.doi.org/10.1093/biomet/ast004.
- Ramsay JO, Dalzell CJ. Some tools for functional data analysis. Journal of the Royal Statistical Society, Series B. 1991;53:539–572. http://links.jstor.org/sici?sici=0035-9246(1991)53:3¡539:STFFDA¿2.0.CO;2-Worigin=MSN. With discussion and a reply by the authors.
- Ramsay JO, Silverman BW. Functional Data Analysis. 2. Springer; 2005. Springer Series in Statistics.
- Rao CR. Large sample tests of statistical hypotheses concerning several parameters with applications to problems of estimation. Mathematical Proceedings of the Cambridge Philosophical Society. 1948;44:50–57. http://dx.doi.org/10.1017/S0305004100023987.
- Shin H. Partial functional linear regression. Journal of Statistical Planning and Inference. 2009;139:3405–3418. http://dx.doi.org/10.1016/j.jspi.2009.03.001.
- Staicu AM, Crainiceanu CM, Ruppert D, Reich D. Modeling functional data with spatially heterogeneous shape characteristics. Biometrics. 2012;17:331–343. [PMC free article] [PubMed]
- Su DC, Yu Ru, Hsu L. Hypothesis testing in functional linear models. 2016 works.bepress.com/di/22/download/
- Swihart B, Goldsmith J, Crainiceanu C. Restricted Likelihood Ratio Tests for Functional Effects in the Functional Linear Model. Technometrics. 2014 p. to appear.
- Verzelen N, Villers F. Goodness-of-fit tests for high-dimensional Gaussian linear models. The Annals of Statistics. 2010;38:704–752. http://dx.doi.org/10.1214/08-AOS629.
- Wang S, Jank W, Shmueli G. Explaining and forecasting online auction prices and their dynamics using functional data analysis. Journal of Business & Economic Statistics. 2008;26:144–160. http://dx.doi.org/10.1198/073500106000000477.
- Yao F, Müller HG, Wang JL. Functional data analysis for sparse longitudinal data. Journal of the American Statistical Association. 2005a;100:577–590. http://dx.doi.org/10.1198/016214504000001745.
- Yao F, Müller HG, Wang JL. Functional linear regression analysis for longitudinal data. The Annals of Statistics. 2005b;33:2873–2903. http://dx.doi.org/10.1214/009053605000000660.
- Zhang JT, Chen J. Statistical inferences for functional data. The Annals of Statistics. 2007;35:1052–1079. http://dx.doi.org/10.1214/009053606000001505.
- Zhu H, Yao F, Zhang HH. Structured functional additive regression in reproducing kernel Hilbert spaces. Journal of the Royal Statistical Society. Series B. 2014;76:581–603. http://dx.doi.org/10.1111/rssb.12036. [PMC free article] [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |