Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Aust N Z J Stat. Author manuscript; available in PMC 2010 August 10.
Published in final edited form as:
Aust N Z J Stat. 2008 December; 50(4): 347–359.
doi:  10.1111/j.1467-842X.2008.00521.x
PMCID: PMC2918919

Empirical Likelihood Based Inferences for Partially Linear Models with Missing Covariates


This paper considers statistical inference for partially linear models Y = XTμ + ν(Z) + ε when the linear covariate X is missing with missing probability π depending upon (Y, Z). We propose empirical likelihood based statistics to construct confidence regions for β and ν(z). The resulting statistics are shown to be asymptotically chi-squared distributed. Finite sample performance of the proposed statistics is assessed by simulation experiments. The proposed methods are applied to a data set from an AIDS clinical trial.

Keywords: Confidence region, local linear regression, missing at random, semiparametric estimation

1 Introduction

The standard additive model, a generalization of the multiple linear regression models by introducing one-dimensional nonparametric functions in the place of the linear component, has been used to explore the complicated relation between the response to treatment and the predictors of interest (Stone 1985). Efforts have been also made to balance the interpretation of linear models and flexibility of additive models. Important results of those efforts are partially linear models (PLMs), which have been considered for analysis of cross-sectional and longitudinal data, and there is a substantial literature for PLMs and their generalizations; see for example Engle et al. (1986), Robinson (1988), Speckman (1988), Severini & Staniswalis (1994), Zeger & Diggle (1994), Gao & Anh (1999), Opsomer & Ruppert (1999), Härdle, Liang & Gao (2000), and Fan & Li (2004) among many others. Recently, Liang et al. (2004) considered the case in which the linear component is missing at random with missingness probability being independent of linear component. The authors developed new methods for estimating parameter and nonparametric function, and derived asymptotic distributions for the proposed estimators. Using their estimators of the covariance matrix or their bootstrap versions, one can give confidence regions of the parameters. However, the finite-sample performance is not optimistic because of the complexity of the covariance matrixand the need of plugging in several estimated terms (Liang et al. 2004 Theorem 2), and the confidence region derived by this procedure is based on a normal approximation. A valuable alternative method is the empirical likelihood principle, which has been systematically studied and developed by Owen (1988, 1990), Qin (1994, 1999), and Qin & Lawless (1994). The empirical likelihood ratio has a limiting chi-squared distribution, and it can be used to obtain tests and confidence intervals for a variety of settings, including linear models (Owen 1991, Chen 1993, 1994), generalized linear models (Kolaczyk 1994), and general estimating equation (Qin & Lawless 1994). Empirical likelihood method has many advantages over its counterparts like the normal-approximation-based method and the bootstrap method (Hall & La Scala 1990). The most appealing features of the empirical likelihood method include improvement of the confidence region, increase of accuracy of coverage because of using auxiliary information, and ease of implementation.

Consider the following partially linear model:


where (X, Z, Y ) [set membership] × [0, 1] × R, β is an unknown parameter vector, ν(·) is a smooth unknown function, E([set membership]|X, Z) = 0. We study this model when there are missing data on X. Let δ = 1 if X is observed, and δ = 0 otherwise. Throughout this paper, we assume that X is missing at random (MAR) in the sense that


Assume that (Yi, Xi, Zi, δi), i = 1, 2, … , n are the independent and identically distributed data generated from (1). We are interested in constructing empirical likelihood confidence intervals of β and ν(z) for a fixed z. The empirical likelihood principle has been applied to PLMs. For example, Shi & Lau (1999) studied empirical likelihood method for PLMs. Qin & Jing (2001) and Wang & Li (2002) considered the case in which the response variables Yi are randomly censored. The authors proposed an empirical likelihood ratio for β and derived an asymptotic distribution, which is a sum of independent chi-squared distributions with unknown weights. More recently, Liang, Wang & Carroll (2007) studied the PLMs with missing response variables.

This paper is organized as follows. In Section 2, the empirical likelihood ratio statistic for β is constructed. The limiting distribution of the statistic and an associated confidence interval for β are derived. Section 3 studies the same topic as in Section 2 for nonparametric component, and similar results are given. Section 4 reports the results from a simulation experiment, while Section 5 presents the results from a real data analysis. All technical derivations are given in the Appendix. Throughout this paper, we use the same notation as in Liang et al. (2004).

2 Confidence Interval of β

Let m1(Z) = E(X|Z), m2(Z) = E(Y|Z), m3(Y, Z) = E(X|Y, Z), and m4(Y, Z) = E(XXT|Y, Z). For model (1), Liang et al. (2004) gave the estimating equation for estimating β after appropriately approximating m1(z), m2(z), m3(y, z), and m4(y, z). To deduce our methods, we estimate these four functions in the same way as Liang et al. (2004); i.e., using Horvitz-Thompson (HT) weighted local linear estimators (Liang et al. 2004, p.358). Denote the HT local linear estimators of mj(·) and [set membership](y, z) as m^j() for j = 1, 2, 3, 4 and π^j(y,z), respectively. Write


The estimating equation for β given by Liang et al. (2004) is of form:


This may motivate us to propose an empirical likelihood ratio statistic for β as follows.


subject to the constraints:


By the Lagrange multiplier method, it can be shown that


where λ1is determined by


We will examine statistical property of [ell]1(β). Before stating the main results, we need regularity conditions. Assume that m^jj, j = 1, 2, 3, 4, and π^j converge uniformly at order op(n−1/4). We claim the following conditions as in Liang et al. (2004) throughout the remainder of the article. In what follows, we denote AAT by A[multiply sign in circle]2, and let ξ = ξ – E(ξ|Z) and ξ = ξ – Ê(ξ|Z) for any random variable(vector) ξ, where Ê(ξ|Z) is a HT local linear estimator of E(ξ|Z).

Assumption 2.1

  1. The matrix E(X~X~T) is positive-definite, E([set membership]|X, Z) = 0, and E(|[set membership]|3|X, Z) <∞.
  2. The bandwidths for estimating m1(z) and m2(z) are of order n−1/5, and the bandwidths for estimating m3(y, z) and m4(y, z) are of order n−1/6.
  3. K(·) is a bounded symmetric density function with compact support and satisfies that ʃ K(u) du = 0, and ʃ K(u) du = 1.
  4. The density function of Z and the density function of (Y, Z) are bounded away from 0 and have bounded continuous second derivatives.
  5. The functions m1(z), m2(z) and ν(·) have bounded and continuous second derivatives, and the functions m3(y, z) and m4(y, z) have bounded first derivatives.
  6. The probability function π(y, z) > 0 on the support of (Y, Z), and has a bounded continuous second derivative.

The asymptotic property of the empirical likelihood ratio statistic [ell]1(β) is established in Theorem 2.1, whose proof is given in the Appendix.

Theorem 2.1

Suppose that Assumption 2.1 is satisfied. Then, as n → ∞,


where χ2p is the chi-squared distributed random variable with p degrees of freedom.

As a consequence, the approximate 100(1 – α)% confidence region of β can be formed as {β : [ell]1(β) ≤ cα,p}, where cα,p satisfies Pr(χ2p ≤ cα,p) = 1 – α. To implement our method, we only need to estimate mi and π, and to calculate [ell]1(β) under the constraint (2). The algorithm developed by Owen (2001, p235) can be used for this goal.

3 Pointwise Confidence Interval of ν(z)

Chen & Qin (2000) constructed confidence intervals for a nonparametric regression function with bounded support, and showed that the coverage error of the resulting confidence intervals has the same order throughout the support of the function. This feature remarkably improves the normal approximation based confidence intervals, which has a larger order of coverage error around the boundary. In this section, we are interested in empirical likelihood pointwise confidence interval of ν(z). To illustrate the motivation of empirical likelihood based confidence interval of ν(z) for a given z [set membership] (0, 1), we briefly review the results for nonparametric regression in the case of no missing data.

If Ui = ν(Zi)+[set membership]i are available, Chen & Qin (2000) proposed the following empirical likelihood statistic for ν(z),


subject to the following constraints:


Where Where γ = v(z), Wni(z) = Kh(zZi)sn,1/h with Kh(·) = K(·/h and


If β in (1) were known and there is no missing data, one can obtain an empirical likelihood based confidence interval of ν(z) on the basis of the model YiXiTβ=ν(Zi)+εi. Let β^ be a n–consistent estimator of β in the case of missing data discussed in this article (e.g. β^π^,all all in Liang et al. 2004). Intuitively a confidence interval similar to Chen & Qin (2000) can be derived in a straightforward manner. To incorporate the missing information, we propose the following empirical likelihood statistic based on the local linear smoother:


subject to the following constraints






By the Lagrange multiplier method, it can be shown that


where λ2 is determined by


The asymptotic property of the empirical likelihood ratio statistic [ell]2(γ) is established in Theorem 3.1. The proof of Theorem 3.1 is given in the Appendix.

Theorem 3.1

Suppose that nh5 → 0 and that Assumption 2.1 is satisfied. Then, as n →∞,


From Theorem 3.1, a 100(1 – α)% confidence interval of ν(z) for a given z can be constructed as {γ : [ell]2(γ) ≤ cα,1}, where cα,1satisfies Pr(χ21 ≤ cα,1) = 1 – α.

4 Simulation Experiment

To evaluate the finite-sample performance of the proposed approach, we conducted a moderate sample simulation experiment. We generated n = 100, 200 observations from model (1), and assumed that Y|X, Z ~ N{+ν(Z), σ2} and the probability of X to be observed equals Pr(δ = 1|Y, X, Z) = Φ{α1Y +ν1(Z)}, where Φ(·) is the standard normal cumulative distribution function. We set β = 0.5, σ = 1, α1 = 2, Z ~ U(0, 1), and ν1(z) = z2. We considered the following three cases.

  • Case 1 : X|Z ~ U[0, 1], ν(Z) = 2Z;
  • Case 2 : X|Z ~ U[0, Z], ν(Z) = 2Z;
  • Case 3 : X|Z ~ U[0, Z], ν(Z) = sin(Z).

The estimating procedure is the same as that in Liang et al. (2004, p.358); using the quartic kernel K0(u) = 15/16(1 – u2)2I(|u| ≤ 1) and the bandwidth n−1/5 for estimating m1(z) and m2(z), and K(u, v) = K0(u)K0(v) and bandwidths n−1/6 for estimating m3(y, z) and m4(y, z). We generated 1000 datasets in each of configurations.

Table 1 gives the results for β, and Table 2 gives the results for the nonparametric function ν(z) at the three points z = 0.3, 0.5, 0.7, respectively. The column “CI” gives the confidence intervals using the empirical likelihood method. The lower and upper values are the averages of 1000 simulated lower and upper values, respectively. The column “len” gives the average length of the confidence intervals, while the column “CP(%)” gives the corresponding coverage probabilities of the 1000 simulated datasets. It can be seen from the tables that the coverage probabilities are close to the nominal level, and the larger the sample size, the closer to the nominal level. The lengths of the estimated confidence intervals of β decrease with the increase of sample size. The confidence intervals of β for case 3 are substantially wider than those for cases 1 and 2, while this feature disappears for the confidence intervals of the nonparametric components.

Table 1
The length (len) and coverage probability (CP) of the 95% confidence intervals (CI) for the parameter β(= 0.5).
Table 2
The length (len) and coverage probability (CP) of the 95% confidence intervals (CI) for the nonparametric function ν(z) at z = 0.3, 0.5 and 0.7.

5 Real Data Analysis

In this section we present an illustrative analysis of the AIDS clinical trial group (ACTG 315) study. Our aim focuses on effectiveness of antiretroviral (anti-HIV) treatments, and how increasing CD4 cell counts decrease the amount of HIV in the blood (HIV viral load). We are interested in understanding the pathogenesis of HIV infection and in evaluation of antiretroviral therapies by characterizing the relation between viral load and CD4 cell counts. We model the relation between viral load and CD4 cell counts by using model (1). See Liang et al. (2004) for a detailed discussion. Let Yij be the viral load, and Xij be the CD4 cell counts for subject i at treatment time Zij.

ACTG 315 was a single-arm clinical trial in which 53 enrolled subjects with moderately advanced HIV-1 infection received an antiretroviral therapy consisting of zidovudine, lamivudine, and ritonavir for 48 weeks. The primary objective of the study was to assess whether the treatment was associated with evidence of immunologic restoration. The management of HIV infected patients mainly includes monitoring their CD4 cell counts, which reflect body immunity, and HIV viral load, a useful virologic marker. CD4 cell counts are used to follow response to HIV medications, as a measure of adherence to treatment and most importantly to guide decisions regarding opportunistic infection prophylaxis. HIV-1 RNA measurements were observed on days 0, 2, 7, 10, and weeks 2, 3, 4, 8, 12, 24, and 48. A total of 514 observations with 13.8% of CD4 cell count missing were obtained. Most of the missing CD4 cell counts occurred because some measurements of CD4 cell counts and viral load were obtained at different time points. Thus the missingness is at random.

In our data analysis, we ignored the correlation structure when computing the estimators and used the so-called working independence assumption because Lin & Carroll (2001) pointed that working independence has some model-robustness advantages over estimation methods that account for correlation, with a corresponding loss of efficiency. To reduce the marked skewness of CD4 cell counts, and the sparsity of treatment times, we took log-transformation for both variables. We used the same kernel function as in the simulation study in Section 4, and obtained a bandwidth of h = 0.15 in the same manner as described there. The empirical likelihood based confidence interval of β is [−0.199, −0.025], which is slightly wider than that obtained by using normal approximation method given in Liang et al. (2004, p. 365). The difference between these two confidence intervals is not substantial but visible. We prefer the results based on the empirical likelihood method.

The confidence bands of the nonparametric function of treatment time are shown in Figure 1. Overall, the confidence intervals of the nonparametric function indicate that the viral load RNA levels decrease after initial antiviral treatment and then become flat. This finding is similar to that reflected in Liang et al.’s (2004) analysis for this dataset.

Figure 1
Empirical likelihood based confidence interval of the nonparametric function v(z) for the data from the A315 study.

6 Discussion

To make inference for partially linear models with missing covariates, we proposed an empirical likelihood-based approach to construct confidence regions of β and ν(z). The proposed approach is remarkably simpler to implement than its counterpart, normal approximation method, and easily executable. The finite-sample performance of the proposed statistics shows promising. It may be of interest to extend the methods to generalized partially linear models:


where μ(·) is a known link function. This needs a further investigation, but it is beyond the scope of this paper.


The authors thank the Editor, an associate editor, and two referees for their constructive comments and suggestions. This research was supported by NIH/NIAID grants AI62247 and AI059773.


The proofs of Theorems 2.1 and 3.1 follow the arguments similar to those used by Liang et al. (2004) to prove their Theorems 1 and 2, and by Owen (2001) to prove his Theorem 3.2. We give only the key steps and departures from their procedures.

Proof of Theorem 2.1.



To prove Theorem 2.1, we first show that





where C=E(2X~X~Tπ)E((1π)(E(X~Y,Z))2π)

Using the arguments similar to the proofs of Theorems 1 and 2 in Liang et al. (2004) and Lemma 5.1 (with his α(z) = 0) in Newey(1994), we know that


have the same limiting distribution. The latter one is a sum of iid random vectors. In a consequence, (A.1) follows. On the other hand,


It is readily seen that (A.2) and (A.3) hold.

From (A.1), (A.2), and (A.3), using the same arguments as the proof of Theorem 3.2 in Owen(1990), we have


It follows from (2) and i=1npi=1 that


By Taylor expansion we obtain that


A combination of (A.1), (A.2), and (A.4) establishes Theorem 2.1. Let




where ϕ1(Y, Z, β) = YmT3βγ. We first prove the following lemma.

Lemma A.1

Under the conditions of Theorem 3.1,



The proof can be finished using the arguments similar to the proofs of Theorems 1 and 2 in Liang et al. (2004) by utilizing Lemma 5.1 of Newey (1994) when Wni(z) is equal to Kh(zZi) or Kh(zZi)(zZi)/h.

Proof of Theorem 3.1

We first derive the following two expressions:



Recall n(β^β)=Op(1). A direct calculation yields that


(A.5) follows from Lemma A.1. In a similar way, we can prove (A.6). Write






It follows from (A.5) and the central limiting theorem that


On the other hand, (3), (A.5), and (A.6) imply that


By Taylor expansion, we obtain


The proof is thus completed by combining (A.5)-(A.7).

Contributor Information

Hua Liang, Department of Biostatistics and Computational Biology, University of Rochester, Rochester, NY 14642, USA ; ude.retsehcor.tsb@gnailh.

Yongsong Qin, School ofMathematical Sciences, Guangxi Normal University, Guilin, Guangxi 541004, P.R.China ; nc.ude.unxg.xobliam@niqsy.


  • Chen SX. On the accuracy of empirical likelihood confidence regions for linear regression model. Annals of the Institute of Statistical Mathematics. 1993;45:621–637.
  • Chen SX. Empirical likelihood confidence intervals for linear regression coefficients. Journal of Multivariate Analysis. 1994;49:24–40.
  • Chen SX, Qin YS. Empirical likelihood confidence intervals for local linear smoothers. Biometrika. 2000;87:946–953.
  • Engle RF, Granger CWJ, Rice J, Weiss A. Semiparametric estimates of the relation between weather and electricity sales. Journal of American Statistical Association. 1986;81:310–320.
  • Fan J, Li R. New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis. Journal of American Statistical Association. 2004;99:710–723.
  • Gao JT, Anh V. Semiparametric regression under long-range dependent errors. Journal of Statistical Planning and Inference. 1999;80:37–57.
  • Hall P, La Scala B. Methodology and algorithms of empirical likelihood. Int. Statist. Rev. 1990;58:109–127.
  • Härdle W, Liang H, Gao J. Partially Linear Models. Springer Physica-Verlag; Heidelberg: 2000.
  • Kolaczyk ED. Empirical likelihood for generalized linear models. Statistica Sinica. 1994;4:199–218.
  • Lederman MM, Connick E, Landay A, et al. Immunologic responses associated with 12 weeks of combination antiretroviral therapy consisting of Zidovudine, Lamivudine and Ritonavir: results of AIDS clinical trials group protocol 315. The Journal of Infectious Diseases. 1998;178:70–79. [PubMed]
  • Liang H, Wang SJ, Carroll R. Partially linear models with missing response variables and error-prone covariates. Biometrika. 2007;94:185–198. [PMC free article] [PubMed]
  • Liang H, Wang S, Robins JM, Carroll RJ. Estimation in partially linear models with missing covariates. Journal of American Statistical 12 Association. 2004;99:357–367.
  • Lin XH, Carroll RJ. Semiparametric regression for clustered data using generalized estimating equations. Journal of American Statistical Association. 2001;96:1045–1056.
  • Newey WK. The asymptotic variance of semiparametric estimators. Econometrica. 1994;62:1349–1382.
  • Opsomer JD, Ruppert D. A root-n consistent backfitting estimator for semiparametric additive modeling. Journal of Computation and Graphical Statistics. 1999;8:715–732.
  • Owen AB. Empirical likelihood ratio confidence intervals for a single functional. Biometrika. 1988;75:237–249.
  • Owen AB. Empirical likelihood ratio confidence regions. Annals of Statistics. 1990;18:90–120.
  • Owen AB. Empirical likelihood for linear models. Annals of Statistics. 1991;19:1725–1747.
  • Owen AB. Empirical likelihood. Chapman & Hall; New York: 2001.
  • Qin GS, Jing BY. Censored partial linear models and empirical likelihood. Journal of Multivariate Analysis. 2001;78:37–61.
  • Qin J. Semi-empirical likelihood ratio confidence intervals for the difference of two sample means. Annals of Statistics. 1994;46:117–26.
  • Qin J. Empirical likelihood ratio based confidence intervals for mixture proportions. Annals of Statistics. 1999;27:1368–84.
  • Qin J, Lawless J. Empirical likelihood and general estimating equations. Annals of Statistics. 1994;22:300–325.
  • Robinson PM. Root-n-consistent semiparametric regression. Econometrica. 1988;56:931–954.
  • Severini TA, Staniswalis JG. Quasilikelihood estimation in semiparametric models. Journal of American Statistical Association. 1994;89:501–511.
  • Shi J, Lau TS. Empirical likelihood for partially linear models. Journal of Multivariate Analysis. 1999;72:132–148.
  • Speckman P. Kernel smoothing in partial linear models. Journal of the Royal Statistical Society, Series B. 1988;50:413–436.
  • Stone CJ. Additive regression and other nonparametric models. Annals of Statistics. 1985;13:689–705.
  • Wang QH, Li G. Empirical likelihood semiparametric regression analysis under random censorship. Journal of Multivariate Analysis. 2002;83:469–486.
  • Zeger SL, Diggle PJ. Semiparametric models for longitudinal data with application to CD4 cell numbers in HIV seroconverters. Biometrics. 1994;50:689–699. [PubMed]