The Cox model (
Cox, 1972) for censored survival data specifies the hazard rate λ(
t) for the survival time
T of an individual with
s-dimensional covariate vector
ξ to have the form
where
β is a
s-vector of regression coefficients and λ
0(
t) is the underlying hazard function.
In the survival data setting with time-invariant covariates, a main/external validation study design consists of data {Ci, Wi, Ti, Di}, i = 1, … , n1 in the main study, and {ci, Ci, Wi, Ti}, i = n1 + 1, … , n in the validation study. Because data on the outcome, Di is not available in the validation study, we call this an external validation study. Here, c is the p1-vector of true exposure which is subject to measurement error, and, in the main study, we observe a vector of surrogate variables C instead. W is a p2-vector of error-free covariates . T is the follow-up time, which is defined as the minimum of the potential failure time T0 and potential censoring time V, i.e. T = min(T0, V, t*), where t* is the end of follow up; D is an indicator for failure from the event of interest, n1 is the sample size of the main study, n2 is the sample size of the validation study, and n = n1 + n2. Typically, c is expensive to measure, and hence n1 >> n2. In what follows, we start by reviewing the ordinary regression calibration method for several different error models.
Prentice (1982) shows that if

, i.e. if the proportional hazards model holds in the perfectly measured covariates, if λ(
t|
c,
C,
W) = λ(
t|
c,
W), i.e. measurement error is non-differential and if λ(
t|
C, no censorship in [0,
t)) = λ(
t|
C), i.e. if there is random censorship conditional on the observed main study data, then
where β
1 and β
2 are respectively
p1-vector and
p2-vector of regression coefficients corresponding to
c and
W, and following
Prentice (1982),
T ≥
t can be dropped out when the event is rare.
From
(2), we see that the critical quantity is

. There are two basic ways of dealing with this quantity: exact evaluation or approximation. Exact evaluation requires assuming a model for the full distribution of (
c|
C,
W). Approximation can be carried out using moment assumptions only. The simplest approximation involves modeling only the conditional mean
μc(
C,
W) = E(
c|
C,
W), and uses the first-order approximation

. This approach leads naturally to imputing
μc(
C,
W) for
c and running a standard Cox analysis. A more sophisticated approximation can be carried out by introducing models for both the conditional mean
μc(
C,
W) = E(
c|
C,
W) and the conditional variance Σ
c(
C,
W) = Cov(
c|
C,
W). The approximation is given by
which is obtained from a second-order Taylor approximation to the cumulant generating function of (
c|
C,
W). In the special case where (
c|
C,
W) is multivariate normal, the second order approximation is exact (
Prentice (1982)); however, the approximation can be used even in the non-normal case. The first-order approximation is the approach most commonly taken.
Equation (3) allows for a semi-parametric error model (
c|
C,
W), where only the conditional mean and covariance of (
c|
C,
W), rather than the whole distribution, needs to be specified. For the ordinary regression calibration method, the multivariate results are similar to those given for the logistic regression model in
Rosner et al. (1990). For one-dimensional β without any error-free covariates, when the disease is rare, or β is small, or if the measurement error variance is small and constant, the ordinary regression calibration (ORC) estimator is given by

(
Spiegelman et al., 1997), where
naive is the naive Cox regression estimate using the surrogate measure
C directly, and
1 is obtained in the validation study by fitting the linear regression model given by E(
c|C) = α
0 + α
1C and Var(
c|C) = σ
2.