2.1 Notation

Let

*T* be the time to the first of two events, either tumor onset or death, and let

*Y* (

*t*) be an indicator that is 1 if the tumor is present at time

*t* and 0 otherwise. The tumor incidence rate and the tumor-free death rate correspond to the event-specific hazard functions

and

respectively. Let π be the expected proportion of animals that develop a tumor during the study (i.e. the lifetime tumor rate), which can be expressed in terms of λ(

*t*) and

*β*(

*t*):

where

*t*_{TS} denotes the terminal sacrifice time (i.e. the length of the study).

Suppose there are

*k* treatment groups in the current experiment, with associated dose levels

*d*_{1} <

*d*_{2} <

<

*d*_{k}, where animals in the first group are unexposed controls (

*d*_{1} = 0). More generally,

*d*_{i} is a dose score assigned to the

*i*^{th} group (

*i* = 1, 2, … ,

*k*). Let π

_{i}, λ

_{i}(

*t*), and

*β*_{i}(

*t*) denote the lifetime tumor rate, the tumor incidence rate, and the tumor-free death rate in the

*i*^{th} group. Tests for a positive trend in tumor incidence rates involve the following null and ordered alternative hypotheses:

and

respectively, where at least one inequality in

*H*_{a} is strict for at least one t.

Equation (1) shows that tests oriented toward the {π

_{i}} are not necessarily valid for comparing the {λ

_{i}(

*t*)}, as such tests can be confounded by differences among the {

*β*_{i}(

*t*)}.

Suppose

*n*_{ci} animals in the current study are randomized to the

*i*^{th} group and receive dose

*d*_{i} of the treatment (

*i* = 1, 2, … ,

*k*). We use

*c* to index variables and parameters related to the current study, and later we use

*h* to index similar quantities related to the historical data. Let

*y*_{cij} and

*t*_{cij} denote the tumor status and death time, respectively, for the

*j*^{th} animal in the

*i*^{th} group of the current study (

*j* = 1, 2, … ,

*n*_{ci};

*i* = 1, 2, … ,

*k*). Thus,

*y*_{cij} is 1 if a tumor is present at necropsy and 0 otherwise. Furthermore,

is the total number of animals in the current study, and

and

are the numbers of tumor-bearing animals in the

*i*^{th} group and in the entire study, respectively.

2.2 Background

Many early analyses used the linear trend test of

Cochran (1954) and

Armitage (1955) to compare lifetime tumor rates. Inferences based on lifetime rates can be misleading, though, if survival differs across treatment groups. For example, unless all animals live to the end of the study, the sample proportion

*y*_{ci+}/

*n*_{ci} tends to underestimate π

_{ci} because it does not account for early deaths reducing the time at risk for developing tumors. Bias arises when the amount of reduction in the time at risk varies with dose (i.e. differential mortality).

Bailer and Portier (1988) addressed the survival issue by defining a modified sample size

, where

*δ*_{cij} is a weight for animal

*j* in group

*i*. Clearly an animal that developed a tumor was at risk of tumor onset and thus receives a weight of 1. An animal that dies without a tumor, however, is assigned a weight linked to the fraction of the study it survived.

Bailer and Portier (1988) suggested using (

*t*_{cij}/

*t*_{TS})

^{3} for this fractional weight, which is known as the Poly-3 weight; then they applied the Cochran-Armitage test after substituting

for

*n*_{ci}. The resulting survival-adjusted estimate of π

_{ci} is

.

Bieler and Williams (1993) noted that

is a random variable and proposed a corrected variance estimator, which under the assumption that π

_{c1} = π

_{c2} =

= π

_{ck} is given by

where

, and the plus sign indicates summation over the missing index.

The usual Poly-3 trend test, with the Bieler-Williams variance adjustment, is

The NTP currently uses the following (conservative) variation of

*W*_{BW}:

which shrinks

*W*_{BW} toward 0. The continuity correction factor in the above formula is

where

is a weighted average of the dose scores. In the absence of survival differences,

*CF* reduces to half the largest difference between successive dose scores. Both

*W*_{NTP} and

*W*_{BW} are approximately normal when π

_{c1} = π

_{c2} =

= π

_{ck}. Bailer and Portier showed that while the Poly-3 test is oriented toward survival-adjusted comparisons of the {π

_{ci}}, it also provides an approximately valid comparison of the age-specific tumor incidence rates when each λ

_{ci}(

*t*) follows a Weibull model with a shape parameter of 3. However, simulation studies demonstrate that the Poly-3 test is sensitive to certain departures from the underlying Weibull model (

Mancuso et al., 2002;

Peddada et al., 2005).

Peddada et al. (2005) suggested an alternative trend test which uses the Poly-3 weights, but does not assume the tumor rates are linear in dose, does not depend on the numerical scores assigned to the dose groups, and does not incorporate a continuity correction. Let

denote the survival-adjusted isotonic regression estimate of π

_{ci} (

*i* = 1, 2, … ,

*k*) under the constraint that π

_{c1} ≤ π

_{c2} ≤

≤ π

_{ck}. These estimates can be obtained, for example, by applying the pool-adjacent-violators algorithm (PAVA) (

Barlow et al., 1972) to the {

} with weights {

}. As a simple illustration of PAVA, suppose there are

*k* = 4 groups, all weights are equal, the true (unknown) proportions are constrained by π

_{1} ≤ π

_{2} ≤ π

_{3} ≤ π

_{4}, and the sample proportions are

and

, respectively. Clearly,

and

violate the constraint. The isotonic solution can be obtained by retaining those estimates which satisfy the inequality constraints while averaging the estimates which are in violation. Thus, in this illustration, the “isotonized” estimates are

, and

. Mathematically, in the general case, the “isotonized” values

(

*i* = 1, 2, …,

*k*) under the above inequality constraint can be computed using the following min-max formula:

The isotonic test statistic proposed by

Peddada et al. (2005) can be written

which was motivated by a procedure that

Williams (1977) derived to test the equality of normal means. Define

*k* standard normal random variables:

*Z*_{i} ~

*N*(0, 1) for

*i* = 1, … ,

*k*. Given equal {π

_{ci}}, the distribution of

*W*_{ISO} can be approximated by the distribution of

where

are the isotonized values of the {

*Z*_{i}} using unit weights.

Peddada et al. (2005) conducted extensive simulations which showed that their test generally was more powerful than the Bailer-Portier/Bieler-Williams test when comparing age-specific incidence rates, even though both tests are more naturally oriented toward the{π

_{ci}} than the {λ

_{ci}(

*t*)}.

2.3 Proposed Test

Suppose there are

*p* relevant studies in the historical control database, where this number depends on several factors. Generally, we would prefer to include only those historical studies for which the route of exposure (e.g. feed, gavage, inhalation, skin painting) is the same as in the current study. Also, we typically restrict our comparison to studies which were performed relatively recently, since background tumor rates are known to “drift” over time (

Haseman, 1992). Thus, in practice, we anticipate incorporating control data from approximately 5 to 10 historical studies.

Let

*n*_{hm} be the number of animals in the

*m*^{th} historical control group, and let

*y*_{hmj} and

*t*_{hmj} denote the tumor status and death time, respectively, for animal

*j* in group

*m* (

*j* = 1, 2, … ,

*n*_{hm};

*m* = 1, 2, … ,

*p*). In addition, set

and

. Finally, let π

_{hm} denote the proportion of animals [summationtext] in the

*m*^{th} historical control group that are expected to develop a tumor if surviving the entire study (i.e. the lifetime tumor rate).

The problem of detecting trends in tumor rates can be viewed as a special case of order restricted inference and can be solved by maximizing certain distances in a directed graph. shows a directed graph appropriate for the usual NTP situation, where there is a high-dose (HD) group, a mid-dose (MD) group, a lowdose (LD) group, a current control (CC) group, and a collection of historical control (HC) groups. The arrows indicate the restrictions that the tumor rates in the control groups do not exceed the rate in the low-dose group, which does not exceed the rate in the mid-dose group, which does not exceed the rate in the high-dose group. The control animals, both current and historical, are assumed to come from a common population, and thus there are no arrows connecting CC and HC.

In the spirit of Kolmogorov distance measures, we follow

Peddada et al. (2001) and define a test statistic for the graph in as the maximum distance between connected points. Specifically, our test is

*max*(

*D*_{1},

*D*_{2}), where

*D*_{1} is the distance between the current control group and current dose groups, and

*D*_{2} is the distance between the historical control groups and current dose groups. Each of these two distances,

*D*_{1} and

*D*_{2}, is itself the maximum of two quantities. Distance

*D*_{1} is the maximum of the Bieler-Williams version of the Poly-3 statistic and the order restricted inference statistic of

Peddada et al. (2005) based on the current control group. Our choice for

*D*_{1} was motivated by a test proposed by

Peddada and Kissling (2006), who selected this pair of statistics because the former performs well against linear trends, while the latter performs well when trends are not linear. By taking the maximum of the two, we obtain a test with reasonably good power for detecting linear or nonlinear (but still non-decreasing) trends. Distance

*D*_{2} has exactly the same form as

*D*_{1}, except that it is based on the historical control groups rather than the current control group. Thus, the proposed test statistic is the maximum of four components. Two of the components are

*W*_{BW} and

*W*_{ISO}, as described in the previous subsection, and now we derive the other two analogous components, say

and

.

Our approach views the lifetime tumor rates in the current and historical control groups (π

_{c1}, π

_{h1}, … , π

_{hp}) as random realizations from a common control population with mean π. If no animals die before the end of the study, the sample proportion

*y*_{hm+}/

*n*_{hm} has expected value π (

*m* = 1, 2, … ,

*p*) and we assume the variance can be expressed as

where σ

^{2} is an unknown parameter that allows for extra-binomial variation.

Now suppose that some animals die before

*t*_{TS} and we want to account for this by using Poly-3 adjusted sample sizes. Allowing for the random nature of the sample size in variance formula (2), we propose separate estimators for σ

^{2} and π(1 — π), where a Bieler-Williams type adjustment is made to the latter. Under the assumption that the lifetime tumor rates are equal in all control and treated groups, we estimate σ

^{2} by

, where

and

The Bieler-Williams estimate of π(1 — π), say

, is obtained by applying the formula for

to the combined set of

*p* historical control groups and

*k* current groups. We estimate the variance of the Poly-3 adjusted tumor proportions in the current high-dose group,

, and in the pooled historical control groups,

, by

and

respectively, where

. Note that although the extra binomial variation among the historical control groups should not extend to the current dose groups, we purposely inflate the estimated variance of

by

in the above formula to avoid the Behrens-Fisher problem. This modification leads to a slightly conservative test (described below), but we view it as a small price to pay for the resulting simplification.

Thus, we define a new pair of test statistics, say

and

, which are analogous to

*W*_{BW} and

*W*_{ISO}, except that they are oriented toward detecting a trend in the current dose groups relative to the historical control groups rather than relative to the current control group. Specifically, define

exactly the same as

*W*_{BW} , except for replacing the current control group quantities

*w*_{c}_{1} and

with the historical control group summaries

*w*_{h}_{+} and

, as well as replacing

*S*_{1} with

. Similarly, let

be the following analog of

*W*_{ISO}:

where

are the isotonized values of {

} under the constraint that π ≤ π

_{c2} ≤ π

_{c3} π

_{ck}, with weights

.

Finally, the proposed survival-adjusted test statistic for detecting a dose-related trend in the tumor rates, using both current and historical controls, is given by

Deriving the exact null distribution of

*W* is not trivial. Hence we use critical values based on the approximations described below. In the next section, simulation studies reveal that our approximations control the Type I error rate very well.

Initially, assume that all animals survive to the end of the study and the lifetime tumor rates in all control and treated groups are equal to π. Also, let

*v*_{g} denote the estimated variance of

for

*g* =

*h, c*1,

*c*2, … ,

*ck*. These assumptions, together with the consistency of the {

*v*_{g}}, suggest the following approximations:

Denote the above limiting standard normal random variables by {

*Z*_{0},

*Z*_{1}, … ,

*Z*_{k}}, where the first term (

*Z*_{0}) corresponds to the pooled set of

*p* historical control groups and the remaining terms correspond to the

*k* groups in the current study. Next, define

and

*U*_{i} =

*Z*_{i} for

*i* = 1, … ,

*k*. In our application, the observations in all control and treated groups are independent; thus, the {

*Z*_{i}} are independently distributed, as are the {

*U*_{i}}. Note that all 4 components of

*W* involve

, but in addition to these terms

*W*_{BW} and

*W*_{ISO} are also functions of

, whereas

and

are also functions of

. Thus, we approximate the null distribution of

*W*_{BW} and

*W*_{ISO} by using the random variables

*Z*_{1}*, Z*_{2}, … ,

*Z*_{k}, and we approximate the null distribution of

and

by using the random variables,

*U*_{0},

*U*_{2},

*U*_{3}, … ,

*U*_{k}. As

*W*_{BW} can be obtained by regressing the lifetime tumor rates on dose, it should be distributed approximately the same as the regression of

*Z*_{1},

*Z*_{2}, … ,

*Z*_{k} on dose:

We approximate the distribution of

using the analogous function of

*U*_{0},

*U*_{2}, … ,

*U*_{k}:

Both

*W*_{ISO} and

resemble the statistic proposed by

Williams (1977) to test for a monotone trend in dose. Hence, using the asymptotic approximations given above, the null distributions for these statistics can be approximated by

and

respectively, where

are the isotonized values of {

*Z*_{1},

*Z*_{2}, … ,

*Z*_{k}} with unit weights and

are the isotonized values of {

*U*_{0},

*U*_{2}, … ,

*U*_{k}} with a weight of

*p* for

*U*_{0} and unit weights for the others.

Therefore, in the absence of deaths prior to the end of the experiment, the null distribution of our test statistic

*W* can be approximated by the distribution of

As

*V* is a function of independent N(0, 1) random variables, its distribution can be simulated by repeatedly generating

*k* + 1 independent standard normals. The level

*α* critical value for

*W* is the 100(1 —

*α*)

^{th} percentile of the simulated distribution of

*V*. In practice, we take a million random samples to approximate the distribution of

*W*.

More generally, when some animals die early, the components of *W* are functions of the Poly-3 weights, which provide a simple survival adjustment. However, despite possible survival effects, we approximate the distribution of *W* by the same simulated distribution of *V* described above. We demonstrate that the proposed test performs well in the more general case by simulating a variety of realistic data sets with many deaths before *t*_{TS}. The simulation results show that our test operates at (or below) the nominal level in a wide range of situations typically observed in practice.