Home | About | Journals | Submit | Contact Us | Français |

**|**HHS Author Manuscripts**|**PMC2790290

Formats

Article sections

- Abstract
- 1. Introduction
- 2. Moderation Analysis
- 3. A Real Study Data Application
- 4. Discussion
- References

Authors

Related links

J Data Sci. Author manuscript; available in PMC 2010 July 1.

Published in final edited form as:

J Data Sci. 2009 July 1; 7(3): 313–329.

PMCID: PMC2790290

NIHMSID: NIHMS90046

Wan Tang, University of Rochester, Rochester, NY 14642, USA;

Wan Tang: ude.retsehcor.cmru@gnat_naw; Qin Yu: ude.retsehcor.cmru@uy_niq; Paul Crits-Christoph: ude.nnepu.dem.liam@stirc; Xin M. Tu: ude.retsehcor.cmru@ut_nix

See other articles in PMC that cite the published article.

Conceptually, a moderator is a variable that modifies the effect of a predictor on a response. Analytically, a common approach as used in most moderation analyses is to add analytic interactions involving the predictor and moderator in the form of cross-variable products and test the significance of such terms. The narrow scope of such a procedure is inconsistent with the broader conceptual definition of moderation, leading to confusion in interpretation of study findings. In this paper, we develop a new approach to the analytic procedure that is consistent with the concept of moderation. The proposed framework defines moderation as a process that modifies an existing relationship between the predictor and the outcome, rather than simply a test of a predictor by moderator interaction. The approach is illustrated with data from a real study.

Moderation and mediation analyses are widely used in biomedical and psychosocial research (Baron & Kenny, 1986; Chaplin, 1991; Cole & Maxwell, 2003;Crits-Christoph et al., 2003; Holmbeck, 1997; Kraemer et al., 2001; 2002; Krull & MacKinnon, 2001; Rogosch et al., 1990; Rothman & Greenland, 1998). Although often implemented in correlational studies in social psychology and other fields of inquiry, moderation and mediation analyses have become increasingly popular and an integral part of data analysis in treatment research (Kraemer et al., 2002). In intervention studies, moderation analysis helps determine whether an intervention has a differential effect among subgroups that are defined by baseline characteristics. Thus, moderators provide useful information for treatment decisions and maximizing treatment effect. In contrast to moderation analysis, mediation analysis helps identify mechanisms by which an intervention achieves its effect. By identifying the correct mediation process through which treatment affects study outcomes, not only can we further our understanding of the pathology of the disease and treatment, but also provide information for developing new and alternative treatments to treat the disease with efficient use of resources. Moderation and mediation analyses are also often performed for epidemiologic studies to determine risk factors and elucidate the causes of a disease.

In a seminal paper, Baron and Kenny (1986) proposed a general framework for characterizing a moderation and mediation process. In particular, they laid a theoretical foundation for conceptualizing such processes and for approaching the underlying analytic problems. In addition, their work clarified the fundamental difference between the closely related, yet fundamentally distinct notion of moderation and mediation. However, as recently pointed out by Kraemer et al. (2001), the limited analytic strategies proposed in their paper have been used and extrapolated to situations to which they often do not apply, leading to confusion in interpreting analysis results and even conflict with the conceptual definition of such processes. For example, their work showed that the presence of analytic interaction between a moderator and a predictor (the product of the two variables) is model-dependent; the same data may show zero or non-zero moderator by predictor interactions depending on which analytic models (e.g., logistic or linear model) are used to fit the data. Thus, the popular approach of simply looking for non-zero interactions as used in most moderation analyses has limited applications and often leads to dubious and uninterpretable results. Defining a general analytic definition consistent with the conceptual notion of moderation is the focus of this paper.

In this paper we restrict our attention to moderation analysis and discuss a new approach to address the limitations of current methods. More specifically, our approach more broadly models the effect of a moderator so that its effect is not limited to analytic interactions. For convenience, we mainly focus on moderation analysis. We show how analytic interactions can become uninterpretable as moderation effect and how one variable can be a moderator without assuming the form of such interactions. After describing the new analytic framework in Section 2, we illustrate the proposed approach with a real data example in Section 3, followed by concluding remarks.

In this section, we first review existing models for moderation analysis and in the process outline the problems with such methods. We then propose our approach to address these issues.

For convenience, assume a relatively simple moderation process involving only one predictor, *x*, a response, *y*, and a moderator, *z*.

Assume that *y* is continuous and consider the following linear model relating *x* to *y*:

$${y}_{i}={\alpha}_{0}+{\alpha}_{1}{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n,$$

(1)

where *i* indexes the subjects from a sample of size *n* and (0, *σ*^{2}) denotes a random variable with mean 0 and variance *σ*^{2}. For robust inference, we only assume that *ε _{i}* has a mean of 0 and is uncorrelated with

A moderator is a variable that affects or modifies the relationship between *x* and *y*. In a conceptual sense, if *z* is a moderator, it interacts with the predictor *x* to alter the effect of the latter variable on the response *y*. Because of such an “interaction” interpretation, a popular approach as used in many moderation analyses is to include the first-order (*xz*) or even higher-order (e.g. *x*^{2}*z*) moderator by predictor interactions to examine moderation effect. For example, by including the first-order *x* by *z* interaction in (1), we obtain:

$${y}_{i}={\alpha}_{0}+{\alpha}_{1}{x}_{i}+{\alpha}_{2}{z}_{i}{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(2)

Under this model, the effect of *x* on *y* defined by *α*_{1} in (1) has been altered and replaced by a function of *z* in the form of *α*_{1} + *α*_{2}*z _{i}* (Aiken & West, 1991; Neter et al. 1990). Because of this, a moderator is also known as an effect modifier.

Although moderation does translate into analytic interactions in the case of (2), it is a fundamentally different concept. Indeed, simply interpreting moderation as analytic interactions can have serious ramifications. In particular, not all interactions will have the moderation interpretation. For example, consider the two scatter plots in Figure 1, in which the relationship between *y* and *x* is plotted for the two levels of a binary variable *z* as indicated by circles (*z _{i}* = 0) and squares (

$${y}_{i}=({\alpha}_{0}+{\delta}_{0}{z}_{i})+({\alpha}_{1}+{\delta}_{1}{z}_{i}){x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(3)

Two patterns of treatment response and fitted regression lines as a function of moderator *z* and treatment condition (circles for treatment 1 and squares for treatment 2).

In this model, the interaction *z _{i} x_{i}* changes the slopes of the linear relations between

$${y}_{i}=({\alpha}_{0}+{\delta}_{0}{z}_{i})+({\alpha}_{1}+{\delta}_{1}{z}_{i}){x}_{i}+{\alpha}_{2}{z}_{i}{x}_{i}^{2}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(4)

Although similar, the above is fundamentally different from (3) as a model for moderation. Unlike (3), this model does not have a moderation interpretation with respect to (1) since it does not merely modify the effect of *x* on *y*, but rather it postulates quite a different relationship between *x* and *y* by adding a quadratic term
${z}_{i}{x}_{i}^{2}$. Thus, *z* cannot be considered as a modifier for the linear relationship between *y* and *x* as initially modeled in (1).

An obvious difference between (3) and (4) is that the latter involves a higher-order interaction
${z}_{i}{x}_{i}^{2}$. However, a moderation model for (1) does not have to involve only the first-order interaction. For example, although the model below contains a higher-order interaction between *x* and *z*:

$${y}_{i}={\alpha}_{0}+{\alpha}_{1}{x}_{i}+{\alpha}_{2}{z}_{i}{x}_{i}+{\alpha}_{3}{{z}^{2}}_{i}{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(5)

it is still a moderation model for (1) since the inclusion of the interaction *z*^{2}* _{i} x_{i}* does not change the initial linear relationship between

Note that although (4) is not a moderation model for (1), it may be viewed as such a model for a different quadratic relationship between *y* and *x*:

$${y}_{i}={\beta}_{0}+{\beta}_{1}{x}_{i}+{\beta}_{2}{x}_{i}^{2}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(6)

since *z* in (4) modifies the effect of *x* on *y* by altering the coefficients associated with the linear and quadratic terms. In general, any model we create by adding certain analytic interactions to (1) can be viewed as a moderation model for some model that relates *y* to *x*. However, a moderation model for (1) should only alter the effect of *x* on *y* without changing the original linear relationship between *x* on *y*. Thus, the essence of the definition of a moderator *z* is that the model form remains the same, but the coefficients may change and become functions of *z*.

The examples above show that not all interactions have a moderation interpretation. The reverse is also true. Interaction in a conceptual sense has a broader interpretation, not just limited to analytic interactions in the form of cross-variable product. Consider for example a non-linear model relating *x* to *y* as given by:

$${y}_{i}={\alpha}_{0}{\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{1}}+{\alpha}_{2}{\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{3}}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(7)

This bi-exponential model, which is not a linear model since *y* is not a linear function of the coefficient *α*_{1}, is widely used in modeling plasma concentration *y* as a function of time *x* in biomedical research (Neter et al. 1990; Davidian and Giltinan, 1995). Although non-linear, the effect of *x* on *y* is still defined by *α _{k}* (0 ≤

$${y}_{i}={\alpha}_{0}{\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{1}}+({\alpha}_{20}+{\alpha}_{21}z){\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{3}}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(8)

The above model has the extra term *α*_{21}*z* (*e*^{−}* ^{xi}*)

$${y}_{i}={\alpha}_{0}{\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{1}}+{\alpha}_{2}{\left({e}^{-{x}_{i}}\right)}^{{\alpha}_{3}}+{\alpha}_{4}{z}_{i}{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(9)

As with (4), the above actually represents a new model for the relationship between *y* and *x*, rather than a moderation model to account for the altered effect of *x* on *y* by *z* based on the original model in (5).

Note that the problem with interpreting conceptual interaction as simply analytic interaction involving cross-variable products has also been noted by Kraemer et al. (2001). By considering analytic interactions across different types of models (e.g. linear, logistic etc.), they demonstrated that the presence of such interaction effects depends on the type of models being fitted. Our considerations above complement their findings by further elucidating the mechanism that causes such model dependency when defining moderation through interactions.

As illustrated in the preceding section, current methods for moderation analysis developed on the premise of analytic interactions are problematic. If used without caution, they may give rise to models that do not have moderation interpretation in the conceptual sense. In addition, such interaction-based strategies generally do not work for non-linear models, as interactions between a predictor and a moderator do not have to be in the form of cross-variable product. In this section, we systematically address these issues simultaneously by proposing an approach that does not rely on analytic interactions.

Let us start with the linear model (1) again. As we discussed earlier, this model is determined by the coefficients or parameters, *α*_{0} and*α*_{1}. Thus, if *z* interacts with *x* to alter the relationship between *x* and *y*, these parameters become a function of *z*, i.e.,

$${y}_{i}={\alpha}_{0}({z}_{i})+{\alpha}_{1}({z}_{i}){x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

(10)

Unlike the models in (2) and (3), no specific form is assumed for *α _{k}* (

$${\alpha}_{0}({z}_{i})={\alpha}_{0},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\alpha}_{1}({z}_{i})={\gamma}_{1}+{\gamma}_{2}{z}_{i}$$

then we immediately obtain the model in (2). The model in (10) automatically excludes models that contain analytic interactions but do not have a moderation interpretation such as the model in (4). The linear model in (10) with the coefficients being a function of a variable is known as the varying-coefficient linear model (Fan and Zhang, 1999; Hastie and Tibishirani, 1993).

Thus, our approach to defining the effect of a moderator *z* on the linear model (1) is to change the definition of the coefficients so that they become a function of *z*. This principle is readily applied to general models such as the generalized linear and non-linear models (McCullagh & Nelder, 1989; Davidian and Giltinan, 1995). For example, the generalized linear model for a binary response is expressed as:

$${y}_{i}~\mathit{Bin}(h({\alpha}_{0}+{\alpha}_{1}{x}_{i}),1),\phantom{\rule{0.38889em}{0ex}}1\le i\le n,$$

(11)

where *B* (*p*,1)denotes a Binomial (or Bernoulli) distribution with sample size *n* = 1 and the probability of success *p*. Thus, in (11), the mean of *y _{i}* is modeled as a function of

$$h({\alpha}_{0}+{\alpha}_{1}{x}_{i})=exp({\alpha}_{0}+{\alpha}_{1}{x}_{i})/[1+exp({\alpha}_{0}+{\alpha}_{1}{x}_{i})],$$

(12)

though other link functions such as the probit link are also often used (McCullagh & Nelder, 1989; Tu et al., 1999). Once a link function is chosen, the relationship between *y* and *x* is determined by the parameters *a*_{0} and*α*_{1}. Thus, as in the linear model case, we define the effect of a moderator *z* by letting *a _{k}* be a function of

$${y}_{i}~\mathit{Bin}(h({\alpha}_{0}({z}_{i})+{\alpha}_{1}({z}_{i}){x}_{i}),1),\phantom{\rule{0.38889em}{0ex}}1\le i\le n.$$

(13)

In the logistic model case, *α _{k}* in (12) becomes a function of

The definition also carries through in a straightforward fashion for non-linear models. For example, by modeling the coefficients as a function of *z* in (6), we obtain:

$${y}_{i}={\alpha}_{0}(z){\left({e}^{-x}\right)}^{{\alpha}_{1}(z)}+{\alpha}_{2}(z){\left({e}^{-x}\right)}^{{\alpha}_{3}(z)}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

As in the linear model case of (10), the above includes (8) as a special case, but excludes (9) as a model for moderation analysis.

Note that in semiparametric regression analysis, models are specified by the conditional mean of the response given the predictor (Robins et al. 1995):

$$E({y}_{i}{x}_{i})=h({x}_{i},\alpha ),\phantom{\rule{0.38889em}{0ex}}1\le i\le n,$$

(14)

where *α* is the vector of parameters or coefficients and *h*(*x*,*α*) is a function of *x* and *α*. When defined under the semi-parametric regression setup, a moderator *z* can affect only *α*, without altering the functional form *h*(*x*,*α*), i.e.,

$${E}_{z}({y}_{i}{x}_{i})=h({x}_{i},\alpha ({z}_{i})),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n.$$

For example, by expressing (1) in the form (14), we obtain:

$$E({y}_{i}{x}_{i})={\alpha}_{0}+{\alpha}_{1}{x}_{i},\phantom{\rule{0.38889em}{0ex}}1\le i\le n.$$

Thus, (2) and (3) are both moderation models for (1) since *z* alters only the parameter vector. However, (4) is not a moderation model for (1) since it also changes the functional form *h*(*x*,*α*).

Procedures for fitting varying-coefficient models are based on the idea of “local averaging.” For example, for the linear varying-coefficient model in (10), first we fix *z* and use data close to *z* (window) to fit the model by treating *z* as a constant. By moving *z* over the range of *z* in the data, we obtain estimates of *α _{k}* (

Since the case with a binary *z* can be subsumed into the discussion of a categorical moderator, we consider only a categorical *z* and assume that *z* has a total of *K* categories.

For such a moderator, (10) becomes:

$${y}_{ki}={\alpha}_{0k}+{\alpha}_{1k}{x}_{ki}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{ki}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le {n}_{k},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le k\le K.$$

(15)

In this case, the original sample is partitioned into *K* sub-samples, each of size *n _{k}*, and a different linear relationship is postulated for each sub-sample as characterized by the different coefficients or parameters

$${H}_{0}:{\alpha}_{1k}={\alpha}_{1}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{all}\phantom{\rule{0.16667em}{0ex}}1\le k\le K\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}vs.\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{H}_{a}:{\alpha}_{1j}\ne {\alpha}_{1k},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\text{for}\phantom{\rule{0.16667em}{0ex}}\text{some}\phantom{\rule{0.16667em}{0ex}}1\le j,k\le K.$$

(16)

Note that sometimes it may happen that *z* changes only the intercepts without affecting the slope, i.e., *α*_{1}* _{k}* =

As in the preceding section, we consider only a categorical predictor *x* with *K* levels. Let *n _{k}* denote the sub-sample size for the

$${y}_{ki}={\alpha}_{k}({z}_{ki})+{\epsilon}_{ki},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{ki}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le {n}_{k};\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le k\le K.$$

(17)

As a special case, if *α _{k}* (

Now, consider a linear *α _{k}* (

$${y}_{ki}={\gamma}_{0k}+{\gamma}_{1k}{z}_{ki}+{\epsilon}_{ki},\phantom{\rule{0.38889em}{0ex}}{\epsilon}_{ki}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le {n}_{k};\phantom{\rule{0.38889em}{0ex}}1\le k\le K.$$

(18)

In this case, the difference between the means of two groups, *k* and *r*, is given by:

$${\mathrm{\Delta}}_{kr}=({\gamma}_{0k}-{\gamma}_{0r})+({\gamma}_{1k}-{\gamma}_{1r})z.$$

(19)

The second term in (19) represents the differential effect of *z* on the means of the two groups and constitutes the moderation effect. If the slopes, *γ*_{1}* _{k}*, in (18) are equal to a constant across all groups, i.e.,

$${y}_{ki}={\gamma}_{0k}+{\gamma}_{1}{z}_{ki}+{\epsilon}_{ki},\phantom{\rule{0.38889em}{0ex}}{\epsilon}_{ki}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le {n}_{k};\phantom{\rule{0.38889em}{0ex}}1\le k\le K.$$

(20)

The above is an analysis of covariance (ANCOVA) model and *γ*_{1}*z _{ki}* is the adjustment factor for the effect of

In real study applications, one or more *α _{k}* (

$${\alpha}_{1}(z)={\gamma}_{0},\phantom{\rule{0.38889em}{0ex}}{\alpha}_{2}(z)={\gamma}_{0}+{\gamma}_{1}{I}_{\{z\le c\}},$$

(20)

can be used to model the scenario where the effect of the dichotomous variable *x* on *y* is through a step function defined by some cut-off *c* of the moderator *z* as depicted in Figure 2 of Baron and Kenny (1986), where *I*_{{}_{z}_{≤}_{c}_{}} denotes the set indicator with *I*_{{}_{z}_{≤}_{c}_{}} = 1 if *z* ≤ *c* and 0 if otherwise. Note that the advantage of formulating the model using the vary-coefficient model (17) is that we can use smoothing techniques to estimate the functional form of *α _{k}* (

The linear varying-coefficient model (10) is also closely related to a linear mixed-effects (LMM) or hierarchical linear model (HLM) (Laird & Ware, 1982; Gibbons et al., 1994; Raudenbush, 1994). In particular, the varying-coefficients, *α _{k}* (

As in the usual derivation of the linear mixed-effects model, at the first level, we assume a linear model with random individual effects as follows:

$${y}_{i}={\alpha}_{0i}+{\alpha}_{1i}{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{\epsilon}_{i}~\left(0,{\sigma}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n,$$

(21)

In the above model, *α*_{0}* _{i}* and

$${\alpha}_{ki}={\alpha}_{k}({z}_{i})+{e}_{ki},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{e}_{ki}~\left(0,{\sigma}_{k}^{2}\right),\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}k=0,1,$$

(22)

where *e _{ki}* is assumed to have a mean of 0 and to be uncorrelated with both

$${y}_{i}={\alpha}_{0}({z}_{i})+{\alpha}_{1}({z}_{i}){x}_{i}+{\stackrel{~}{\epsilon}}_{i},\phantom{\rule{0.38889em}{0ex}}{\stackrel{~}{\epsilon}}_{i}~\left(0,{\sigma}_{\mathit{mix}}^{2}({x}_{i}^{2})\right),$$

(23)

where * _{i}* =

The extension to longitudinal data analysis is straightforward. As before, we only consider a continuous response *y* with a predictor *x* and moderator *z*. For convenience, we consider modeling such a response using the linear mixed-effects model, as this approach is widely used in modeling longitudinal data (e.g. Laird & Ware, 1982; Gibbons et al., 1994; Raudenbush, 1994).

Consider a longitudinal study with *n* subjects and *m* assessment points. For illustration purposes, we only consider linear growth-curve analysis in which the trajectory of each subject is modeled as a linear function of time as follows:

$${y}_{it}={\alpha}_{0}+{\alpha}_{1}{x}_{i}+{\alpha}_{2}t+{\alpha}_{3}t{x}_{i}+{b}_{0}+{b}_{1}{x}_{i}+{b}_{2}t+{b}_{3}t{x}_{i}+{\epsilon}_{i},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le i\le n,\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le t\le m,$$

(24)

where *t* denotes time, *α _{k}* the fixed-effects for the population mean, and

In most applications, interest lies in whether *z* moderates treatment differences. For example, suppose that *x* is a binary indicator for two treatment conditions. The vary-coefficient linear model in this case is given by:

$$\begin{array}{l}\text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=0:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}={\alpha}_{0}({z}_{i})+{\alpha}_{2}({z}_{i})t+{b}_{0}+{b}_{2}t+{\epsilon}_{i},\\ \text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=1:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}=\left[{\alpha}_{0}({z}_{i})+{\alpha}_{1}({z}_{i})\right]+\left[{\alpha}_{2}({z}_{i})+{\alpha}_{3}({z}_{i})\right]t+{b}_{0}+({b}_{2}+{b}_{3})t+{\epsilon}_{i}.\end{array}$$

(25)

In this model, *α*_{0} (*z _{i}*) moderates the within-treatment effect,

If all the varying-coefficients are a linear function of *z*, *α _{k}* (

$$\begin{array}{l}\text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=0:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}={\gamma}_{00}+{\gamma}_{01}{z}_{i}+{\gamma}_{20}t+{\gamma}_{21}{z}_{i}t+{b}_{0}+{b}_{2}t+{\epsilon}_{i},\\ \text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=1:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}=({\gamma}_{00}+{\gamma}_{10})+({\gamma}_{01}+{\gamma}_{11}){z}_{i}+({\gamma}_{20}+{\gamma}_{30})t+({\gamma}_{21}+{\gamma}_{31}){z}_{i}t+{b}_{0}+({b}_{2}+{b}_{3})t+{\epsilon}_{i}.\end{array}$$

(26)

In most randomized trials, the mean response does not differ between treatment conditions at baseline so that *γ*_{10} = 0. In addition, if randomization works effectively, *z* should not have a differential effect at baseline, which implies that *γ*_{11} = 0. So, (26) further simplifies to:

$$\begin{array}{l}\text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=0:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}={\gamma}_{00}+{\gamma}_{01}{z}_{i}+{\gamma}_{20}t+{\gamma}_{21}{z}_{i}t+{b}_{0}+{b}_{2}t+{\epsilon}_{i},\\ \text{If}\phantom{\rule{0.16667em}{0ex}}{x}_{i}=1:\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}{y}_{it}={\gamma}_{00}+{\gamma}_{01}{z}_{i}+({\gamma}_{20}+{\gamma}_{30})t+({\gamma}_{21}+{\gamma}_{31}){z}_{i}t+{b}_{0}+({b}_{2}+{b}_{3})t+{\epsilon}_{i}.\end{array}$$

(27)

In the above model, the treatment by time interaction, *γ*_{30}, represents treatment difference over time in the absence of the moderator *z* (when *z* = 0), while the treatment by time by moderator interaction, *γ*_{31}, represents the moderation effect of *z* on the treatment difference. The model in (27) and its generalizations for multiple treatment conditions are widely used in testing moderation effect in longitudinal studies.

As in the case of cross-sectional study designs, inference for varying-coefficient models can still be made using the usual estimation procedures when *z* or *x* or both are categorical variables. For example, we can use standard estimation procedures to fit the model in (27). When both *z* and *x* are continuous, inference becomes much more complex and smoothing methods may be used. Again, this issue will not be pursued here. Fortunately, in many randomized studies, treatment differences are modeled by binary indicators, in which case standard procedures can be used to fit linear mixed-effects models with varying-coefficients.

We illustrate the proposed methodology with real study data from the National Institute on Drug Abuse Collaborative Cocaine Treatment Study (Crits-Christoph et al., 1999). This randomized and multi-center project investigated the efficacy of psychosocial treatment for cocaine dependence, with a sample of 487 patients who were randomized to one of four treatment conditions: cognitive therapy (CT) plus group drug counseling (GDC), supportive-expressive (SE) plus group drug counseling (GDC), individual drug counseling (IDC) plus group drug counseling (GDC), and GDC alone. Primary outcome analyses focused on the intent-to-treat sample and examined several measures of drug use (Crits-Christoph et al., 1999).

For illustration purposes, we applied the proposed methodology to data at six month post-treatment using the Addiction Severity Drug Use composite variable (ASI; McLellan et al., 1992) as the response variable. As a significant treatment difference was found among the four treatment groups, it was of interest to examine if the treatment differences were moderated by baseline alcohol consumption as measured by the ASI alcohol use composite.

Let *y* denote the drug use composite variable at six month post-treatment and *z* the pre-treatment alcohol use composite variable. We applied the ANOVA model with varying coefficients in (17) to examine the effect of moderation by *z*. To determine the appropriate analytic form for modeling the mean response of each group *α _{k}* (

The fitted LOWESS curves indicated a quadratic mean response for the SE group, but a linear response for each of the other three groups. Thus, to formally test for moderation by *z*, we fitted the following quadratic response model:

$${y}_{ki}={\gamma}_{k0}+{\gamma}_{1}{z}_{ki}+{\gamma}_{2}{{z}^{2}}_{ki}+\sum _{l=1}^{3}{\delta}_{l}{z}_{ki}{I}_{\mathit{kil}}+\sum _{l=1}^{3}{\eta}_{l}{z}_{ki}^{2}{I}_{\mathit{kil}}+{\epsilon}_{ki},\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}\phantom{\rule{0.16667em}{0ex}}1\le k\le 4,$$

(28)

where *γ’s*, *δ’s* and *η’s* are model parameters, and *k* = 1,2,3,4 denote the IDC, CT, SE and GDC treatment groups, respectively. For robust inference, we did not assume normality for *ε _{ki}* and estimated the parameters using estimating equations or quasi-likelihood (e.g. McCullagh & Nelder, 1989).

Shown in Table 1 are the estimated parameters and the associated p-values. The estimated coefficients for the first- and second-order interactions are statistically significant only for the SE group (see estimates of *δ*_{3} and *η*_{3} and their associated p-values), indicating that unlike the other groups, the response for SE had a quadratic relationship with the moderator. Thus, pre-treatment alcohol use was a moderator, as it affects treatment response differentially between this and the other three treatment conditions.

It is interesting to note that when we applied the linear coefficient model (18) without the second-order interactions, none of the coefficients were significantly different from 0 (see estimates and associated p-values in Table 1). Thus, by looking only at the first-order interactions as in the traditional way, we would not be able to detect any moderation effect in this case. In this particular application, the use of the varying-coefficient model (17) helped identify the correct analytic interactions to model the effect of moderation by *z*.

In this paper, we discussed a general analytic framework for moderation analysis by defining moderation as a process that modifies an existing relationship between the predictor and outcome. As illustrated by both theoretical considerations and real data analyses, moderation can follow quite a complex process, which may not be modeled by simply including analytic interactions involving the moderator and predictor as in most moderation analyses. Since the relationship between the response and predictor is defined by the coefficients or parameters of a given model, it is logical to define the effect of a moderator through such model parameters. Thus, the proposed approach is consistent with the conceptual definition of a moderation process.

Although moderation effect often exhibits in the form of analytic interaction, especially for linear regression models, not all such interactions can be interpreted as moderation effect. By defining moderation effect using the varying-coefficient model, we are able to delineate the types of analytic interaction that have a moderation interpretation from those that do not. Also, since moderation models are defined based on the original model relating the response and predictor, they are consistent and well-interpreted. Thus, the model-dependent issue as pointed out in Kraemer et al. (2001) does not arise. For example, if the original relationship between the response and predictor is a linear model, the effect of a moderator is limited to modifying the coefficients of the linear model, ruling out other types of models, such as the logistic model, as potential candidates for moderation analysis.

Since our goal in this paper was to present an appropriate analytic framework for moderation analysis, we did not get into technical details about inference for general varying-coefficient regression models. When there are multiple continuous predictors and moderators, inference for such models may become quite complex, especially with longitudinal study data. We will address these issues in future research.

We would like to thank an anonymous reviewer and Editor Chao for helpful and constructive comments that led to substantial improvement in the presentation of the research. This research was funded in part by National Institute of Mental Health grant P50-MH-45178, U01-DA07090, P30-MH-45178, U01-DA07663, U01-DA07673, U01-DA07693, U01-DA07085, and R01-DA012249.

Wan Tang, University of Rochester, Rochester, NY 14642, USA.

Qin Yu, University of Rochester, Rochester, NY 14642, USA.

Paul Crits-Christoph, University of Pennsylvania, Philadelphia, PA 19104, USA.

Xin M. Tu, University of Rochester, Rochester, NY 14642, USA.

- Aiken LS, West SG. Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage; 1991.
- Baron RM, Kenny DA. The moderator-mediator variable distinction in social psychological research: concept, strategic and statistical considerations. Journal of Personality and Social Psychology. 1986;51:1173–1182. [PubMed]
- Bollen KA. Structural equations with latent variables. Wiley; New York: 1989.
- Carroll RJ, Ruppert D, Welsh AH. Local estimating equations. Journal of the American Statistical Association. 1998;93:214–227.
- Chaplin WF. The next generation of moderator research in personality psychology. Journal of Personality. 1991;59:143–178. [PubMed]
- Cole DA, Maxwell SE. Testing mediational models with longitudinal data: questions and tips in the use of structural equation modeling. Journal of Abnormal Psychology. 2003;4:558–577. [PubMed]
- Crits-Christoph P, Siqueland L, Blaine J, Frank A, Luborsky L, Onken LS, et al. Psychosocial treatments for cocaine dependence: National Institute on Drug Abuse Collaborative Cocaine Treatment Study. Archives of General Psychiatry. 1999;56:493–502. [PubMed]
- Davidian M, Giltinana DM. Nonlinear Models for Repeated Measurement Data. Chapman and Hall/CRC; Boca Raton, FL: 1995.
- Fan J, Zhang W. Statistical Estimation in varying-coefficient models. Annals of Statistics. 1999;27:1491–1518.
- Fox J. Nonparametric simple regression: smoothing scatterplots. Thousand Oaks CA: Sage; 2000.
- Gibbons RD, Hedeker D, Charles SC, Frisch PR. A random-effects probit model for predicting medical malpractice claims. Journal of American Statistical Association. 1994;89:760–767.
- Hart JD. Nonparametric smoothing and lack-of-fit tests. Springer-Verlag; New York: 1997.
- Hastie TJ, Tibishirani RJ. Varying-coefficient Models. Journal of the Royal Statistiscal Society Ser B. 1993;55:757–796.
- Holmbeck GN. Toward terminological, conceptual and statistical calrity in the study of mediators and moderatorsl; examples from the child-clinical and pediatric psychology literatures. Journal of Consulting and Clinical Psychology. 1997;65:599–610. [PubMed]
- Kraemer HC, Stice E, Kazdin A, Offord D, Kupfer D. How do risk factors work together? Mediators, moderators, and independent, overlapping, and proxy risk factors. American Journal of Psychiatry. 2001;158:848–856. [PubMed]
- Kraemer HC, Wilson T, Fairburn CG, Agras WS. Mediators and moderators of treatment effects in randomized clinical trials. Archives of General Psychiatry. 2002;59:877–883. [PubMed]
- Krull JL, MacKinnon DP. Multilevel modeling of individual and group level mediated effects. Multivariate Behavioral Research. 2001;36:249–277.
- Laird N, Ware J. Random-effects models for longitudinal data. Biometrics. 1982;38:963–974. [PubMed]
- Loader C. Local regression and likelihood. Springer; New York: 1999.
- McCullagh P, Nelder JA. Generalized linear models. 2. Chapman and Hall; London: 1989.
- McLellan AT, Kushner H, Metzger D, Peters R, Smith I, Grissom G, et al. The Fifth Edition of the Addiction Severity Index. Journal of Substance Abuse Treatment. 1992;9:199–213. [PubMed]
- Neter J, Wasserman W, Kutner MH. Applied statistical models. 3. IL: Irwin; 1990.
- Raudenbush SW. Random effects models. In: Cooper H, Hedges LV, editors. The handbook of research synthesis. Russell Sage Foundations; New York: 1994.
- Robins JM, Rotnitzky A, Zhao LP. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. Journal of the American Statistical Association. 1995;90:106–121.
- Rogosch F, Chassin L, Sher KJ. Personality variables as mediators and moderators of family history risk for alcoholism: conceptual and methodological issues. Journal of Studies on Alcohol. 1990;51:310–318. [PubMed]
- Rothman KJ, Greenland S. Modern epidemiology. Lippincott Willams and Wilkins; Philadelphia: 1998.
- Tu XM, Kowalski J, Jia G. Bayesian analysis of prevalence with covariates using simulation-based techniques: applications to HIV screening. Statistics in Medicine. 1999;18:3059–3073. [PubMed]

PubMed Central Canada is a service of the Canadian Institutes of Health Research (CIHR) working in partnership with the National Research Council's national science library in cooperation with the National Center for Biotechnology Information at the U.S. National Library of Medicine(NCBI/NLM). It includes content provided to the PubMed Central International archive by participating publishers. |