|Home | About | Journals | Submit | Contact Us | Français|
Herpes simplex virus type 1 (HSV-1) is known to cause diseases of various severities. There is increasing interest to find drug combinations to treat HSV-1 by reducing drug resistance and cytotoxicity. Drug combinations offer potentially higher efficacy and lower individual drug dosage. In this paper, we report a new application of fractional factorial designs to investigate a biological system with HSV-1 and six antiviral drugs, namely, Interferon-alpha, Interferon-beta, Interferon-gamma, Ribavirin, Acyclovir, and TNF-alpha. We show how the sequential use of two- and three-level fractional factorial designs can screen for important drugs and drug interactions, as well as determine potential optimal drug dosages through the use of contour plots. Our initial experiment using a two-level fractional factorial design suggests that there is model inadequacy and drug dosages should be reduced. A follow-up experiment using a blocked three-level fractional factorial design indicates that TNF-alpha has little effect and HSV-1 infection can be suppressed effectively by using a right combination of the other five antiviral drugs. These observations have practical implications in the understanding of antiviral drug mechanism that can result in better design of antiviral drug therapy.
Herpes simplex virus (HSV) is known to cause diseases of various severities, including mucocu-taneous diseases, neonatal herpes and herpes encephalitis . Recent reports also suggest HSV infection could strongly increase risk for HIV infection . HSV has become one of the most common sexual transmitted diseases in U.S.A., U.K., French and other western societies [3, 4, 5]. Furthermore, HSV encephalitis is the most common form of fatal encephalitis in the U.S., occurring about two per 100,000 persons yearly in the U.S. . Many therapeutic agents both pharmaceutical and chemical have been developed and to treat HSV infections [7, 8]. While these agents are generally effective, drug resistance and toxicity concerns have been increasingly reported [9, 10]. To reduce possible drug resistant viral mutant and the cytotoxicity, combinations of different antiviral drugs have been widely used . For two drugs, effective drug combinations can be found using some nonlinear modeling approaches [12, 13]. Drug combinations have often been reported to have higher efficacy and lower individual drug dosage.
Many challenges and complexities arise when trying to understand a system with multiple drugs (e.g., three or more drugs) because the underlying biological system is intrinsically complex and there are potential multiple drug interactions. For example, to study six antiviral drugs, each with seven dosage levels, there are 76 = 117, 649 different drug combinations to be tested. It is time and labor consuming to test all possible drug combinations. Researchers have developed a feedback system control to identify optimal drug combinations with five to 10 drugs at six or more dosage levels [14, 15, 16]. The feedback system control technique combines two parts, 1) biological experiment and 2) search algorithm, into a feedback loop. It is a rapid platform and usually identifies optimal drug combinations in less than 15 iterations by testing 1% or less of the total searching space. Yet, with such an approach it is challenging to quantify drug contributions and drug interactions .
Here, we introduce an alternative approach by using fractional factorial designs to replace the part of search algorithm in the feedback system control. Fractional factorial designs are an effective and commonly used tool in scientific investigations and industrial applications. Many successful applications have been reported in physical and chemical sciences, and engineering. Many textbooks on experimental design, such as [18, 19, 20, 21, 22, 23], provide various real applications. However, fractional factorial designs have yet to make an impact in bioscience, particularly in virology study. A main advantage of fractional factorial designs is that they enable us to build statistical models with a small number of runs. Using the models we can not only identify important drugs and drug interactions, but also predict optimal drug combinations.
In this paper, we present one of the first uses of fractional factorial designs in the area of virology by sequentially using two- and three-level fractional factorial designs to investigate a biological system with Herpes simplex virus type 1 (HSV-1) and six antiviral drugs: Interferon-alpha (A), Interferon-beta (B), Interferon-gamma (C), Ribavirin (D), Acyclovir (E), and TNF-alpha (F). The experiments were conducted at the UCLA Micro Systems Laboratories. We show that our approach successfully identifies that Ribavirin (D) has the largest effect on minimizing the virus load and TNF-alpha (F) has the smallest effect on minimizing the virus load.
The paper is organized as follows. In Section 2, we first provide a brief overview of two-level factorial and fractional factorial designs, which are widely used in early stages of experiments to screen important factors from a large number of potential factors. We then describe our initial antiviral drug experiment using a two-level fractional factorial design and perform data analysis. We demonstrate how this two-level experiment helps us understand the HSV-1 system and identify potential effective drug combinations. Section 3 describes the follow-up experiment using a three-level blocked fractional factorial design when there is evidence of model inadequacy in the two-level experiment. In addition, we report our analysis results and show how we determine the optimal drug levels using contour plots. Section 4 contains conclusions.
Factorial designs are very efficient for studying two or more factors. The effect of a factor can be defined as the change in response produced by a change in the level of the factor. This is referred to as the main effect. In some experiments, it may be found that the difference in the response between levels of one factor is not the same at all levels of the other factors. This is referred to as an interaction effect between factors. Collectively, main effects and interaction effects are called the factorial effects . A full factorial design can estimate all main effects and higher-order interactions.
Another way to define the concept of main effects and interaction effects for two-level designs is using a regression model. Suppose we have a full factorial design studying the six antiviral drugs: A, B, C, D, E, and F with two levels for each drug. There are 26 = 64 treatments or level combinations. A common regression model for studying main effects and interactions is
Here y is the response, the β’s are unknown parameters, x1, …, x6 represent drugs A–F, and ε is a random error term. The variables x1, …, x6 are coded as 1 and −1, for the high and low levels for their respective factors. The interaction between x1 and x2 is denoted as x1x2, and the other interaction effects are similarly defined. It is well known that the least squares estimates of the β’s in the model (1) are half of the corresponding factorial effects .
Using a full factorial design with 64 runs for all six drugs, we can estimate 6 main effects, 15 two-factor interactions, 20 three-factor interactions, 15 four-factor interactions, 5 five-factor interactions, and 1 six-factor interaction. Note that out of the 63 (26 − 1) degrees of freedom in the 64-run (26) design, 42 are used for estimating three-factor or higher interactions. However, in many experiments, we often find that three-factor and higher order interactions are usually not important . This means that we are using over half of the degrees of freedom to estimate effects that are potentially not significant. Therefore, using a full factorial design to study six drugs in 64 runs is quite wasteful. A more practical and economical approach is to use a fractional factorial design that allows the estimation of lower-order effects.
For economical reasons, we study six drugs in the antiviral drug experiment, introduced in Section 1, in 26−1 = 32 runs, a half fraction of the full 26 design. To construct such a design, we write down all possible 25 level combinations for drugs A, B, C, D, and E, and then set the level of drug F as the product of the levels of drugs A, B, C, D, and E, that is, F = ABCDE. Note that the low and high levels are coded as −1 and 1. The price we pay for using a half-fraction design is that the main effect of F is aliased with the ABCDE interaction because they are identical in the model. Additionally, there is also aliasing among other effects. Indeed, each main effect is aliased with a fifth-order interaction: A = BCDEF, B = ACDEF, C = ABDEF, D = ABCEF, and E = ABCDF. Each two-factor interaction is aliased with a fourth-order interaction, i.e., AB = CDEF, AC = BDEF, …, EF = ABCD, and each three-factor interaction is aliased with a third-order interaction, i.e., ABC = DEF, ABD = CEF, …, AEF = BCD. To disentangle these effects, a common and reasonable assumption is that higher-order interactions are assumed to be negligible because they are less likely to be important than lower-order interactions . This means that for this fractional factorial design we can estimate all main effects and all two-factor interactions assuming that fourth-order and higher interactions are negligible, which is quite reasonable in practice. Furthermore, every three-factor interaction is aliased with another three-factor interaction and so we can only estimate their sum.
Effect aliasing is a consequence of using a fractional factorial design. A related concept is resolution, which captures the amount of aliasing. This half-fraction design has resolution VI, which allows the estimation of all main effects and two-factor interactions under the assumption that fourth-order and higher interactions are negligible. In general, the higher the resolution of a fractional factorial design, the less restrictive the assumption is regarding which interactions are negligible to obtain a unique interpretation of the data. See Wu and Hamada  for more details on aliasing and resolution for fractional factorial designs.
Table 1 gives the design and data for the initial experiment, where the two levels are coded as −1 and 1. Each run represents a combinatorial drug treatment and the outcome, called readout, is the percentage of virus infected cells after the treatment. The first 32 runs correspond to the half fraction design obtained by setting F = ABCDE. Following the typical practice for two-level designs, we add three replicated runs (the last three runs) at the center (0). The addition of replicated center points allows an independent estimate of error to be obtained without affecting the estimates of the factorial effects. Generally, three to five center runs are recommended . Using these center points, we can obtain an estimate of the variability and conduct a lack-of-fit test.
We now provide technical details of the antiviral drug experiment, where NIH 3T3 cells were chosen as host cells. Cells were initially cultured on 15mm plates covered with 25mL culture medium. The culture medium was made from DMEM in presence of 10% Fetal Bovine Serum (FBS) and 1% Penicillin-Streptomycin (Pen-Strep). The 15mm plates were maintained in 37°C incubator filled with 5% CO2. Cultures were propagated at 107 cells/plate every 24 hours for two times before use in experiment.
Cell infection was carried out in 24-well plates. Each well contained 2 × 105 cells in 1mL culture medium. Cells were allowed to grow for four hours before viral infection and drug treatments occurred. Drug combinations were added simultaneously with HSV-1 to the host cells in 24-well plates. The plates were incubated at 37°C incubator with 5% CO2 for 16 hours. The virus was engineered to carry the green fluorescent protein (GFP) gene. Thus, cells infected with the virus would be GFP positive. GFP served as a biomarker to assess the percentage of infected cells. The readout was defined as the percentage of GFP positive cells after combinatorial drug treatments. The readout was measured through a flow cytometer (BD FACSCanto II, BD Biosciences).
Table 2 shows the actual dosage levels for the six drugs. Before this study, we performed single drug pilot studies and tested a wide range of dosages for each drug in order to find the “minimum response dosage”, at which the drug started to show some efficacy, and the “plateau dosage”, at which the drug’s efficacy did not increase when higher dosage was used. The pilot study suggested that the minimum response dosage was about 16 times diluted from the plateau dosage. In this study, we chose the plateau dosage as the high level (coded as 1) and the minimum response dosage as the low level (coded as −1). A center level (coded as 0) was added for the additional runs at the center of the factorial design. The center level is four times diluted from the high level and the low level is four time diluted from the center level.
As explained in Section 2.2, our design can estimate all six main effects, all 15 two-factor interactions, and 10 pairs of aliased three-factor interactions, assuming that four-factor and higher interactions are negligible. In the analysis, we use y = log(readout), i.e., log base 10 of the viral infection load, as the response because the distribution of the viral infections are positively skewed. The log transformation is also confirmed with the Box-Cox method. Table 3 presents the least squares estimates, the sum of squares, and the percentage of total sum of squares. The sum of squares of an effect here is simply 32 times the square of its estimate.
Table 3 suggests that the effects of drugs D and E are the largest. The linear effect of drug D is the most significant with an estimate of three times the estimate of the next most significant drug, E; showing that drug D is very significant and important relative to the other drugs. Together, drugs D and E account for 75.3% of the total sum of squares in the data. Overall, the six main effects contribute 81.5% of the sum of squares, the fifteen two-factor interactions contribute 6.8%, the ten pairs of three-factor interactions contribute 3.2%, and the residuals account for 8.3%. In this antiviral experiment the main effects dominate the system, and drug D alone accounts for 68.0% of the total sum of squares within the system. Such finding is similar to many engineering experiments where fractional factorial designs are successfully used. This observation gives us confidence that fractional factorial designs can be useful for studying cellular system under multiple drug stimulations.
We observe that all of the estimates for drugs A–F have positive coefficients except for drug D. The practical implication is that in our experiment the minimum viral infection can be achieved when we set the dugs A, B, C, E, and F at the low level and D at the high level. Accordingly we decrease the dosage for the drugs A, B, C, E, and F and increase the dosage for D. However, while drug D is an effective antiviral drug, it often induces an unacceptable levels of toxic side effects for the subjects. To screen for less toxic drug combinations we reduce all of the drug dosages in a follow-up experiment.
We use the data from the three independent center runs to test for lack-of-fit. The lack-of-fit test is also known as a check for curvature [21, 22] for a two-level factorial design. The residuals sum of squares in Table 3 can be decomposed into two parts: lack-of-fit and pure error, with one and two degrees of freedom, respectively. The lack-of-fit test presented in Table 4 shows that lack-of-fit is very significant with an F value of 272.46 and a p-value of 0.0037. This implies that the relationship between the response and the drugs is nonlinear; therefore, we need additional levels and runs to model the nonlinear relationship.
Summarizing the results, lower drug dosages for all drugs are desirable in order to reduce potential drug toxicity and minimize viral infection. In addition, we need to add a third level for each drug to model the nonlinear relationship. This naturally leads to a three-level design.
Two-level designs are commonly used to screen factors in the initial stage given a small number of runs. Three-level designs are widely used in practice to study the nonlinear relationship for quantitative factors. In the follow-up experiment, we use three levels for each drug. Table 5 shows the drug dosage levels for the follow-up three-level experiment, where the three levels are denoted as low (0), intermediate (1), and high (2). The highest concentration level for the follow-up experiment is the middle concentration level for the initial two-level experiment. Similar to the two-level experiment, the intermediate level is 16 times diluted from the high level and the low level is to be no drug.
One possible design to consider for the follow-up experiment is a resolution VI, 36−1 design, which has 243 runs and enables the estimation of all main effects and two-factor interactions. However, this design is not so feasible in practice because of the large number of runs required. Instead, we employ a 81-run design, a one-ninth fraction of the 36 design, 36−2 design. First, the design is constructed by choosing the column for factor E to be equal to column A + column B + column C + column D (mod 3); that is, every entry in column E is the sum of the first four levels of the factors modulus 3. Here modulus 3 means that any multiple of 3 equals zero. Second, we choose F to be equal to column A + 2(column B) + column C (mod 3). The conventional notation for such a design is E = ABCD and F = AB2C, which are called the generators of this particular three-level design. If x1, …, x6 represents the six factors, then E = ABCD is equivalent to x5 = x1 +x2 +x3 +x4 (mod 3), and F = AB2C is equivalent to x6 = x1 +2x2 +x3 (mod 3). This design has resolution IV; therefore, all main effects are not aliased with any two-factor interactions and some two-factor interactions are aliased with other two-factor interactions. Moreover, this 81-run design has the ability to estimate each of the main effects as well as some of the two-factor interactions.
However, there are practical issues with this design. The antiviral drug experiments require cell culture preparation and adding virus and drug combinations manually. It is also not practical to perform the 81-run design using a single batch of cell culture. Experiences show that there are substantial batch to batch variation from the nature of cells. These systematic sources of variation are intrinsic and are independent of the researcher or the equipment. If such an issue is not addressed carefully, the precision of the experiment can be reduced greatly by these systematic sources of variation. Blocking is a useful way to reduce the influence of these systematic sources of variation by arranging homogeneous experimental runs into groups. There are many recent studies on the optimal choices of blocking schemes for fractional factorial designs; see, among others, [24, 25, 26, 27, 28, 29, 30]. However, real applications of blocked fractional factorial designs is limited in the biomedical science area. Considering the experimental capacity and time, we divide the 81 runs in three blocks, each of size 27. Each block uses one batch of cell culture and the runs within a block are randomized. In particular, we arrange the 36−2 design into 3 blocks with the block generator, block = AC2D, or equivalently block= x1 + 2x3 + x4 (mod 3), following . Because the block effect and the three-factor interaction have the same estimate, the block effect is said to be confounded with the three-factor interaction effect. With this arrangement the main effects and two-factor interactions are not confounded with the block effects and therefore they can be estimated effciently. Table 6 gives the design and data of the experiment.
To analyze the data, we fit a second-order model with the addition of the block effects:
where βi represents the linear effect of xi, βii represents the quadratic effect of xi, and βij represents the bilinear (i.e., linear-by-linear) interaction between xi and xj. For convenience, in the analysis, the three dosage levels (0, 1, and 2) are encoded as −1, 0, 1, respectively. The variables, block1 and block2, are indicators of blocks 1 and 2, respectively, with block 0 as a reference. As explained earlier, the blocking variables are not correlated with other variables in model (2).
Table 7 column (a) gives the estimates of the model. The model fits the data well with an R2 value of 91.4%. Table 7(a) shows that the linear effects D and E, the quadratic effect D2, and the interaction AD are significant at the 0.1% level; the linear effect B is significant at the 5% level. As expected, both blocking variables are significant at the 1% level, indicating that the batch-to-batch variation is large. Residual analysis indicates that the usual assumptions on the error are reasonable. However, run 80 turns out to be an outlier. This is obvious from inspecting Table 6. When factor D is at the high level the readout is usually small, except run 80. We remove this outlier and refit the model. Column (b) of Table 7 gives the results after the removal of the outlier. The new model has a slightly higher R2 value of 94.5% and the significant effects identified earlier remain significant. We note that the linear effect B becomes more significant and the linear effect C becomes significant at the 5% significance level. We further perform variable selection via stepwise regression and confirm that there are no other significant effects. The final model at the 1% significance level is
with R2 of 92%. The linear effect A is not significant, but we keep the term in the model because the interaction AD is significant.
The data analysis identifies that four drugs, Interferon-beta (B), Interferon-gamma (C), Ribavirin (D), Acyclovir (E), have a significant linear effect on HSV-1. The nonlinear (quadratic) effect of Ribavirin also has a very significant effect on HSV-1. We also observe a significant interaction between Interferon-alpha (A) and Ribavirin (D). We do not see any significance for the drug TNF-alpha (F), and it is considered inert in the minimization of the viral infection. The negative coefficients associated with drugs B, C, and E imply that these particular drugs have the potential to lower the virus. Therefore, to achieve the minimum viral infection we set drugs B, C, and E at the high level.
Since the interaction effect between A and D is significant we use a contour plot to identify potential optimal drug dosage levels. Contour plots are used in full and fractional factorial experimental designs and analyses to determine settings that will maximize, or minimize, the response of interest. The x and y axes of the plot represent the values of the first and second factors. We examine the contour plot of A and D for the predicted percentage of viral infection from the final fitted model (3). Figure 1 shows the contour plot of the predicted readout in terms of A and D, while drugs B, C, and E are held at the high level and block1 = block2 = 0, see Table 5. This contour plot suggests the minimum viral infection is achieved when A is set at the low level, no drug, and D is set at the high level. If multiple two-factor interactions are considered important, then one can generate a series of contour plots, each of which is drawn for two of these factors. Therefore, the optimal drug combinations to minimize the viral infection for the final model (3) are: B, C, D, and E at the high level and A at the low level. The predicted response for these recommendations comes out to be 1.72%.
Ribavirin (D) is an effective antiviral drug; however, it can also induce toxic side effects. Hence, it is desirable to reduce the dosage level of Ribavirin to the lowest possible setting in order to lower the toxicity. Figure 2 shows the predicted readout for Ribavirin using model (3), with the settings recommended for the other drugs above. Notice that the shape of the predicted response for Ribavirin is convex because the coefficient of D2 is positive in model (3). The convexity has an important application, that is, reducing the Ribavirin dosage will not affect its efficacy substantially. For example, if we lower the drug dosage of Ribavirin from the high dosage level of 6,250 ng/mL to the middle dosage level, 390 ng/mL, the predicted response only changes from 1.72% to 3.84%, an absolute difference of approximately 2%. The relative change in the predicted response from the high dosage level to the middle dosage level, based on the percentage of viral infection with no drug treatment 49.1%, run 1 in Table 6, is approximately 4.3%. Therefore, we can potentially decrease the toxicity by reducing the dosage of Ribavirin by 16 times with only a relative change of 4% in the viral infection. This enables us to achieve the goal of finding drug combinations with higher efficacy and lower toxicity.
We provide a new sequential application of fractional factorial designs to investigate the complicated underlying biological system of HSV-1 and six antiviral drugs in virology. In the present study, we apply an initial two-level design to study six drugs at two dosage levels. The need for the quadratic model comes about upon testing the assumption of linearity in the two-level experiment. To investigate the possibility of insignificant drugs, and the need to change levels of the overall drug dosage levels, we use a three-level fractional factorial design with blocks in the follow-up experiment. We find that TNF-alpha has little effect and HSV-1 infection can be suppressed effectively by using a right combination of the other five antiviral drugs.
There is a growing demand for identifying drug combinations for a large number of drugs, say 10–50 drugs. We are currently working on a colon cancer project with 11 FDA approved drugs up to 10 dosage levels each. Our aim is to identify effective and low-toxic combinations of these FDA approved drugs to treat colon cancer. To accommodate the large number of drugs and the large number of dosage levels, fractional factorial designs with large run sizes (128 runs or more) could be a viable alternative to high throughput methods. Efficient fractional factorial designs with 128–4,096 runs have been recently constructed [31, 32, 33, 34] and could be potentially useful for our project. It is also possible that we need to construct new fractional factorial designs or develop new design strategies to meet our goal. A full exploration of the application of fractional factorial designs to study drug combinations for a large number of drugs is left as future research.
The work was supported by National Institutes of Health Nanomedicine Development Center grant PN2EY018228 for Ding and Ho, and by National Science Foundation DMS grant 1106854 for Jaynes and Xu.