2.1. Data

The data published by Emond et al. [

11] consisted of rats receiving oral doses of a mixture of 6 PCB congeners: 118 (2,3′,4,4′,5-pentachlorobiphenyl), 138 (2,2′,3′,4,4′,5-hexachlorobiphenyl), 153 (2,2′,4,4′,5,5′-hexachlorobiphenyl), 170 (2,2′,3,3′,4,4′,5-heptachlorobiphenyl), 180 (2,2′,3,4,4′,5,5′-heptachlorobiphenyl), and 187 (2,2′,3,4,5,5′,6-heptachlorobiphenyl). The dosing regimen consisted of 3 dose levels (5, 50, and 500

*μ*g/kg body weight of each PCB), and 4 dose protocols (one dose per day, one dose per week, consecutive daily doses for 13 days followed by no exposure, and 13 irregularly timed doses). Rats were either sacrificed at 41 days or 90 days for data collection. PCB concentrations in total lipids of plasma, liver, and adipose tissue were measured (adipose tissue concentrations were measured for only those rats sacrificed at 90 days). Body weight and liver weight at time of sacrifice were measured. The final data consisted of approximately six rats for each dose level/dose protocol

*/*sacrifice day combination (142 rats in total) [

11].

2.2. Toxicokinetic Model

The new PBTK model developed here is an extension of the lipid-based PCB mixture model by Emond et al. [

11]. The original model formulation required alternative clearance parameters at different dose levels and dose protocols, thus increasing the number of parameters and creating model discontinuity. The updated model provides an alternative formulation that incorporates CYP450 induction, thus facilitating the use of a single set of parameters for wider applicability of the model.

The model consists of five compartments (blood, adipose tissue, liver, slowly and rapidly perfused), with mean physiology defined in . The overall clearance of PCBs is empirically described in the liver and represented as a function of CYP450 metabolism (metabolites are subsequently excreted in urine or feces [

18]), which is modeled as a first-order process [

11]. It is assumed that PCBs accumulate only in the neutral lipid spaces of blood and tissues. Any accumulation outside of the lipid fraction is assumed to be negligible and is not incorporated into the mass balance. Compartment volumes correspond to the lipid volumes in each tissue, and the total cardiac output is corrected for the fractional lipid content of blood. The PBTK model is based on chemical concentration in neutral lipid equivalent (NLE) components of blood and tissues, which can be converted to concentration in total lipids (the measurable quantity [

14]). The lipid-based mass balances for tissues in the PBTK model are defined in the same manner as the original model [

11]:

where

*A*_{nlt} is the mass of chemical in the tissue NLEs (

*μ*g),

*Q*_{nlt} is the flow rate of blood NLEs through the tissue (mL NLE/h),

*C*_{nla} is the chemical concentration in the NLE fraction of arterial blood (

*μ*g/mL NLE),

*C*_{nlt} is the chemical concentration in the tissue NLEs (

*μ*g/mL NLE),

*C*_{tlt} is the chemical concentration expressed in terms of total lipids in tissue (

*μ*g/mL of total lipid),

*V*_{nlt} is the volume of neutral lipid equivalents in tissue (mL NLE), and

*V*_{tlt} is the volume of total lipids in tissue (mL total lipid). Volumes of total lipids in tissues are measurable quantities, while neutral lipid equivalents are quantities that are derived by assuming NLEs are composed of all the neutral lipids and 30% of the phospholipids in tissue [

11,

14].

NLE-based volumes in are obtained by multiplying conventional values with NLE ratios in . Flows are obtained by multiplying conventional values with the blood NLE ratio. The ratio of NLE/total lipid in

(

*V*_{nlt}/

*V*_{tlt}) is used to convert concentrations from NLE basis to total lipid basis. To convert liver, fat, and plasma NLE concentrations to a total lipid basis, the corresponding values in (column 3) are used.

| **Table 2**Physiological values for a standard 225g rat (adapted from [11]). |

2.3. Induction Model

High chronic doses of the PCB mixture caused an increased elimination rate for all PCBs, which was attributed to CYP450 induction [

11]. PBTK models predicting changes in metabolic rate due to CYP450 induction have been previously implemented for other chemicals [

17,

20–

23]. A CYP450 balance in the liver can be defined as

where

*A*^{CYP} is the mass (per gram protein) of CYP450 in the liver,

*k*_{0} is the basal CYP450 production rate (mass/time),

*k*_{e} is the CYP450 degradation rate (time

^{−1}), and

*S*(

*t*) is the stimulation function for induction exposure response (mass/time). The initial condition for

*A*^{CYP} is the baseline level

*A*_{0}^{CYP}. In the presence of zero inducer,

*S*(

*t*) is zero and (

2) is at steady state, and, therefore,

*k*_{0} is equivalent to

*k*_{e}*A*_{0}^{CYP}.

For simplicity, a linear function was adopted for *S*(*t*). While a Hill equation could have been implemented, it was determined that optimizing Hill parameters with weak prior information was impractical. The internal exposure-response parameters of this particular mixture are highly uncertain. Furthermore, lipid-based model formulations assume that contaminant concentration outside of the lipid space is negligible. If only unbound chemical outside of the lipid space can initiate a toxicological response (i.e., by binding to a receptor), a lipid model will assume that this external/unbound concentration remains low and does not approach saturation. The initiation of CYP450 induction was modeled as proportional to the inducing PCB concentration:

where

*C*_{IND} is a relative metric for inducer concentration, and

*F* (≥0) is the induction slope factor defining the increase in CYP450 enzyme production caused by

*C*_{IND}. The induction slope is defined as a factor of the basal production rate.

In previous PBTK models for lipophilic contaminants, the inducer concentration

*C*_{IND} was defined to be the chemical concentration bound to the Ah-receptor [

17]. The Ah-receptor has consistently been demonstrated to be crucial for CYP450 induction by PCBs [

24]. Since only unbound chemical outside of the lipid space can bind to the Ah-receptor [

23], assumptions from the prior PBPK models do not apply. To maintain the parsimony of the model and maintain lipid-based concentrations throughout,

*C*_{IND} was defined as the concentration of the inducing PCB congener(s) in the neutral-lipid space of the liver. Any additional steps in the induction process (i.e., concentration gradients between free and lipid-space PCB, and Ah-receptor binding) were essentially lumped into the dose response parameter

*F*. During the model development phase, it was observed that introducing a time-lag into the stimulation function to account for unspecified processes had a negligible impact on predicted lipid concentrations. This was likely due to the larger timescales of dose protocols and simulated data collection times.

The following relationship between CYP450 levels and metabolic clearance was determined to be flexible enough to model the data over a wide range of doses:

where

*v*_{cl } is metabolic clearance as a function of CYP450 concentration (mL/h) and

*v*_{0} is the basal metabolic clearance under low exposure and negligible induction conditions (mL/h).

The rate of metabolism of each PCB is the product of the PCB concentration in the neutral lipid space of the liver *C*_{nlL}(*A*_{nlL}/*V*_{nlL}) and the *v*_{cl } for the particular PCB:

PCB induction and metabolism are congener specific and are functions of structure and classification [

2]. Non-ortho PCBs (“dioxin-like” PCBs with no ortho-substituted chlorines) assume a coplanar position and are strong inducers of CYP1A. Multi-ortho-substituted PCBs cannot become coplanar and interact primarily with CYP2B. Mono-ortho PCBs can assume both planar and coplanar positions and are considered “mixed-type” inducers [

2,

25,

26]. PCB congeners 138, 153, 170, and 180 are di-ortho; 187 is tri-ortho; and 118 is mono-ortho. In the current dataset, PCB 118 shows significantly higher clearance than other congeners at the high-dose level [

11].

To model the difference in PCB 118 metabolism at the high doses, the PBTK model assumes the multi-ortho PCBs are metabolized through the CYP2B pathway, while PCB 118 is metabolized via CYP1A. It was also assumed that both types of PCBs induce CYP2B, but induction of CYP1A by multi-ortho PCBs was negligible. These assumptions were based on

*in vitro* studies of different classes of PCBs. A study in rat hepatocytes found that mono-ortho PCBs are primarily metabolized by CYP1A and primarily induce CYP1A (with CYP2B being induced to a lesser extent) [

27]. CYP1A induction by PCB 118 has also been shown to be orders of magnitude greater than induction by multi-ortho PCBs [

28]. Meanwhile, CYP2B induction from both PCB 118 and multi-ortho PCBs was the same order of magnitude [

25].

Equations (

2) through (

6) were applied with parameters to describe both CYP1A and CYP2B kinetics. For each PCB, (

5) was applied using a basal metabolic clearance (

*v*_{0}) specific to that PCB, with the induction scaling factor dependent on the PCB classification (mono-ortho or multi-ortho). Induced clearance of PCB 118 was dependent on the CYP1A ratio, while induction of the others was dependent on CYP2B ratio. For the induction of CYP1A,

*C*_{IND} in (

6) was assumed to be the concentration of PCB 118 in the neutral lipid space of the liver. For CYP2B induction,

*C*_{IND} was assumed to be the total PCB concentration (sum of all 6 PCBs) in the neutral lipid space of the liver. This assumes an additive effect on CYP2B induction, with each PCB having equal weight. Separate values for the induction factor

*F* were used to describe CYP1A and CYP2B induction.

Parameters for baseline CYP450 dynamics were obtained from literature and are summarized in . For simplicity, the same parameters were used for both CYP1A and CYP2B enzymes and were obtained from a model for CYP1A [

17]. Alternatively, CYP2B-specific parameters from a similar PBTK-induction model can be used [

22]. A recent study in female rats estimated CYP1A content to be a factor of two greater than CYP2B in the liver [

29], which could also be used to inform the model. It was found that optimizing the model using separate 1A/2B parameters produced nearly identical final results, due to the fact that the synthesis and degradation parameters were proportionally the same. Additionally, the initial amounts and the synthesis/degradation rates are not entirely identifiable and are related via the steady-state mass balance. Since the induction in this model will only be affected by the increase of CYP450 relative to the baseline, the actual baseline values are less important, and some adjustments can be made to those assumptions without reoptimization of induction factor

*F*.

2.4. Hierarchical Bayesian Model

Interanimal variation existed in the data which could not be attributed to differences in physiology or tissue lipid content alone. It was assumed that this inter-rat variation can be attributed to basal metabolic clearance *v*_{0}. A hierarchical model for basal clearance was constructed to optimize the population distribution of *v*_{0} to the observed data ().

A generalized population model assumes random variable Ψ_{ik} (where *i* denotes an individual within the population, and *k* denotes the particular variable) is derived from a distribution of mean *μ*_{k} and standard deviation Σ_{k}. Both random and nonrandom variables are used as inputs to the PBTK model to predict *Y*_{i}. The likelihood function *L* calculates the probability that *Y*_{i} is an adequate prediction of data *y*_{i} given the set of random variables. The prior function *P* calculates the probability of all random variable values conditional on their population assumptions. The posterior probability is proportional to the product of the likelihood and prior.

A lognormal error function was implemented as the likelihood, which assumes the log of data measurements *y*_{i} are scattered in a normal distribution from the log of their corresponding model predictions *Y*_{i}:

The population and error parameters to be estimated are summarized in . It was assumed that metabolic clearances at the individual level (

*v*_{0}) were derived from a lognormal population distribution defined by

*μv*_{0} and Σ

*v*_{0}. Priors on

*μv*_{0} for each PCB were set as wide and noninformative. The priors for population variances Σ

*v*_{0} for each PCB were assumed to be inverse-gamma with a shape parameter of 1 (indicating large uncertainty), and a scale parameter of 0.8 (the initial assumption on the lognormal Σ). The prior probabilities of individual-level parameters were calculated using the values of population

*μ* and Σ for each parameter at the current iteration. Upper and lower limits for the uniform priors on mean population basal clearances (

*μv*_{0}) were set by observing model behavior at extreme values. It was determined that scaling the basal rate by body-weight

^{0.75} slightly improved convergence and model fit, due to the increasing body weight over the 90-day period, and variation in body weight of the studied population. Non-informative distributions were used for the priors on

*σ*. Since three tissues were measured (plasma, liver, and fat), three separate values for

*σ* were optimized.

| **Table 4**Population-level parameters to be estimated by Bayesian analysis. |

During the testing phase of the optimization, interindividual variation in cardiac output, fractional organ volumes, and blood flow rates were incorporated by using informative population priors based on standard literature values for Sprague-Dawley rats. This involved optimizing individual-level physiology while holding population-level distributions constant. It was found that nearly identical final results were obtained regardless of whether variation in physiology was incorporated into the model. In order to improve convergence, mean values for organ physiology were used. Individual-level data of measured liver weight ratios and the change in body weights over the course of the study were still incorporated into the PBTK model during the optimization.

2.5. Computational Implementation

The PBTK model was developed in the MATLAB software environment [

30], while implementing an open-source Metropolis-Hastings toolbox [

31]. The PBTK model is available from the institute website (

http://www.ccl.rutgers.edu/onlineCodes/PCBmixturePBPK) and is also provided as a supplemental file. Markov-Chain Monte Carlo (MCMC) with Metropolis sampling was used to iteratively converge to the posterior distribution. The number of PBTK model parameters for each of the 142 rats, added with population and error parameters, leads to over 1000 parameters in total. Convergence issues arise with such high dimensions and noninformative priors. Additionally, each Metropolis step requires the solution of over 800 systems of differential equations, and multiple independent Markov chains are required to assess convergence. Since the system might not converge for 100,000 iterations, it was necessary to apply simplifying assumptions to reduce model evaluations and improve convergence.

Because a minimal induction effect was observed at the lowest dose [

11], the parameter optimization was decomposed into two steps. For the first step, induction was neglected and the model was optimized using only the low-dose data in order to obtain the basal metabolic rate (

*v*_{0}) for each PCB. The resulting population distributions were then used as informative priors in the second step. For step 2, induction was incorporated in the model, and parameters were optimized using only data for the two high doses. While MCMC was still performed on the individual-level values for

*v*_{0} in step 2, they were defined by stronger population priors than in step 1. The population mean and variance for

*v*_{0} were not updated in step 2, since the final distribution of all individual-level clearances from step 2 remained consistent with the population priors. Had any anomalies been identified (i.e., many individual-level parameters being optimized at the upper or lower limits), the population parameters would have been reoptimized in the second step. The model/data error parameters (

*σ*) were optimized in both steps, since it was observed that allowing these parameters to freely explore the space improved convergence.

Splitting the problem into two steps helped to reduce convergence problems, since basal and induced metabolic clearances are inherently nonidentifiable. Interindividual variation of the induction factor *F* was neglected in step 2 (i.e., the prior probability on Σ*F* was assumed to be extremely small). Assuming negligible variation on *F* eliminates the need to estimate individual-level values for each rat and can prevent poor mixing of the Markov chains. Interindividual variation in metabolism would be accounted for by variations in basal clearance. Two induction factors (one each for 1B and 2A induction) would be optimized to fit the entire population.

For both step 1 and step 2 of the parameter optimization, three sets of independent Markov chains were initiated using over dispersed initial guesses. After adjustments in the random-walk parameters to optimize the acceptance rate (it was determined that the optimal acceptance rate for this system was approximately 10% [

32]), the chains were run for 50,000 iterations. The chains were considered converged if the Gelman-Rubin convergence statistic was close to 1 for the parameters from all three independent sets of chains [

33]. The PBTK models and Metropolis sampler were implemented in MATLAB on a cluster of multi core processors. Convergence of the Markov chains typically occurred after 80,000 iterations and three days computational time.