|Home | About | Journals | Submit | Contact Us | Français|
For the accurate prediction of in vivo hepatic clearance or drug–drug interaction potential through in vitro microsomal metabolic data, it is essential to evaluate the fraction unbound in hepatic microsomal incubation media. Here, a structure-based in silico predictive model of the nonspecific binding (fumic, fraction unbound in hepatic microsomes) for 86 drugs was successfully developed based on seven selected molecular descriptors. The R2 of the predicted and observed log((1−fumic)/fumic) for the training set (n=64) and test set (n=22) were 0.82 and 0.85, respectively. The average fold error (AFE, calculated by fumic rather than log((1−fumic)/fumic)) of the in silico model was 1.33 (n=86). The predictive capability of fumic for neutral drugs compared well to that for basic compounds (R2=0.82, AFE=1.18 and fold error values were all below 2, except for felodipine and progesterone) in our model. This model appears to perform better for neutral compounds when compared to models previously published in the literature. Therefore, this in silico model may be used as an additional tool to estimate fumic and for predicting in vivo hepatic clearance and inhibition potential from in vitro hepatic microsomal studies.
Significant progress have been made recently in the prediction of in vivo hepatic clearance (1–3) and drug–drug interaction potential (4–6). The determination of intrinsic clearance (CLint) and inhibition constant (Ki) through in vitro microsomal incubation can provide a basis for these predictions. However, lipophilic drugs tended to bind nonspecifically to microsomal phospholipids, resulting in an underestimation of CLint (7–9) or an overestimation of Ki (10–12). Consequently, in vivo hepatic clearance and the extent of inhibitory drug interactions were often underpredicted.
Investigators have tried to use relative low microsomal protein concentration to avoid the nonspecific binding (13). However, relative high concentrations (1 to 2 mg/mL) were still needed when studying phase II metabolic reactions (14) and in vitro assessment of the time-dependent inhibition potential (15). As such, it is essential to correct the metabolic kinetic parameters (CLint and Ki) by the unbound fraction to microsomes (fumic) in order to ensure accurate pharmacokinetic estimation of potential drug candidates. Unfortunately, currently available experimental methods for measuring the fumic are relatively labor- and time-consuming.
In order to avoid these experimental studies, in silico prediction of fumic has gained great interest recently. Austin et al. (16) reported a linear relationship between log((1−fumic)/fumic) and logP/D (logP for bases, while logD7.4 for acids and neutrals) of 56 drugs (R2=0.82). Hallifax and Houston (17) proposed that the relationship between logP/D and log((1−fumic)/fumic) was nonlinear. They concluded that the nonlinear empirical equation gave more unbiased predictions of fumic for drugs with low binding affinity (fumic>0.9) when compared with the model by Austin et al. Later, Gertz et al. addressed the limitations of these empirical predictive tools and their applicability for fumic predictions over a range of lipophilicity and microsomal protein concentrations (18). They concluded that the accuracy of fumic predictions for highly lipophilic drugs was poor by both equations, while the Hallifax equation provided more accurate fumic predictions on average.
Interestingly, Sykes et al. (19) reanalyzed the data reported by Austin et al. and found that the logP values clearly correlated well with the transformed fumic (R2=0.90) for bases but less predictive for neutrals (R2=0.34) and acids (R2=0.10). They obtained good discrimination between drugs classified as strong binders (experimental fumic<0.50) and those with a lower degree of binding (experimental fumic>0.50) by molecular modeling approaches.
Recently, Gao et al. (20) developed a quantitative in silico model correlating fumic of 1,223 drug-like molecules with two-dimensional molecular descriptors. These investigators demonstrated that lipophilicity was the most important molecular property contributing to fumic in this high performance model. However, the information of the original dataset was not open to the public. Therefore, a model not only with high prediction accuracy but also with open-sourced dataset would be useful for researchers in assessing quantitative structure vs. fumic relationships.
In this study, quantitative structure–fumic relationship was constructed just based on molecular descriptors for a dataset of 86 drugs covering a large range of molecular properties. Molecular descriptors were calculated using TSARTM software version 3.3 (Accelrys Inc.) (21), preADMET (22), and SciFinder Scholar 2007 (23). Then, the feature selection was performed by stepwise regression, and an in silico model was established with multiple linear regression (MLR) method. The principal objectives of the study were, therefore, (1) to develop a quantitative relationship between the molecular structure descriptors and log((1−fumic)/fumic); (2) to estimate the predictive accuracy of in silico model, and (3) to understand what structural factors determining fumic.
The observed fumic values of 86 drugs were obtained from the literatures as described in Table I. These fumic values were measured at the microsomal protein concentration of 1 mg/mL or converted to fumic values at 1 mg/mL based on the equation proposed by Austin et al. (16). The fumic values of each drug were transformed to log((1−fumic)/fumic). As shown in Fig. 1, the fumic values of 86 drugs did not follow a normal distribution. The transformation of fumic to log((1−fumic)/fumic) yielded a more desirable distribution and could reduce unequal error variances simultaneously. Therefore, the observed log((1−fumic)/fumic) was considered as the dependent variable in the model construction.
The 2D structures of 86 drugs were searched in SciFinder Scholar 2007 and the mol files were saved for further calculation. Then, the molecular descriptors that were known to influence almost all pharmacokinetic properties were selected as original independent variables. A set of 32 descriptors was obtained from TSAR 3.3, preADMET online, and SciFinder Scholar 2007, including: molecular refractivity, cosmic torsional/electrostatic/total energy, number of atoms/halogen atoms/heteroatoms, heat of formation, energy of the lowest unoccupied molecular orbital (LUMO), energy of the highest occupied molecular orbital (HOMO), ΔE (LUMO–HOMO), number of primary/secondary/tertiary amine groups, number of carboxylic acid groups, number of single/double/aromatic bonds, total absolute atomic charge, total/aromatic rings, number of negatively/positively charged groups, rigid/rotable bonds, number of hydrogen bond acceptors/donors, logD7, logP, molecular weight, mean net charge per molecule of the compounds (fi) (24), and polar surface area.
As expected, only some of the 32 descriptors are significantly correlated with log((1−fumic)/fumic). Furthermore, many of the descriptors are intercorrelated, which has a negative effect on the accuracy and interpretability of the final quantitative model. Therefore, stepwise regression method was employed to perform the feature selection process in this study.
MLR analysis was applied to develop the in silico model. In order to examine the predictive power and robustness of our model, the entire dataset should be subdivided into training and test set. In general, there are three methods for the selection of training and test set: (1) selection based on a random manner; (2) selection based on clusters of the dependent variable; (3) selection based on clusters of factor scores of the descriptor space along with or without the biological activity values. Due to the skew distribution of fumic, the entire dataset was categorized into training set (n=64) and test set (n=22) by the cluster analysis of log((1−fumic)/fumic). The whole range of log((1−fumic)/fumic) was divided into bins, and compounds belonging to each bin were randomly assigned to the training or test set. Meanwhile, leave-one-out (LOO) cross-validation was performed. Then, R2 and Q2 resulted from LOO (Q2LOO) were calculated to evaluate the model predictability.
Two other commonly employed accuracy test criteria, the fold error and the average fold error (AFE), were used to evaluate the predictive accuracy, as represented by Eqs. 1 and 2, respectively. The percentages of drugs with the fold error more than two (E2-fold) and three (E3-fold) were calculated to estimate the accuracy of the model in our study, respectively. A prediction is usually thought to be successful if the value of AFE is less than two (25).
Seven descriptors were chosen via the feature selection to construct the in silico model. Then, the model for the training set was built with 64 drugs, as represented by Eq. 3.
where x1 is the cosmic electrostatic energy; x2 is the number of aromatic bonds; x3 is the number of negatively charged groups; x4 is the number of positively charged groups; x5 is logP; x6 is the mean net charge per molecule of the compounds; x7 is polar surface area (PSA). All of the selected descriptors, the values of which can be obtained directly from the authors, are standardized to ensure that all descriptors had equally determinant strength affecting log((1−fumic)/fumic). The standardized values with a mean value of zero and a variance of unity are represented as “x*” in Eq. 3.
The correlation between predicted log((1−fumic)/fumic) and observed log((1−fumic)/fumic) from in silico model is shown in Fig. 2. It is seen that the in silico model exhibits high predictive performance: for the training set, n=64, R2=0.82 (R2=0.85 for test set), Q2LOO=0.75, RMSE=0.45, F=36.31, p<0.0001, the slope equal to unity and the intercept to zero (the slope is 0.94 and the intercept is −0.08 for test set).
The observed and predicted values of fumic, and the fold error values of 86 drugs are shown in Table I. As can be seen, 75% of drugs are found with fold error<2 and only 2% of drugs with fold error>3 and AFE= 1.33 (Table II) in our model. For training set, 82% of drugs are found with fold error<2 and only 5% of drugs (one drug) with fold error>3 and AFE=1.34. And for test set, 86% of drugs are found with fold error<2 and only 0% of drugs with fold error>3 and AFE=1.33. Therefore, the fumic can be predicted accurately by our model.
Figure Figure22 also describes the respective correlations between predicted and observed values of log((1−fumic)/fumic) for acids, bases, and neutral compounds. As stated earlier, the good prediction of bases is more easily achieved than neutral compounds and acids. Our model predicted fumic for bases well (R2=0.82, AFE=1.68). Furthermore, prediction of fumic for neutral drugs was comparable (R2=0.82, AFE=1.18), which might be a positive feature of this model. Unfortunately, for acids, the correlation between log((1−fumic)/fumic) and the seven descriptors was still poor (as shown in Fig. 2; R2=0.43). However, the fold error values of acids were all below 2 (Table II), indicating that the predictive accuracy of acids in our model might still be useful in some circumstances. The slope of the fitted line (Fig. 2; 0.78) for acids was similar to bases (0.73) and neutral compounds (0.76). These findings indicate that the prediction of fumic for acids in our model is still reasonable. In fact, the poor correlation of acids was probably due to the relative narrow distribution of observed fumic values (or the log((1−fumic)/fumic) values). As can be seen in Table I, except for emodin, the range of the log((1−fumic)/fumic) for acids is from −1.5097 to −0.1581, with a log unit span of 1.4, (most of the fumic values are within the range of 0.6–0.9). In contrast, for bases and neutral compounds, the log unit spans of the log((1−fumic)/fumic) are 4.0 and 3.2, respectively. The relative low nonspecific binding of acids to hepatic microsomes likely results in the skew distribution of the fumic and the poor correlation between log((1−fumic)/fumic) and the selected seven descriptors.
Our results shown in Eq. 3 suggest that the descriptors chosen strongly correlate with fumic, thus, allowing some mechanistic interpretations of the model. In general, these molecular descriptors relate to molecular lipophilicity, charge state, flexibility, polarity, and extent of ionization at pH 7.4, as shown in Fig. 3.
The molecular mechanism of nonspecific binding is presently unclear, but it is believed to depend on the lipophilicity and the electronic charge. The main binding contributors can be divided into non-electrostatic and electrostatic terms, wherein the non-electrostatic contributions include lipophilic interactions, van der Waals interactions, and translational, rotational, and configurational entropies (26).
As can be expected, the extent of microsomal binding generally increases with increasing lipophilicity of the drug. Especially, as the main structural contributor, logP is positively correlated with fumic (Eq. 3), consistent with the above analysis. The cosmic electrostatic energy, parameter x6 (fi), the number of positively charged groups and the number of negatively charged groups are descriptors representing the above electrostatic term contributing to the nonspecific binding. As shown in Eq. 3 and Fig. 3, the cosmic electrostatic energy is the second important descriptor in our model. It is energy descriptor accounting for the noncovalent interaction potential energy, which determines the binding affinity of a molecule to the pertinent receptor(s). The parameter x6 (fi) is calculated from the pKa and pH7.4, and its value is equal to the ionization fraction for compounds at pH7.4 (24). Thus, it denotes the contribution of electrostatic interaction to the nonspecific binding based on the ionization of the compounds. However, the basic compounds clearly exhibit enhanced binding over neutral or acidic compounds with similar lipophilicity. This enhanced phospholipid binding of bases is thought to be due to a favorable electrostatic interaction between the protonated base and phosphate groups of the phospholipids (27). The negative charges for acidic drugs at pH 7.4 would likely limit their nonspecific binding. This conclusion can be used to explain the positive effect of the x4 (the number of positively charged groups) on fumic, and the negative effect of the x3 (the number of negatively charged groups) on fumic (Eq. 3). The PSA and the number of aromatic bonds are two other contributors in our model.
The performance of the present model vs. the models published by Austin et al. and Hallifax-Houston were compared (Table II).
In general, our model compared favorably to these models for basic and neutral compounds but fared equally inadequately for acidic compounds. The present model differs, however, in its approach in that it utilizes more structural specific parameters such as the number of positively or negatively charged groups, the cosmic electrostatic energy and PSA, etc., in addition to log P and log D. The involvement of these parameters provided additional insights into the molecular mechanisms of nonspecific binding of drugs to hepatic microsomes, especially for the electrostatic interaction.
A structure-based in silico model was developed successfully for the prediction of the nonspecific binding of drugs to hepatic microsomes. Especially, the prediction of fumic for neutral drugs demonstrated similar capability to that for basic drugs (R2=0.82, AFE=1.18 and fold error values were all below 2, except for felodipine and progesterone). The lipophilicity, charge state, and the extent of ionization at pH 7.4 were identified as important properties affecting fumic. One obvious weakness of the present model is the skew distribution of fumic in the entire dataset (most of the compounds were in the range of fumic>0.7, especially for the acids). A larger dataset, composed of drugs with uniform distribution of fumic values, is necessary for accurate fumic prediction and for further reliable evaluation of the free clearance and drug–drug interaction.
We are thankful to Accelrys Inc. for providing 1-month free evaluation of TSAR software in 2007.
Jin Sun, Phone: +86-24-23986321, Fax: +86-24-23986321, Email: moc.nc12@66nijnus.