Search tips
Search criteria 


Logo of ijmsMDPIhomeThis articleThis journalInstructions for authorsSubscribeIJMS
Int J Mol Sci. 2010; 11(4): 1228–1235.
Published online 2010 March 24. doi:  10.3390/ijms11041228
PMCID: PMC2871113

Study on QSTR of Benzoic Acid Compounds with MCI


Quantitative structure-toxicity relationship (QSTR) plays an important role in toxicity prediction. With the modified method, the quantum chemistry parameters of 57 benzoic acid compounds were calculated with modified molecular connectivity index (MCI) using Visual Basic Program Software, and the QSTR of benzoic acid compounds in mice via oral LD50 (acute toxicity) was studied. A model was built to more accurately predict the toxicity of benzoic acid compounds in mice via oral LD50: 39 benzoic acid compounds were used as a training dataset for building the regression model and 18 others as a forecasting dataset to test the prediction ability of the model using SAS 9.0 Program Software. The model is LogLD50 = 1.2399 × 0JA +2.6911 × 1JA – 0.4445 × JB (R2 = 0.9860), where 0JA is zero order connectivity index, 1JA is the first order connectivity index and JB = 0JA × 1JA is the cross factor. The model was shown to have a good forecasting ability.

Keywords: benzoic acid, acute toxicity, MCI, toxicity prediction, QSTR

1. Introduction

Benzoic acid compounds are an important organic chemical raw material that are widely used in food, medicine, cosmetic, antiseptic, insecticide, dyestuff, etc. For example, benzoic acid is a common antiseptic, Aspirin is a famous non-steroid anti-inflammatory drug, Triflusal is a antithrombotic, and Chloramben and Dicamba are common pesticides (see Figure 1). Most benzoic acid compounds are toxic and are hardly degraded by microorganism in the natural environment, which may cause serious public health and environmental problems.

Figure 1.
Molecular structures of benzoic acid (1), aspirin (2), triflusal (3), chloramben (4) and dicamba (5).

With the development of synthetic chemistry, combinatorial chemistry and pharmaceutical chemistry, millions of new compounds are being synthesized. Classical chemical substance evaluation needs a lot of time and is expensive, and the speed of analyzing the toxicity of compounds is less than the speed of discovery of new compounds. Nowadays, scientists pay more and more attention to the importance of prediction toxicity in the early stage. Quantitative structure-toxicity relationships (QSTR) have been efficiently used for the study of toxicity mechanisms of various compounds [1].

QSTR plays an important role in toxicity forecasting, which is widely used in the modern studying of compounds, since more and more compounds are being found. It is necessary to predict the toxicity of compounds accurately and quickly [24]. QSTR of benzoic acid compounds with molecular connectivity index (MCI) in mice via oral LD50 (acute toxicity, half lethal dose) are not reported. The quantitative structure characteristic parameters of 57 benzoic acid compounds were obtained with MCI. Values of LD50 for mice in benzoic acid compounds have been collected from various literature sources. In this work, the QSTR of benzoic acid compounds in mice via oral LD50 was studied and a model was developed to more accurately predict the toxicity of benzoic acid compounds in mice via oral LD50. 39 benzoic acid compounds were used as a training dataset for building the regression model, and 18 other benzoic acid compounds as a forecasting dataset to test the prediction ability of the model. The experimental result analysis showed that 0JA, 1JA and cross factor JB were important factors affecting the toxicity of benzoic acid compounds (although the toxicity mechanism of compounds is not clear yet), where 0JA is zero order connectivity index, 1JA is the first order connectivity index and JB= 0JA × 1JA is the cross factor.

2. Research Methods

In 1975, Milan Randic described a skeletal branching index that correlated with the three physical properties of alkenes [5]. The concept was further developed and applied extensively by Kier and Hall [68], which led to the molecular connectivity index (MCI). Eventually, Kier and Hall modified the connectivity indices to discriminate carbon atoms from other heteroatoms, which introduced the valance molecular connectivity index mχt [9]. The MCI is calculated with the follow formula:


mχt is mth-order MCI, t is the type of sub-graph including path (p), cluster (c), path-cluster (pc), Nm is the number of the sub-graph of the same type and order. The abbreviation is δ = σ – h, where σ is the count of electrons in σ orbital and h is the count of bonding hydrogen atoms.

There was no doubt that the MCI was proved to be the one of the most successful and widely used descriptors. The MCI has been introduced and used in many studies [1013].

From the skeletal branching index of Randic to the connectivity index modified by Kier and Hall, the core is the connectivity of atoms, which is from the connectivity δi of upper atom to valence connectivity of δiv. The computing method of heteroatom i modified by Kier and Hall is as the following formula:


Z and Zi are the count of extra nuclear electrons and valence electrons, respectively, hi is the count of hydrogen atoms combining with heteroatom i. Although Kier et al contributed to the computing method of heteroatom i, the method could not discriminate the same heteroatom in different oxidation states. More recently, Yu et al improved the method, and redefined the valence connectivity value δhi using the following formula [14]:

δhi=2×Z (Zihi)[(8Ni)1/Ni][(2ni1)hi/Ni1]/[(mi+Lp)(2ni1)]

mi is the count of bonding electrons, Z is the count of extra nuclear electrons, ni is maximum first quantum number, Zi is the valence electron number, Ni is the count, Lp is the hybridization style of heteroatom i, the value as following: sp3, Lp = 1; sp2, Lp = −1.8; sp, Lp = 2; if that is the atom itself, Lp = 2, mi = 0.

The program package for calculating the MCI of compounds was compiled by Visual Basic Program Software according to the modified formula. In order to predict the toxicity of benzoic acid compounds and get the prediction model, the molecular structure of 57 benzoic acid compounds was entered into the program package and their MCI were calculated. 39 of them were a training dataset for building the multi variance linear regression model (logarithm of LD50 as dependent variable and MCI as factor), and 18 of them were predicted samples to test the prediction ability of the model using SAS 9.0 Program Software. During the process of building the regression model, the cross factor was considered into the model.

3. Results and Discussion

In what follows, we will present the process of computing MCI, choosing factors of the regression model and building the model, as well as testing the model. Firstly, zero order connectivity index 0JA and first order connectivity index 1JA were calculated using the program package. The value of LD50 was converted to logarithm in order to make all the data in the same order of magnitude and easier to statistically analysze and compare. Then, the toxicity data was analyzed in the training dataset as regression analysis. Non-intercept stepwise regression was chosen as the statistical method. The influencing factors were as follows: zero order connectivity index 0JA, first order connectivity index 1JA and the cross factor JB= 0JA × 1JA. These influencing factors were inspected, and the results were as below:

  1. 0JA: R-Square = 0.9542 and C(p) = 1.0000
  2. 1JA: R-Square = 0.9560 and C(p) = 1.0000
  3. JB: R-Square = 0.8656 and C(p) = 1.0000
  4. 0JA, 1JA: R-Square = 0.9560 and C(p) = 0.2565
  5. 0JA, JB: R-Square = 0.9829 and C(p) = 2.0000
  6. 1JA, JB: R-Square = 0.9816 and C(p) = 2.0000
  7. 0JA, 1JA: JB: R-Square = 0.9860 and C(p) = 3.0000

The results show that the groups are fine expect (3) and (4), and correlation coefficient (R2) showed that (7) is the best. It was shown that the regression linearity of (7) is better than other groups. Therefore, 0JA, 1JA and JB were chosen as the independent variables of the model (see Table 1).

Table 1.
Variable parameter estimation analysis.

Comparing the p value in the table, it was shown that 0JA, 1JA and JB had an obvious significant influence, and a regression estimated model was built:


Obeying the principles that the value of correlation coefficient (R2) is approximate to 1 and the p value is less than 0.01, as well as the numbers of the parameters equal to the test coefficient, we found that the linearity of the model is appropriate. The result of residual analysis shows that the fitting of the model was good (see Table 2). The distribution of residual is a normal distribution, since the scatter plots are almost standing on one line (see Figure 2).

Figure 2.
Normal P-P Plot of residual.
Table 2.
Building the toxicity prediction regression model of benzoic acid compounds with training dataset (39 benzoic acid compounds).

From analysis of the model, it was known that 0JA, 1JA and cross factor JB had great influence on the oral toxicity in mice. When 0JA and 1JA decrease, the value of LD50 increases. And LD50 decreases as JB increases. Since increasing LD50 resulted in lower toxicity, therefore, the model showed that 0JA and 1JA have a negative correlation to the toxicity of benzoic acid compounds, and JB has a positive correlation to the toxicity of benzoic acid compounds. The ability of regression model with 18 benzoic acid compounds was also tested, and the result indicates that the prediction ability of the model is good (Table 3). It is shown that these influencing factors indeed had an significant effect on toxicity, and the forecasting accuracy of the model becomes higher when introducing the cross factor (JB).

Table 3.
Toxicity prediction of the regression model with Testing dataset (18 benzoic acid compounds).

4. Conclusions

LD50 is a common factor for evaluating compound toxicity, which reflects receptivity of test animals, and LD50 values have high reproducibility and stability. In QSTR study, linear regression analysis is a widely useful quantization method [15]. In this work, the quantitative parameters were calculated with MCI and the toxicity prediction model of benzoic acid compounds was obtained as follow. LogLD50=1.2399 × 0JA +2.6911 × 1JA – 0.4445 × JB, R-Square = 0.9860. The model has a good forecasting ability.


This work was supported by Chinese National Science Foundation (No. 30571591).

References and Notes

1. Sun YZ, Yan XL, Li ZJ, Meng FH. Application of Chemical Models in Toxicological Study. Chinese J. Environ. Health. 2007;24:734–736.
2. Mihai VP, Ana-Maria L. Introducing Spectral Structure Activity Relationship (S-SAR) Analysis. Application to Ecotoxicology. Int. J. Mol. Sci. 2007;8:363–391.
3. Mihai VP, Ana-Maria P, Marius L, Luciana I, Adrian C. Quantum-SAR Extension of the Spectral-SAR Algorithm. Application to Polyphenolic Anticancer Bioactivity. Int. J. Mol. Sci. 2009;10:1193–1214. [PMC free article] [PubMed]
4. Meng FH, Sun YZ, Li ZJ, Yan XL. The Application of QSAR in the Study of Chemicals Toxicity. Chem. Bioeng. 2007;24:5–7.
5. Randic M. On characterization of molecular branching. J. Am. Chem. Soc. 1975;97:6609–6615.
6. Kier LB, Murray WJ, Randic M, Hall LH. Molecular connectvity I Relationship to non-specific local anesthetic activity. J. Pharm. Sci. 1975;64:1971–1974. [PubMed]
7. Hall LH, Kier LB, Murray WJ. Molecular connectivity I. Relationship to water solubility and boiling point. J. Pharm. Sci. 1975;64:1974–1977. [PubMed]
8. Kier LB, Murray WJ, Randic M. Molecular connectivity V connectivity series applied to density. J. Pharm. Sci. 1976;65:1226–1230. [PubMed]
9. Kier LB, Murray WJ, Randic M, Hall LH. Molecular connectivity VII Specific treatment of heteroatoms. J. Pharm. Sci. 1976;65:1806–1809. [PubMed]
10. Gayathri P, Pande V, Sivakumar R, Gupta SP. A quantitative structure-activity relationship study on some HIV-1 protease inhibitors using molecular connectivity index. Bioorgan. Med. Chem. 2001;11:3059–3036. [PubMed]
11. Roy K, Leonard JT. QSAR modeling of HIV-1 reverse transcriptase inhibitor 2-amino-6-arylsulfonylbenzonitriles and congeners using molecular connectivity and E-state parameters. Bioorgan. Med. Chem. 2004;12:745–754. [PubMed]
12. Agrawal VK, Khadikar PV. QSAR study on inhibitor of brain 3-hydroxy-anthranihc acid dioxygenase (3-HAO): A molecular connectivity approach. Bioorgan. Med. Chem. 2001;9:3295. [PubMed]
13. Gupta S, Singh M, Madan AK. Applications of graph theory: Relationship of molecular connectivity index and atomic molecular connectivity index with anti-HSV activity. J. Mol. Struct. 2001;571:147–152.
14. Yu XM, Yu XS. A New Method for Calculation Valence Delta of Heteroatoms in Molecular Valence Connectivity Topological Index and Its Application. Chin. J. Organ. Chem. 2001;21:658–667.
15. Xu SJ. Computer-Assisted Drug Molecular Design. Chemical industry press; Beijing, China: 2004.

Articles from International Journal of Molecular Sciences are provided here courtesy of Multidisciplinary Digital Publishing Institute (MDPI)