The aryl hydrocarbon receptor (AhR) is a ligand activated member of the basic-helix-loop-helix (bHLH) family of transcription factors.
1–3 The AhR is activated by a variety of compounds, both synthetic and natural, including halogenated aromatic hydrocarbons such as 2,3,7,8-tetrachlorodibenzo-
p-dioxin (TCDD), and mediates their biological activity.
1–3 The AhR is a cytosolic transcription factor bound to several co-chaperones. Upon ligand binding, the AhR translocates from the cytoplasm to the nucleus and regulates genes, including several drug metabolizing enzymes that can influence the therapeutic activity of a number of compounds.
1–3 The AhR regulates proliferation and differentiation of cells.
4–6 The AhR also induces immunosuppressive regulatory T cells with therapeutic implications in hyperimmune disorders.
7–9 The PAS (Per-AhR/Arnt-Sim) domain of the AhR is the ligand binding domain (LBD).
1–3 In the present study, homology models of the AhR-LBD were built to study the inter and intra-species differences in ligand binding and to identify AhR ligands by virtual ligand screening.
To construct the homology models, a multiple sequence alignment of the AhR-LBD domain from mouse, human, rat, guinea pig, rabbit and zebrafish () was performed. The sequence alignment produced sequence identities of greater than 85% between the different species, with the exception of zebrafish isoform 2, which shared a consensus sequence identity of approximately 70% (). The AhR-LBD of mouse and rat had the highest sequence identity at nearly 98% (), suggesting a common evolutionary path and a recent species splitting as depicted by the evolutionary tree in . The mouse AhR-LBD residues His285, Cys327, Ile332, Met334, Ala375 and Gln377, which have been characterized by mutagenesis studies and are known to influence TCDD binding,
10 are highlighted in red in . These residues are well conserved with the exception of Met337 in guinea pig and Val381 in human AhR. These amino acid residues are replaced with Ile and Ala in all other species ().
| Table 1Interspecies percentage of sequence identity in the AhR-LBD domain |
Several PAS domain structures are currently available in the Protein Data Bank (Pdb). The sequence similarity between the structures of the PAS domain containing-protein family is low; however, within the family there is a high structural conservation of the α and β folds. We selected the PAS domain structure of hypoxia-inducible factor 2α (Hif-2α) (pdb 1p97)
11 for generating a homology model of the mouse AhR-LBD based on its closest similarity in this region. The model was built using the primary amino acid sequence [241–384] of the AhR-PASB domain, producing a sequence identity of 30% from the alignment with Hif-2α. The backbone geometry was preserved while loops were rebuilt using a database search for matching loop geometries.
12 The side chain conformations of amino acids different in the alignment were globally optimized using the biased probability monte carlo procedure.
13 The human, rat, zebrafish 1 and 2 AhR-LBD homology models were built starting from the mouse model and using the respective sequences of the other species homologous to that of the mouse primary sequence [241–384].
Each of the AhR ligand binding domains were built using the homology modeling protocol described in detail by Cardozo et al.
14 The initial model was built by using the template backbone and side-chains in their most likely conformation. Next, the side chains were globally optimized using the modified ECEPP/3 energy function described by Abagyan et al.
15 This energy function includes the following energy terms: van der Waals, hydrogen bonding, torsion energy, distance dependent electrostatic term, disufide bond constraints, side-chain entropy, and the implicit solvent accessible surface dependent solvation term.
14 After the side chain placement the whole model was energy optimized using all the above energy terms but with strong positional restraints. The refined homology models were then geometrically corrected such that all of the residues were in sterically allowed regions of the Ramachandran plot. Following development and refinement of the model, we first analyzed the mouse AhR-LBD homology model to characterize the ligand binding pocket. Using ICM Pocket Finder (Molsoft ICM v3.5-1p),
16 a binding pocket was identified (volume of 201.5 Å
3, area of 178.3 Å
2, and a radius of 3.637 Å) in the region of the protein surrounded by β-sheets Hβ and Iβ (Ala375 and Gln377) at the bottom, by the helix-connector Fα (Cys327and Met334) on the left, by Aβ on the right (His285), and at the top by the loop connecting Aβ and Bβ (). In the helix connector Fα, the backbone NH of Ile332 established an intra H-bond with the backbone carbonyl oxygen of Ala328, keeping the helical secondary structure stable. Replacing Ile332 with a Proline in our mouse AhR-LBD homology model resulted in a loss of intra H-bond interactions. These results suggested that the decrease of TCDD binding in the I332P mutant
10 is due to an indirect effect caused by the disruption of the Fα helix connector in the vicinity of the binding pocket region. Thus, our AhR-LBD homology model provides a structural explanation for the decreased TCDD binding observed with the I332P substitution in the AhR.
10We should note that while other groups have attempted modeling of the AhR-LBD,
10,17 there are noteworthy differences between these models and our characterized binding pocket. The ligand binding pocket in our model is more compact, surrounded mainly by the residues Cys327, Gln377, His285, Met334 and Ala375. The side chains of these amino acids are all pointing towards the binding pocket, indicating structurally their direct involvement in TCDD binding (). The model by Pandini et al
10 is more spread, involving the region between residues Gln377 and His285 surrounded by Dα and Eα helices in addition to the area characterized in our model. Similar to Pandini et al., we observed that the side chain of residues Thr311 and Lys350 point outside the modeled LBD and do not affect AhR ligand binding.
10 The side chain of residues Ile319 and His320 also pointed towards the pocket cavity, confirming their key role in ligand binding specificity.
17To study TCDD binding capacity across species and to provide a structural explanation for the biological data reported in the literature,
1–3 TCDD was docked into mouse, human and zebrafish isoform 2 AhR-LBD models. TCDD docked similarly in all three species binding pockets () and the binding energy calculations (see methodology) were in agreement with previous inter-species biological characterizations.
1–3 Lower binding energy values indicate a higher binding affinity of a particular ligand in our model. The values obtained for TCDD binding in mouse and zebrafish isoform 2 were −3.796 kcal/mol and −3.972 kcal/mol, respectively, confirming that TCDD has similarly high binding affinity in mice and zebrafish.
18 In contrast, the value obtained with the human model was −2.33 kcal/mol, which was significantly higher than for the mouse and zebrafish models, indicating a decreased binding affinity compared with the other two species investigated. TCDD docking orientation was similar in all three species. Further, the residues that were important for ligand binding and facing the ligand pocket cavity were conserved in mouse and zebrafish isoform 2. In the human AhR, one of the residues, corresponding to mouse Ala375 was replaced by a Valine; the presence of this bulkier residue would likely sterically impair TCDD binding ability.
The mouse AhR-LBD homology model was used to structurally characterize some of the intraspecies experimental biological data available for the AhR. The C57BL/6J mouse strain exhibits a 10-fold higher TCDD susceptibility compared to the DBA/2 mouse strain.
1–3 Both the mouse DBA/2 AhRLBD and human AhR-LBD share the A375V substitution (). Hence, the effect of this amino acid substitution on TCDD docking was analyzed in our mouse homology model by mutating A375 into a Valine or a Leucine. In the WT and A375V model TCDD was able to dock into the binding pocket, whereas in the A375L model TCDD failed to dock completely. Thus, the residue at position 375 located in the middle of the ligand binding area appears to directly impact TCDD binding. The replacement of the amino acid Alanine with bulkier residues like Valine (as present in mouse DBA/2 and human sequences) or Leucine decreased the volume of the binding pocket (195 Å
3 (WT), 186.1 Å
3 (A375V) and 148.6 Å
3 (A375L)) () with the consequence of partially (Valine) or totally (Leucine) sterically blocking TCDD binding. By changing the tautomerization of one of the key residue for binding His285 in both WT (194.4 Å
3) and the A375V mutant (185.9 Å
3) models and by re-docking TCDD, the binding energy calculations generated values of −3.617 kcal/mol and −3.413 kcal/mol, respectively. Decreased volume of the binding pocket might result in weaker binding energy that contributes to decreased affinity of TCDD to AhR from human or mouse DBA/2 strain. Indeed, substitution of Alanine at 375 with Leucine has been reported to affect TCDD binding.
10We next used homology modeling to study AhR in zebrafish. Interestingly, zebrafish have three AhR isoforms, namely, AhR1a, AhR1b and AhR2, with AhR2 being the dominant isoform that binds TCDD, whereas AhR1a (zfAhR1a) exhibits impaired TCDD binding.
18,19 The sequence identity between zfAhR1a-and zfAhR2-LBD is 65%. The amino acid residues His285, Ala375 and Gln377 (), which are highly conserved in most species () and contribute to ligand binding are present in zfAhR2; however, in zfAhR1a, these amino acids are substituted with Threonine, Histidine, and Tyrosine, respectively (). Thus, the difference between zfAhR1a and zfAhR2 in TCDD binding may be attributed to these dissimilarities in the ligand binding pocket region. To understand the differences in the ligand binding pockets of zfAhR1a and zfAhR2, homology models for the zfAhR1aand zfAhR2-LBD were built and energetically minimized using the ICM method. TCDD docking was attempted with both zebrafish AhR isoforms models; however, TCDD was able to dock only into zfAhR2-LBD. By analyzing the binding pockets of zfAhR1a and zfAhR2 identified with ICM Pocket Finder,
16 () zfAhR1a was found to have a decreased binding area that may contribute to its lower affinity for TCDD. Substitution of zfAhR2 residues Gln388 and His296 with a Histidine and Tyrosine respectively (as present in zfAhR1a) modified the type of electrostatic interactions established between the protein and TCDD, resulting in decreased binding affinity. Thus, these specific amino acid replacements (Ala386Thr, Gln388His and His296Tyr) in the zfAhR1a-LBD appear to play a direct role in the considerable decrease of TCDD binding by modifying the electrostatic environment of the binding pocket region and the volume of the binding pocket itself. Indeed, electrostatic and van der Waals forces are the driving force in the total binding energy of many halogenated aromatic hydrocarbons towards the AhR.
20In addition to characterization of the prototypical AhR ligand TCDD, we also used our homology models to study the binding of structurally diverse molecules to the AhR. Recently, formylindolo [3,2-b]carbazole (FICZ) and 2-(1’H-indole-3’-carbonyl)-thiazole-4-carboxylic acid methyl ester (ITE) have been described as potent AhR agonists both
in vitro and
in vivo.
8,9 These two ligands were docked into our human and mouse AhR-LBD models to investigate any inter-species differences in their respective binding modes. The two molecules docked into both the human and mouse binding pockets in a similar orientation (), establishing H-bond interactions with protein residues facing the binding cavity. FICZ made two H-bonds between the formyl carbonyl oxygen and the side chain of Ser359 (mouse) and Ser 365 (human) and between the nitrogen of the carbazole group and the side chain of Gln377 (mouse) and Gln383 (human) (). On the other hand, ITE made a single H-bond between its 3’-carbonyl group and the side chain of Ser359 (mouse) and Ser365 (human) (). The binding affinity of FICZ and ITE is higher for mAhR than hAhR as estimated by the binding energy calculations of the ligand and protein complexes. The binding energies for the ligand-mAhRLBD complex formation for FICZ and ITE were −4.714 kcal/mol and −4.214 kcal/mol respectively. The estimated values for hAhR-LBD-ligand binding were −4.12 kcal/mol for FICZ and −3.46 kcal/mol for ITE. The H-bond pattern established between FICZ and ITE with mouse Ser359 and human Ser365, which are not present for TCDD, may play an important role in increasing AhR binding of these two ligands.
To further validate and apply our AhR-LBD homology model, a database of 498 natural compounds was docked into mouse AhR-LBD model binding pocket for the purpose of identifying new ligands of the AhR by virtual ligand screening (VLS).
21 After the docking results were obtained we arbitrarily set a binding energy cutoff of δ −2.5 kcal/mol, which resulted in the selection of 25 compounds that docked into the binding pocket in an allowable orientation; these 25 compounds were then tested in cell based experiments. The list of the 25 docked structures and their respective AhR transcriptional activation results are provided in
Supplemental Table 1. Six of the 25 compounds were able to activate AhR transcription and at least three of the non-activating compounds strongly antagonized TCDD-induced AhR transcription ( and data not shown). The top two AhR activators that we identified by VLS were 5-hydroxy-7-methoxyflavone and pinocembrin, both of which are flavonoids. Both compounds strongly activated the AhR transcription in a dose-dependent manner (). The docking orientation of the two potent AhR activators identified was similar, with the 2-phenyl ring allocated in the small hydrophobic cleft surrounded by residues Phe281, Leu309 and Leu347 (). The 1,4-benzopyrone structure of these two flavonoids bound instead in the middle of the binding pocket surrounded by residues Cys327, His285, Met334 and Ala375 (). Both ligands shared H-bond interaction between the carbonyl group of the benzopyrone template and the side chain of residue Ser359 (). Furthermore, 5-hydroxy-7-methoxyflavone established a second Hbond between its hydroxyl group adjacent to the carbonyl group and the side chain of Ser359 (). We further tested the ability of 5-hydroxy-7-methoxyflavone and pinocembrin to activate endogenous target genes of AhR.
22 Both the compounds activated
CYP1A1 and
NADPH quinone reductase in an AhR dependent manner (). Furthermore, both the compounds promoted nuclear translocation of endogenous AhR from cytosol after one hour of exposure, confirming their status as AhR agonists (). Out of the different flavones predicted to bind to AhR ligand binding pocket by the virtual ligand screen, 5-OH-7-methoxyflavone strongly induced AhR transcription, whereas 3-OH-7-methoxyflavone modestly activated AhR transcription. 6-OH-7-methoxyflavone did not activate AhR, suggesting that 6-OH-7-methoxyflavone binds to AhR, but may not bring AhR in to a transcriptionally active form. We tested whether this compound can act as an AhR transcriptional antagonist. Indeed, co-treatment with 6-OH-7-methoxyflavone strongly inhibited both TCDD- and 5-OH-7-methoxyflavone-induced AhR transcription (
Supplemental Figure 1). There is precedence for differential effects of different flavones on AhR activity.
23,24,25 β-naphthoflavene is a strong AhR agonist, whereas α-naphtoflavone is a partial agonist/antagonist of AhR transcription.
26 By comparing the binding pattern of 7-methoxyflavone derivatives to AhR ligand binding pocket, we were unable to distinguish agonists from potential antagonists.
| Table 2Activation of AhR transcription by compounds identified by virtual docking into AhR LBD |
In summary, homology models of AhR-LBD in the agonist bound conformation were built using the ICM method.
14,15 Using these models, differences in ligand binding across various species was structurally characterized. The mouse AhR-LBD homology model was also utilized successfully to identify new AhR agonists by
insilico structure based VLS.
21