|Home | About | Journals | Submit | Contact Us | Français|
Interferon regulatory factors IRF-3 and IRF-7 are transcription factors essential in the activation of interferon-β (IFN-β) gene in response to viral infections. Although, both proteins recognize the same consensus IRF binding site AANNGAAA, they have distinct DNA binding preferences for sites in vivo. The X-ray structures of IRF-3 and IRF-7 DNA binding domains (DBDs) bound to IFN-β promoter elements revealed flexibility in the loops (L1–L3) and the residues that make contacts with the target sequence. To characterize the conformational changes that occur on DNA binding and how they differ between IRF family members, we have solved the X-ray structures of IRF-3 and IRF-7 DBDs in the absence of DNA. We found that loop L1, carrying the conserved histidine that interacts with the DNA minor groove, is disordered in apo IRF-3 but is ordered in apo IRF-7. This is reflected in differences in DNA binding affinities when the conserved histidine in loop L1 is mutated to alanine in the two proteins. The stability of loop L1 in IRF-7 derives from a unique combination of hydrophobic residues that pack against the protein core. Together, our data show that differences in flexibility of loop L1 are an important determinant of differential IRF-DNA binding.
The interferon regulatory factor (IRF) family of proteins plays an essential role in the activation and regulation of immune response genes implicated in both innate and adaptive immunity (1–5). In addition, some members of this family have critical roles in the differentiation and development of hematopoietic cells and in the regulation of apoptosis (6,7). In mammals, nine IRF proteins have been identified so far and all share the same general architecture, with a highly conserved N-terminal domain of approximately 120 residues, which is involved in DNA-specific binding and a variable C-terminal domain or IRF association domain (IAD) that mediates not only homo or hetero-oligomerization among IRF factors, but also mediates association with other transcription factors and co-activators, like CBP/p300 (8,9). Despite the high degree of structural similarity among the different IRF DNA binding domains (DBDs), there are important differences; for instance, IRF-4 contains an N-terminal extension of 20 amino acids that inhibit DNA binding (10) and IRF-7 has a recognition helix α3 that is longer than other members of the IRF family (11). Nevertheless, all IRF proteins recognize a DNA element with a consensus sequence AANNGAAA. This sequence is found in a multitude of promoters from interferon response genes, usually containing two or more of these elements (12–14). The crystal structure of IRF-1 DBD bound to DNA (15), revealed that the IRF DBD consists of a modified version of the helix-turn-helix (HTH) motif that includes a four-stranded antiparallel β-sheet and three large loops (L1–L3) connecting the different secondary structural elements. Subsequent structures of IRF-2 (16), IRF-4 (10), IRF-3 (17) and IRF-7 (11) bound to DNA have confirmed that DNA recognition is achieved by (i) conserved residues on helix α3 (Arg, Cys, Asn) interacting with GAAA bases (AANNGAAA) in the major groove and (ii) a conserved His on loop L1 protruding into the minor groove and making water-mediated contacts with two consecutive A:T steps upstream of the GAAA core sequence (AANNGAAA). These sets of interactions are further stabilized through a network of hydrogen bonds between the IRF proteins and the DNA phosphate backbone. The recent structures of IRF-3 DBD bound to the PRDI–PRDIII DNA element of the IFN-β enhancer and the structure of IRF-3, IRF-7 and NF-κB DBDs bound to the PRDII–PRDIII element have provided insights on the basis of cooperativity when two or more IRF DBDs bind to natural enhancers and promoters (11,18). From the structures, due to the overlapping nature of the individual sites, the binding of one IRF molecule affects the binding of the second molecule primarily through DNA bending rather than direct contacts with each other. Also, IRF-3 appears to be capable of recognizing both consensus and non-consensus sites and IRF-7 can better accommodate sites with G:C base pairs upstream of the AA element. In order to fully understand the structural and thermodynamic determinants of IRF DNA recognition, it is important to characterize the structural changes that occur upon DNA binding. To address this issue, we have solved the structures of IRF-3 and IRF-7 DBDs in the absence of DNA. Our data shows that differences in the flexibility of loop L1 in IRF proteins play an important role in DNA-specific binding.
The cDNA fragments encoding human IRF-3 DBD (residues 1–113) and mouse IRF-7 DBD (residues 1–134) were generated by PCR and cloned in pET-15b vectors (Novagene). All point mutations were introduced by the Quick-change site-directed mutagenesis kit (Stratagene). Proteins were purified as previously described (18). Briefly, the proteins were expressed in the Escherichia coli strain BL21(DE3) pLyS (Stratagene). Cells were grown in LB medium supplemented with ampicillin (100µg/ml) at 37°C to OD600 0.5 and induced with 0.4mM IPTG for 3h. Clarified lysates were applied to nickel affinity columns (Qiagen) obtaining protein up to 90% pure as judged by SDS–PAGE. The His-6 affinity tags were removed by thrombin digestion (Sigma). The proteins were then loaded onto a SP-Sepharose column (GE healthcare) equilibrated with 25mM HEPES pH 7.0, 50mM NaCl, 1mM DTT and eluted using a linear gradient of 0–1.0M NaCl in buffer. Further purification was carried out using a SD75 gel filtration column (GE healthcare) equilibrated with a buffer containing 25mM Tris–HCl pH 7.5, 300mM NaCl and 1mM TCEP. The purified proteins were concentrated and used for crystallization.
Selenomethionine (Semet) crystals of IRF-3 DBD were obtained by mixing equal volumes of protein solution (30mg/ml) and reservoir solution containing 6% PEG 1000, 100mM sodium acetate (pH=6.0), 300mM zinc chloride and 8% cadaverine. Crystals grew at 20°C within 2 days and belonged to space group P3221 with unit cell dimensions of a=b=64.90 Å, c=157.62 Å. Multiwavelength anomalous diffraction data (MAD) were measured from a single frozen crystal at the Advance Photon Source (APS, beamline 17ID) of Argonne National laboratory. Data were measured at three wavelengths, corresponding to the inflection (0.9796 Å) and peak (0.9791 Å) of the selenium edge absorption profile plus at remote point (0.9537 Å). IRF-3 crystals diffracted to 2.1 Å resolution at APS. The data were truncated to 2.3 Å, due to the presence of an ice ring in the 2.1–2.25 Å resolution range.
IRF-7 DBD apo crystals were obtained mixing two volumes of protein solution (46mg/ml) with one volume of reservoir solution containing 18% PEG 1500. The crystals grew over night and belonged to the trigonal space group P31 with unit cell dimension of a=b=68.59 Å, c=68.61 Å. Crystals were flash frozen in liquid nitrogen using mother liquor plus 20% glycerol as a cryoprotectant. Diffraction data were collected at National Synchrotron Light Source (NSLS, beam lines X6A and X29) at λ=1.0 Å. The crystals diffracted to 1.3 Å resolution. The data were integrated and reduced using HKL2000 package (19).
The structure of IRF-3 DBD was solved by molecular replacement (MR) using one of the IRF monomers from the IRF-3/PRDIII–PRDI complex (PDB id 2PI0) (18) as a search model. The initial MR solution was obtained using the program PHASER (20) and included only two molecules in the asymmetric unit. The inspection of density map calculated with this solution suggested the presence of another IRF-3 molecule in the asymmetric unit, which was found by the program MolRep (21). An anomalous difference Fourier map was used to locate the positions of selenium (and zinc) atoms. Structure refinement was carried out by rounds of energy minimization and B-factor refinement using the CNS package (22). Manual model rebuilding was performed with the program COOT (23) using composite omit maps generated by CNS. Final cycles of restrained refinement were performed using the programs REFMAC (24) and PHENIX (25). The final model contains three monomers in the asymmetric unit, named IRF-3A (amino acids 5–40 and 49–110), IRF-3B (amino acids 5–39 and 49–110), IRF-3C (amino acids 5–39 and 49–110); 9 zinc ions; 3 sodium ions; 7 chloride ions and 151 water molecules (Figure 1A). The model presents excellent geometry, with the Ramachandran analysis indicating that 94.6% of the residues are in the allowed regions and 5.4% are in the preferred regions (Table 1).
The structure of mouse IRF-7 DBD apo form was solved by MR using a homology search model from the human IRF-7 DBD bound structure (PDB id 2O61) (11), generated by the program CHAINSAW (26). The initial MR solution was obtained with the program PHASER, which suggested three molecules in the asymmetric unit. After a round of rigid-body refinement, the Rfact and Rfree values were 47 and 48%, respectively. A significant twin fraction (α=0.36) was detected by the merohedral crystal twinning server (http://nihserver.mbi.ucla.edu/twinning/) and Xtriage program in the Phenix suite (25). Reflections for the free set were thus selected using the highest lattice symmetry [P3(bar)m] in order to avoid a bias in the calculation of Rfree, due to the pseudo-symmetry generated by the twinning. In the final step of refinement, twining was included, with a significant improvement of the refinement statistics (final Rfree=20.0%). Iterative rounds of restrained refinement and building were performed in the same manner as IRF3-DBD apo. The final model contains three monomers in the asymmetric unit, named IRF-7A (amino acids 9–130), IRF-7B (amino acids 9–125) and IRF-7C (amino acids 8–125); 287 water molecules and 3 sodium ions. The model has excellent geometry with 98.0% of the residues in the most favored regions of the Ramachandran plot, and no residues in the disallowed regions (Table 1). All figures were made with the PyMol solftware (27).
6-Carboxyfluorescein (6FAM) labeled (5′-6FAM-GAGAAGTGAAAGTGG-3′) and unlabeled complementary DNA oligonucleotides containing the PRDI consensus site were purchased form IDT DNA Technologies (Coralville, IA, USA). Oligos were resuspended in 1×TE buffer supplemented with 100mM NaCl. DNA duplex were formed by heating to 95°C a mixture of one equivalent of the 6-FAM-labeled strand with one equivalent of the complementary strand and permitting the sample to cool down to room temperature. Each reaction sample (volume 150µl) consisted of 5nM of 5′-Fluorescein-labeled DNA and increasing concentrations of IRF-3 DBD or IRF-7 DBD (0.5nM–300µM) in a buffer containing 25mM HEPES pH 7.0, 50mM NaCl. Reactions were incubated at 20°C for 15min prior to measurement. Fluorescence anisotropy intensity data were collected on a Beacon 2000 fluorescence polarization system (Invitrogen) by setting excitation filter at 490nm and emission filter at 520nm. The fraction of DNA bound (B) was calculated using the following equation:
where [A]x equals the anisotropy measured at protein concentration X, [A]DNA is the anisotropy in absence of protein and [A]final is the anisotropy value at saturation. Each data point was calculated from an average of five anisotropy measurements. DNA fraction bound values were then plotted versus domain concentration and the data were fitted by hyperbolic regression, using Origin 6.1 (OriginLab), to the following equation:
where B is the fraction of DNA duplex bound, D0 is the total concentration of the DBD, and Kd is the dissociation constant of the complex.
As in previous structures of IRF DBD/DNA complexes, the apo forms of IRF-3 and IRF-7 DBD’s retain an α/β architecture consisting of three α helices (α1–α3) flanked by a four-stranded antiparallel β-sheet (β1–β4), a variant of the helix-turn-helix (HTH) motif found commonly in transcription factors (Figure 1B and E) (28). As is characteristic for this family, three large loops (L1– L3) connect different parts of the secondary structure. Loop L1 connects β2 and α2, loop L2 connects α2 and α3, and loop L3 connects β3 and β4. The electron density for apoIRF-3 DBD is well defined with the exception of 4 residues at the N-terminus and approximately 10 residues from loop L1. In the case of IRF-7 DBD, only the eight residues at the N-terminus are not defined. The apo IRF-3 and IRF-7 DBD structures superimpose with an RMSD of 1.39Å, but if the superposition is performed with only the three α-helices and the β-sheet, the RMSD decreases to 1.19Å. The largest deviations between the two structures occur in loops L1 and L2.
The three IRF-3 DBD molecules in the crystallographic asymmetric unit are arranged with their loops (L1–L3) and helix α3 exposed to the solvent while the rest of the molecules pack against each other. Interestingly, the apo IRF-3 DBD crystals grow only in the presence of ZnCl2. We identified nine Zn+2 ions via their anomalous signal and together these ions mediate intermolecular interactions between the three symmetry-related molecules by coordinating a set of Asp, Glu and His residues (Figure 1A), as well as chloride ions. The coordination is typically tetrahedral with an average Zn2+-ligand distance of 2.1Å (29). A superposition of the three molecules yields a low RMSD value of 0.5 Å2 for 98 Cα’s. The largest deviation occurs in the linker between helix α3 and strand β3 (RMSD of 1.47Å) of molecule C and molecules A/B. A notable feature of the structure is the absence of density for most of loop L1 (amino acids Gly41–Asn48) in all three molecules of the asymmetric unit (Figure 1A).
In the previous IRF-3-DNA complex structures, loop L1 interacts with the DNA minor groove, with residues Trp38, Gly41, Leu42 and Gln44 contacting the DNA backbone and His40 making water-mediated contacts with the two contiguous A:T base pairs upstream of the GAAA core sequence (AANNGAAA) (11,17,18). The loop is somewhat flexible in the DNA complexes (average B-factor of ~70Å2) but, as we show here, in the absence of DNA, it is largely disordered (Figures 1B and 2A). Also, residues that flank the loop deviate substantially in conformation from that observed in complex with DNA. For example, the Cα of Phe51 is shifted by ~10Å and the Cα of His40 is displaced by ~3.7Å. Also, side chain density of His40 is defined only in one of the three molecules in the asymmetric unit (Figure 2A). With the exception of loop L1 (and the flanking residues), the rest of the apo structure (including loops L2 and L3) superimposes well with the DNA complex, with an RMSD of ~1.06Å for 98 Cα’s (Figure 3A). Most of the conformational changes are limited to the local rearrangement of side chains. Most notably, Lys77, Arg78, Arg81 and Arg86 (on helix α3) that make in contacts with the DNA in the complex adopt different (or multiple) conformations in the absence of DNA (Figure 3A) (18).
Although the IRF-7 DBD crystallizes in a different space group than the IRF-3 DBD there are again three molecules in the asymmetric unit (Figure 1B). The three molecules superimpose with a maximum RMSD of ~0.5Å for 117 Cα’s. In contrast to the apo IRF-3 DBD, loop L1 is well defined (Figure 1E and 2B). This ordering of loop L1 in IRF-7 does not appear to be a crystallization artifact, as the conformation of the loop is almost identical in all three molecules in the asymmetric unit despite their different crystal packing environments. Loop L1 in IRF-7 differs from that in other IRF family members in containing a phenylalanine (Phe45 instead of an alanine) after the conserved histidine (His44) (Figure 1C). Phe45 packs against Leu50 (also unique to IRF-7) in loop L1 and Phe58 in helix α2, as part of the hydrophobic core of the IRF-7 DBD (Figure 2B). Together, these hydrophobic interactions appear to confer a more restrictive conformation to loop L1, which is reflected in lower average B-factors for this region (~25 Å2) compared with the same region in IRF-3.
Loop L2 is much longer in the IRF-7 DBD than in other IRFs and has been proposed to interact with the p50 subunit of NF-κB when the two transcription factors bind to adjacent sites on the interferon-β enhancer (10). Because of its length and the abundance of proline and glycine residues, it is the most flexible segment of the IRF-7 DBD structure. Indeed, in the structure of IRF-7 bound to DNA, loop L2 is partially unstructured (11). In the apo structure, loop L2 is well defined but has the highest average B-factors (~29 Å2) (Figure 3B). Interestingly, residues 71–78 in L2 are unique to IRF-7 and adopt an extended conformation characteristic of poly glycine and poly proline chains (Figure 3B). This extended conformation appears to be important in allowing Arg67 on loop L2 to contact Asn75 on NF-κB.
Intriguingly, the recognition helix α3 in the apo structure is longer at the N-terminus by three amino acids when compared to IRF-7 DBD in the presence of DNA (Figure 3B). Among these amino acids is Arg89, which in the DNA complex makes contacts with the DNA backbone but in the absence of DNA is directed toward the interior of the protein and makes contacts with structural water in the protein cavity. Interestingly, Arg89 in apo IRF-7 adopts a similar conformation as Lys92 in DNA complex, pointing to a switch in position between the two residues upon DNA binding. One feature of helix α3 that is conserved in the apo and DNA bound forms of IRF-7 is a kink in the helix at Gly90, indicating that this bending is not a consequence of the DNA binding.
We also identify a putative metal ion between helix α3 and strand β3 in the apo structure that is coordinated with carboxylic groups of Leu99, Gly103 and Phe105. The metal (possibly Na+ from the crystallization mix) appears to compensate for the large dipole moment of long helix α3. A similar metal binding site is located in the DNA bound structure of IRF-2 (16).
The structures of all IRF/DNA complexes have shown that a conserved histidine on loop L1 (His40 in IRF-3 and His44 in IRF-7) partially enters the DNA minor groove and makes water mediated contacts with A:T base pairs upstream of the GAAA core sequence (AANNGAAA). The observed difference in flexibility of loop L1 in IRF-3 and IRF-7 DBDs raises the question of the effect of mutating residues in loop L1 on DNA binding. To address this, we constructed two mutants: (i) IRF3-H40A, in which His40 in IRF-3 DBD is mutated to alanine, (ii) IRF7-H44A, in which His44 in IRF-7 DBD is also mutated to alanine. The wild-type (WT) and mutant proteins were tested for DNA binding by fluorescence anisotropy. Figure 5 shows the result obtained using the PRDI DNA element with sequence GAGAAGTGAAAGT. Our experiments show that IRF-3 DBDwt binds with a dissociation constant of 486nM, while IRF-7 DBDwt binds with a slightly lower affinity of 630nM. These values are consistent with previously reported affinities of IRFs proteins for the PRD-I element (30). However, the affinity of IRF-3-H40A decreases by ~3.6-fold (Kd=1706nM), IRF-7-H44A binds with almost the same affinity (Kd=532nM) as the WT protein.
We present here the first crystal structures of IRF-3 and IRF-7 DBDs in the absence of DNA. Superposition of apo and DNA bound IRF-3 DBDs reveals two primary conformational changes on DNA binding: (i) the N-terminus of IRF-3 undergoes a disorder to order transition, with Lys5 making contacts with DNA along the minor groove, (ii) loop L1 becomes ordered and approximately 10 residues that are not visible in the apo IRF3 structure are well defined in the DNA-bound structure. In contrast to IRF-3, the L1 loop is ordered in the apo IRF-7 DBD structure and the major change is a rearrangement of the loop as a rigid body around the minor groove. The inherent flexibility of loop L1 is consistent with the apo NMR structure of IRF-2 DBD, wherein L1 displays a large number of conformations (31). A conserved PWKH motif in loop L1 appears to be important in positioning L1 for its interaction with DNA. That is, the proline (Pro37 in IRF-3) fixes the adjoining tryptophan (Trp38) in an outward conformation for interaction with the DNA backbone. The indole ring of this tryptophan moves by about 1.6 Å in order to interact with a DNA phosphate group. At the same time, the main chain carbonyl group of Lys39 is positioned to make a hydrogen bond with Lys77 (on the recognition helix α3), providing stability to the overall loop conformation. Notably, the imidazole ring of His40 swings by almost 6 Å into the DNA minor groove in order to make water-mediated contacts with A:T base pairs, while the main chain carboxyl group of His40 interacts with the main chain amide group of Arg43 to further stabilize the loop (Figure 4A).
In contrast to IRF-3, the L1 loop is ordered in the apo IRF-7 DBD structure and stabilized in part by two hydrophobic residues (Phe45 and Leu50) that fold back into the core of the protein. Upon DNA binding, the loop undergoes a rigid body transition of ~2 Å that pivots around two points at both ends of the loop. At the N-terminal end, the Cα of the conserved residue Trp42 moves by about 1 Å in order to interact with the DNA backbone. Moreover, the displacement of Trp42 pushes against the other pivot site in residue Phe60. Similarly to IRF-3 DBD, the imidazole ring of the conserved Histidine residue (His44) rotates around the chi1 bond by ~25° to position itself to interact with the minor groove. The difference in the inherent flexibility of loop L1 between IRF-3 and IRF-7 seems to be reflected on the effect of His mutations on DNA binding. Thus, while mutation of His40 to alanine significantly decreases DNA binding of IRF-3 by ~3.6-fold, the equivalent mutation in IRF-7 shows no significant change in binding (Figure 5). The decrease in binding with the IRF-3 H40A mutation likely represents the loss of interactions that stabilize loop L1 upon DNA binding, including the interactions between His40 and the DNA minor groove. However, because loop L1 is already ordered in apo IRF-7 and His44 is involved in interactions with Ser112 and several water molecules, it is conceivable that there is less of a gain in enthalpy when His44 (compared to His40 in IRF-3) interacts with DNA. Our data suggest that differences in the inherent flexibility of loop L1 between IRF-3 and IRF-7 have a direct effect on DNA binding and may play a role in the distinct DNA binding specificities observed between the two proteins. Indeed, there is some indication from biochemical studies that IRF-3 and IRF-7 interact differently with DNA. IRF-7, for example, is more tolerant than IRF-3 to changes in the AA sequence upstream of the GAAA core (AANNGAAA) suggesting a somewhat looser interaction with DNA (9,31). To more fully explore the differences between IRF-3 and IRF-7 in DNA selection will require additional mutations and a more detailed thermodynamic analysis on several DNA sites.
Taken together, we show here that the structures apo IRF-3 and IRF-7 DBD’s are generally similar to the DNA bound forms, including a kink in the recognition helix α3 and ordered loops L2 and L3. The primary difference is in loop L1, which undergoes a disorder to order transition in the IRF-3 DBD and a conformational change in the IRF-7 DBD. The varying intrinsic flexibility of loops and tails in the IRF family may serve as a mechanism to modulate the binding specificity of its members and to respond to a larger population of diverse promoter sites.
Coordinates have been submitted to the RCSB Protein Data Bank with accession codes: 3QU6 for IRF-3 and 3QU3 for IRF-7
US National Institutes of Health (R01 AI41706 to A.K.A. and R01 GM092854 to C.R.E.). Funding for open access charge: NIH.
Conflict of interest statement. None declared.
We thank the staff at APS beamline 17ID and NSLS beamline X6A for help with data collection.