|Home | About | Journals | Submit | Contact Us | Français|
The crystal structure of an uncharacterized conserved protein S4005, coded by yidB gene of Shigella flexneri (gi:30043267),1 has been determined by single wavelength anomalous diffraction (SAD) method and refined to 1.45 Å. The YidB structure is the first representative of COG3753 and Pfam06078 medium size families of bacterial proteins of unknown function (Fig. 1). The yidB gene of S. flexneri, as well as yidB gene of Escherichia coli, is located between yidA and gyrB genes which are involved in DNA processing. Biochemical function of the yidA product is unknown, but it is predicted to have hydrolase/phosphotase activity.2,3 The other neighbor, gyrB, codes subunit B of DNA gyrase type II topoisomerase which controls DNA supercoiling and DNA-relaxing.4 It is often found that genes in bacteria are clustered according to their products functions.5 Thus, it is possible that the YidB protein can have functions associated with DNA. YidB is found in number of pathogenic species including Escherichia, Bordetella, Burkholderia, and Shigella species (Fig. 1).
Here we report the crystal structure of YidB protein at 1.45 Å resolution. The structure represents a new protein fold and shows distant structural similarity to eukaryotic homeodomain proteins.
The yidB gene was cloned in pMCSG7 vector6 and overexpressed in E. coli BL21 (DE3) - Gold (Stratagene) harboring an extra plasmid encoding three rare tRNAs (AGG and AGA for Arg, ATA for Ile). The pMCSG7 vector bearing a TEV protease cleavage site creates a construct with cleavable His6-tag fused into N-terminus of the target protein and adds three artificial residues (Ser–Asn–Ala) on that end. The cells were grown using selenomethionine (SeMet) containing enriched M9 medium and conditions known to inhibit methionine biosynthesis.7,8 The cells were grown at 37°C to an OD600 of ~0.6 and protein expression induced with 1 mM IPTG. After induction, the cells were grown overnight with shaking at 20°C. The harvested cells were resuspended in five volumes of lysis buffer (50 mM HEPES, pH 8.0, 500 mM NaCl, 10 mM imidazole, 10 mM β-mercaptoethanol, and 5% v/v glycerol) and stored at −20°C.
The thawed cells were lysed by sonication after the addition of inhibitor proteases (Sigma, P8849) and 1 mg/mL lysozyme. The lysate was clarified by centrifugation at 30,000 × g (RC5C-Plus centrifuge, Sorval) for 20 min, followed by filtration through 0.45 μm and 0.22 μm inline filters (Gelman).
The standard purification protocol is thoroughly described previously.9 Immobilized metal affinity chromatography (IMAC-I) using a 5-mL HiTrap Chelating HP column charged with Ni+2 ions and buffer-exchange chromatography on a HiPrep 26/10 desalting column (both Amersham Biosciences) were performed using AKTA EXPLORER 3D (Amersham Biosciences). His6-tag was cleaved using the recombinant TEV protease expressed from the vector pRK508.10 The protease was added to the target protein in a ratio of 1:30 and the mixture was incubated at 4°C for 48 h. The YidB protein was then purified using a 1-mL HiTrap Chelating column (Amersham Biosciences) charged with Ni+2 ions. The protein was dialyzed in 20 mM Tris-HCl, (pH 7.1), 50 mM NaCl, 2 mM DTT, and concentrated using a Centricon Plus-20 Centrifugal Concentrator (Millipore).
Initially protein crystals were obtained in sitting drops by using Index (Hampton Research) and Wizard I and II (Emerald Biostructures) crystallization screens with the help of a HoneyBee crystallization workstation (Cartesian Technologies). The first crystals appeared after several days in Index #18, #48, and Wizard I #19, #23 crystallization conditions. In approximately 2 wk, a large crystal conglomerate was formed in Index condition #4. Further optimization of this condition was done manually using the hanging drop technique. The protein solution (1 μL, 44 mg/mL) was mixed with 1 μL of 0.1 M Bis-Tris (pH 7.2) and 2.0 M ammonium sulfate, and equilibrated over 1 mL of crystallization solution at 23°C. Quality crystals, which appeared in 2 wk, were flash-frozen in liquid nitrogen with crystallization solution complemented with 15% (v/v) glycerol as cryoprotectant prior to data collection.
Diffraction data were collected at 100 K temperature at the 19BM beamline of the Structural Biology Center at the Advanced Photon Source, Argonne National Laboratory. The single wavelength anomalous dispersion (SAD) data at 0.9793 Å (peak: 12.6603 keV) up to 1.45 Å were collected from a single (0.1 × 0.02 × 0.05 mm) Se–Met labeled protein crystal at 100 K. The space group was C2 with cell dimension of a = 57.48 Å, b = 40.48 Å, c = 48.33 Å, α =90.00°, β =93.78δ, and γ = 90.00°. There is one protein molecule in the asymmetric unit. All data were processed and scaled with HKL2000 suite11 (Table I).
The YidB structure was determined by SAD phasing using HKL2000_PH (W. Minor University of Virginia, personal communication) and RESOLVE12 and refined to 1.45 Å using REFMAC 5.213 in CCP4 suite.14 The initial model was completed by using ARP/wARP15 and manual fitting using COOT16 and O17 programs. The Structure Analysis server (STAN) was used to run the WASP program18 for identification of sodium ion in the structure. The stereochemistry of the structure was checked with PRO-CHECK.19 Atomic coordinates and experimental structure factors of YidB have been deposited with the PDB and are accessible under the code 1Z67.
The YidB structure is composed of eight α-helices connected by short β-turns (H1~H8, Fig. 2), where helices (H2–H7) form a compact six helix bundle with a well defined hydrophobic core. A number of hydrophobic residues that contribute to the core are conserved (Leu35, Trp51, Leu70, Leu93, Leu97). Two N-and C-terminal helices, H1 and H8, project out from the protein body and make few contacts with the main protein body (particularly H1). Their orientation is maintained by crystal packing contacts, therefore these helices could assume different orientations and may serve as interaction surfaces. Both helices have well defined hydrophobic/hydrophilic surfaces, but most residues are not conserved. The exception is N-terminal sequence motif MGL(L/F)D (MG are not visible in our structure) and Gly9 and Gly14 in H1 that are very strongly conserved. The loop connecting H1 and H2 helices is very short (residues 15–16), but it is flanked at both ends by glycine residues (G14 and G17) which may allow H1 helix to move. The loop connecting H7 and H8 helices is formed by 12 amino-acid residues. Two internal residues of the loop, Ser111 and Ala112, are not visible in the structure. No putative dimer interface was identified by visual inspection of protein contacts inside the crystal and PQS20 search predicted a monomeric form for this protein.
A structural homology search using DALI server21 showed some very distant structural homologs. The closest match was the NMR structure of Mouse Homeodomain-Only Protein HOP (1UHS.pdb) with Z-score and RMSD equal to 3.9 and 3.7, respectively. The next two matches were other homeodomain proteins in complexes with DNA, yeast MATa1/MATa2 homedomain heterodimer (1AKH.pdb)22 and Drosophila Engrailed Homeodomain (2HDD.pdb)23 with Z-scores equal to 3.7 and 3.5 and RMSD - 2.3 and 2.5, respectively. The superposition of structural homologs onto YidB structure showed that similarity is limited to only three of eight helices of our structure, H2, H6, and H7 (Fig. 3). These helices contribute to the hydrophobic core and contain several highly conserved residues. The homeodomain structural motif of three successive helices is characteristic of eukaryotic homeodomains which are one of the key DNA-binding domains used in gene regulation.24 The helix H7 of YidB corresponds to the DNA-binding helix 3 of homeodomains22 (data not shown). The H7 helix has several hydrophilic and charged residues that are solvent exposed and may interact with nucleic acid. However, the exact superposition with 1AKH results in collision of the H4 - H5 helices region of our structure with MATa1/MATa2 homedomain heterodimer DNA complex. Therefore, YidB protein could interact with nucleic acid only after undergoing significant conformational change or if its mode of interaction is very different from homeodomains. Nevertheless the presence of such a configuration in bacterial proteins suggests that this motif was invented very early in protein evolution. However, taking into account only partial homology of YidB structure to homeodomains (25% of sequence) we believe that our structure represents a unique structure and a new protein fold.
We have searched a number of databases including PQS,20 BLAST,25 ProFunc,26 DALI,21 and ISREC-TMpred27 servers to assign more detailed protein function. Sequence comparisons showed 201 matching sequences found by PSI-BLAST with nearly all of them being conserved proteins of unknown function. The ISREC-TMpred Server27 did not find any trans-membrane helices. Enzyme template search with ProFunc26 identified Glu28 and Asp77 in S4005 as a part of horse lysozyme active site template but these residues are not well conserved across the YidB family suggesting that YidB is unlikely an enzyme. Interestingly, at this site the sodium ion was found in the structure coordinated by Ser76, Asp77, Gln80 and several nearby carbonyls. Therefore, there are some indications that YidB may be a nucleic acid binding protein but this hypothesis requires further investigation and experimental verification.
The National Institutes of Health; Grant numbers: GM62414, GM074942; Grant sponsor: U.S. Department of Energy, Office of Biological and Environmental Research, under contract W-31-109-Eng-38.
Atomic coordinates have been deposited in the Protein Data Bank (PDB) with PDB-ID 1Z67 and accession number RCSB032348. We wish to thank all members of the Structural Biology Center at Argonne National Laboratory for their help in conducting these experiments.
The submitted manuscript has been created by the University of Chicago as Operator of Argonne National Laboratory (“Argonne”) under Contract No. W-31-109-ENG-38 with the U.S. Department of Energy. The U.S. Government retains for itself, and others acting on its behalf, a paid-up, nonexclusive, irrevocable worldwide license in said article to reproduce, prepare derivative works, distribute copies to the public, and perform publicly and display publicly, by or on behalf of the Government.