Search tips
Search criteria 


Logo of actafjournal home pagethis articleInternational Union of Crystallographysearchsubscribearticle submission
Acta Crystallogr Sect F Struct Biol Cryst Commun. 2010 October 1; 66(Pt 10): 1265–1273.
Published online 2010 March 5. doi:  10.1107/S1744309109046788
PMCID: PMC2954215

The structure of BVU2987 from Bacteroides vulgatus reveals a superfamily of bacterial periplasmic proteins with possible inhibitory function


Proteins that contain the DUF2874 domain constitute a new Pfam family PF11396. Members of this family have predominantly been identified in microbes found in the human gut and oral cavity. The crystal structure of one member of this family, BVU2987 from Bacteroides vulgatus, has been determined, revealing a β-lactamase inhibitor protein-like structure with a tandem repeat of domains. Sequence analysis and structural comparisons reveal that BVU2987 and other DUF2874 proteins are related to β-lactamase inhibitor protein, PepSY and SmpA_OmlA proteins and hence are likely to function as inhibitory proteins.

Keywords: BVU2987, DUF2874, PF11396, human gut microbiome, β-lactamase inhibitor protein-like fold, putative inhibitor proteins

1. Introduction

Recent interest in metagenomics (Sleator et al., 2008 [triangle]), together with advances in genomic and proteomic techniques, has led to a rapid evolution in the study of the human gut microbiome (Frank & Pace, 2008 [triangle]; Ley et al., 2008 [triangle]; Verberkmoes et al., 2009 [triangle]) and its association with human health and disease (Mai & Draganov, 2009 [triangle]; Kinross et al., 2008 [triangle]; Turnbaugh et al., 2009 [triangle]; Ordovas & Mooser, 2006 [triangle]; Othman et al., 2008 [triangle]; O’Keefe, 2008 [triangle]). The sequencing of complete genomes of bacteria from the human gut, such as Bacteroides thetaiotaomicron (Xu et al., 2003 [triangle]) and B. vulgatus (Xu et al., 2007 [triangle]), as well as from the oral cavity, such as Porphyromonas gingivalis (Nelson et al., 2003 [triangle]), has identified many novel proteins of unknown function. Large-scale structure determination of these proteins can provide functional insights and may lead to the identification of new drug targets for therapeutic exploitation (Zaneveld et al., 2008 [triangle]).

Towards this goal, the BVU2987 protein from B. vulgatus ATCC 8482, one of the predominant members of the human gut microbiome, was selected for crystallographic structure determination. BVU2987 is a 145-residue protein with a calculated pI of 5.36 and is annotated as a putative periplasmic protein based on the predicted N-­terminal signal peptide. The protein sequence has been assigned to a novel protein family that is predominately found in species that populate the human oral cavity and gut microbiomes, including Bacteroides, Campylobacter and P. gingivalis (the predominant agent of periodontal disease). Proteins in this family are annotated either as putative periplasmic proteins or as conserved hypothetical proteins, but none have been biochemically characterized. Analysis of our structure and of the available sequences shows that collectively this family forms part of a larger superfamily of bacterial periplasmic proteins that all adopt a fold similar to β-lactamase inhibitor protein (BLIP-like fold) and appear to share some broad spectrum of inhibitory function.

2. Materials and methods

2.1. Protein production and crystallization

Clones were generated using the Polymerase Incomplete Primer Extension (PIPE) cloning method (Klock et al., 2008 [triangle]). The gene encoding BVU2987 (GenBank YP_001300247.1) was amplified by polymerase chain reaction (PCR) from B. vulgatus ATCC 8482 genomic DNA using PfuTurbo DNA polymerase (Stratagene) and I-­PIPE (Insert) primers (forward primer, 5′-ctgtacttccagggcGCGG­ATGATGACAAACCTATTCAAG-3′; reverse primer, 5′-aattaagtc­gcgttaATTGTCAATATCAATCACATTGAACTGC-3′; the target sequence is shown in upper case) that included sequences for the predicted 5′ and 3′ ends. The expression vector pSpeedET, which encodes an amino-terminal tobacco etch virus (TEV) protease-cleavable expression and purification tag (MGSDKIHHHHHH­ENLYFQ/G), was PCR-amplified with V-PIPE (Vector) primers (forward primer, 5′-taacgcgacttaattaactcgtttaaacggtctccagc-3′; reverse primer, 5′-gccc­tggaagtacaggttttcgtgatgatgatgatgatg-3′). V-PIPE and I-­PIPE PCR products were mixed to anneal the amplified DNA fragments together. Escherichia coli GeneHogs (Invitrogen) com­petent cells were transformed with the V-PIPE/I-PIPE mixture and dispensed onto selective LB–agar plates. The cloning junctions were confirmed by DNA sequencing. Using the PIPE method, the part of the gene encoding residues Met1–Trp19 (predicted signal sequence) was deleted. Expression was performed in selenomethionine-containing medium at 310 K. Selenomethionine was incorporated via inhibition of methionine biosynthesis (Van Duyne et al., 1993 [triangle]), which does not require a methionine-auxotrophic strain. At the end of fermentation, lysozyme was added to the culture to a final concentration of 250 µg ml−1 and the cells were harvested and frozen. After one freeze–thaw cycle, the cells were homogenized in lysis buffer [50 mM HEPES pH 8.0, 50 mM NaCl, 10 mM imidazole, 1 mM tris(2-car­boxyethyl)phosphine–HCl (TCEP)] and the lysate was clarified by centrifugation at 32 500g for 30 min. The soluble fraction was passed over nickel-chelating resin (GE Healthcare) pre-equilibrated with lysis buffer, the resin was washed with wash buffer [50 mM HEPES pH 8.0, 300 mM NaCl, 40 mM imidazole, 10%(v/v) glycerol, 1 mM TCEP] and the protein was eluted with elution buffer [20 mM HEPES pH 8.0, 300 mM imidazole, 10%(v/v) glycerol, 1 mM TCEP]. The eluate was buffer-exchanged with TEV buffer (20 mM HEPES pH 8.0, 200 mM NaCl, 40 mM imidazole, 1 mM TCEP) using a PD-10 column (GE Healthcare) and incubated with 1 mg TEV protease per 15 mg of eluted protein. The protease-treated eluate was run over nickel-chelating resin (GE Healthcare) pre-equilibrated with HEPES crystallization buffer (20 mM HEPES pH 8.0, 200 mM NaCl, 40 mM imidazole, 1 mM TCEP) and the resin was washed with the same buffer. The flowthrough and wash fractions were combined and concentrated by centrifugal ultrafiltration (Millipore) to 9.7 mg ml−1 for crystallization trials. BVU2987 was crystallized using the nanodroplet vapor-diffusion method (Santarsiero et al., 2002 [triangle]) with standard Joint Center for Structural Genomics (JCSG; crystallization protocols (Lesley et al., 2002 [triangle]). Sitting drops composed of 200 nl protein solution mixed with 200 nl crystallization solution were equilibrated against a 50 µl reservoir at 277 K for 37 d prior to harvesting. The crystallization reagent con­sisted of 35.0%(v/v) 2-­ethoxyethanol and 0.1 M cacodylate pH 6.5. No further cryoprotectant was added to the crystals. Initial screening for diffraction was carried out using the Stanford Automated Mounting system (SAM; Cohen et al., 2002 [triangle]) at the Stanford Synchrotron Radiation Lightsource (SSRL). A rod-shaped crystal of approximate size 20 × 20 × 100 µm was harvested for data collection. The diffraction data were indexed in the orthorhombic space group P212121. To determine its oligomeric state in solution, BVU2987 was analyzed using a 1 × 30 cm Superdex 200 size-exclusion column (GE Healthcare) coupled with miniDAWN static light-scattering (SEC/SLS) and Optilab differential refractive-index detectors (Wyatt Technology). The mobile phase consisted of 20 mM Tris pH 8.0, 150 mM NaCl and 0.02%(w/v) sodium azide. The molecular weight was calculated using ASTRA v.5.1.5 software (Wyatt Technology).

2.2. Data collection, structure solution and refinement

Multi-wavelength anomalous diffraction (MAD) data were collected to 1.85 Å resolution on beamline 11-1 at SSRL at wavelengths corresponding to the high-energy remote (λ1), inflection point (λ2) and peak (λ3) of a selenium MAD experiment using the Blu-Ice data-collection environment (McPhillips et al., 2002 [triangle]). A beam size of 0.15 × 0.15 mm was used during data collection. The λ1 and λ2 data sets were collected simultaneously interleaved in 30° wedges and were followed by λ3 (González, 2003a [triangle],b [triangle]). The data set was collected at 100 K using a MarMosaic 325 CCD detector (Rayonix). The MAD data were integrated and reduced using MOSFLM (Leslie, 1992 [triangle]) and scaled with the program SCALA (Collaborative Computational Project, Number 4, 1994 [triangle]).

The heavy-atom sites were located with SHELXD (Sheldrick, 2008 [triangle]) and phasing was performed with autoSHARP (Vonrhein & Blanc, 2007 [triangle]). The heavy-atom substructure contained four anomalous scatterers per asymmetric unit, with an overall figure of merit (acentric/centric) of 0.39/0.33 and an anomalous phasing power for the three wavelengths of ~0.5–0.8. ARP/wARP (Langer et al., 2008 [triangle]) was used for automatic model building. Model completion and crystallo­graphic refinement were performed with the λ1 data set using Coot (Emsley & Cowtan, 2004 [triangle]) and REFMAC5 (Collaborative Computational Project, Number 4, 1994 [triangle]), respectively, with one TLS group per molecule (Winn et al., 2003 [triangle]). Crystallographic data and refinement statistics are summarized in Table 1 [triangle].

Table 1
Crystallographic data and refinement statistics for BVU2987 (PDB code 3due)

2.3. Validation and deposition

The quality of the crystal structure was analyzed using the JCSG Quality Control server ( This server automatically processes the coordinates and data through a variety of validation tools including AutoDepInputTool (Yang et al., 2004 [triangle]), MolProbity (Lovell et al., 2003 [triangle]), WHAT IF v.5.0 (Vriend, 1990 [triangle]), RESOLVE (Terwilliger, 2003 [triangle]) and MOLEMAN2 (Kleywegt, 2000 [triangle]), as well as several in-house scripts, and summarizes the results. Protein quaternary-structure analysis was performed using the PISA server (Krissinel & Henrick, 2005 [triangle]). Fig. 1 [triangle](b) was adapted from an analysis using PDBsum (Laskowski et al., 2005 [triangle]) and all other figures were prepared with PyMOL (DeLano, 2008 [triangle]). Atomic coordinates and experimental structure factors for BVU2987 were deposited in the PDB under the accession code 3due. Fig. 1 [triangle](c) was prepared using the PDB2PQR server (Dolinsky et al., 2007 [triangle]) and the APBS module (Dolinsky et al., 2007 [triangle]; Baker et al., 2001 [triangle]) in PyMOL.

Figure 1
Crystal structure of BVU2987 from B. vulgatus. (a) Stereo ribbon diagram of the BVU2987 monomer with the N-terminal domain in cyan and the C-terminal tandem-repeat domain in pink. Helices H1–H4 (helices H1 and H3 are 310-helices and helices H2 ...

3. Results and discussion

3.1. Overall structure

The structure of BVU2987 was determined by MAD phasing to 1.85 Å resolution. The crystallized protein contained residues 20–145 of the full-length protein and an N-terminal glycine (that remained after the cleavage of the expression and purification tag). A predicted signal sequence (residues 1–19) was identified at the N-terminus of the full-length sequence and was omitted from the construct used for protein production. The final model contained one monomer, one cacodylate anion (from the crystallization condition) and 133 water molecules in the asymmetric unit. The Matthews coefficient (Matthews, 1968 [triangle]) is ~2.3 Å3 Da−1, with an estimated solvent content of ~47%. The Ramachandran plot produced by MolProbity (Davis et al., 2004 [triangle]) showed that 96.8% and 100% of amino acids were in the favored and allowed regions, respectively. Crystal-packing analysis using PISA (Krissinel & Henrick, 2005 [triangle]), in addition to analytical size-exclusion chromatography coupled with static light scattering, indicated that the monomer was the favored oligomeric form in solution.

BVU2987 forms a crescent-shaped molecule comprised of an eight-stranded antiparallel β-sheet with four helices (two α-helices and two 310-helices; Fig. 1 [triangle] a). The β-sheet forms the inner concave side of the crescent and the helices form the outer edge of this ~40 Å long and ~30 Å wide molecule. The monomer is formed by a tandem repeat of a structural motif, possibly arising from a gene-duplication event, comprised of four antiparallel β-strands and a short helix–loop–long helix. Thus, residues 28–85 and 92–145 can be superimposed with an r.m.s.d. of 1.7 Å and a sequence identity of 22% over 54 aligned Cα atoms. The β-strands β1 and β2 in the first structural motif are slightly longer than the corresponding structural elements β5 and β6 in the tandem repeat, whereas β3 is slightly shorter than β7 (Fig. 1 [triangle] b). Inspection of the electrostatic surface potential (Fig. 1 [triangle] c) reveals that the concave surface has a prominent overall negative charge, mainly owing to the presence of numerous aspartic and glutamic acid residues.

Multiple orthologs of this protein family were targeted in parallel for structure determination; the crystal structures of two other proteins from this family were also determined and will be briefly described here. The structure of BT0923 (UniProt Q8A994; PDB code 3db7) from B. thetaiotaomicron VPI-5482 was determined at 1.40 Å resolution and that of BVU2443 (UniProt A6L337; PDB code 3elg) from B. vulgatus ATCC 8482 was determined at 1.64 Å resolution; these proteins have 73 and 42% sequence identity to BVU2987, respectively. These proteins are both very similar and superimpose on BVU2987 with r.m.s.d.s of 1.1 Å (over 122 aligned Cα residues) and 1.7 Å (over 119 aligned Cα residues), respectively.

3.2. Sequence and structural comparisons

Detailed sequence and structural analyses of the crystal structure of BVU2987 uncovered new relationships that unify the DUF2874 proteins into a superfamily of bacterial periplasmic proteins that includes PepSY, BLIP, SmpA_OmlA and DUF3192 proteins. Remote sequence similarities were first identified between DUF2874 and PepSY-domain proteins and were followed by sequence relationships between SmpA_OmlA, BLIP and DUF3192 that led to the identification of structural similarities between DUF2874, BLIP, SmpA_OmlA and PepSY proteins.

3.2.1. Sequence relationship between DUF2874 and PepSY-domain proteins

After structure determination, sequence searches against protein-domain databases, such as Pfam (Finn et al., 2008 [triangle]) and the Conserved Domain Database (CDD; Marchler-Bauer et al., 2007 [triangle]), with BVU2987 did not find any significant hits. However, a BLAST (Altschul et al., 1997 [triangle]) search revealed several related proteins (E value < 0.001). Regions that shared significant sequence similarity to either tandem repeat, as defined by the structure, were aligned using MAFFT (Katoh et al., 2005 [triangle]) and the resulting multiple sequence alignment (representing a single domain) was used to construct a profile hidden Markov model (HMM) using the HMMER package (v.3.0, alpha release v.1.0). After multiple rounds of searching the UniProt sequence database (v.12.5) using the HMM, coupled with careful manual inspection of the resulting matches, we identified 271 sequences (E value < 0.01) which form a new protein family that has now been added to Pfam and appears in the new release (Pfam 24.0, October 2009) as DUF2874 (Pfam accession PF11396). These 271 DUF2874 domains are distributed in 153 distinct proteins from 40 species. In general, two copies of this domain are usually found in each protein, although single copies, and even up to four copies, also occur in some members of the family.

Interestingly, the most significant marginal matches (E-value range 0.01–0.1, below the set threshold of 0.01) matched the HMM of the Pfam domain PepSY (Pfam accession PF03413). Inspection of these marginal hits suggested that PepSY-domain proteins were likely to be distant homologs of DUF2874. Profile–profile comparisons of all of the latest Pfam HMMs against each other (Madera, 2008 [triangle]) indicated significant similarity between the DUF2874 and PepSY families (E value of 5.7 × 10−3). The sequence relationship is demonstrated in the family pairwise sequence alignment in Fig. 2 [triangle](a). In addition, the presence of a signal peptide motif (predicted using PHOBIUS; Kall et al., 2004 [triangle]) at the N-terminus and the repetitive nature of the domain in some sequences are highly reminiscent of the domain architecture in the PepSY family (Yeats et al., 2004 [triangle]). Unlike some members of the PepSY family where the PepSY domain co-occurs with other domains in the same protein (such as Peptidase_M4 and Peptidase_M36), no additional domains were found to co-occur in proteins containing DUF2874 domains. Further analysis was carried out to determine whether a single HMM could represent both DUF2874 and PepSY. However, a single model could not be built that was sufficiently sensitive to detect all of the domains that could be found using the two individual HMMs. This analysis demonstrates that PepSY and DUF2874 domains represent either a single divergent family or two related families of proteins that have arisen from a common evolutionary ancestor. Interestingly, the profile–profile comparisons also indicated that DsbC_N (Pfam accession PF10411), an N-terminal domain found in disulfide-bond isomerase (DsbC) proteins, may be related to DUF2874 (E value of 0.072). DsbC proteins not only function as disulfide-bond isomerases during oxidative protein folding in the bacterial periplasm, but have also been implicated as chaperones (Hiniker et al., 2005 [triangle]). The structural representative of the DsbC_N family (PDB code 1t3b; Zhang et al., 2004 [triangle]; aligns with BVU2987 with an r.m.s.d of 2.5 Å over 45 Cα atoms) is also found in the same SCOP fold as YpmB, a PepSY-family protein (PDB code 2gu3; J. Osipiuk, N. Maltseva, I. Dementieva, S. Moy & A. Joachimiak, unpublished work).

Figure 2Figure 2
Alignments of representative multiple sequence alignments of the DUF2874 (Pfam accession PF11396), PepSY (PF03413), BLIP (PF07467), SmpA_OmlA (PF04355) and DUF3192 (PF11399) families. The alignments are colored according to the sequence conservation using ...

3.2.2. Sequence relationship between SmpA_OmlA, BLIP and DUF3192 proteins

The recently determined first structural representative (PDB code 2pxg; Vanini et al., 2008 [triangle]) of the SmpA_OmlA family of lipoproteins (Pfam accession PF04355) revealed structural similarity to BLIP (Pfam accession PF07467). As in BVU2987, each BLIP sequence contains a tandem repeat of a structural domain (four antiparallel β-strands and a short helix–loop–long helix), with the structure of OmlA being superimposable on both the N-terminal and C-­terminal copies of this domain (Vanini et al., 2008 [triangle]). Given that the structurally equivalent positions between OmlA and BLIP corresponded to conserved residues among the BLIP sequences themselves, we took the Pfam BLIP HMM model from release 23.0 (which represented BLIP as a continuous sequence rather than a domain representing the tandem duplication) and modified it to represent the repeated domain. A single search using this modified BLIP HMM detected sequences from the SmpA_OmlA family, which highlighted the presence of a common evolutionary ancestor. This updated version of the BLIP family also appears in the new release of the Pfam database (Pfam 24.0, October 2009).

Profile–profile comparisons were again used to identify additional related families. These comparisons demonstrated that BLIP and SmpA_OmlA are significantly similar (E value of 2.8 × 10−8) and that both of these domains are also related to DUF3192 (Pfam accession PF11399), with E values of 5.4 × 10−5 for BLIP and 8.6 × 10−5 for SmpA_OmlA. Representative sequence alignments of each family over a similar region of the proteins (Figs. 2 [triangle] a and 2 [triangle] b) demonstrate the sequence conservation between PepSY and DUF2874 and between SmpA_OmlA, BLIP and DUF3192.

3.2.3. Structural relationship between DUF2874, BLIP, SmpA_OmlA and PepSY proteins

A systematic search for other proteins of similar structure to BVU2987 was conducted using several different methods including the DALI server (Holm et al., 2008 [triangle]), the protein structure-comparison service SSM at the European Bioinformatics Institute (; Krissinel & Henrick, 2005 [triangle]) and the flexible structure-alignment method implemented in FATCAT (Ye & Godzik, 2004 [triangle]). The most prominent hit was to BLIP (SCOP superfamily 55648 and SCOP fold 55647) from Streptomyces clavuligerus (UniProt BLIP_STRCL), for which structures are available in complex with Klebsiella pneumoniae SHV-1 β-­lactamase (PDB code 2g2u and related entries; Reynolds et al., 2006 [triangle]), E. coli β-­lactamase TEM-1 (PDB code 1jtg and the related entries 1s0w and 1xxm; Strynadka et al., 1996 [triangle]) and a putative BLIP from Streptococcus mutans (PDB code 3d4e; Joint Center for Structural Genomics, unpublished work). In the current Pfam PF07467/BLIP family, only three protein sequences are present from two species: BLIP_STRCL and P97062_STRCL from Streptomyces clavuligerus (with ~31% sequence identity to each other) and Q9KJ90_STREX from Streptomyces exfoliatus (with ~37% sequence identity to BLIP_STRCL). BLIP inhibits a wide variety of β-lactamases (such as TEM-1, which is the most widespread resistance enzyme to penicillin antibiotics). BLIP_STRCL is larger than BVU2987 by about 50 residues, although it also has an N-­terminal signal sequence and is a secreted protein. BVU2987 matches the different BLIP structures with DALI Z scores of 5.5–6.5 and with r.m.s.d.s of 2.7–3.4 Å over ~75% of the residues (Fig. 3 [triangle] a). The antiparallel β-sheet is conserved, although differences are found in the size of the connecting loops and in the positioning of the N-­terminal helices. The loop between the two tandem structural repeats is ~10 residues long in BLIP and may contribute to its binding flexibility and its ability to inhibit a variety of class A β-­lactamases (Strynadka et al., 1996 [triangle]). This loop is of similar length in BVU2987 and may confer similar flexibility.

Figure 3
Structural comparisons of BVU2987 with related proteins. (a) Comparison of BVU2987 (blue) with BLIP (gray). The sequence conservation between the two proteins is <10% and among the functionally important residues (orange sticks) in BLIP only Lys74 ...

Some of the important residues that have been implicated in the interactions of BLIP with SHV-1 β-lactamase (Reynolds et al., 2006 [triangle]) are Glu31, Asp49, Lys74, Tyr115, Phe142, Tyr143, Trp150, Arg160 and Trp162 (numbering after removal of the N-terminal signal sequence). From the structural superposition (Fig. 3 [triangle] a), only Lys74 in BLIP is conserved in BVU2987 as Lys86. Tyr115, Phe142 and Tyr143 in BLIP are located in long loops. Although these aromatic residues are not conserved in BVU2987, other aromatic residues (Trp57, Phe58, Tyr122, Trp130 and Phe138) are present in the corresponding shorter loops in BVU2987, which may be functionally important.

The concave surface of BVU2987 is negatively charged (Fig. 1 [triangle] b) owing to the presence of numerous aspartate and glutamate residues. In contrast, the concave surface of BLIP has numerous uncharged polar residues (Ser35, Ser39, Tyr50, Tyr51, Tyr53, Thr55, Ser69, Ser71, Thr110, Ser113, Ser128, Ser130 and Ser146). Of these, Ser39 and Ser69 are conserved in BVU2987 as Ser61 and Thr81, respectively. This surface of BLIP also includes Phe36, His41, Trp112, His148, Trp150 and Trp162. It is interesting that the aromatic residues Tyr53, Trp112 and Trp150 in BLIP are structurally equivalent to the basic residues Lys71, Lys114 and Lys133 in BVU2987, respectively. It is possible that the long aliphatic tail of the lysine residues may mimic certain aspects of the hydrophobic tyrosine and tryptophan residues. Loop L23 between strands β2 and β3 (residues 46–51) in the first domain of BLIP is functionally important as Asp49 interacts with four conserved active-site residues in TEM-1 β-lactamase (Strynadka et al., 1996 [triangle]), mimicking the interaction with the carboxylate group of its substrate penicillin G. The corresponding loop in BVU2987 is significantly shorter and is comprised of only two residues, 67–68. Interestingly, BLIP is similar to the TATA-box-binding protein in that it uses a tandem repeat of a structural motif of antiparallel β-­strands to create a concave saddle-shaped surface that can bind to a convex interacting partner (β-lactamase and DNA, respectively; Strynadka et al., 1996 [triangle]). For BVU2987, the negatively charged concave surface is most likely to reflect binding to a positively charged partner.

It has recently been shown that members of the OmlA (outer membrane lipoprotein A) family are involved in the assembly of outer membrane proteins and in maintaining the structure of the cell envelope (Sklar et al., 2007 [triangle]), although the actual mechanism is unknown. The structures of the BLIP-like domains of BVU2987 (residues 28–85) and the OmlA protein (PDB code 2pxg) superimpose with a Z score of 0.9 and an r.m.s.d. of 2.6 Å over 35 Cα atoms with 9% sequence identity (Holm & Park, 2000 [triangle]; Fig. 3 [triangle] b). Although the Z score is below the standard significance cutoff of 2.0, OmlA nevertheless has a BLIP-like fold (Vanini et al., 2008 [triangle]). Of the con­served N-terminal QGN motif and the four aromatic residues in the protein core that are seen in all OmlA proteins, only a single residue, Phe74 (equivalent to Phe76 in OmlA), is found in BVU2987.

The BLIP-like domains of BVU2987 and YpmB (a member of the PepSY family; PDB code 2gu3) superimpose with a Z score of 5.2 and an r.m.s.d. of 2.9 Å over 58 Cα atoms with 9% sequence identity, but the relative orientation of the tandem structural repeats in the two proteins are different (Fig. 3 [triangle] c). Interestingly, Lys86 of BVU2987, which is the counterpart of the functionally important Lys74 in BLIP (mutation of this residue causes disruption of the BLIP–β-lactamase interface), is present as Lys97 in YpmB. Although the PepSY and DUF2874 domains appear to be more closely related by sequence, structural analysis indicates a greater similarity of DUF2874 to BLIP. This may account for the discrepancy in the SCOP classification where YpmB has been classified under a different SCOP fold, 54402.

3.3. Potential function based on similarity to related families

We have identified five bacterial periplasmic protein domain families (DUF2874, PepSY, BLIP, SmpA_OmlA and DUF3192) that are related by sequence and/or structural similarity (Fig. 4 [triangle]). BLIP binds to numerous class A β-lactamases and prevents them from hydrolyzing β-lactam antibiotics. Gene-knockout studies of BLIP in Streptomyces exfoliatus SMF19 have indicated that BLIP may have a broader role, particularly in regulating cell morphology (Kang et al., 2000 [triangle]), which is thought to be mediated by its binding to penicillin-binding proteins involved in cell-wall synthesis. Apart from BLIP, the precise functions of these other families remain to be elucidated. Nevertheless, a number of recurring themes appear to be emerging.

Figure 4
Schematic of the structural and sequence relationships between families belonging to the BLIP-like superfamily. PepSY, DUF2487, BLIP and SmpA_OmlA structures were rendered using OpenAstexViewer (Hartshorn, 2002 [triangle]). The structure is colored from ...

The PepSY domain, when found in combination with other Pfam domains, is typically associated with M4 or M36 peptidases. These peptidases all function in the periplasmic space and it has been postulated that the PepSY domain functions as an inhibitor of the peptidase. The same PepSY domain is also found in YpmB, which is co-expressed with SleB (Boland et al., 2000 [triangle]). In this case, SleB, a lytic enzyme, is inhibited by YpmB; given the lack of any sequence similarity between the peptidase and this lytic peptide, it has been suggested that PepSY may also function as a broad-spectrum inhibitor (Yeats et al., 2004 [triangle]).

PepSY and DUF2874 domains are found in most protein sequences where no other associated domains are present. The precise function of OmlA remains unclear, but it is thought to be involved in maintaining the integrity of the cell envelope (Ochsner et al., 1999 [triangle]). A knockout study in Xanthomonas campestris pv. phaseoli indicated that even though OmlA is divergently transcribed from the gene encoding the ferric uptake regulator Fur, the absence of Fur does not alter OmlA expression. In the same study, an OmlA mutant showed increased susceptibility to novobiocin and coumermycin, which are antibiotics with gyrase inhibitory activity. How OmlA protects the cell against these antibiotics or maintains the cell envelope is not known, but given the similarity to BLIP it is interesting to speculate that a similar inhibitory/regulatory binding mechanism may be employed in these two cases.

The BVU2987 structure is the first structural representative of a novel protein family, which has now been added to the Pfam database as DUF2874. The sequence and structural analyses presented show that this family is a member of a superfamily containing four other related bacterial periplasmic protein families: PepSY, BLIP, SmpA_OmlA and DUF3192. The protein structures from these families all adopt a BLIP-like fold. Although the precise functions of PepSY, DUF2874, SmpA_OmlA and DUF3192 remain to be elucidated, it seems that they function as inhibitors by binding a partner domain located either on the same protein or on a separate protein. The structure of BVU2987 reveals an internal duplication of a domain that occurs between one and four times in different sequences. BLIPs are important for the design of peptide-based β-lactamase inhibitors and for studying protein–protein interactions. Thus, the similarity between these families opens up the possibility of biochemical studies and therapeutic potential. Members of the DUF2874 family define a new type of BLIP-like protein produced by the human gut microbiome. The structures of DUF2874 presented here can be used to investigate whether these proteins do indeed inhibit β-lactamases of the human gut (Chanal et al., 1996 [triangle]). If so, these different BLIP-like proteins could be utilized in the design of novel peptide-like β-­lactamase inhibitors.

Additional information about BVU2987 is available from TOPSAN (Krishna, 2010 [triangle]) at

Supplementary Material

PDB reference: BVU2987, 3due


This work was supported by National Institutes of Health Protein Structure Initiative grant No. U54 GM074898 from the National Institute of General Medical Sciences ( Portions of this research were performed at the Stanford Synchrotron Radiation Lightsource (SSRL) at the SLAC National Accelerator Laboratory, Menlo Park, California, USA. The SSRL is a national user facility operated by Stanford University on behalf of the United States Department of Energy, Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research and by the National Institutes of Health (National Center for Research Resources, Biomedical Technology Program and the National Institute of General Medical Sciences). RDF was supported by Wellcome Trust grant No. WT077044/Z/05/Z. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute of General Medical Sciences or the National Institutes of Health. Genomic DNA from B. vulgatus ATCC 8482 (ATCC No. ATCC8482D-5) was obtained from the American Type Culture Collection (ATCC).


  • Altschul, S. F., Madden, T. L., Schäffer, A. A., Zhang, J., Zhang, Z., Miller, W. & Lipman, D. J. (1997). Nucleic Acids Res.25, 3389–3402. [PMC free article] [PubMed]
  • Baker, N. A., Sept, D., Joseph, S., Holst, M. J. & McCammon, J. A. (2001). Proc. Natl Acad. Sci. USA, 98, 10037–10041. [PubMed]
  • Boland, F. M., Atrih, A., Chirakkal, H., Foster, S. J. & Moir, A. (2000). Microbiology, 146, 57–64. [PubMed]
  • Chanal, C., Sirot, D., Romaszko, J. P., Bret, L. & Sirot, J. (1996). J. Antimicrob. Chemother.38, 127–132. [PubMed]
  • Cohen, A. E., Ellis, P. J., Miller, M. D., Deacon, A. M. & Phizackerley, R. P. (2002). J. Appl. Cryst.35, 720–726. [PMC free article] [PubMed]
  • Collaborative Computational Project, Number 4 (1994). Acta Cryst. D50, 760–763. [PubMed]
  • Cruickshank, D. W. J. (1999). Acta Cryst. D55, 583–601. [PubMed]
  • Davis, I. W., Murray, L. W., Richardson, J. S. & Richardson, D. C. (2004). Nucleic Acids Res.32, W615–W619. [PMC free article] [PubMed]
  • DeLano, W. L. (2008). PyMOL Molecular Viewer. DeLano Scientific LLC, Palo Alto, California, USA.
  • Diederichs, K. & Karplus, P. A. (1997). Nature Struct. Biol.4, 269–275. [PubMed]
  • Dolinsky, T. J., Czodrowski, P., Li, H., Nielsen, J. E., Jensen, J. H., Klebe, G. & Baker, N. A. (2007). Nucleic Acids Res.35, W522–W525. [PMC free article] [PubMed]
  • Emsley, P. & Cowtan, K. (2004). Acta Cryst. D60, 2126–2132. [PubMed]
  • Finn, R. D., Tate, J., Mistry, J., Coggill, P. C., Sammut, S. J., Hotz, H. R., Ceric, G., Forslund, K., Eddy, S. R., Sonnhammer, E. L. & Bateman, A. (2008). Nucleic Acids Res.36, D281–D288. [PMC free article] [PubMed]
  • Frank, D. N. & Pace, N. R. (2008). Curr. Opin. Gastroenterol.24, 4–10. [PubMed]
  • González, A. (2003a). Acta Cryst. D59, 315–322. [PubMed]
  • González, A. (2003b). Acta Cryst. D59, 1935–1942. [PubMed]
  • Goodstadt, L. & Ponting, C. P. (2001). Bioinformatics, 17, 845–846. [PubMed]
  • Hartshorn, M. J. (2002). J. Comput. Aided Mol. Des.16, 871–881. [PubMed]
  • Hiniker, A., Collet, J. F. & Bardwell, J. C. (2005). J. Biol. Chem.280, 33785–33791. [PubMed]
  • Holm, L., Kaariainen, S., Rosenstrom, P. & Schenkel, A. (2008). Bioinformatics, 24, 2780–2781. [PMC free article] [PubMed]
  • Holm, L. & Park, J. (2000). Bioinformatics, 16, 566–567. [PubMed]
  • Kall, L., Krogh, A. & Sonnhammer, E. L. (2004). J. Mol. Biol.338, 1027–1036. [PubMed]
  • Kang, S. G., Park, H. U., Lee, H. S., Kim, H. T. & Lee, K. J. (2000). J. Biol. Chem.275, 16851–16856. [PubMed]
  • Katoh, K., Kuma, K., Toh, H. & Miyata, T. (2005). Nucleic Acids Res.33, 511–518. [PMC free article] [PubMed]
  • Kinross, J. M., von Roon, A. C., Holmes, E., Darzi, A. & Nicholson, J. K. (2008). Curr. Gastroenterol. Rep.10, 396–403. [PubMed]
  • Kleywegt, G. J. (2000). Acta Cryst. D56, 249–265. [PubMed]
  • Klock, H. E., Koesema, E. J., Knuth, M. W. & Lesley, S. A. (2008). Proteins, 71, 982–994. [PubMed]
  • Krishna, S. S., Weekes, D., Bakolitsa, C., Elsliger, M.-A., Wilson, I. A., Godzik, A. & Wooley, J. (2010). Acta Cryst. F66, 1143–1147. [PMC free article] [PubMed]
  • Krissinel, E. & Henrick, K. (2005). Computational Life Sciences, edited by M. R. Berthold, R. Glen, K. Diederichs, O. Kohlbacher & I. Fischer, pp. 163–174. Berlin: Springer-Verlag.
  • Langer, G., Cohen, S. X., Lamzin, V. S. & Perrakis, A. (2008). Nature Protoc.3, 1171–1179. [PMC free article] [PubMed]
  • Laskowski, R. A., Chistyakov, V. V. & Thornton, J. M. (2005). Nucleic Acids Res.33, D266–D268. [PMC free article] [PubMed]
  • Lesley, S. A. et al. (2002). Proc. Natl Acad. Sci. USA, 99, 11664–11669. [PubMed]
  • Leslie, A. G. W. (1992). Jnt CCP4/ESF–EACBM Newsl. Protein Crystallogr.26
  • Ley, R. E., Lozupone, C. A., Hamady, M., Knight, R. & Gordon, J. I. (2008). Nature Rev. Microbiol.6, 776–788. [PMC free article] [PubMed]
  • Lovell, S. C., Davis, I. W., Arendall, W. B. III, de Bakker, P. I., Word, J. M., Prisant, M. G., Richardson, J. S. & Richardson, D. C. (2003). Proteins, 50, 437–450. [PubMed]
  • Madera, M. (2008). Bioinformatics, 24, 2630–2631. [PMC free article] [PubMed]
  • Mai, V. & Draganov, P. V. (2009). World J. Gastroenterol.15, 81–85. [PMC free article] [PubMed]
  • Marchler-Bauer, A. et al. (2007). Nucleic Acids Res.35, D237–D240. [PMC free article] [PubMed]
  • Matthews, B. W. (1968). J. Mol. Biol.33, 491–497. [PubMed]
  • McPhillips, T. M., McPhillips, S. E., Chiu, H.-J., Cohen, A. E., Deacon, A. M., Ellis, P. J., Garman, E., Gonzalez, A., Sauter, N. K., Phizackerley, R. P., Soltis, S. M. & Kuhn, P. (2002). J. Synchrotron Rad.9, 401–406. [PubMed]
  • Nelson, K. E. et al. (2003). J. Bacteriol.185, 5591–5601. [PMC free article] [PubMed]
  • O’Keefe, S. J. (2008). Curr. Opin. Gastroenterol.24, 51–58. [PubMed]
  • Ochsner, U. A., Vasil, A. I., Johnson, Z. & Vasil, M. L. (1999). J. Bacteriol.181, 1099–1109. [PMC free article] [PubMed]
  • Ordovas, J. M. & Mooser, V. (2006). Curr. Opin. Lipidol.17, 157–161. [PubMed]
  • Othman, M., Aguero, R. & Lin, H. C. (2008). Curr. Opin. Gastroenterol.24, 11–16. [PubMed]
  • Reynolds, K. A., Thomson, J. M., Corbett, K. D., Bethel, C. R., Berger, J. M., Kirsch, J. F., Bonomo, R. A. & Handel, T. M. (2006). J. Biol. Chem.281, 26745–26753. [PubMed]
  • Santarsiero, B. D., Yegian, D. T., Lee, C. C., Spraggon, G., Gu, J., Scheibe, D., Uber, D. C., Cornell, E. W., Nordmeyer, R. A., Kolbe, W. F., Jin, J., Jones, A. L., Jaklevic, J. M., Schultz, P. G. & Stevens, R. C. (2002). J. Appl. Cryst.35, 278–281.
  • Sheldrick, G. M. (2008). Acta Cryst. A64, 112–122. [PubMed]
  • Sklar, J. G., Wu, T., Gronenberg, L. S., Malinverni, J. C., Kahne, D. & Silhavy, T. J. (2007). Proc. Natl Acad. Sci. USA, 104, 6400–6405. [PubMed]
  • Sleator, R. D., Shortall, C. & Hill, C. (2008). Lett. Appl. Microbiol.47, 361–366. [PubMed]
  • Strynadka, N. C., Jensen, S. E., Alzari, P. M. & James, M. N. (1996). Nature Struct. Biol.3, 290–297. [PubMed]
  • Terwilliger, T. C. (2003). Acta Cryst. D59, 38–44. [PMC free article] [PubMed]
  • Turnbaugh, P. J., Hamady, M., Yatsunenko, T., Cantarel, B. L., Duncan, A., Ley, R. E., Sogin, M. L., Jones, W. J., Roe, B. A., Affourtit, J. P., Egholm, M., Henrissat, B., Heath, A. C., Knight, R. & Gordon, J. I. (2009). Nature (London), 457, 480–484. [PMC free article] [PubMed]
  • Van Duyne, G. D., Standaert, R. F., Karplus, P. A., Schreiber, S. L. & Clardy, J. (1993). J. Mol. Biol.229, 105–124. [PubMed]
  • Vanini, M. M., Spisni, A., Sforca, M. L., Pertinhez, T. A. & Benedetti, C. E. (2008). Proteins, 71, 2051–2064. [PubMed]
  • Verberkmoes, N. C., Russell, A. L., Shah, M., Godzik, A., Rosenquist, M., Halfvarson, J., Lefsrud, M. G., Apajalahti, J., Tysk, C., Hettich, R. L. & Jansson, J. K. (2009). ISME J.3, 179–189. [PubMed]
  • Vonrhein, C. & Blanc, E. (2007). Methods Mol. Biol.364, 215–230. [PubMed]
  • Vriend, G. (1990). J. Mol. Graph.8, 52–56. [PubMed]
  • Winn, M. D., Murshudov, G. N. & Papiz, M. Z. (2003). Methods Enzymol.374, 300–321. [PubMed]
  • Xu, J., Bjursell, M. K., Himrod, J., Deng, S., Carmichael, L. K., Chiang, H. C., Hooper, L. V. & Gordon, J. I. (2003). Science, 299, 2074–2076. [PubMed]
  • Xu, J. et al. (2007). PLoS Biol.5, e156. [PMC free article] [PubMed]
  • Yang, H., Guranovic, V., Dutta, S., Feng, Z., Berman, H. M. & Westbrook, J. D. (2004). Acta Cryst. D60, 1833–1839. [PubMed]
  • Ye, Y. & Godzik, A. (2004). Nucleic Acids Res.32, W582–W585. [PMC free article] [PubMed]
  • Yeats, C., Rawlings, N. D. & Bateman, A. (2004). Trends Biochem. Sci.29, 169–172. [PubMed]
  • Zaneveld, J., Turnbaugh, P. J., Lozupone, C., Ley, R. E., Hamady, M., Gordon, J. I. & Knight, R. (2008). Curr. Opin. Chem. Biol.12, 109–114. [PMC free article] [PubMed]
  • Zhang, M., Monzingo, A. F., Segatori, L., Georgiou, G. & Robertus, J. D. (2004). Acta Cryst. D60, 1512–1518. [PubMed]

Articles from Acta Crystallographica Section F: Structural Biology and Crystallization Communications are provided here courtesy of International Union of Crystallography