cancer has a dismal 5 year survival rate of 5.5% that
has not been improved over the past 25 years despite an enormous amount
of effort. Thus, there is an urgent need to identify truly novel yet
druggable protein targets for drug discovery. The human protein DnaJ
homologue subfamily A member 1 (DNAJA1) was previously shown to be
downregulated 5-fold in pancreatic cancer cells and has been targeted
as a biomarker for pancreatic cancer, but little is known about the
specific biological function for DNAJA1 or the other members of the
DnaJ family encoded in the human genome. Our results suggest the overexpression
of DNAJA1 suppresses the stress response capabilities of the oncogenic
transcription factor, c-Jun, and results in the diminution of cell
survival. DNAJA1 likely activates a DnaK protein by forming a complex
that suppresses the JNK pathway, the hyperphosphorylation of c-Jun,
and the anti-apoptosis state found in pancreatic cancer cells. A high-quality
nuclear magnetic resonance solution structure of the J-domain of DNAJA1
combined with a bioinformatics analysis and a ligand affinity screen
identifies a potential DnaK binding site, which is also predicted
to overlap with an inhibitory binding site, suggesting DNAJA1 activity
is highly regulated.
Cyanobacterial phycobiliproteins have evolved to capture light energy over most of the visible spectrum due to their bilin chromophores, which are linear tetrapyrroles that have been covalently attached by enzymes called bilin lyases. We report here the crystal structure of a bilin lyase of the CpcS family from Thermosynechococcus elongatus (TeCpcS-III). TeCpcS-III is a 10-stranded beta barrel with two alpha helices and belongs to the lipocalin structural family. TeCpcS-III catalyzes both cognate as well as non-cognate bilin attachment to a variety of phycobiliprotein subunits. TeCpcS-III ligates phycocyanobilin, phycoerythrobilin and phytochromobilin to the alpha and beta subunits of allophycocyanin and to the beta subunit of phycocyanin at the Cys82-equivalent position in all cases. The active form of TeCpcS-III is a dimer, which is consistent with the structure observed in the crystal. Using the UnaG protein and its association with bilirubin as a guide, a model for the association between the native substrate, phycocyanobilin, and TeCpcS was produced.
bilin lyase; cyanobacteria; fluorescent probes; phycobiliproteins; lipocalins
A high-quality structure of the 68-residue protein CD1104B from Clostridium difficile strain 630 exhibits a distinct all α-helical fold. The structure presented here is the first representative of bacterial protein domain family PF14203 (currently 180 members) of unknown function (DUF4319) and reveals that the side-chains of the only two strictly conserved residues (Glu 8 and Lys 48) form a salt bridge. Moreover, these two residues are located in the vicinity of the largest surface cleft which is predicted to contribute to a surface area involved in protein-protein interactions. This, along with its coding in transposon CTn4, suggests that CD1104B (and very likely all members of Pfam 14230) functions by interacting with other proteins required for the transfer of transposons between different bacterial species.
CD1104B; PF14203; DUF4319; Transposon; Structural Genomics
We have determined the solution NMR structure of the intermembrane space domain (IMSD) of the human mitochondrial ATPase associated with various activities (AAA) protease known as AFG3-like protein 2 (AFG3L2). Our structural analysis and molecular dynamics results indicate that the IMSD is peripherally bound to the membrane surface. This is a modification to the location of the six IMSDs in a model of the full length yeast hexaoligomeric homolog of AFG3L2 determined at low resolution by electron cryomicroscopy 1. The predicted protein-protein interaction surface, located on the side furthest from the membrane, may mediate binding to substrates as well as prohibitins.
m-AAA protease; Molecular dynamics; NMR structure
Small molecule control of intracellular protein levels allows temporal and dose-dependent regulation of protein function. Recently, we developed a method to degrade proteins fused to a mutant dehalogenase (HaloTag2) using small molecule hydrophobic tags (HyTs). Here, we introduce a complementary method to stabilize the same HaloTag2 fusion proteins, resulting in a unified system allowing bidirectional control of cellular protein levels in a temporal and dose-dependent manner. From a small molecule screen, we identified N-(3,5-dichloro-2-ethoxybenzyl)-2H-tetrazol-5-amine as a nanomolar HALoTag2 Stabilizer (HALTS1) that reduces the Hsp70:HaloTag2 interaction, thereby preventing HaloTag2 ubiquitination. Finally, we demonstrate the utility of the HyT/HALTS system in probing the physiological role of therapeutic targets by modulating HaloTag2-fused oncogenic H-Ras, which resulted in either the cessation (HyT) or acceleration (HALTS) of cellular transformation. In sum, we present a general platform to study protein function, whereby any protein of interest fused to HaloTag2 can be either degraded 10-fold or stabilized 5-fold using two corresponding compounds.
Drug Target Validation; Hydrophobic Tag; Degron; Hsp70; Ubiquitin Proteasome System
At present, only 0.9% of PDB-deposited structures are of membrane proteins in spite of the fact that membrane proteins constitute approximately 30% of total proteins in most genomes from bacteria to humans. Here we address some of the major bottlenecks in the structural studies of membrane proteins and discuss the ability of the new technology, the Single-Protein Production (SPP) system, to help solve these bottlenecks.
SPP; membrane protein; NMR
Gram-negative bacteria consist of two independent membranes, the inner cytoplasmic membrane and the outer membrane. The outer membrane contains a number of β-barrel proteins such as OmpF, OmpC, OmpA and OmpX. In this paper, we explored to use the condensed Single Protein Production (cSPP) system for isotope labelling of OmpA and OmpX for NMR structural study, both of which are known to consist of eight β-strands forming a barrel in the outer membrane. Using a deletion strain lacking all major outer membrane proteins, both OmpA and OmpX were expressed well in a 20-fold condensed SPP (cSPP) system. We demonstrated that outer membrane fractions prepared from the cSPP system in M9 medium containing 15-N-NH4Cl can be directly used for NMR structural study of the outer mebrane proteins without any further purification to get excellent [1H-15N]-TROSY spectra.
Biomolecular NMR structures are now routinely used in biology, chemistry, and bioinformatics. Methods and metrics for assessing the accuracy and precision of protein NMR structures are beginning to be standardized across the biological NMR community. These include both knowledge-based assessment metrics, parameterized from the database of protein structures, and model vs. data assessment metrics. On line servers are available that provide comprehensive protein structure quality assessment reports, and efforts are in progress by the world-wide Protein Data Bank (wwPDB) to develop a biomolecular NMR structure quality assessment pipeline as part of the structure deposition process. These quality assessment metrics and standards will aid NMR spectroscopists in determining more accurate structures, and increase the value and utility of these structures for the broad scientific community.
As methods for analysis of biomolecular structure and dynamics using nuclear magnetic resonance spectroscopy (NMR) continue to advance, the resulting 3D structures, chemical shifts, and other NMR data are broadly impacting biology, chemistry, and medicine. Structure model assessment is a critical area of NMR methods development, and is an essential component of the process of making these structures accessible and useful to the wider scientific community. For these reasons, the Worldwide Protein Data Bank (wwPDB) has convened an NMR Validation Task Force (NMR-VTF) to work with the wwPDB partners in developing metrics and policies for biomolecular NMR data harvesting, structure representation, and structure quality assessment. This paper summarizes the recommendations of the NMR-VTF, and lays the groundwork for future work in developing standards and metrics for biomolecular NMR structure quality assessment.
High-quality NMR structures of the C-terminal domain comprising residues 484-537 of the 537-residue protein Bacterial chlorophyll subunit B (BchB) from Chlorobium tepidum and residues 9-61 of 61-residue Asr4154 from Nostoc sp. (strain PCC 7120) exhibit a mixed α/β fold comprised of three α-helices and a small β-sheet packed against second α-helix. These two proteins share 29 % sequence similarity and their structures are globally quite similar. The structures of BchB(484-537) and Asr4154(9-61) are the first representative structures for the large protein family (Pfam) PF08369, a family of unknown function currently containing 610 members in bacteria and eukaryotes. Furthermore, BchB(484-537) complements the structural coverage of the dark-operating protochlorophyllide oxidoreductase (DPOR).
BchB; DPOR; Asr4154; PF08369; PCP-red; structural genomics
SecA is an intensively studied mechanoenzyme that uses ATP hydrolysis to drive processive extrusion of secreted proteins through a protein-conducting channel in the cytoplasmic membrane of eubacteria. The ATPase motor of SecA is strongly homologous to that in DEAD-box RNA helicases. It remains unclear how local chemical events in its ATPase active site control the overall conformation of an ~100 kDa multidomain enzyme and drive protein transport. In this paper, we use biophysical methods to establish that a single electrostatic charge in the ATPase active site controls the global conformation of SecA. The enzyme undergoes an ATP-modulated endothermic conformational transition (ECT) believed to involve similar structural mechanics to the protein transport reaction. We have characterized the effects of an isosteric glutamate-to-glutamine mutation in the catalytic base, which mimics the immediate electrostatic consequences of ATP hydrolysis in the active site. Calorimetric studies demonstrate that this mutation facilitates the ECT in E. coli SecA and triggers it completely in B. subtilis SecA. Consistent with the substantial increase in entropy observed in the course of the ECT, hydrogen-deuterium exchange mass spectrometry demonstrates that it increases protein backbone dynamics in domain-domain interfaces at remote locations from the ATPase active site. The catalytic glutamate is one of ~250 charged amino acids in SecA, and yet neutralization of its sidechain charge is sufficient to trigger a global order-disorder transition in this 100 kDa enzyme. The intricate network of structural interactions mediating this effect couples local electrostatic changes during ATP hydrolysis to global conformational and dynamic changes in SecA. This network forms the foundation of the allosteric mechanochemistry that efficiently harnesses the chemical energy stored in ATP to drive complex mechanical processes.
SecA; ATPase; thermodynamics; entropy; protein dynamics; allostery; hydrogen-deuterium exchange
For the 10th experiment on Critical Assessment of the techniques of protein Structure Prediction (CASP) the prediction target proteins were broken into independent evaluation units (EUs), which were then classified into template-based modeling (TBM) or free modeling (FM) categories. We describe here how the EUs were defined and classified, what issues arose in the process, and how we resolved them. Evaluation units are frequently not the whole target proteins but the constituting structural domains. However, the assessors from CASP7 on combined more than one domain into one evaluation unit for some targets, which implied that the assessment also included evaluation of the prediction of the relative position and orientation of these domains. In CASP10, we followed and expanded this notion by defining multi-domain evaluation units for a number of targets. These included three EUs, each made of two domains of familiar fold but arranged in a novel manner and for which the focus of evaluation was the inter-domain arrangement. An EU was classified to the TBM category if a template could be found by sequence similarity searches and to FM if a structural template could not be found by structural similarity searches. The EUs that did not fall cleanly in either of these cases were classified case-by-case, often including consideration of the overall quality and characteristics of the predictions.
CASP; CASP10; protein structure; structure prediction; domain definition; evaluation unit; assessment unit; classification
We have found that refinement of protein NMR structures using Rosetta with experimental NMR restraints yields more accurate protein NMR structures than those that have been deposited in the PDB using standard refinement protocols. Using 40 pairs of NMR and X-ray crystal structures determined by the Northeast Structural Genomics Consortium, for proteins ranging in size from 5 – 22 kDa, restrained-Rosetta refined structures fit better to the raw experimental data, are in better agreement with their X-ray counterparts, and have better phasing power compared to conventionally determined NMR structures. For 38 proteins for which NMR ensembles were available and which had similar structures in solution and in the crystal, all of the restrained-Rosetta refined NMR structures were sufficiently accurate to be used for solving the corresponding X-ray crystal structures by molecular replacement. The protocol for restrained refinement of protein NMR structures was also compared with restrained CS-Rosetta calculations. For proteins smaller than 10 kDa, restrained CS-Rosetta, starting from extended conformations, provides slightly more accurate structures, while for proteins in the size range of 10 – 25 kDa the less cpu intensive restrained-Rosetta refinement protocols provided more accurate structures. The restrained-Rosetta protocols described here can improve the accuracy of protein NMR structures, and should find broad and general for studies of protein structure and function.
Protein perdeuteration approaches have tremendous value in protein NMR studies, but are limited by the high cost of perdeuterated media. Here, we demonstrate that E. coli cultures expressing proteins using either the condensed single protein production method (cSPP), or conventional pET expression plasmids, can be condensed prior to protein expression, thereby providing high-quality 2H,13C,15N-enriched protein samples at 2.5 - 10% the cost of traditional methods. As an example of the value of such inexpensively-produced perdeuterated proteins, we produced 2H,13C,15N-enriched E. coli cold shock protein A (CspA) and EnvZb in 40X condensed phase media, and obtained NMR spectra suitable for 3D structure determination. The cSPP system was also used to produce 2H,13C,15N-enriched E. coli plasma membrane protein YaiZ and outer membrane protein X (OmpX) in condensed phase. NMR spectra can be obtained for these membrane proteins produced in the cSPP system following simple detergent extraction, without extensive purification or reconstitution. This allows a membrane protein’s structural and functional properties to be characterized prior to reconstitution, or as a probe of the effects of subsequent purification steps on the structural integrity of membrane proteins. We also provide a standardized protocol for production of perdeuterated proteins using the cSPP system. The 10 - 40 fold reduction in costs of fermentation media provided by using a condensed culture system opens the door to many new applications for perdeuterated proteins in spectroscopic and crystallographic studies.
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.
Protein NMR Structure Validation; BioMagResDatabase; XPLOR; CNS; CYANA; CS-Rosetta
How living organisms create carbon-sulfur bonds during biosynthesis of critical sulphur-containing compounds is still poorly understood. The methylthiotransferases MiaB and RimO catalyze sulfur insertion into tRNAs and ribosomal protein S12, respectively. Both belong to a sub-group of Radical-SAM enzymes that bear two [4Fe-4S] clusters. One cluster binds S-Adenosylmethionine and generates an Ado• radical via a well- established mechanism. However, the precise role of the second cluster is unclear. For some sulfur-inserting Radical-SAM enzymes, this cluster has been proposed to act as a sacrificial source of sulfur for the reaction. In this paper, we report parallel enzymological, spectroscopic and crystallographic investigations of RimO and MiaB, which provide the first evidence that these enzymes are true catalysts and support a new sulfation mechanism involving activation of an exogenous sulfur co-substrate at an exchangeable coordination site on the second cluster, which remains intact during the reaction.
Enzymology; Radical-SAM; methylthiotransferase; Fe-S clusters; HYSCORE spectroscopy; X-ray crystallography
Intrinsically disordered or unstructured regions in proteins are both common and biologically important, particularly in regulation, signaling and modulating intermolecular recognition processes. From a practical point of view, however, such disordered regions often can pose significant challenges for crystallization. Disordered regions are also detrimental to NMR spectral quality, complicating the analysis of resonance assignments and three-dimensional protein structures by NMR methods. The DisMeta Server has been used by Northeastern Structural Genomics Consortium (NESG) as a primary tool for construct design and optimization in preparing samples for both NMR and crystallization studies. It is a meta-server that generates a consensus analysis of eight different sequence-based disorder predictors to identify regions that are likely to be disordered. DisMeta also identifies predicted secretion signal peptides, trans-membrane segments, and low complexity regions. Identification of disordered regions, by either experimental or computational methods, is an important step in the NESG structure production pipeline, allowing the rational design of protein constructs that have improved expression and solubility, improved crystallization, and better quality NMR spectra.
intrinsically disorder protein prediction; construct design; construct optimization; hydrogen-deuterium exchange with mass spectrometry (HDX-MS)
NanoRNase (Nrn) specifically degrades nucleoside 3′,5′-bisphosphate and the very short RNA, nanoRNA, during the final step of mRNA degradation. The crystal structure of Nrn in complex with a reaction product GMP was determined. The overall structure consists of two domains that are interconnected by a flexible loop and form a cleft. Two Mn2+ ions are coordinated by conserved residues in the DHH motif of the N-terminal domain. GMP binds near the DHHA1 motif region in the C-terminal domain. Our structure enables us to predict the substrate-bound form of Nrn as well as other DHH/DHHA1 phosphoesterase family proteins.
nanoRNA; nanoRNase; DHH/DHHA1 family; crystal structure; mRNA degradation
We describe the core Protein Production Platform of the Northeast Structural Genomics Consortium (NESG) and outline the strategies used for producing high-quality protein samples. The platform is centered on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems. The 6X-His tag allows for similar purification procedures for most targets and implementation of high-throughput (HTP) parallel methods. In most cases, the 6X-His-tagged proteins are sufficiently purified (> 97% homogeneity) using a HTP two-step purification protocol for most structural studies. Using this platform, the open reading frames of over 16,000 different targeted proteins (or domains) have been cloned as > 26,000 constructs. Over the past nine years, more than 16,000 of these expressed protein, and more than 4,400 proteins (or domains) have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html). Using these samples, the NESG has deposited more than 900 new protein structures to the Protein Data Bank (PDB). The methods described here are effective in producing eukaryotic and prokaryotic protein samples in E. coli. This paper summarizes some of the updates made to the protein production pipeline in the last five years, corresponding to phase 2 of the NIGMS Protein Structure Initiative (PSI-2) project. The NESG Protein Production Platform is suitable for implementation in a large individual laboratory or by a small group of collaborating investigators. These advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are of broad value to the structural biology, functional proteomics, and structural genomics communities.
Structural Genomics; High throughput protein production; Construct optimization; Disorder prediction; Ligation independent cloning; Multiple Displacement Amplification; Laboratory Information Management System; Protein Structure Initiative; NMR; X-ray crystallography; T7 Escherichia Coli expression system; Wheat Germ Cell Free; NMR microprobe screening; Parallel protein purification; 6X-His tag; HDX-MS
In this chapter, we concentrate on the production of high quality protein samples for NMR studies. In particular, we provide an in-depth description of recent advances in the production of NMR samples and their synergistic use with recent advancements in NMR hardware. We describe the protein production platform of the Northeast Structural Genomics Consortium, and outline our high-throughput strategies for producing high quality protein samples for nuclear magnetic resonance (NMR) studies. Our strategy is based on the cloning, expression and purification of 6X-His-tagged proteins using T7-based Escherichia coli systems and isotope enrichment in minimal media. We describe 96-well ligation-independent cloning and analytical expression systems, parallel preparative scale fermentation, and high-throughput purification protocols. The 6X-His affinity tag allows for a similar two-step purification procedure implemented in a parallel high-throughput fashion that routinely results in purity levels sufficient for NMR studies (> 97% homogeneity). Using this platform, the protein open reading frames of over 17,500 different targeted proteins (or domains) have been cloned as over 28,000 constructs. Nearly 5,000 of these proteins have been purified to homogeneity in tens of milligram quantities (see Summary Statistics, http://nesg.org/statistics.html), resulting in more than 950 new protein structures, including more than 400 NMR structures, deposited in the Protein Data Bank. The Northeast Structural Genomics Consortium pipeline has been effective in producing protein samples of both prokaryotic and eukaryotic origin. Although this paper describes our entire pipeline for producing isotope-enriched protein samples, it focuses on the major updates introduced during the last 5 years (Phase 2 of the National Institute of General Medical Sciences Protein Structure Initiative). Our advanced automated and/or parallel cloning, expression, purification, and biophysical screening technologies are suitable for implementation in a large individual laboratory or by a small group of collaborating investigators for structural biology, functional proteomics, ligand screening and structural genomics research.
Structural Genomics; High throughput protein production; Construct optimization; Disorder prediction; Ligation independent cloning; Multiple Displacement Amplification; Laboratory Information Management System; Protein Structure Initiative; NMR; T7 Escherichia coli expression system; Wheat Germ Cell Free; NMR microprobe screening; Parallel protein purification; 6X-His tag; HDX-MS; Total gene synthesis; condensed single protein production
Nucleophilic catalysis is a general strategy for accelerating ester and amide hydrolysis. In natural active sites, nucleophilic elements such as catalytic dyads and triads are usually paired with oxyanion-holes for substrate activation, but it is difficult to parse out the independent contributions of these elements or to understand how they emerged in the course of evolution. Here we explore the minimal requirements for esterase activity by computationally designing artificial catalysts using catalytic dyads and oxyanion holes. We found much higher success rates using designed oxyanion holes formed by backbone NH groups rather than by sidechains or bridging water molecules and obtained four active designs in different scaffolds by combining this motif with a Cys-His dyad. Following active site optimization, the most active of the variants exhibited a catalytic efficiency (kcat/KM) of 400 M−1s−1 for the cleavage of a p-nitrophenyl ester. Kinetic experiments indicate that the active site cysteines are rapidly acylated as programmed by design, but the subsequent slow hydrolysis of the acyl-enzyme intermediate limits overall catalytic efficiency. Moreover, the Cys-His dyads are not properly formed in crystal structures of the designed enzymes. These results highlight the challenges that computational design must overcome to achieve high levels of activity.
The solution NMR structures and backbone 15N dynamics of the specialized acyl carrier protein (ACP), RpAcpXL, from Rhodopseudomonas palustris, in both the apo-form and holo-form modified by covalent attachment of 4′-phosphopantethine at S37, are virtually identical, monomeric, and correspond to the closed conformation. The structures have an extra α-helix compared to the archetypical ACP from Escherichia coli, which has four helices, resulting in a larger opening to the hydrophobic cavity. Chemical shift differences between apo- and holo-RpAcpXL indicated some differences in the hinge region between helices α2 and α3 and in the hydrophobic cavity environment, but corresponding changes in NOE crosspeak patterns were not detected. In contrast to the NMR structures, apo-RpAcpXL was observed in an open conformation in crystals that diffracted to 2.0 Å resolution, which resulted from movement of helix α3. Based on the crystal structure, the predicted biological assembly is a homodimer. Although the possible biological significance of dimerization is unknown, there is potential that the resulting large shared hydrophobic cavity could accommodate the very long-chain fatty acid (28 to 30 carbon chain length) that this specialized ACP is known to synthesize and transfer to lipid A. These structures are the first representatives of the AcpXL family and the first to indicate that dimerization may be important for the function of these specialized ACPs.
Structural genomics; Northeast Structural Genomics Consortium (NESG); Protein Structure Initiative; Solution NMR; X-ray crystal structure; ACP-XL
Application of biocatalysis in the synthesis of chiral molecules is one of the greenest technologies for the replacement of chemical routes due to its environmentally benign reaction conditions and unparalleled chemo-, regio-and stereoselectivities. We have been interested in searching for carbonyl reductase enzymes and assessing their substrate specificity and stereoselectivity. We now report a gene cluster identified in Candida parapsilosis that consists of four open reading frames including three putative stereospecific carbonyl reductases (scr1, scr2, and scr3) and an alcohol dehydrogenase (cpadh). These newly identified three stereospecific carbonyl reductases (SCRs) showed high catalytic activities for producing (S)-1-phenyl-1,2-ethanediol from 2-hydroxyacetophenone with NADPH as the coenzyme. Together with CPADH, all four enzymes from this cluster are carbonyl reductases with novel anti-Prelog stereoselectivity. SCR1 and SCR3 exhibited distinct specificities to acetophenone derivatives and chloro-substituted 2-hydroxyacetophenones, and especially very high activities to ethyl 4-chloro-3-oxobutyrate, a β-ketoester with important pharmaceutical potentials. Our study also showed that genomic mining is a powerful tool for the discovery of new enzymes.
Bacterial species in the Enterobacteriaceae typically contain multiple paralogues of a small domain of unknown function (DUF1471) from a family of conserved proteins also known as YhcN or BhsA/McbA. Proteins containing DUF1471 may have a single or three copies of this domain. Representatives of this family have been demonstrated to play roles in several cellular processes including stress response, biofilm formation, and pathogenesis. We have conducted NMR and X-ray crystallographic studies of four DUF1471 domains from Salmonella representing three different paralogous DUF1471 subfamilies: SrfN, YahO, and SssB/YdgH (two of its three DUF1471 domains: the N-terminal domain I (residues 21–91), and the C-terminal domain III (residues 244–314)). Notably, SrfN has been shown to have a role in intracellular infection by Salmonella Typhimurium. These domains share less than 35% pairwise sequence identity. Structures of all four domains show a mixed α+β fold that is most similar to that of bacterial lipoprotein RcsF. However, all four DUF1471 sequences lack the redox sensitive cysteine residues essential for RcsF activity in a phospho-relay pathway, suggesting that DUF1471 domains perform a different function(s). SrfN forms a dimer in contrast to YahO and SssB domains I and III, which are monomers in solution. A putative binding site for oxyanions such as phosphate and sulfate was identified in SrfN, and an interaction between the SrfN dimer and sulfated polysaccharides was demonstrated, suggesting a direct role for this DUF1471 domain at the host-pathogen interface.