Details are emerging on the structure and function of a remarkable class of capsid-like protein assemblies that serve as simple metabolic organelles in many bacteria. These bacterial microcompartments consist of a few thousand shell proteins, which encapsulate two or more sequentially acting enzymes in order to enhance or sequester certain metabolic pathways, particularly those involving toxic or volatile intermediates. Genomic data indicate that bacterial microcompartment shell proteins are present in a wide range of bacterial species, where they encapsulate varied reactions. Crystal structures of numerous shell proteins from distinct types of microcompartments have provided keys for understanding how the shells are assembled and how they conduct molecular transport into and out of microcompartments. The structural data emphasize a high level of mechanistic sophistication in the protein shell, and point the way for further studies on this fascinating but poorly appreciated class of subcellular structures.
Protein assembly; capsid; molecular transport; pore; BMC; carboxysome
Disulfide bonds are generally not used to stabilize proteins in the cytosolic compartments of bacteria or eukaryotic cells, owing to the chemically reducing nature of those environments. In contrast, certain thermophilic archaea use disulfide bonding as a major mechanism for protein stabilization. Here, we provide a current survey of completely sequenced genomes, applying computational methods to estimate the use of disulfide bonding across the Archaea. Microbes belonging to the Crenarchaeal branch, which are essentially all hyperthermophilic, are universally rich in disulfide bonding while lesser degrees of disulfide bonding are found among the thermophilic Euryarchaea, excluding those that are methanogenic. The results help clarify which parts of the archaeal lineage are likely to yield more examples and additional specific data on protein disulfide bonding, as increasing genomic sequencing efforts are brought to bear.
Cystathionine β-synthase (CBS) domains are found in myriad proteins from organisms across the tree of life, and have been hypothesized to function as regulatory modules that sense the energy charge of the cell. Here we characterize the structure and stability of PAE2072, a dimeric, tandem CBS domain protein from the hyperthermophilic crenarchaeon Pyrobaculum aerophilum. Crystal structures of the protein in unliganded and adenosine monophosphate (AMP)-bound forms, determined at resolutions of 2.10 Å and 2.35 Å respectively, reveal a remarkable conservation of key functional features seen in the γ subunit of the eukaryotic AMP-activated protein kinase (AMPK). The structures also confirm the presence of a suspected intermolecular disulfide bond between the two subunits that is shown to stabilize the protein. Our AMP-bound structure represents a first step in investigating the function of a large class of uncharacterized prokaryotic proteins. In addition, this work extends previous studies that have suggested that, in certain thermophilic microbes, disulfide bonds play a key role in stabilizing intracellular proteins and protein-protein complexes.
Cystathionine β-synthase; AMP-activated protein kinase; disulfide bond; protein stabilization; hyperthermophile
Among proteins of known three dimensional structure, only a few possess complex topological features such as knotted or interlinked (catenated) protein backbones. Such unusual proteins offer potentially unique insights into folding pathways and stabilization mechanisms. They also present special challenges for both theorists and computational scientists interested in understanding and predicting protein folding behavior. Here we review complex topological features in proteins with a focus on recent progress on the identification and characterization of knotted and interlinked protein systems. Also, an approach is described for designing an expanded set of knotted proteins.
Protein knots; protein links; protein folding; protein stability; protein topology
The polypeptide backbones of a few proteins are tied in a knot. The biophysical effects and potential biological roles of knots are not well understood. Here, we test the consequences of protein knotting by taking a monomeric protein, carbonic anhydrase II, whose native structure contains a shallow knot, and polymerizing it end-to-end to form a deeply and multiply knotted polymeric filament. Thermal stability experiments show that the polymer is stabilized against loss of structure and aggregation by the presence of deep knots.
biomaterials; disulfide; protein design; protein knot; topology
Many bacteria conditionally express proteinaceous organelles referred to here as microcompartments (Fig. 1). These microcompartments are thought to be involved in a least seven different metabolic processes and the number is growing. Microcompartments are very large and structurally sophisticated. They are usually about 100–150 nm in cross section and consist of 10,000–20,000 polypeptides of 10–20 types. Their unifying feature is a solid shell constructed from proteins having bacterial microcompartment (BMC) domains. In the examples that have been studied, the microcompartment shell encases sequentially acting metabolic enzymes that catalyze a reaction sequence having a toxic or volatile intermediate product. It is thought that the shell of the microcompartment confines such intermediates, thereby enhancing metabolic efficiency and/or protecting cytoplasmic components. Mechanistically, however, this creates a paradox. How do microcompartments allow enzyme substrates, products and cofactors to pass while confining metabolic intermediates in the absence of a selectively permeable membrane? We suggest that the answer to this paradox may have broad implications with respect to our understanding of the fundamental properties of biological protein sheets including microcompartment shells, S-layers and viral capsids.
Some bacteria contain organelles or microcompartments consisting of a large virion-like protein shell encapsulating sequentially acting enzymes. These organized microcompartments serve to enhance or protect key metabolic pathways inside the cell. The variety of bacterial microcompartments provide diverse metabolic functions, ranging from CO2 fixation to the degradation of small organic molecules. Yet they share an evolutionarily related shell, which is defined by a conserved protein domain that is widely distributed across the bacterial kingdom. Structural studies on a number of these bacterial microcompartment shell proteins are illuminating the architecture of the shell and highlighting its critical role in controlling molecular transport into and out of microcompartments. Current structural, evolutionary, and mechanistic ideas are discussed, along with genomic studies for exploring the function and diversity of this family of bacterial organelles.
carboxysome; molecular transport; protein assembly; nanocompartments; protein shell
Enzymes from natural product biosynthetic pathways are attractive candidates for creating tailored biocatalysts to produce semisynthetic pharmaceutical compounds. LovD is an acyltransferase that converts the inactive monacolin J acid (MJA) into the cholesterol-lowering lovastatin. LovD can also synthesize the blockbuster drug simvastatin using MJA and a synthetic α-dimethylbutyryl thioester, albeit with suboptimal properties as a biocatalyst. Here we used directed evolution to improve the properties of LovD towards semisynthesis of simvastatin. Mutants with improved catalytic efficiency, solubility and thermal stability were obtained, with the best mutant displaying an ~11-fold increase in an Escherichia coli based biocatalytic platform. To understand the structural basis of LovD enzymology, seven X-ray crystal structures were determined, including the parent LovD, an improved mutant G5, and G5 co-crystallized with ligands. Comparisons between the structures reveal that beneficial mutations stabilize the structure of G5 in a more compact conformation that is favorable for catalysis.
The crystal structure of a putative NTP pyrophosphohydrolase, YP_001813558.1 from E. sibiricum, reveals a novel segment-swapped linked-dimer assembly.
The crystal structure of a putative NTPase, YP_001813558.1 from Exiguobacterium sibiricum 255-15 (PF09934, DUF2166) was determined to 1.78 Å resolution. YP_001813558.1 and its homologs (dimeric dUTPases, MazG proteins and HisE-encoded phosphoribosyl ATP pyrophosphohydrolases) form a superfamily of all-α-helical NTP pyrophosphatases. In dimeric dUTPase-like proteins, a central four-helix bundle forms the active site. However, in YP_001813558.1, an unexpected intertwined swapping of two of the helices that compose the conserved helix bundle results in a ‘linked dimer’ that has not previously been observed for this family. Interestingly, despite this novel mode of dimerization, the metal-binding site for divalent cations, such as magnesium, that are essential for NTPase activity is still conserved. Furthermore, the active-site residues that are involved in sugar binding of the NTPs are also conserved when compared with other α-helical NTPases, but those that recognize the nucleotide bases are not conserved, suggesting a different substrate specificity.
structural genomics; putative NTP pyrophosphohydrolase; MazG nucleotide pyrophosphohydrolase; dUTPases
Many of the functional units in cells are multi-protein complexes such as RNA polymerase, the ribosome, and the proteasome. For such units to work together, one might expect a high level of regulation to enable co-appearance or repression of sets of complexes at the required time. However, this type of coordinated regulation between whole complexes is difficult to detect by existing methods for analyzing mRNA co-expression. We propose a new methodology that is able to detect such higher order relationships.
We detect coordinated regulation of multiple protein complexes using logic analysis of gene expression data. Specifically, we identify gene triplets composed of genes whose expression profiles are found to be related by various types of logic functions. In order to focus on complexes, we associate the members of a gene triplet with the distinct protein complexes to which they belong. In this way, we identify complexes related by specific kinds of regulatory relationships. For example, we may find that the transcription of complex C is increased only if the transcription of both complex A AND complex B is repressed. We identify hundreds of examples of coordinated regulation among complexes under various stress conditions. Many of these examples involve the ribosome. Some of our examples have been previously identified in the literature, while others are novel. One notable example is the relationship between the transcription of the ribosome, RNA polymerase and mannosyltransferase II, which is involved in N-linked glycan processing in the Golgi.
The analysis proposed here focuses on relationships among triplets of genes that are not evident when genes are examined in a pairwise fashion as in typical clustering methods. By grouping gene triplets, we are able to decipher coordinated regulation among sets of three complexes. Moreover, using all triplets that involve coordinated regulation with the ribosome, we derive a large network involving this essential cellular complex. In this network we find that all multi-protein complexes that belong to the same functional class are regulated in the same direction as a group (either induced or repressed).
Multiple crystal structures are reported of cross-linked actin dimers. Interactions that are conserved across crystal structures suggest detailed interactions that are likely to be present in F-actin filaments.
The structure of actin in its monomeric form is known at high resolution, while the structure of filamentous F-actin is only understood at considerably lower resolution. Knowing precisely how the monomers of actin fit together would lead to a deeper understanding of the dynamic behavior of the actin filament. Here, a series of crystal structures of actin dimers are reported which were prepared by cross-linking in either the longitudinal or the lateral direction in the filament state. Laterally cross-linked dimers, comprised of monomers belonging to different protofilaments, are found to adopt configurations in crystals that are not related to the native structure of filamentous actin. In contrast, multiple structures of longitudinal dimers consistently reveal the same interface between monomers within a single protofilament. The reappearance of the same longitudinal interface in multiple crystal structures adds weight to arguments that the interface visualized is similar to that in actin filaments. Highly conserved atomic interactions involving residues 199–205 and 287–291 are highlighted.
F-actin; cross-linking; actin filaments
A growing number of organisms have been discovered inhabiting extreme environments, including temperatures in excess of 100 °C. How cellular proteins from such organisms retain their native folds under extreme conditions is still not fully understood. Recent computational and structural studies have identified disulfide bonding as an important mechanism for stabilizing intracellular proteins in certain thermophilic microbes. Here, we present the first proteomic analysis of intracellular disulfide bonding in the hyperthermophilic archaeon Pyrobaculum aerophilum. Our study reveals that the utilization of disulfide bonds extends beyond individual proteins to include many protein-protein complexes. We report the 1.6Å crystal structure of one such complex, a citrate synthase homodimer. The structure contains two intramolecular disulfide bonds, one per subunit, which result in the cyclization of each protein chain in such a way that the two chains are topologically interlinked, rendering them inseparable. This unusual feature emphasizes the variety and sophistication of the molecular mechanisms that can be achieved by evolution.
disulfide bond; protein stability; catenane; citrate synthase; thermophile
The carboxysome is a bacterial organelle that functions to enhance the efficiency of CO2 fixation by encapsulating the enzymes ribulose bisphosphate carboxylase/oxygenase (RuBisCO) and carbonic anhydrase. The outer shell of the carboxysome is reminiscent of a viral capsid, being constructed from many copies of a few small proteins. Here we describe the structure of the shell protein CsoS1A from the chemoautotrophic bacterium Halothiobacillus neapolitanus. The CsoS1A protein forms hexameric units that pack tightly together to form a molecular layer, which is perforated by narrow pores. Sulfate ions, soaked into crystals of CsoS1A, are observed in the pores of the molecular layer, supporting the idea that the pores could be the conduit for negatively charged metabolites such as bicarbonate, which must cross the shell. The problem of diffusion across a semiporous protein shell is discussed, with the conclusion that the shell is sufficiently porous to allow adequate transport of small molecules. The molecular layer formed by CsoS1A is similar to the recently observed layers formed by cyanobacterial carboxysome shell proteins. This similarity supports the argument that the layers observed represent the natural structure of the facets of the carboxysome shell. Insights into carboxysome function are provided by comparisons of the carboxysome shell to viral capsids, and a comparison of its pores to the pores of transmembrane protein channels.
Bacterial cells are generally viewed as being relatively simple because they lack the membrane-bound organelles that help organize the interiors of eukaryotic cells. However, many bacterial cells produce large, protein-based microcompartments that serve effectively as simple organelles. These microcompartments enclose specific cellular enzymes, thereby successfully sequestering particular reactions or pathways from the rest of the cytosol. The prototypical bacterial microcompartment is the carboxysome, which is found in many bacteria that fix CO2 into organic carbon. In these bacteria, the efficiency of CO2 fixation is enhanced by having the key enzymes in that pathway encapsulated together. Carboxysomes were discovered more than 40 years ago, but an understanding of their assembly and function is just beginning to emerge. Here we report new structures of the proteins that form the outer shell of the carboxysome. These structures provide further evidence that the carboxysome shell is constructed according to principles similar to those seen in icosahedral viral capsids. The structure of the carboxysome serves as a model for understanding a variety of primitive bacterial organelles that are coming to light.
The structure of the bacterial carboxysome shell protein consists of tightly packed hexameric units perforated by narrow pores that bind sulfate ions, enabling metabolites to be conducted across the shell.
A new method for predicting recoding by rare amino acids such as selenocysteine and pyrrolysine was used to survey a set of microbial genomes.
In several natural settings, the standard genetic code is expanded to incorporate two additional amino acids with distinct functionality, selenocysteine and pyrrolysine. These rare amino acids can be overlooked inadvertently, however, as they arise by recoding at certain stop codons. We report a method for such recoding prediction from genomic data, using read-through similarity evaluation. A survey across a set of microbial genomes identifies almost all the known cases as well as a number of novel candidate proteins.
The Genomic Disulfide Analysis Program (GDAP) provides web access to computationally predicted protein disulfide bonds for over one hundred microbial genomes, including both bacterial and achaeal species. In the GDAP process, sequences of unknown structure are mapped, when possible, to known homologous Protein Data Bank (PDB) structures, after which specific distance criteria are applied to predict disulfide bonds. GDAP also accepts user-supplied protein sequences and subsequently queries the PDB sequence database for the best matches, scans for possible disulfide bonds and returns the results to the client. These predictions are useful for a variety of applications and have previously been used to show a dramatic preference in certain thermophilic archaea and bacteria for disulfide bonds within intracellular proteins. Given the central role these stabilizing, covalent bonds play in such organisms, the predictions available from GDAP provide a rich data source for designing site-directed mutants with more stable thermal profiles. The GDAP web application is a gateway to this information and can be used to understand the role disulfide bonds play in protein stability both in these unusual organisms and in sequences of interest to the individual researcher. The prediction server can be accessed at http://www.doe-mbi.ucla.edu/Services/GDAP.
Four methods that infer protein function and linkages have been combined in a single database, Prolinks, which spans 83 organisms and includes 10 million high-confidence links.
The advent of whole-genome sequencing has led to methods that infer protein function and linkages. We have combined four such algorithms (phylogenetic profile, Rosetta Stone, gene neighbor and gene cluster) in a single database - Prolinks - that spans 83 organisms and includes 10 million high-confidence links. The Proteome Navigator tool allows users to browse predicted linkage networks interactively, providing accompanying annotation from public databases. The Prolinks database and the Proteome Navigator tool are available for use online at .
Genome-wide functional linkages among proteins in cellular complexes and metabolic pathways can be inferred from high throughput experimentation, such as DNA microarrays, or from bioinformatic analyses. Here we describe a method for the visualization and interpretation of genome-wide functional linkages inferred by the Rosetta Stone, Phylogenetic Profile, Operon and Conserved Gene Neighbor computational methods. This method involves the construction of a genome-wide functional linkage map, where each significant functional linkage between a pair of proteins is displayed on a two-dimensional scatter-plot, organized according to the order of genes along the chromosome. Subsequent hierarchical clustering of the map reveals clusters of genes with similar functional linkage profiles and facilitates the inference of protein function and the discovery of functionally linked gene clusters throughout the genome. We illustrate this method by applying it to the genome of the pathogenic bacterium Mycobacterium tuberculosis, assigning cellular functions to previously uncharacterized proteins involved in cell wall biosynthesis, signal transduction, chaperone activity, energy metabolism and polysaccharide biosynthesis.
Thermophilic organisms flourish in varied high-temperature environmental niches that are deadly to other organisms. Recently, genomic evidence has implicated a critical role for disulfide bonds in the structural stabilization of intracellular proteins from certain of these organisms, contrary to the conventional view that structural disulfide bonds are exclusively extracellular. Here both computational and structural data are presented to explore the occurrence of disulfide bonds as a protein-stabilization method across many thermophilic prokaryotes. Based on computational studies, disulfide-bond richness is found to be widespread, with thermophiles containing the highest levels. Interestingly, only a distinct subset of thermophiles exhibit this property. A computational search for proteins matching this target phylogenetic profile singles out a specific protein, known as protein disulfide oxidoreductase, as a potential key player in thermophilic intracellular disulfide-bond formation. Finally, biochemical support in the form of a new crystal structure of a thermophilic protein with three disulfide bonds is presented together with a survey of known structures from the literature. Together, the results provide insight into biochemical specialization and the diversity of methods employed by organisms to stabilize their proteins in exotic environments. The findings also motivate continued efforts to sequence genomes from divergent organisms.
Certain thermophiles are found to stabilize their proteins in extreme environments with additional disulfide bonds. A phylogenetic profile identifies a protein disulfide oxidoreductase critical to the stabilization process.