Hermeuptychia intricata Grishin, sp. n. is described from the Brazos Bend State Park in Texas, United States, where it flies synchronously with Hermeuptychia sosybius (Fabricius, 1793). The two species differ strongly in both male and female genitalia and exhibit 3.5% difference in the COI barcode sequence of mitochondrial DNA. Setting such significant genitalic and genotypic differences aside, we were not able to find reliable wing pattern characters to tell a difference between the two species. This superficial similarity may explain why H. intricata, only distantly related to H. sosybius, has remained unnoticed until now, despite being widely distributed in the coastal plains from South Carolina to Texas, USA (and possibly to Costa Rica). Obscuring the presence of a cryptic species even further, wing patterns are variable in both butterflies and ventral eyespots vary from large to almost absent. To avoid confusion with the new species, neotype for Papilio sosybius Fabricius, 1793, a common butterfly that occurs across northeast US, is designated from Savannah, Georgia, USA. It secures the universally accepted traditional usage of this name. Furthermore, we find that DNA barcodes of Hermeuptychia specimens from the US, even those from extreme south Texas, are at least 4% different from those of H. hermes (Fabricius, 1775)—type locality Brazil: Rio de Janeiro—and suggest that the name H. hermes should not be used for USA populations, but rather reserved for the South American species. This conclusion is further supported by comparison of male genitalia. However, facies, genitalia and 2.1% different DNA barcodes set Hermeuptychia populations in the lower Rio Grande Valley of Texas apart from H. sosybius. These southern populations, also found in northeastern Mexico, are described here as Hermeuptychia hermybius Grishin, sp. n. (type locality Texas: Cameron County). While being phylogenetically closer to H. sosybius than to any other Hermeuptychia species, H. hermybius can usually be recognized by wing patterns, such as the size of eyespots and the shape of brown lines on hindwing. “Intricate Satyr” and “South Texas Satyr” are proposed as the English names for H. intricata and H. hermybius, respectively.
Biodiversity; cryptic species; DNA barcodes; neotropical; satyr; Hermeuptychia gisella; Hermeuptychia cucullina; Hermeuptychia sosybius kappeli; female genitalia
Stu2p/XMAP215 proteins are essential microtubule polymerases that use multiple αβ-tubulin-interacting TOG domains to bind microtubule plus ends and catalyze fast microtubule growth. We report here the structure of the TOG2 domain from Stu2p bound to yeast αβ-tubulin. Like TOG1, TOG2 binds selectively to a fully ‘curved’ conformation of αβ-tubulin, incompatible with a microtubule lattice. We also show that TOG1-TOG2 binds non-cooperatively to two αβ-tubulins. Preferential interactions between TOGs and fully curved αβ-tubulin that cannot exist elsewhere in the microtubule explain how these polymerases localize to the extreme microtubule end. We propose that these polymerases promote elongation because their linked TOG domains concentrate unpolymerized αβ-tubulin near curved subunits already bound at the microtubule end. This tethering model can explain catalyst-like behavior and also predicts that the polymerase action changes the configuration of the microtubule end.
Dynamic filaments of proteins, called microtubules, have several important roles inside cells. Microtubules provide structural support for the cell; they help to pull chromosomes apart during cell division; and they guide the trafficking of proteins and molecules across the cell.
The building blocks of microtubules are proteins called αβ-tubulin, which are continually added to and removed from the ends of a microtubule, causing it to grow and shrink. Other proteins that interact with the microtubules can help to speed up these construction and deconstruction processes. Ayaz et al. took a closer look at the structure of one particular family of proteins that make it easier for the microtubules to grow, using a technique called X-ray crystallography. The resulting images show two sites—called TOG1 and TOG2—on the enzymes that attach to the αβ-tubulin proteins. Ayaz et al. found that this binding can only occur when αβ-tubulin has a curved shape, which only happens when the tubulins are not included in, or are only bound weakly to the end of, a microtubule.
Previous research suggested that the two binding sites might work together to provide ‘scaffolding’ that stabilizes the microtubule. However, genetic experiments by Ayaz et al. show that microtubules will grow even if one of the binding sites is missing. Both TOG1 and TOG2 bind to αβ-tubulin in the same way, and by using computer simulations Ayaz et al. found that this helps to speed up the growth of microtubules. This is because the enzyme's two sites concentrate the individual tubulin building blocks at the ends of the filament. For example, TOG2 could bind to the end of the microtubule, while TOG1 holds an αβ-tubulin protein nearby and ready to bind to the filament's end. This tethering allows the microtubules to be assembled more efficiently.
microtubule; TOG domain; kinetic simulation; X-ray structure; conformation-selective; S. cerevisiae
Candidatus Liberibacter asiaticus(Ca. L. asiaticus) is a Gram-negative bacterium and the pathogen of Citrus Greening disease (Huanglongbing, HLB). As a parasitic bacterium, Ca. L. asiaticus harbors ABC transporters that play important roles in exchanging chemical compounds between Ca. L. asiaticus and its host. Here we analyzed all the ABC transporter-related proteins in Ca. L. asiaticus. We identified 14 ABC transporter systems and predicted their structures and substrate specificities. In-depth sequence and structure analysis including multiple sequence alignment, phylogenetic tree reconstruction and structure comparison further support their function predictions. Our study shows that this bacterium could utilize these ABC transporters to import metabolites (amino acids and phosphates) and enzyme cofactors (choline, thiamine, iron, manganese and zinc), resist to organic solvent, heavy metal and lipid-like drugs, construct and maintain the composition of the outer membrane, and secrete virulence factors. While the features of most ABC systems could be deduced from the abundant experimental data on their orthologs, we reported several novel observations within ABC system proteins. Moreover, we identified seven non-transport ABC systems that are likely involved in virulence gene expression regulation, transposon excision regulation and DNA repair. Our analysis reveals several candidates for further studies to understand and control the disease, including the type I virulence factor secretion system and its substrate that are likely related to Ca. L. asiaticus pathogenicity, and the ABC transporter systems responsible for bacterial outer membrane biosynthesis that are good drug targets.
Genomic annotation; function prediction; ATPase; transmembrane protein; multiple sequence alignment; phylogenetic tree; protein homology; structure comparison
Understanding the evolution of a protein, including both close and distant relationships, often reveals insight into its structure and function. Fast and easy access to such up-to-date information facilitates research. We have developed a hierarchical evolutionary classification of all proteins with experimentally determined spatial structures, and presented it as an interactive and updatable online database. ECOD (Evolutionary Classification of protein Domains) is distinct from other structural classifications in that it groups domains primarily by evolutionary relationships (homology), rather than topology (or “fold”). This distinction highlights cases of homology between domains of differing topology to aid in understanding of protein structure evolution. ECOD uniquely emphasizes distantly related homologs that are difficult to detect, and thus catalogs the largest number of evolutionary links among structural domain classifications. Placing distant homologs together underscores the ancestral similarities of these proteins and draws attention to the most important regions of sequence and structure, as well as conserved functional sites. ECOD also recognizes closer sequence-based relationships between protein domains. Currently, approximately 100,000 protein structures are classified in ECOD into 9,000 sequence families clustered into close to 2,000 evolutionary groups. The classification is assisted by an automated pipeline that quickly and consistently classifies weekly releases of PDB structures and allows for continual updates. This synchronization with PDB uniquely distinguishes ECOD among all protein classifications. Finally, we present several case studies of homologous proteins not recorded in other classifications, illustrating the potential of how ECOD can be used to further biological and evolutionary studies.
Protein structural domain databases offer a vital resource for structural bioinformatics. These databases provide functional inference for homologous structures, supply templates for structural prediction experiments, and differentiate between homologs and analogs. The rate of structure determination and deposition has increased dramatically over recent years, overwhelming the ability of current classifications to incorporate all new structures. We have developed a fast and reliable methodology for updating domain databases automatically, and created a revised hierarchy for domain classification that emphasizes evolutionary relationships. By classifying all known structures in our database with continuing automatic updates, we provide an up-to-date alternative to current resources. We illustrate several concepts that guided our classification scheme with examples of homology between domains in ECOD that are not observed in other resources.
Summary: Online Mendelian Inheritance in Man (OMIM) is a manually curated compendium of human genetic variants and the corresponding phenotypes, mostly human diseases. Instead of directly documenting the native sequences for gene entries, OMIM links its entries to protein and DNA sequences in other databases. However, because of the existence of gene isoforms and errors in OMIM records, mapping a specific OMIM mutation to its corresponding protein sequence is not trivial. Combining computer programs and extensive manual curation of OMIM full-text descriptions and original literature, we mapped 98% of OMIM amino acid substitutions (AASs) and all SwissProt Variant (SwissVar) disease-related AASs to reference sequences and confidently mapped 99.96% of all AASs to the genomic loci. Based on the results, we developed an online database and interactive web server (M2SG) to (i) retrieve the mapped OMIM and SwissVar variants for a given protein sequence; and (ii) obtain related proteins and mutations for an input disease phenotype. This database will be useful for analyzing sequences, understanding the effect of mutations, identifying important genetic variations and designing experiments on a protein of interest.
Availability and implementation: The database and web server are freely available at http://prodata.swmed.edu/M2S/mut2seq.cgi.
Supplementary data are available at Bioinformatics online.
The molecular mechanism of autophagy and its relationship to other lysosomal degradation pathways remain incompletely understood. Here, we identified a previously uncharacterized mammalian-specific protein, Beclin 2, which like Beclin 1, functions in autophagy and interacts with class III PI3K complex components and Bcl-2. However, Beclin 2, but not Beclin 1, functions in an additional lysosomal degradation pathway. Beclin 2 is required for ligand-induced endolysosomal degradation of several G protein-coupled receptors (GPCRs) through its interaction with GASP1. Beclin 2 homozygous knockout mice have decreased embryonic viability, and heterozygous knockout mice have defective autophagy, increased levels of brain cannabinoid 1 receptor, elevated food intake, and obesity and insulin resistance. Our findings identify Beclin 2 as a novel converging regulator of autophagy and GPCR turnover, and highlight the functional and mechanistic diversity of Beclin family members in autophagy, endolysosomal trafficking and metabolism.
Summary: One approach to infer functions of new proteins from their homologs utilizes visualization of an all-against-all pairwise similarity network (A2ApsN) that exploits the speed of BLAST and avoids the complexity of multiple sequence alignment. However, identifying functions of the protein clusters in A2ApsN is never trivial, due to a lack of linking characterized proteins to their relevant information in current software packages. Given the database errors introduced by automatic annotation transfer, functional deduction should be made from proteins with experimental studies, i.e. ‘reference proteins’. Here, we present a web server, termed Pclust, which provides a user-friendly interface to visualize the A2ApsN, placing emphasis on such ‘reference proteins’ and providing access to their full information in source databases, e.g. articles in PubMed. The identification of ‘reference proteins’ and the ease of cross-database linkage will facilitate understanding the functions of protein clusters in the network, thus promoting interpretation of proteins of interest.
Availability: The Pclust server is freely available at http://prodata.swmed.edu/pclust
Supplementary data are available at Bioinformatics online.
Vibrio parahaemolyticus is a Gram-negative halophilic bacterium and one of the leading causes of food-borne gastroenteritis. Its genome harbors two Type III Secretion Systems (T3SS1 and T3SS2), but only T3SS2 is required for enterotoxicity seen in animal models. Effector proteins secreted from T3SS2 have been previously shown to promote colonization of the intestinal epithelium, invasion of host cells, and destruction of the epithelial monolayer. In this study, we identify VPA1380, a T3SS2 effector protein that is toxic when expressed in yeast. Bioinformatic analyses revealed that VPA1380 is highly similar to the inositol hexakisphosphate (IP6)-inducible cysteine protease domains of several large bacterial toxins. Mutations in conserved catalytic residues and residues in the putative IP6-binding pocket abolished toxicity in yeast. Furthermore, VPA1380 was not toxic in IP6 deficient yeast cells. Therefore, our findings suggest that VPA1380 is a cysteine protease that requires IP6 as an activator.
The WAVE regulatory complex (WRC) controls actin cytoskeletal dynamics throughout the cell by stimulating the actin nucleating activity of the Arp2/3 complex at distinct membrane sites. However, the factors that recruit the WRC to specific locations remain poorly understood. Here we have identified a large family of potential WRC ligands, consisting of ~120 diverse membrane proteins including protocadherins, ROBOs, netrin receptors, Neuroligins, GPCRs and channels. Structural, biochemical and cellular studies reveal that a novel sequence motif that defines these ligands binds to a highly conserved interaction surface of the WRC formed by the Sra and Abi subunits. Mutating this binding surface in flies resulted in defects in actin cytoskeletal organization and egg morphology during oogenesis, leading to female sterility. Our findings directly link diverse membrane proteins to the WRC and actin cytoskeleton, and have broad physiological and pathological ramifications in metazoans.
The fungal genus Stachybotrys produces several diverse toxins that affect human health. Its strains comprise two mutually-exclusive toxin chemotypes, one producing satratoxins, which are a subclass of trichothecenes, and the other producing the less-toxic atranones. To determine the genetic basis for chemotype-specific differences in toxin production, the genomes of four Stachybotrys strains were sequenced and assembled de novo. Two of these strains produce atranones and two produce satratoxins.
Comparative analysis of these four 35-Mbp genomes revealed several chemotype-specific gene clusters that are predicted to make secondary metabolites. The largest, which was named the core atranone cluster, encodes 14 proteins that may suffice to produce all observed atranone compounds via reactions that include an unusual Baeyer-Villiger oxidation. Satratoxins are suggested to be made by products of multiple gene clusters that encode 21 proteins in all, including polyketide synthases, acetyltransferases, and other enzymes expected to modify the trichothecene skeleton. One such satratoxin chemotype-specific cluster is adjacent to the core trichothecene cluster, which has diverged from those of other trichothecene producers to contain a unique polyketide synthase.
The results suggest that chemotype-specific gene clusters are likely the genetic basis for the mutually-exclusive toxin chemotypes of Stachybotrys. A unified biochemical model for Stachybotrys toxin production is presented. Overall, the four genomes described here will be useful for ongoing studies of this mold’s diverse toxicity mechanisms.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-590) contains supplementary material, which is available to authorized users.
Stachybotrys; Comparative genomics; Secondary metabolism; Trichothecene biosynthesis; Toxins; Satratoxins; Atranones; Whole-genome sequencing
FMN adenylyltransferase (FMNAT) is an essential enzyme catalyzing the last step of a two-step pathway converting riboflavin (vitamin B2) to FAD, the ubiquitous flavocoenzyme. A structure-based mutagenesis and steady-state kinetic analysis of yeast FMNAT unexpectedly revealed that mutant D181A had a much faster turnover rate than the wild type enzyme. Product inhibition analysis showed that wild type FMNAT is strongly inhibited by FAD, whereas D181A mutant enzyme has an attenuated product inhibition. These results provide a structural basis for the product inhibition of the enzyme and suggest that product release may be the rate-limiting step of the reaction.
CRISPR-Cas adaptive immunity systems of bacteria and archaea insert fragments of virus or plasmid DNA as spacer sequences into CRISPR repeat loci. Processed transcripts encompassing these spacers guide the cleavage of the cognate foreign DNA or RNA. Most CRISPR-Cas loci, in addition to recognized cas genes, also include genes that are not directly implicated in spacer acquisition, CRISPR transcript processing or interference. Here we comprehensively analyze sequences, structures and genomic neighborhoods of one of the most widespread groups of such genes that encode proteins containing a predicted nucleotide-binding domain with a Rossmann-like fold, which we denote CARF (CRISPR-associated Rossmann fold). Several CARF protein structures have been determined but functional characterization of these proteins is lacking. The CARF domain is most frequently combined with a C-terminal winged helix-turn-helix DNA-binding domain and “effector” domains most of which are predicted to possess DNase or RNase activity. Divergent CARF domains are also found in RtcR proteins, sigma-54 dependent regulators of the rtc RNA repair operon. CARF genes frequently co-occur with those coding for proteins containing the WYL domain with the Sm-like SH3 β-barrel fold, which is also predicted to bind ligands. CRISPR-Cas and possibly other defense systems are predicted to be transcriptionally regulated by multiple ligand-binding proteins containing WYL and CARF domains which sense modified nucleotides and nucleotide derivatives generated during virus infection. We hypothesize that CARF domains also transmit the signal from the bound ligand to the fused effector domains which attack either alien or self nucleic acids, resulting, respectively, in immunity complementing the CRISPR-Cas action or in dormancy/programmed cell death.
CRISPR; Rossmann fold; beta barrel; DNA-binding proteins; phage defense
Cell surface growth factor receptors couple environmental cues to the regulation of cytoplasmic homeostatic process including autophagy, and aberrant activation of such receptors is a common feature of human malignancies. Here, we defined the molecular basis by which the epidermal growth factor receptor (EGFR) tyrosine kinase regulates autophagy. Active EGFR binds to the autophagy protein Beclin 1, leading to its multisite tyrosine phosphorylation, enhanced binding to inhibitors, and decreased Beclin 1-associated Class III phosphatidylinositol-3 kinase activity. EGFR tyrosine kinase inhibitor (TKI) therapy disrupts Beclin 1 tyrosine phosphorylation and binding to its inhibitors, and restores autophagy in non-small cell lung carcinoma (NSCLC) cells with a TKI-sensitive EGFR mutation. In NSCLC tumor xenografts, the expression of a tyrosine phosphomimetic Beclin 1 mutant leads to reduced autophagy, enhanced tumor growth, tumor dedifferentiation, and resistance to TKI therapy. Thus, oncogenic receptor tyrosine kinases directly regulate the core autophagy machinery, which may contribute to tumor progression and chemoresistance.
Zinc fingers are small protein domains in which zinc plays a structural role contributing to the stability of the domain. Zinc fingers are structurally diverse and are present among proteins that perform a broad range of functions in various cellular processes, such as replication and repair, transcription and translation, metabolism and signaling, cell proliferation and apoptosis. Zinc fingers typically function as interaction modules and bind to a wide variety of compounds, such as nucleic acids, proteins and small molecules. Here we present a comprehensive classification of zinc finger spatial structures. We find that each available zinc finger structure can be placed into one of eight fold groups that we define based on the structural properties in the vicinity of the zinc-binding site. Three of these fold groups comprise the majority of zinc fingers, namely, C2H2-like finger, treble clef finger and the zinc ribbon. Evolutionary relatedness of proteins within fold groups is not implied, but each group is divided into families of potential homologs. We compare our classification to existing groupings of zinc fingers and find that we define more encompassing fold groups, which bring together proteins whose similarities have previously remained unappreciated. We analyze functional properties of different zinc fingers and overlay them onto our classification. The classification helps in understanding the relationship between the structure, function and evolutionary history of these domains. The results are available as an online database of zinc finger structures.
Motivation: The structures of homologous proteins are generally better conserved than their sequences. This phenomenon is demonstrated by the prevalence of structurally conserved regions (SCRs) even in highly divergent protein families. Defining SCRs requires the comparison of two or more homologous structures and is affected by their availability and divergence, and our ability to deduce structurally equivalent positions among them. In the absence of multiple homologous structures, it is necessary to predict SCRs of a protein using information from only a set of homologous sequences and (if available) a single structure. Accurate SCR predictions can benefit homology modelling and sequence alignment.
Results: Using pairwise DaliLite alignments among a set of homologous structures, we devised a simple measure of structural conservation, termed structural conservation index (SCI). SCI was used to distinguish SCRs from non-SCRs. A database of SCRs was compiled from 386 SCOP superfamilies containing 6489 protein domains. Artificial neural networks were then trained to predict SCRs with various features deduced from a single structure and homologous sequences. Assessment of the predictions via a 5-fold cross-validation method revealed that predictions based on features derived from a single structure perform similarly to ones based on homologous sequences, while combining sequence and structural features was optimal in terms of accuracy (0.755) and Matthews correlation coefficient (0.476). These results suggest that even without information from multiple structures, it is still possible to effectively predict SCRs for a protein. Finally, inspection of the structures with the worst predictions pinpoints difficulties in SCR definitions.
Availability: The SCR database and the prediction server can be found at http://prodata.swmed.edu/SCR.
email@example.com or firstname.lastname@example.org
Supplementary data are available at Bioinformatics Online
Tuberculosis, caused by Mycobacterium tuberculosis, remains a devastating human infectious disease, causing two million deaths annually. We previously demonstrated that M. tuberculosis induces an enzyme, heme oxygenase (HO1), that produces carbon monoxide (CO) gas and that M. tuberculosis adapts its transcriptome during CO exposure. We now demonstrate that M. tuberculosis carries a novel resistance gene to combat CO toxicity. We screened an M. tuberculosis transposon library for CO-susceptible mutants and found that disruption of Rv1829 (carbon monoxide resistance, Cor) leads to marked CO sensitivity. Heterologous expression of Cor in Escherichia coli rescued it from CO toxicity. Importantly, the virulence of the cor mutant is attenuated in a mouse model of tuberculosis. Thus, Cor is necessary and sufficient to protect bacteria from host-derived CO. Taken together, this represents the first report of a role for HO1-derived CO in controlling infection of an intracellular pathogen and the first identification of a CO resistance gene in a pathogenic organism.
Macrophages produce a variety of antimicrobial molecules, including nitric oxide (NO), hydrogen peroxide (H2O2), and acid (H+), that serve to kill engulfed bacteria. In addition to these molecules, human and mouse macrophages also produce carbon monoxide (CO) gas by the heme oxygenase (HO1) enzyme. We observed that, in contrast to other bacteria, mycobacteria are resistant to CO, suggesting that this might be an evolutionary adaptation of mycobacteria for survival within macrophages. We screened a panel of ~2,500 M. tuberculosis mutants to determine which genes are required for survival of M. tuberculosis in the presence of CO. Within this panel, we identified one such gene, cor, that specifically confers CO resistance. Importantly, we found that the ability of M. tuberculosis cells carrying a mutated copy of this gene to cause tuberculosis in a mouse disease model is significantly attenuated. This indicates that CO resistance is essential for mycobacterial survival in vivo.
Krüppel-like factors (KLF) and specificity proteins (SP) constitute a family of zinc-finger-containing transcription factors that play important roles in a wide range of processes including differentiation and development of various tissues. The human genome possesses 17 KLF genes (KLF1–KLF17) and nine SP genes (SP1–SP9) with diverse functions. We used sequence similarity searches and gene synteny analysis to identify a new putative KLF gene/pseudogene named KLF18 that is present in most of the placental mammals with sequenced genomes. KLF18 is a chromosomal neighbor of the KLF17 gene and is likely a product of its duplication. Phylogenetic analyses revealed that mammalian predicted KLF18 proteins and KLF17 proteins experienced elevated rates of evolution and are grouped with KLF1/KLF2/KLF4 and non-mammalian KLF17. Predicted KLF18 proteins maintain conserved features in the zinc fingers of the SP/KLF family, while possessing repeats of a unique sequence motif in their N-terminal regions. No expression data have been reported for KLF18, suggesting that it either has highly restricted expression patterns and specialized functions, or could have become a pseudogene in extant placental mammals. Besides KLF18 genes/pseudogenes, we identified several KLF18-like genes such as Zfp352, Zfp352-like, and Zfp353 in the genomes of mouse and rat. These KLF18-like genes do not possess introns inside their coding regions, and gene expression data indicate that some of them may function in early embryonic development. They represent further expansions of KLF members in the murine lineage, most likely resulted from several events of retrotransposition and local gene duplication starting from an ancient spliced mRNA of KLF18.
The lysosomal degradation pathway of autophagy has a crucial role in defence against infection, neurodegenerative disorders, cancer and ageing. Accordingly, agents that induce autophagy may have broad therapeutic applications. One approach to developing such agents is to exploit autophagy manipulation strategies used by microbial virulence factors. Here we show that a peptide, Tat–beclin 1—derived from a region of the autophagy protein, beclin 1, which binds human immunodeficiency virus (HIV)-1 Nef—is a potent inducer of autophagy, and interacts with a newly identified negative regulator of autophagy, GAPR-1 (also called GLIPR2). Tat–beclin 1 decreases the accumulation of polyglutamine expansion protein aggregates and the replication of several pathogens (including HIV-1) in vitro, and reduces mortality in mice infected with chikungunya or West Nile virus. Thus, through the characterization of a domain of beclin 1 that interacts with HIV-1 Nef, we have developed an autophagy-inducing peptide that has potential efficacy in the treatment of human diseases.
Candidatus Liberibacter asiaticus (Ca. L. asiaticus) is a Gram-negative bacterium and the pathogen of Citrus Greening disease (Huanglongbing, HLB). As a parasitic bacterium, Ca. L. asiaticus harbors ABC transporters that play important roles in exchanging chemical compounds between Ca. L. asiaticus and its host. Here, we analyzed all the ABC transporter-related proteins in Ca. L. asiaticus. We identified 14 ABC transporter systems and predicted their structures and substrate specificities. In-depth sequence and structure analysis including multiple sequence alignment, phylogenetic tree reconstruction, and structure comparison further support their function predictions. Our study shows that this bacterium could use these ABC transporters to import metabolites (amino acids and phosphates) and enzyme cofactors (choline, thiamine, iron, manganese, and zinc), resist to organic solvent, heavy metal, and lipid-like drugs, maintain the composition of the outer membrane (OM), and secrete virulence factors. Although the features of most ABC systems could be deduced from the abundant experimental data on their orthologs, we reported several novel observations within ABC system proteins. Moreover, we identified seven nontransport ABC systems that are likely involved in virulence gene expression regulation, transposon excision regulation, and DNA repair. Our analysis reveals several candidates for further studies to understand and control the disease, including the type I virulence factor secretion system and its substrate that are likely related to Ca. L. asiaticus pathogenicity and the ABC transporter systems responsible for bacterial OM biosynthesis that are good drug targets.
genomic annotation; function prediction; ATPase; transmembrane protein; multiple sequence alignment; phylogenetic tree; protein homology; structure comparison
Protein phosphorylation is a fundamental mechanism regulating nearly every aspect of cellular life. Several secreted proteins are phosphorylated, but the kinases responsible are unknown. We identified a family of atypical protein kinases that localize within the Golgi apparatus and are secreted. Fam20C appears to be the Golgi casein kinase that phosphorylates secretory pathway proteins within S-x-E motifs. Fam20C phosphorylates the caseins and several secreted proteins implicated in biomineralization, including the small integrin-binding ligand, N-linked glycoproteins (SIBLINGs). Consequently, mutations in Fam20C cause an osteosclerotic bone dysplasia in humans known as Raine syndrome. Fam20C is thus a protein kinase dedicated to the phosphorylation of extracellular proteins.
A daring experiment is performed. Using sequence alignments to predict contacts between residues in protein spatial structures, Hopf et al. (2012) are publishing untested de novo structure models for 11 transmembrane protein families. Will their models stand the test of time and hold up to experimentation? The prospects are excellent.
Recently, the nature of protein structure space has been widely discussed in the literature. The traditional discrete view of protein universe as a set of separate folds has been criticized in the light of growing evidence that almost any arrangement of secondary structures is possible and the whole protein space can be traversed through a path of similar structures. Here we argue that the discrete and continuous descriptions are not mutually exclusive, but complementary: the space is largely discrete in evolutionary sense, but continuous geometrically when purely structural similarities are quantified. Evolutionary connections are mainly confined to separate structural prototypes corresponding to folds as islands of structural stability, with few remaining traceable links between the islands. However, for a geometric similarity measure, it is usually possible to find a reasonable cutoff that yields paths connecting any two structures through intermediates.
We describe predictions made using the Rosetta structure prediction methodology for the Eighth Critical Assessment of Techniques for Protein Structure Prediction. Aggressive sampling and all-atom refinement were carried out for nearly all targets. A combination of alignment methodologies was used to generate starting models from a range of templates, and the models were then subjected to Rosetta all atom refinement. For 50 targets with readily identified templates, the best submitted model was better than the best alignment to the best template in the Protein Data Bank for 24 domains, and improved over the best starting model for 43 domains. For 13 targets where only very distant sequence relationships to proteins of known structure were detected, models were generated using the Rosetta de novo structure prediction methodology followed by all-atom refinement; in several cases the submitted models were better than those based on the available templates. Of the 12 refinement challenges, the best submitted model improved on the starting model in 7 cases. These improvements over the starting template-based models and refinement tests demonstrate the power of Rosetta structure refinement in improving model accuracy.
A Rho GTPase inactivation domain (RID) has been discovered in the multifunctional, autoprocessing RTX toxin RtxA from Vibrio cholerae. The RID domain causes actin depolymerization and rounding of host cells through inactivation of the small Rho GTPases Rho, Rac and Cdc42. With only a few toxin proteins containing RID domains in the current sequence database, the structure and molecular mechanisms of this domain are unknown. Using comparative sequence and structural analyses, we report homology inference, fold recognition, and active site prediction for RID domains. Remote homologs of RID domains were identified in two other experimentally characterized bacterial virulence factors: IcsB of Shigella flexneri and BopA of Burkholderia pseudomallei, as well as in a group of uncharacterized bacterial membrane proteins. IcsB plays an important role in helping Shigella to evade the host autophagy defense system. RID domain homologs share a conserved diad of cysteine and histidine residues, and are predicted to adopt a circularly permuted papain-like thiol protease fold. RID domains from MARTX toxins and virulence factors IcsB and BopA thus could function as proteases or acyltransferases acting on host molecules. Our results provide structural and mechanistic insights into several important proteins functioning in bacterial pathogenesis.
Rho GTPase inactivation; cysteine protease domain; papain-like fold; multifunctional; autoprocessing RTX toxins; Shigella virulence factor IcsB; structure prediction; homology inference
Cellular iron homeostasis is maintained by the coordinate posttranscriptional regulation of genes responsible for iron uptake, release, use, and storage through the actions of the iron regulatory proteins IRP1 and IRP2. However, the manner in which iron levels are sensed to affect IRP2 activity is poorly understood. We found that an E3 ubiquitin ligase complex containing the FBXL5 protein targets IRP2 for proteasomal degradation. The stability of FBXL5 itself was regulated, accumulating under iron- and oxygen-replete conditions and degraded upon iron depletion. FBXL5 contains an iron- and oxygen-binding hemerythrin domain that acted as a ligand-dependent regulatory switch mediating FBXL5's differential stability. These observations suggest a mechanistic link between iron sensing via the FBXL5 hemerythrin domain, IRP2 regulation, and cellular responses to maintain mammalian iron homeostasis.