Group B Streptococcus invades human amniotic epithelial cells using a hemolytic pigment.
Microbial infection of the amniotic fluid is a significant cause of fetal injury, preterm birth, and newborn infections. Group B Streptococcus (GBS) is an important human bacterial pathogen associated with preterm birth, fetal injury, and neonatal mortality. Although GBS has been isolated from amniotic fluid of women in preterm labor, mechanisms of in utero infection remain unknown. Previous studies indicated that GBS are unable to invade human amniotic epithelial cells (hAECs), which represent the last barrier to the amniotic cavity and fetus. We show that GBS invades hAECs and strains lacking the hemolysin repressor CovR/S accelerate amniotic barrier failure and penetrate chorioamniotic membranes in a hemolysin-dependent manner. Clinical GBS isolates obtained from women in preterm labor are hyperhemolytic and some are associated with covR/S mutations. We demonstrate for the first time that hemolytic and cytolytic activity of GBS is due to the ornithine rhamnolipid pigment and not due to a pore-forming protein toxin. Our studies emphasize the importance of the hemolytic GBS pigment in ascending infection and fetal injury.
Summary: Gibberellic acids (GAs) are key plant hormones, regulating various aspects of growth and development, which have been at the center of the ‘green revolution’. GRAS family proteins, the primary players in GA signaling pathways, remain poorly understood. Using sequence-profile searches, structural comparisons and phylogenetic analysis, we establish that the GRAS family first emerged in bacteria and belongs to the Rossmann fold methyltransferase superfamily. All bacterial and a subset of plant GRAS proteins are likely to function as small-molecule methylases. The remaining plant versions have lost one or more AdoMet (SAM)-binding residues while preserving their substrate-binding residues. We predict that GRAS proteins might either modify or bind small molecules such as GAs or their derivatives.
Supplementary Material for this article is available at Bioinformatics online.
We provide a portrait of the bacterial transcription apparatus in light of the data emerging from structural studies, sequence analysis and comparative genomics to bring out important but underappreciated features. We first describe the key structural highlights and evolutionary implications emerging from comparison of the cellular RNA polymerase subunits with the RNA-dependent RNA polymerase involved in RNAi in eukaryotes and their homologs from newly identified bacterial selfish elements. We describe some previously unnoticed domains and the possible evolutionary stages leading to the RNA polymerases of extant life forms. We then present the case for the ancient orthology of the basal transcription factors, the sigma factor and TFIIB, in the bacterial and the archaeo-eukaryotic lineages. We also present a synopsis of the structural and architectural taxonomy of specific transcription factors and their genome-scale demography. In this context, we present certain notable deviations from the otherwise invariant prote-ome-wide trends in transcription factor distribution and use it to predict the presence of an unusual lineage-specifically expanded signaling system in certain firmicutes like Paenibacillus. We then discuss the intersection between functional properties of transcription factors and the organization of transcriptional networks. Finally, we present some of the interesting evolutionary conundrums posed by our newly gained understanding of the bacterial transcription apparatus and potential areas for future explorations.
RNA polymerase; Beta barrel; Two component system; Activators; Transcription factors; Mobile elements; ATPases
Every genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology.
We analyze a previously uncharacterized Pfam protein family called DUF4424 [Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210 from Legionella pneumophila provides the first structural information pertaining to this family. This protein additionally includes the first representative structure of another Pfam family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure analysis allows us to recognize distant similarities between the DUF4424 domain and individual domains of M1 aminopeptidases and tricorn proteases, which form massive proteasome-like capsids in both archaea and bacteria.
Based on our analyses we hypothesize that the DUF4424 domain may have a role in forming large, multi-component enzyme complexes. We suggest that the YARGH domain may play a role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer membrane lipid or lipopolysaccharide.
Domain of unknown function; Protein family; Protein structure; DUF4424; YARHG domain; Sequence analysis
The bacterial SOS response is an elaborate program for DNA repair, cell cycle regulation and adaptive mutagenesis under stress conditions. Using sensitive sequence and structure analysis, combined with contextual information derived from comparative genomics and domain architectures, we identify two novel domain superfamilies in the SOS response system. We present evidence that one of these, the SOS response associated peptidase (SRAP; Pfam: DUF159) is a novel thiol autopeptidase. Given the involvement of other autopeptidases, such as LexA and UmuD, in the SOS response, this finding suggests that multiple structurally unrelated peptidases have been recruited to this process. The second of these, the ImuB-C superfamily, is linked to the Y-family DNA polymerase-related domain in ImuB, and also occurs as a standalone protein. We present evidence using gene neighborhood analysis that both these domains function with different mutagenic polymerases in bacteria, such as Pol IV (DinB), Pol V (UmuCD) and ImuA-ImuB-DnaE2 and also other repair systems, which either deploy Ku and an ATP-dependent ligase or a SplB-like radical SAM photolyase. We suggest that the SRAP superfamily domain functions as a DNA-associated autoproteolytic switch that recruits diverse repair enzymes upon DNA damage, whereas the ImuB-C domain performs a similar function albeit in a non-catalytic fashion. We propose that C3Orf37, the eukaryotic member of the SRAP superfamily, which has been recently shown to specifically bind DNA with 5-hydroxymethylcytosine, 5-formylcytosine and 5-carboxycytosine, is a sensor for these oxidized bases generated by the TET enzymes from methylcytosine. Hence, its autoproteolytic activity might help it act as a switch that recruits DNA repair enzymes to remove these oxidized methylcytosine species as part of the DNA demethylation pathway downstream of the TET enzymes.
This article was reviewed by RDS, RF and GJ.
Discovery of the TET/JBP family of dioxygenases that modify bases in DNA has sparked considerable interest in novel DNA base modifications and their biological roles. Using sensitive sequence and structure analyses combined with contextual information from comparative genomics, we computationally characterize over 12 novel biochemical systems for DNA modifications. We predict previously unidentified enzymes, such as the kinetoplastid J-base generating glycosyltransferase (and its homolog GREB1), the catalytic specificity of bacteriophage TET/JBP proteins and their role in complex DNA base modifications. We also predict the enzymes involved in synthesis of hypermodified bases such as alpha-glutamylthymine and alpha-putrescinylthymine that have remained enigmatic for several decades. Moreover, the current analysis suggests that bacteriophages and certain nucleo-cytoplasmic large DNA viruses contain an unexpectedly diverse range of DNA modification systems, in addition to those using previously characterized enzymes such as Dam, Dcm, TET/JBP, pyrimidine hydroxymethylases, Mom and glycosyltransferases. These include enzymes generating modified bases such as deazaguanines related to queuine and archaeosine, pyrimidines comparable with lysidine, those derived using modified S-adenosyl methionine derivatives and those using TET/JBP-generated hydroxymethyl pyrimidines as biosynthetic starting points. We present evidence that some of these modification systems are also widely dispersed across prokaryotes and certain eukaryotes such as basidiomycetes, chlorophyte and stramenopile alga, where they could serve as novel epigenetic marks for regulation or discrimination of self from non-self DNA. Our study extends the role of the PUA-like fold domains in recognition of modified nucleic acids and predicts versions of the ASCH and EVE domains to be novel ‘readers’ of modified bases in DNA. These results open opportunities for the investigation of the biology of these systems and their use in biotechnology.
The major role of enzymatic toxins that target nucleic acids in biological conflicts at all levels has become increasingly apparent thanks in large part to the advances of comparative genomics. Typically, toxins evolve rapidly hampering the identification of these proteins by sequence analysis. Here we analyze an unexpectedly widespread superfamily of toxin domains most of which possess RNase activity.
The HEPN superfamily is comprised of all α-helical domains that were first identified as being associated with DNA polymerase β-type nucleotidyltransferases in prokaryotes and animal Sacsin proteins. Using sensitive sequence and structure comparison methods, we vastly extend the HEPN superfamily by identifying numerous novel families and by detecting diverged HEPN domains in several known protein families. The new HEPN families include the RNase LS and LsoA catalytic domains, KEN domains (e.g. RNaseL and Ire1) and the RNase domains of RloC and PrrC. The majority of HEPN domains contain conserved motifs that constitute a metal-independent endoRNase active site. Some HEPN domains lacking this motif probably function as non-catalytic RNA-binding domains, such as in the case of the mannitol repressor MtlR. Our analysis shows that HEPN domains function as toxins that are shared by numerous systems implicated in intra-genomic, inter-genomic and intra-organismal conflicts across the three domains of cellular life. In prokaryotes HEPN domains are essential components of numerous toxin-antitoxin (TA) and abortive infection (Abi) systems and in addition are tightly associated with many restriction-modification (R-M) and CRISPR-Cas systems, and occasionally with other defense systems such as Pgl and Ter. We present evidence of multiple modes of action of HEPN domains in these systems, which include direct attack on viral RNAs (e.g. LsoA and RNase LS) in conjunction with other RNase domains (e.g. a novel RNase H fold domain, NamA), suicidal or dormancy-inducing attack on self RNAs (RM systems and possibly CRISPR-Cas systems), and suicidal attack coupled with direct interaction with phage components (Abi systems). These findings are compatible with the hypothesis on coupling of pathogen-targeting (immunity) and self-directed (programmed cell death and dormancy induction) responses in the evolution of robust antiviral strategies. We propose that altruistic cell suicide mediated by HEPN domains and other functionally similar RNases was essential for the evolution of kin and group selection and cell cooperation. HEPN domains were repeatedly acquired by eukaryotes and incorporated into several core functions such as endonucleolytic processing of the 5.8S-25S/28S rRNA precursor (Las1), a novel ER membrane-associated RNA degradation system (C6orf70), sensing of unprocessed transcripts at the nuclear periphery (Swt1). Multiple lines of evidence suggest that, similar to prokaryotes, HEPN proteins were recruited to antiviral, antitransposon, apoptotic systems or RNA-level response to unfolded proteins (Sacsin and KEN domains) in several groups of eukaryotes.
Extensive sequence and structure comparisons reveal unexpectedly broad presence of the HEPN domain in an enormous variety of defense and stress response systems across the tree of life. In addition, HEPN domains have been recruited to perform essential functions, in particular in eukaryotic rRNA processing. These findings are expected to stimulate experiments that could shed light on diverse cellular processes across the three domains of life.
This article was reviewed by Martijn Huynen, Igor Zhulin and Nick Grishin
The PIWI module, found in the PIWI/AGO superfamily of proteins, is a critical component of several cellular pathways including germline maintenance, chromatin organization, regulation of splicing, RNA interference, and virus suppression. It binds a guide strand which helps it target complementary nucleic strands.
Here we report the discovery of two divergent, novel families of PIWI modules, the first such to be described since the initial discovery of the PIWI/AGO superfamily over a decade ago. Both families display conservation patterns consistent with the binding of oligonucleotide guide strands. The first family is bacterial in distribution and is typically encoded by a distinctive three-gene operon alongside genes for a restriction endonuclease fold enzyme and a helicase of the DinG family. The second family is found only in eukaryotes. It is the core conserved module of the Med13 protein, a subunit of the CDK8 subcomplex of the transcription regulatory Mediator complex.
Based on the presence of the DinG family helicase, which specifically acts on R-loops, we infer that the first family of PIWI modules is part of a novel RNA-dependent restriction system which could target invasive DNA from phages, plasmids or conjugative transposons. It is predicted to facilitate restriction of actively transcribed invading DNA by utilizing RNA guides. The PIWI family found in the eukaryotic Med13 proteins throws new light on the regulatory switch through which the CDK8 subcomplex modulates transcription at Mediator-bound promoters of highly transcribed genes. We propose that this involves recognition of small RNAs by the PIWI module in Med13 resulting in a conformational switch that propagates through the Mediator complex.
This article was reviewed by Sandor Pongor, Frank Eisenhaber and Balaji Santhanam.
Complex regulatory networks orchestrate most cellular processes in biological systems. Genes in such networks are subject to expression noise, resulting in isogenic cell populations exhibiting cell-to-cell variation in protein levels. Increasing evidence suggests that cells have evolved regulatory strategies to limit, tolerate, or amplify expression noise. In this context, fundamental questions arise: how can the architecture of gene regulatory networks generate, make use of, or be constrained by expression noise? Here, we discuss the interplay between expression noise and gene regulatory network at different levels of organization, ranging from a single regulatory interaction to entire regulatory networks. We then consider how this interplay impacts a variety of phenomena such as pathogenicity, disease, adaptation to changing environments, differential cell-fate outcome and incomplete or partial penetrance effects. Finally, we highlight recent technological developments that permit measurements at the single-cell level, and discuss directions for future research.
expression noise; gene regulatory network; persistence; phenotypic variation; single-cell analysis; differentiation and development
Memory B cells are generated during an individual's first encounter with a foreign antigen and respond to re-encounter with the same antigen through cell surface immunoglobulin G (IgG) B cell receptors (BCRs) resulting in rapid, high-titered IgG antibody responses. Despite a central role for IgG BCRs in B cell memory, our understanding of the molecular mechanism by which IgG BCRs enhance antibody responses is incomplete. Here, we showed that the conserved cytoplasmic tail of the IgG BCR, which contains a putative PDZ-binding motif, associated with synapse-associated protein 97 (SAP97), a member of the PDZ domain–containing, membrane-associated guanylate-kinase family of scaffolding molecules that play key roles in controlling receptor density and signal strength at neuronal synapses. We showed that SAP97 accumulated and bound to IgG BCRs in the immune synapses that formed in response to engagement of the B cell with antigen. Knocking down SAP97 in IgG-expressing B cells or mutating the putative PDZ-binding motif in the tail impaired immune synapse formation, the initiation of IgG BCR signaling, and downstream activation of p38 mitogen-activated protein kinase. Thus, heightened B cell memory responses are encoded, in part, by a mechanism that involves SAP97 serving as a scaffolding protein in the IgG BCR immune synapse.
Chromatin dynamics play a central role in maintaining genome integrity, but how this is achieved remains largely unknown. Here, we report that microrchidia CW-type zinc finger 2 (MORC2), an uncharacterized protein with a derived PHD finger domain and a conserved GHKL-type ATPase module, is a physiological substrate of p21-activated kinase 1 (PAK1), an important integrator of extracellular signals and nuclear processes. Following DNA damage, MORC2 is phosphorylated on serine 739 in a PAK1 dependent manner, and phosphorylated MORC2 regulates its DNA-dependent ATPase activity to facilitate chromatin remodeling. Moreover, MORC2 associates with chromatin and promotes gamma-H2AX induction in a PAK1 phosphorylation-dependent manner. Consequently, cells expressing MORC2-S739A mutation displayed a reduction in DNA repair efficiency and were hypersensitive to DNA-damaging agent. These findings suggest that the PAK1-MORC2 axis is critical for orchestrating the interplay between chromatin dynamics and the maintenance of genomic integrity through sequentially integrating multiple essential enzymatic processes.
Chromatin remodeling; DNA damage response; Genomic stability; Modifier of radiosensitivity; MORC2
The tripartite DENN module, comprised of a N-terminal longin domain, followed by DENN, and d-DENN domains, is a GDP-GTP exchange factor (GEFs) for Rab GTPases, which are regulators of practically all membrane trafficking events in eukaryotes. Using sequence and structure analysis we identify multiple novel homologs of the DENN module, many of which can be traced back to the ancestral eukaryote. These findings provide unexpected leads regarding key cellular processes such as autophagy, vesicle-vacuole interactions, chromosome segregation, and human disease. Of these, SMCR8, the folliculin interacting protein-1 and 2 (FNIP1 and FNIP2), nitrogen permease regulator 2 (NPR2), and NPR3 are proposed to function in recruiting Rab GTPases during different steps of autophagy, fusion of autophagosomes with the vacuole and regulation of cellular metabolism. Another novel DENN protein identified in this study is C9ORF72; expansions of the hexanucleotide GGGGCC in its first intron have been recently implicated in amyotrophic lateral sclerosis (ALS) and fronto-temporal dementia (FTD). While this mutation is proposed to cause a RNA-level defect, the identification of C9ORF72 as a potential DENN-type GEF raises the possibility that at least part of the pathology might relate to a specific Rab-dependent vesicular trafficking process, as has been observed in the case of some other neurological conditions with similar phenotypes. We present evidence that the longin domain, such as those found in the DENN module, are likely to have been ultimately derived from the related domains found in prokaryotic GTPase-activating proteins of MglA-like GTPases. Thus, the origin of the longin domains from this ancient GTPase-interacting domain, concomitant with the radiation of GTPases, especially of the Rab clade, played an important role in the dynamics of eukaryotic intracellular membrane systems.
membrane trafficking; evolution; homology detection; DENN domain; longin domain; C9ORF72; ALS; FTD
The virus-host arms race is a major theater for evolutionary innovation. Archaea and bacteria have evolved diverse, elaborate antivirus defense systems that function on two general principles: i) immune systems that discriminate self DNA from nonself DNA and specifically destroy the foreign, in particular viral, genomes, whereas the host genome is protected, or ii) programmed cell suicide or dormancy induced by infection.
Presentation of the hypothesis
Almost all genomic loci encoding immunity systems such as CRISPR-Cas, restriction-modification and DNA phosphorothioation also encompass suicide genes, in particular those encoding known and predicted toxin nucleases, which do not appear to be directly involved in immunity. In contrast, the immunity systems do not appear to encode antitoxins found in typical toxin-antitoxin systems. This raises the possibility that components of the immunity system themselves act as reversible inhibitors of the associated toxin proteins or domains as has been demonstrated for the Escherichia coli anticodon nuclease PrrC that interacts with the PrrI restriction-modification system. We hypothesize that coupling of diverse immunity and suicide/dormancy systems in prokaryotes evolved under selective pressure to provide robustness to the antivirus response. We further propose that the involvement of suicide/dormancy systems in the coupled antivirus response could take two distinct forms:
1) induction of a dormancy-like state in the infected cell to ‘buy time’ for activation of adaptive immunity; 2) suicide or dormancy as the final recourse to prevent viral spread triggered by the failure of immunity.
Testing the hypothesis
This hypothesis entails many experimentally testable predictions. Specifically, we predict that Cas2 protein present in all cas operons is a mRNA-cleaving nuclease (interferase) that might be activated at an early stage of virus infection to enable incorporation of virus-specific spacers into the CRISPR locus or to trigger cell suicide when the immune function of CRISPR-Cas systems fails. Similarly, toxin-like activity is predicted for components of numerous other defense loci.
Implications of the hypothesis
The hypothesis implies that antivirus response in prokaryotes involves key decision-making steps at which the cell chooses the path to follow by sensing the course of virus infection.
This article was reviewed by Arcady Mushegian, Etienne Joly and Nick Grishin. For complete reviews, go to the Reviewers’ reports section.
Members of the Arabidopsis LSH1 and Oryza G1 (ALOG) family of proteins have been shown to function as key developmental regulators in land plants. However, their precise mode of action remains unclear. Using sensitive sequence and structure analysis, we show that the ALOG domains are a distinct version of the N-terminal DNA-binding domain shared by the XerC/D-like, protelomerase, topoisomerase-IA, and Flp tyrosine recombinases. ALOG domains are distinguished by the insertion of an additional zinc ribbon into this DNA-binding domain. In particular, we show that the ALOG domain is derived from the XerC/D-like recombinases of a novel class of DIRS-1-like retroposons. Copies of this element, which have been recently inactivated, are present in several marine metazoan lineages, whereas the stramenopile Ectocarpus, retains an active copy of the same. Thus, we predict that ALOG domains help establish organ identity and differentiation by binding specific DNA sequences and acting as transcription factors or recruiters of repressive chromatin. They are also found in certain plant defense proteins, where they are predicted to function as DNA sensors. The evolutionary history of the ALOG domain represents a unique instance of a domain, otherwise exclusively found in retroelements, being recruited as a specific transcription factor in the streptophyte lineage of plants. Hence, they add to the growing evidence for derivation of DNA-binding domains of eukaryotic specific TFs from mobile and selfish elements.
DIRS1; Tyrosine recombinase; Plant development; DNA-binding; Retroposon; Transcription factor; Chromatin protein; Plant defense
In addition to their role in motility, eukaryotic cilia serve as a distinct compartment for signal transduction and regulatory sequestration of biomolecules. Recent genetic and biochemical studies have revealed an extraordinary diversity of protein complexes involved in the biogenesis of cilia during each cell cycle. Mutations in components of these complexes are at the heart of human ciliopathies such as Nephronophthisis (NPHP), Meckel-Gruber syndrome (MKS), Bardet-Biedl syndrome (BBS) and Joubert syndrome (JBTS). Despite intense studies, proteins in some of these complexes, such as the NPHP1-4-8 and the MKS, remain poorly understood. Using a combination of computational analyses we studied these complexes to identify novel domains in them which might throw new light on their functions and evolutionary origins. First, we identified both catalytically active and inactive versions of transglutaminase-like (TGL) peptidase domains in key ciliary/centrosomal proteins CC2D2A/MKS6, CC2D2B, CEP76 and CCDC135. These ciliary TGL domains appear to have originated from prokaryotic TGL domains that act as peptidases, either in a prokaryotic protein degradation system with the MoxR AAA+ ATPase, the precursor of eukaryotic dyneins and midasins, or in a peptide-ligase system with an ATP-grasp enzyme comparable to tubulin-modifying TTL proteins. We suggest that active ciliary TGL proteins are part of a cilia-specific peptidase system that might remove tubulin modifications or cleave cilia- localized proteins, while the inactive versions are likely to bind peptides and mediate key interactions during ciliogenesis. Second, we observe a vast radiation of C2 domains, which are key membrane-localization modules, in multiple ciliary proteins, including those from the NPHP1-4-8 and the MKS complexes, such as CC2D2A/MKS6, RPGRIP1, RPGRIP1L, NPHP1, NPHP4, C2CD3, AHI1/Jouberin and CEP76, most of which can be traced back to the last eukaryotic ancestor. Identification of these TGL and C2 domains aid in the proper reconstruction of the Y-shaped linkers, which are key structures in the transitional zone of cilia, by allowing precise prediction of the multiple membrane-contacting and protein-protein interaction sites in these structures. These findings help decipher key events in the evolutionary separation of the ciliary and nuclear compartments in course of the emergence of the eukaryotic cell.
ciliogenesis; transglutaminase-like; membrane; tubulin-tyrosine ligase; C2; transition zone; Y-shaped linkers; evolution; origin of eukaryotes; ciliopathy
We briefly review the history of microRNA (miRNA) research and some of the lessons learnt. To provide some insights as to how and why miRNAs came into existence, we consider the evolution of the RNA interference machinery, miRNA genes, and their targets. We highlight the importance of systems biology approaches to integrate miRNAs as an essential subnetwork for modulating gene expression programs. Building accurate computational models that can simulate highly complex cell-specific gene expression patterns in mammals will lead to a better understanding of miRNAs and their targets in physiological and pathological situations. The impact of miRNAs on medicine, either as potential disease predisposing factors, biomarkers or therapeutics, is highly anticipated and has started to reveal itself.
Gene expression; regulatory networks; comparative genomics; untranslated regulatory RNA; RNA interference
Though the heterotrimeric G-proteins signaling system is one of the best studied in eukaryotes, its provenance and its prevalence outside of model eukaryotes remains poorly understood. We utilized the wealth of sequence data from recently sequenced eukaryotic genomes to uncover robust G-protein signaling systems in several poorly studied eukaryotic lineages such as the parabasalids, heteroloboseans and stramenopiles. This indicated that the Gα subunit is likely to have separated from the ARF-like GTPases prior to the last eukaryotic common ancestor. We systematically identified the structure and sequence features associated with this divergence and found that most of the neomorphic positions in Gα form a ring of residues centered on the nucleotide binding site, several of which are likely to be critical for interactions with the RGS domain for its GAP function. We also present evidence that in some of the potentially early branching eukaryotic lineages, like Trichomonas, Gα is likely to function independently of the Gβγ subunits. We were able to identify previously unknown Gγ subunits in Naegleria, suggesting that the trimeric version was already present by the time of the divergence of the heteroloboseans from the remaining eukaryotes. Evolution of Gα subunits is dominated by several independent lineage-specific expansions (LSEs). In most of these cases there are concomitant, independent LSEs of RGS proteins along with an extraordinary diversification of their domain architectures. The diversity of RGS domains from Naegleria in particular, which has the largest complement of Gα and RGS proteins for any eukaryote, provides new insights into RGS function and evolution. We uncovered a new class of soluble ligand receptors of bacterial origin with RGS domains and an extraordinary diversity of membrane-linked, redox-associated, adhesion-dependent and small molecule-induced G-protein signaling networks that evolved in early-branching eukaryotes, independently of parallel systems in animals. Furthermore, this newly characterized diversity of RGS domains helps in defining their ancestral conserved interfaces with Gα and also those interfaces that are prone to extensive lineage-specific diversification and are thereby responsible for selectivity in Gα-RGS interactions. Several mushrooms show LSEs of Gαs but not of RGS proteins pointing to the probable differentiation of Gαs in conjunction with mating-type diversity. When combined with the characterization of the 7TM receptors (GPCRs), it becomes apparent that, through much of eukaryotic evolution, cells contained both 7TM receptors that acted as GEFs and those as GAPs (with C-terminal RGS domains) for Gαs. Only in some lineages like animals and stramenopiles the 7TM receptors were restricted to GEF only roles, probably due to selection imposed by the rate-constants of the Gαs that underwent lineage-specific expansion in them. In the alveolate lineage the 7TM receptors occur independently of heterotrimeric G-proteins, suggesting the prevalence of G-protein-independent signaling in these organisms.
Proteinaceous toxins are observed across all levels of inter-organismal and intra-genomic conflicts. These include recently discovered prokaryotic polymorphic toxin systems implicated in intra-specific conflicts. They are characterized by a remarkable diversity of C-terminal toxin domains generated by recombination with standalone toxin-coding cassettes. Prior analysis revealed a striking diversity of nuclease and deaminase domains among the toxin modules. We systematically investigated polymorphic toxin systems using comparative genomics, sequence and structure analysis.
Polymorphic toxin systems are distributed across all major bacterial lineages and are delivered by at least eight distinct secretory systems. In addition to type-II, these include type-V, VI, VII (ESX), and the poorly characterized “Photorhabdus virulence cassettes (PVC)”, PrsW-dependent and MuF phage-capsid-like systems. We present evidence that trafficking of these toxins is often accompanied by autoproteolytic processing catalyzed by HINT, ZU5, PrsW, caspase-like, papain-like, and a novel metallopeptidase associated with the PVC system. We identified over 150 distinct toxin domains in these systems. These span an extraordinary catalytic spectrum to include 23 distinct clades of peptidases, numerous previously unrecognized versions of nucleases and deaminases, ADP-ribosyltransferases, ADP ribosyl cyclases, RelA/SpoT-like nucleotidyltransferases, glycosyltranferases and other enzymes predicted to modify lipids and carbohydrates, and a pore-forming toxin domain. Several of these toxin domains are shared with host-directed effectors of pathogenic bacteria. Over 90 families of immunity proteins might neutralize anywhere between a single to at least 27 distinct types of toxin domains. In some organisms multiple tandem immunity genes or immunity protein domains are organized into polyimmunity loci or polyimmunity proteins. Gene-neighborhood-analysis of polymorphic toxin systems predicts the presence of novel trafficking-related components, and also the organizational logic that allows toxin diversification through recombination. Domain architecture and protein-length analysis revealed that these toxins might be deployed as secreted factors, through directed injection, or via inter-cellular contact facilitated by filamentous structures formed by RHS/YD, filamentous hemagglutinin and other repeats. Phyletic pattern and life-style analysis indicate that polymorphic toxins and polyimmunity loci participate in cooperative behavior and facultative ‘cheating’ in several ecosystems such as the human oral cavity and soil. Multiple domains from these systems have also been repeatedly transferred to eukaryotes and their viruses, such as the nucleo-cytoplasmic large DNA viruses.
Along with a comprehensive inventory of toxins and immunity proteins, we present several testable predictions regarding active sites and catalytic mechanisms of toxins, their processing and trafficking and their role in intra-specific and inter-specific interactions between bacteria. These systems provide insights regarding the emergence of key systems at different points in eukaryotic evolution, such as ADP ribosylation, interaction of myosin VI with cargo proteins, mediation of apoptosis, hyphal heteroincompatibility, hedgehog signaling, arthropod toxins, cell-cell interaction molecules like teneurins and different signaling messengers.
This article was reviewed by AM, FE and IZ.
Development of malaria parasites within vertebrate erythrocytes requires nutrient uptake at the host cell membrane. The plasmodial surface anion channel (PSAC) mediates this transport and is an antimalarial target, but despite its importance, its molecular basis has been unknown. We now report a parasite gene family responsible for PSAC activity. We performed high-throughput screening to find transport inhibitors specific for distinct lines of the human pathogen P. falciparum. One inhibitor, 800-fold more active against PSAC from the Dd2 line than from HB3 parasites, was used with a genetic cross to map a single parasite locus on chromosome 3. DNA transfection and in vitro selections indicate that PSAC-inhibitor interactions are determined by two clag genes previously assumed to function in cytoadherence. These genes are conserved in plasmodia, exhibit expression switching, and encode an integral protein on the host membrane, as predicted by functional studies. This protein establishes novel ion channel activity on the erythrocyte surface.
Large-scale chemical genetics screens (chemogenomics) in yeast have been widely used to find drug targets, understand the mechanism-of-action of compounds, and unravel the biochemistry of drug resistance. Chemogenomics is based on the comparison of growth of gene deletants in the presence and absence of a chemical substance. Such studies showed that more than 90% of the yeast genes are required for growth in the presence of at least one chemical. Analysis of these data, using computational approaches, has revealed non-trivial features of the natural chemical tolerance systems. As a result two non-overlapping sets of genes are seen to respectively impart robustness and evolvability in the context of natural chemical resistance. The former is composed of multidrug-resistance genes, whereas the latter comprises genes sharing chemical genetic profiles with many others. Recent publications showing the potential applications chemogenomics in studying the pharmacological basis of various drugs are discussed, as well as the expansion of chemogenomics to other organisms. Finally, integration of chemogenomics with sensitive sequence analysis and ubiquitination/phosphorylation data led to the discovery of a new conserved domain and important post-translational modification pathways involved in stress resistance.
chemogenomics; yeast; chemical genetics; evolution; multi drug resistance; biochemistry; ubiquitin; phosphorylation
Recent studies have shown that the ubiquitin system had its origins in ancient cofactor/amino acid biosynthesis pathways. Preliminary studies also indicated that conjugation systems for other peptide tags on proteins, such as pupylation, have evolutionary links to cofactor/amino acid biosynthesis pathways. Following up on these observations, we systematically investigated the non-ribosomal amidoligases of the ATP-grasp, glutamine synthetase-like and acetyltransferase folds by classifying the known members and identifying novel versions. We then established their contextual connections using information from domain architectures and conserved gene neighborhoods. This showed remarkable, previously uncharacterized functional links between diverse peptide ligases, several peptidases of unrelated folds and enzymes involved in synthesis of modified amino acids. Using the network of contextual connections we were able to predict numerous novel pathways for peptide synthesis and modification, amine-utilization, secondary metabolite synthesis and potential peptide-tagging systems. One potential peptide-tagging system, which is widely distributed in bacteria, involves an ATP-grasp domain and a glutamine synthetase-like ligase, both of which are circularly permuted, an NTN hydrolase fold peptidase and a novel alpha helical domain. Our analysis also elucidates key steps in the biosynthesis of antibiotics such as friulimicin, butirosin and bacilysin and cell surface structures such as capsular polymers and teichuronopeptides. We also report the discovery of several novel ribosomally synthesized bacterial peptide metabolites that are cyclized via amide and lactone linkages formed by ATP-grasp enzymes. We present an evolutionary scenario for the multiple convergent origins of peptide ligases in various folds and clarify the bacterial origin of eukaryotic peptide-tagging enzymes of the TTL family.
Human ASXL proteins, orthologs of Drosophila Additional sex combs, have been implicated in conjunction with TET2 as a major target for mutations and translocations, leading to a wide range of myeloid leukemias, related myelodysplastic conditions (ASXL1 and ASXL2) and Bohring-Opitz syndrome, a developmental disorder (ASXL1). Using sensitive sequence and structure comparison methods, we show that most animal ASXL proteins contain a novel N-terminal domain that is also found in several other eukaryotic chromatin proteins, diverse restriction endonucleases and DNA glycosylases, the RNA polymerase delta subunit of Gram-positive bacteria and certain bacterial proteins that combine features of the RNA polymerase α-subunit and sigma factors. This domain adopts the winged helix-turn-helix fold (wHTH) and is predicted to bind DNA. Based on its domain architectural contexts, we present evidence that this domain might play an important role both in eukaryotes and bacteria in the recruitment of diverse effector activities, including the the deubiquitinase subunit of Polycomb repressive complexes, to DNA, depending on the state of epigenetic modifications such as 5-methylcytosine and its oxidized derivatives. In other eukaryotic chromatin proteins, wHTH domain is fused to a region with three conserved motifs, which are also found in diverse eukaryotic chromatin proteins, such as the animal BAZ/WAL proteins, plant HB1 and MBD9, yeast Itc1p and Ioc3, RSF1, CECR2 and NURF1. Based on the crystal structure of Ioc3, we establish that these motifs in conjunction with the DDT motif constitute a structural determinant that is central to nucleosomal repositioning by the ISWI clade of SWI2/SNF2 ATPases. We also show that the central domain of the ASXL proteins (ASXH domain) is conserved outside of animals in fungi and plants, where it is combined with other domains, suggesting that it might be an ancient module mediating interactions between chromatin-linked protein complexes and transcription factors and UCH37-like deubiquitnases via its conserved LXXLL motif. We present evidence that the C-terminal PHD finger of ASXL protein has peculiar structural modifications that might allow it to recognize internal modified lysines other than those from the N terminus of histone H3, making it the mediator of previously unexpected interactions in chromatin.
DNA modifications; methylation; Tet2; nucleosomal spacing ISWI; restriction enzymes; polycomb repressive complex
The endosymbiotic origin of eukaryotes brought together two disparate genomes in the cell. Additionally, eukaryotic natural history has included other endosymbiotic events, phagotrophic consumption of organisms, and intimate interactions with viruses and endoparasites. These phenomena facilitated large-scale lateral gene transfer and biological conflicts. We synthesize information from nearly two decades of genomics to illustrate how the interplay between lateral gene transfer and biological conflicts has impacted the emergence of new adaptations in eukaryotes. Using apicomplexans as example, we illustrate how lateral transfer from animals has contributed to unique parasite-host interfaces comprised of adhesion- and O-linked glycosylation-related domains. Adaptations, emerging due to intense selection for diversity in the molecular participants in organismal and genomic conflicts, being dispersed by lateral transfer, were subsequently exapted for eukaryote-specific innovations. We illustrate this using examples relating to eukaryotic chromatin, RNAi and RNA-processing systems, signaling pathways, apoptosis and immunity. We highlight the major contributions from catalytic domains of bacterial toxin systems to the origin of signaling enzymes (e.g., ADP-ribosylation and small molecule messenger synthesis), mutagenic enzymes for immune receptor diversification and RNA-processing. Similarly, we discuss contributions of bacterial antibiotic/siderophore synthesis systems and intra-genomic and intra-cellular selfish elements (e.g., restriction-modification, mobile elements and lysogenic phages) in the emergence of chromatin remodeling/modifying enzymes and RNA-based regulation. We develop the concept that biological conflict systems served as evolutionary “nurseries” for innovations in the protein world, which were delivered to eukaryotes via lateral gene flow to spur key evolutionary innovations all the way from nucleogenesis to lineage-specific adaptations.
antibiotics; biological conflict; endosymbiosis; immunity proteins; restriction-modfication; RNAi; selfish elements; toxins
Chemical genetics in yeast has shown great potential in clarifying the pharmacology of various drugs. Investigating these results from a systems perspective has uncovered many facets of natural chemical tolerance, but many cellular interactions of chemicals still remain poorly understood. To uncover previously overlooked players in resistance to chemical stress we integrated several independent chemical genetics datasets with protein-protein interactions and a comprehensive collection of yeast protein complexes. As consequence we were able to identify the potential targets and mode of action of certain poorly understood compounds. However, most complexes recovered in our analysis appear to perform indirect roles in countering deleterious effects of chemicals by constituting an underlying intricate buffering system that has been so far underappreciated. This buffering role appears to be largely contributed by complexes pertaining to chromatin and vesicular dynamics. The former set of complexes seems to act by setting up or maintaining gene expression states necessary to protect the cell against chemical effects. Among the latter complexes we found an important role for specific vesicle tethering complexes in tolerating particular sets of compounds indicating that different chemicals might be routed via different points in the intracellular trafficking system. We also suggest a general operational similarity between these complexes and molecular capacitors (e.g. the chaperone Hsp90). Both have a key role in increasing the system’s robustness, although at different levels, through buffering stress and mutation, respectively. Therefore, it is conceivable that some of these complexes identified here might have roles in molding the evolution of chemical resistance and response.
Double-stranded DNA viruses display a great variety of proteins that interact with host chromatin. Using the wealth of available genomic and functional information, we have systematically surveyed chromatin-related proteins encoded by dsDNA viruses. The distribution of viral chromatin-related proteins is primarily influenced by viral genome size and the superkingdom to which the host of the virus belongs. Smaller viruses usually encode multifunctional proteins that mediate several distinct interactions with host chromatin proteins and viral or host DNA. Larger viruses additionally encode several enzymes, which catalyze manipulations of chromosome structure, chromatin remodeling and covalent modifications of proteins and DNA. Among these viruses, it is also common to encounter transcription factors and DNA-packaging proteins such as histones and IHF/HU derived from cellular genomes, which might play a role in constituting virus-specific chromatin states. Through all size ranges a subset of domains in viral chromatin proteins appear to have been derived from those found in host proteins. Examples include the Zn-finger domains of the E6 and E7 proteins of papillomaviruses, SET-domain methyltransferases and Jumonji-related demethylases in certain nucleocytoplasmic large DNA viruses and BEN domains in poxviruses and polydnaviruses. In other cases, chromatin-interacting modules, such as the LxCxE motif, appear to have been widely disseminated across distinct viral lineages, resulting in similar retinoblastoma targeting strategies. Viruses, especially those with large linear genomes, have evolved a number of mechanisms to manipulate viral chromosomes in the process of replication-associated recombination. These include topoisomerases, Rad50/SbcC-like ABC ATPases and a novel recombinase system in bacteriophages utilizing RecA and Rad52 homologs. Larger DNA viruses also encode SWI2/SNF2 and A18-like ATPases which appear to play specialized roles in transcription and recombination. Finally, it also appears that certain domains of viral provenance have given rise to key functions in eukaryotic chromatin such as a HEH domain of chromosome tethering proteins and the TET/JBP-like cytosine and thymine hydroxylases.