The Caenorhabditis elegans SH3 domain interactome was mapped and compared with the yeast SH3 interactome. Orthologous SH3 domain-mediated interactions are highly rewired, but the general function of the SH3 domain network is conserved between the two species
C. elegans Src homology 3 (SH3) domain interactome was mapped using stringent yeast two-hybrid, resulting in a total of 1070 interactions among 79 out of 84 worm SH3 domains and 475 proteins.SH3 domain binding specificities were profiled for 36 worm SH3 domains using peptide phage display.The yeast and worm SH3 domain interactomes are significantly enriched in endocytosis proteins, but the specific interactions mediated by orthologous SH3 domains are highly rewired.Using the worm SH3 interactome, we identified new endocytosis proteins in worm and human.
Src homology 3 (SH3) domains bind peptides to mediate protein–protein interactions that assemble and regulate dynamic biological processes. We surveyed the repertoire of SH3 binding specificity using peptide phage display in a metazoan, the worm Caenorhabditis elegans, and discovered that it structurally mirrors that of the budding yeast Saccharomyces cerevisiae. We then mapped the worm SH3 interactome using stringent yeast two-hybrid and compared it with the equivalent map for yeast. We found that the worm SH3 interactome resembles the analogous yeast network because it is significantly enriched for proteins with roles in endocytosis. Nevertheless, orthologous SH3 domain-mediated interactions are highly rewired. Our results suggest a model of network evolution where general function of the SH3 domain network is conserved over its specific form.
network evolution; phage display; protein interaction conservation; SH3 domains; yeast two-hybrid
A comparative analysis of the genomes of Drosophila melanogaster, Caenorhabditis elegans, and Saccharomyces cerevisiae—and the proteins they are predicted to encode—was undertaken in the context of cellular, developmental, and evolutionary processes. The nonredundant protein sets of flies and worms are similar in size and are only twice that of yeast, but different gene families are expanded in each genome, and the multidomain proteins and signaling pathways of the fly and worm are far more complex than those of yeast. The fly has orthologs to 177 of the 289 human disease genes examined and provides the foundation for rapid analysis of some of the basic processes involved in human disease.
The basic helix-loop-helix (bHLH) proteins are a large and complex multigene family of transcription factors with important roles in animal development, including that of fruitflies, nematodes and vertebrates. The identification of orthologous relationships among the bHLH genes from these widely divergent taxa allows reconstruction of the putative complement of bHLH genes present in the genome of their last common ancestor.
We identified 39 different bHLH genes in the worm Caenorhabditis elegans, 58 in the fly Drosophila melanogaster and 125 in human (Homo sapiens). We defined 44 orthologous families that include most of these bHLH genes. Of these, 43 include both human and fly and/or worm genes, indicating that genes from these families were already present in the last common ancestor of worm, fly and human. Only two families contain both yeast and animal genes, and no family contains both plant and animal bHLH genes. We suggest that the diversification of bHLH genes is directly linked to the acquisition of multicellularity, and that important diversification of the bHLH repertoire occurred independently in animals and plants.
As the last common ancestor of worm, fly and human is also that of all bilaterian animals, our analysis indicates that this ancient ancestor must have possessed at least 43 different types of bHLH, highlighting its genomic complexity.
The meiotically expressed Zip3 protein is found conserved from Saccharomyces cerevisiae to humans. In baker's yeast, Zip3p has been implicated in synaptonemal complex (SC) formation, while little is known about the protein's function in multicellular organisms. We report here the successful targeted gene disruption of zhp-3 (K02B12.8), the ZIP3 homolog in the nematode Caenorhabditis elegans. Homozygous zhp-3 knockout worms show normal homologue pairing and SC formation. Also, the timing of appearance and the nuclear localization of the recombination protein Rad-51 seem normal in these animals, suggesting proper initiation of meiotic recombination by DNA double-strand breaks. However, the occurrence of univalents during diplotene indicates that C. elegans ZHP-3 protein is essential for reciprocal recombination between homologous chromosomes and thus chiasma formation. In the absence of ZHP-3, reciprocal recombination is abolished and double-strand breaks seem to be repaired via alternative pathways, leading to achiasmatic chromosomes and the occurrence of univalents during meiosis I. Green fluorescent protein-tagged C. elegans ZHP-3 forms lines between synapsed chromosomes and requires the SC for its proper localization.
Differences between species have been suggested to largely reside in the network of connections among the genes. Nevertheless, the rate at which these connections evolve has not been properly quantified. Here, we measure the extent to which co-regulation between pairs of genes is conserved over large phylogenetic distances; between two eukaryotes Caenorhabditis elegans and Saccharomyces cerevisiae, and between two prokaryotes Escherichia coli and Bacillus subtilis. We first construct a reliable set of co-regulated genes by combining various functional genomics data from yeast, and subsequently determine conservation of co-regulation in worm from the distribution of co-expression values. For B.subtilis and E.coli, we use known operons and regulons. We find that between 76 and 80% of the co-regulatory connections are conserved between orthologous pairs of genes, which is very high compared with previous estimates and expectations regarding network evolution. We show that in the case of gene duplication after speciation, one of the two inparalogous genes tends to retain its original co-regulatory relationship, while the other loses this link and is presumably free for differentiation or sub-functionalization. The high level of co-regulation conservation implies that reliably predicted functional relationships from functional genomics data in one species can be transferred with high accuracy to another species when that species also harbours the associated genes.
The complete genome sequences for human and the nematode Caenorhabditis elegans offer an opportunity to learn more about human gene function through functional characterization of orthologs in the worm. Based on a previous genome-wide analysis of worm-human orthologous transmembrane proteins, we selected seventeen genes to explore experimentally in C. elegans. These genes were selected on the basis that they all have high confidence candidate human orthologs and that their function is unknown. We first analyzed their phylogeny, membrane topology and domain organization. Then gene functions were studied experimentally in the worm by using RNA interference and transcriptional gfp reporter gene fusions.
The experiments gave functional insights for twelve of the genes studied. For example, C36B1.12, the worm ortholog of three presenilin-like genes, was almost exclusively expressed in head neurons, suggesting an ancient conserved role important to neuronal function. We propose a new transmembrane topology for the presenilin-like protein family. sft-4, the worm ortholog of surfeit locus gene Surf-4, proved to be an essential gene required for development during the larval stages of the worm. R155.1, whose human ortholog is entirely uncharacterized, was implicated in body size control and other developmental processes.
By combining bioinformatics and C. elegans experiments on orthologs, we provide functional insights on twelve previously uncharacterized human genes.
The C. elegans genome has been extensively annotated by the WormBase consortium that uses state of the art bioinformatics pipelines, functional genomics and manual curation approaches. As a result, the identification of novel genes in silico in this model organism is becoming more challenging requiring new approaches. The Oligonucleotide-oligosaccharide binding (OB) fold is a highly divergent protein family, in which protein sequences, in spite of having the same fold, share very little sequence identity (5–25%). Therefore, evidence from sequence-based annotation may not be sufficient to identify all the members of this family. In C. elegans, the number of OB-fold proteins reported is remarkably low (n = 46) compared to other evolutionary-related eukaryotes, such as yeast S. cerevisiae (n = 344) or fruit fly D. melanogaster (n = 84). Gene loss during evolution or differences in the level of annotation for this protein family, may explain these discrepancies.
This study examines the possibility that novel OB-fold coding genes exist in the worm. We developed a bioinformatics approach that uses the most sensitive sequence-sequence, sequence-profile and profile-profile similarity search methods followed by 3D-structure prediction as a filtering step to eliminate false positive candidate sequences. We have predicted 18 coding genes containing the OB-fold that have remarkably partially been characterized in C. elegans.
This study raises the possibility that the annotation of highly divergent protein fold families can be improved in C. elegans. Similar strategies could be implemented for large scale analysis by the WormBase consortium when novel versions of the genome sequence of C. elegans, or other evolutionary related species are being released. This approach is of general interest to the scientific community since it can be used to annotate any genome.
Aging is a degenerative process characterized by a progressive deterioration of cellular components and organelles resulting in mortality. The budding yeast Saccharomyces cerevisiae has been used extensively to study the biology of aging, and several determinants of yeast longevity have been shown to be conserved in multicellular eukaryotes, including worms, flies, and mice 1. Due to the lack of easily quantified age-associated phenotypes, aging in yeast has been assayed almost exclusively by measuring the life span of cells in different contexts, with two different life span paradigms in common usage 2. Chronological life span refers to the length of time that a mother cell can survive in a non-dividing, quiescence-like state, and is proposed to serve as a model for aging of post-mitotic cells in multicellular eukaryotes. Replicative life span, in contrast, refers the number of daughter cells produced by a mother cell prior to senescence, and is thought to provide a model of aging in mitotically active cells. Here we present a generalized protocol for measuring the replicative life span of budding yeast mother cells. The goal of the replicative life span assay is to determine how many times each mother cell buds. The mother and daughter cells can be easily differentiated by an experienced researcher using a standard light microscope (total magnification 160X), such as the Zeiss Axioscope 40 or another comparable model. Physical separation of daughter cells from mother cells is achieved using a manual micromanipulator equipped with a fiber-optic needle. Typical laboratory yeast strains produce 20-30 daughter cells per mother and one life span experiment requires 2-3 weeks.
The nematode Caenorhabditis elegans is a powerful model system to study contemporary biological problems. This system would be even more useful if we had mutations in all the genes of this multicellular metazoan. The combined efforts of the C. elegans Deletion Mutant Consortium and individuals within the worm community are moving us ever closer to this goal. At present, of the 20,377 protein-coding genes in this organism, 6764 genes with associated molecular lesions are either deletions or null mutations (WormBase WS220). Our three laboratories have contributed the majority of mutated genes, 6841 mutations in 6013 genes. The principal method we used to detect deletion mutations in the nematode utilizes polymerase chain reaction (PCR). More recently, we have used array comparative genome hybridization (aCGH) to detect deletions across the entire coding part of the genome and massively parallel short-read sequencing to identify nonsense, splicing, and missense defects in open reading frames. As deletion strains can be frozen and then thawed when needed, these strains will be an enduring community resource. Our combined molecular screening strategies have improved the overall throughput of our gene-knockout facilities and have broadened the types of mutations that we and others can identify. These multiple strategies should enable us to eventually identify a mutation in every gene in this multicellular organism. This knowledge will usher in a new age of metazoan genetics in which the contribution to any biological process can be assessed for all genes.
genomics; knockouts; deletion mutations; multi-gene families
Various age-related neurodegenerative diseases, including Parkinson's disease, polyglutamine expansion diseases and Alzheimer's disease, are associated with the accumulation of misfolded proteins in aggregates in the brain. How and why these proteins form aggregates and cause disease is still poorly understood. Small model organisms—the baker's yeast Saccharomyces cerevisiae, the nematode worm Caenorhabditis elegans and the fruit fly Drosophila melanogaster—have been used to model these diseases and high-throughput genetic screens using these models have led to the identification of a large number of genes that modify aggregation and toxicity of the disease proteins. In this review, we revisit these models and provide a comprehensive comparison of the genetic screens performed so far. Our integrative analysis highlights alterations of a wide variety of basic cellular processes. Not all disease proteins are influenced by alterations in the same cellular processes and despite the unifying theme of protein misfolding and aggregation, the pathology of each of the age-related misfolding disorders can be induced or influenced by a disease-protein-specific subset of molecular processes.
neurodegeneration; protein aggregation; genetic modifiers; small model organisms; meta-analysis
The identification of over-represented transcription factor binding sites from sets of co-expressed genes provides insights into the mechanisms of regulation for diverse biological contexts. oPOSSUM, an internet-based system for such studies of regulation, has been improved and expanded in this new release. New features include a worm-specific version for investigating binding sites conserved between Caenorhabditis elegans and C. briggsae, as well as a yeast-specific version for the analysis of co-expressed sets of Saccharomyces cerevisiae genes. The human and mouse applications feature improvements in ortholog mapping, sequence alignments and the delineation of multiple alternative promoters. oPOSSUM2, introduced for the analysis of over-represented combinations of motifs in human and mouse genes, has been integrated with the original oPOSSUM system. Analysis using user-defined background gene sets is now supported. The transcription factor binding site models have been updated to include new profiles from the JASPAR database. oPOSSUM is available at http://www.cisreg.ca/oPOSSUM/
Saccharomyces cerevisiae Scc2 binds Scc4 to form an essential complex that loads cohesin onto chromosomes. The prevalence of Scc2 orthologs in eukaryotes emphasizes a conserved role in regulating sister chromatid cohesion, but homologs of Scc4 have not hitherto been identified outside certain fungi. Some metazoan orthologs of Scc2 were initially identified as developmental gene regulators, such as
Drosophila Nipped-B, a regulator of
Ultrabithorax, and delangin, a protein mutant in Cornelia de Lange syndrome. We show that delangin and Nipped-B bind previously unstudied human and fly orthologs of
Caenorhabditis elegans MAU-2, a non-axis-specific guidance factor for migrating cells and axons. PSI-BLAST shows that Scc4 is evolutionarily related to metazoan MAU-2 sequences, with the greatest homology evident in a short N-terminal domain, and protein–protein interaction studies map the site of interaction between delangin and human MAU-2 to the N-terminal regions of both proteins. Short interfering RNA knockdown of human MAU-2 in HeLa cells resulted in precocious sister chromatid separation and in impaired loading of cohesin onto chromatin, indicating that it is functionally related to Scc4, and RNAi analyses show that MAU-2 regulates chromosome segregation in
C. elegans embryos. Using antisense morpholino oligonucleotides to knock down
Xenopus tropicalis delangin or MAU-2 in early embryos produced similar patterns of retarded growth and developmental defects. Our data show that sister chromatid cohesion in metazoans involves the formation of a complex similar to the Scc2-Scc4 interaction in the budding yeast. The very high degree of sequence conservation between Scc4 homologs in complex metazoans is consistent with increased selection pressure to conserve additional essential functions, such as regulation of cell and axon migration during development.
A complex previously found only in yeast is described in metazoa, where it functions both in chromatid cohesion and in migration during development.
Protein-protein interactions play a fundamental role in elucidating the molecular mechanisms of biomolecular function, signal transductions and metabolic pathways of living organisms. Although high-throughput technologies such as yeast two-hybrid system and affinity purification followed by mass spectrometry are widely used in model organisms, the progress of protein-protein interactions detection in plants is rather slow. With this motivation, our work presents a computational approach to predict protein-protein interactions in Oryza sativa.
To better understand the interactions of proteins in Oryza sativa, we have developed PRIN, a Predicted Rice Interactome Network. Protein-protein interaction data of PRIN are based on the interologs of six model organisms where large-scale protein-protein interaction experiments have been applied: yeast (Saccharomyces cerevisiae), worm (Caenorhabditis elegans), fruit fly (Drosophila melanogaster), human (Homo sapiens), Escherichia coli K12 and Arabidopsis thaliana. With certain quality controls, altogether we obtained 76,585 non-redundant rice protein interaction pairs among 5,049 rice proteins. Further analysis showed that the topology properties of predicted rice protein interaction network are more similar to yeast than to the other 5 organisms. This may not be surprising as the interologs based on yeast contribute nearly 74% of total interactions. In addition, GO annotation, subcellular localization information and gene expression data are also mapped to our network for validation. Finally, a user-friendly web interface was developed to offer convenient database search and network visualization.
PRIN is the first well annotated protein interaction database for the important model plant Oryza sativa. It has greatly extended the current available protein-protein interaction data of rice with a computational approach, which will certainly provide further insights into rice functional genomics and systems biology.
PRIN is available online at http://bis.zju.edu.cn/prin/.
Tyrosine phosphorylation is an essential element of signal transduction in multicellular animals. Although tyrosine kinases were originally regarded as specific to the metazoan lineage, it is now clear that they evolved prior to the split between unicellular and multicellular eukaryotes (≈ 600 million years ago). Genome analyses of choanoflagellates and other protists show an abundance of tyrosine kinases that rivals the most complex animals. Some of these kinases are orthologs of metazoan enzymes (e.g., Src), but others display unique domain compositions not seen in any metazoan. Biochemical experiments have highlighted similarities and differences between the unicellular and multicellular tyrosine kinases. In particular, it appears that the complex systems of kinase autoregulation may have evolved later in the metazoan lineage.
tyrosine kinase; phosphorylation; evolution; choanoflagellate; SH2 domain; SH3 domain; autoinhibition
The database of Clusters of Orthologous Groups of proteins (COGs),
which represents an attempt on a phylogenetic classification of
the proteins encoded in complete genomes, currently consists of
2791 COGs including 45 350 proteins from 30 genomes of bacteria,
archaea and the yeast Saccharomyces cerevisiae (http://www.ncbi.nlm.nih.gov/COG).
In addition, a supplement to the COGs is available, in which proteins
encoded in the genomes of two multicellular eukaryotes, the nematode Caenorhabditis elegans and the fruit fly Drosophila
melanogaster, and shared with bacteria and/or archaea
were included. The new features added to the COG database include
information pages with structural and functional details on each
COG and literature references, improvements of the COGNITOR program
that is used to fit new proteins into the COGs, and classification
of genomes and COGs constructed by using principal component analysis.
The Yeast Proteome Database (YPD™) has been for several years a resource for organized and accessible information about the proteins of Saccharomyces cerevisiae. We have now extended the YPD format to create a database containing complete proteome information about the model organism Caenorhabditis elegans (WormPD™). YPD and WormPD are designed for use not only by their respective research communities but also by the broader scientific community. In both databases, information gleaned from the literature is presented in a consistent, user-friendly Protein Report format: a single Web page presenting all available knowledge about a particular protein. Each Protein Report begins with a Title Line, a concise description of the function of that protein that is continually updated as curators review new literature. Properties and functions of the protein are presented in tabular form in the upper part of the Report, and free-text annotations organized by topic are presented in the lower part. Each Protein Report ends with a comprehensive reference list whose entries are linked to their MEDLINE abstracts. YPD and WormPD are seamlessly integrated, with extensive links between the species. They are freely accessible to academic users on the WWW at http://www.proteome.com/databases/index.html , and are available by subscription to corporate users.
To initiate studies on how protein-protein interaction (or “interactome”) networks relate to multicellular functions, we have mapped a large fraction of the Caenorhabditis elegans interactome network. Starting with a subset of metazoan-specific proteins, more than 4000 interactions were identified from high-throughput, yeast two-hybrid (HT=Y2H) screens. Independent coaffinity purification assays experimentally validated the overall quality of this Y2H data set. Together with already described Y2H interactions and interologs predicted in silico, the current version of the Worm Interactome (WI5) map contains ∼5500 interactions. Topological and biological features of this interactome network, as well as its integration with phenome and transcriptome data sets, lead to numerous biological hypotheses.
WormBase (http://www.wormbase.org/) is the central data repository for information about Caenorhabditis elegans and related nematodes. As a model organism database, WormBase extends beyond the genomic sequence, integrating experimental results with extensively annotated views of the genome. The WormBase Consortium continues to expand the biological scope and utility of WormBase with the inclusion of large-scale genomic analyses, through active data and literature curation, through new analysis and visualization tools, and through refinement of the user interface. Over the past year, the nearly complete genomic sequence and comparative analyses of the closely related species Caenorhabditis briggsae have been integrated into WormBase, including gene predictions, ortholog assignments and a new synteny viewer to display the relationships between the two species. Extensive site-wide refinement of the user interface now provides quick access to the most frequently accessed resources and a consistent browsing experience across the site. Unified single-page views now provide complete summaries of commonly accessed entries like genes. These advances continue to increase the utility of WormBase for C.elegans researchers, as well as for those researchers exploring problems in functional and comparative genomics in the context of a powerful genetic system.
Treatment of systemic fungal infections is difficult because of the limited number of antimycotic drugs available. Thus, there is an immediate need for simple and innovative systems to assay the contribution of individual genes to fungal pathogenesis. We have developed a pathogenesis assay using Caenorhabditis elegans, an established model host, with Saccharomyces cerevisiae as the invading fungus. We have found that yeast infects nematodes, causing disease and death. Our data indicate that the host produces reactive oxygen species (ROS) in response to fungal infection. Yeast mutants sod1Δ and yap1Δ, which cannot withstand ROS, fail to cause disease, except in bli-3 worms, which carry a mutation in a dual oxidase gene. Chemical inhibition of the NADPH oxidase activity abolishes ROS production in worms exposed to yeast. This pathogenesis assay is useful for conducting systematic, whole-genome screens to identify fungal virulence factors as alternative targets for drug development and exploration of host responses to fungal infections.
The soil nematodes Caenorhabditis briggsae and Caenorhabditis elegans diverged from a common ancestor roughly 100 million years ago and yet are almost indistinguishable by eye. They have the same chromosome number and genome sizes, and they occupy the same ecological niche. To explore the basis for this striking conservation of structure and function, we have sequenced the C. briggsae genome to a high-quality draft stage and compared it to the finished C. elegans sequence. We predict approximately 19,500 protein-coding genes in the C. briggsae genome, roughly the same as in C. elegans. Of these, 12,200 have clear C. elegans orthologs, a further 6,500 have one or more clearly detectable C. elegans homologs, and approximately 800 C. briggsae genes have no detectable matches in C. elegans. Almost all of the noncoding RNAs (ncRNAs) known are shared between the two species. The two genomes exhibit extensive colinearity, and the rate of divergence appears to be higher in the chromosomal arms than in the centers. Operons, a distinctive feature of C. elegans, are highly conserved in C. briggsae, with the arrangement of genes being preserved in 96% of cases. The difference in size between the C. briggsae (estimated at approximately 104 Mbp) and C. elegans (100.3 Mbp) genomes is almost entirely due to repetitive sequence, which accounts for 22.4% of the C. briggsae genome in contrast to 16.5% of the C. elegans genome. Few, if any, repeat families are shared, suggesting that most were acquired after the two species diverged or are undergoing rapid evolution. Coclustering the C. elegans and C. briggsae proteins reveals 2,169 protein families of two or more members. Most of these are shared between the two species, but some appear to be expanding or contracting, and there seem to be as many as several hundred novel C. briggsae gene families. The C. briggsae draft sequence will greatly improve the annotation of the C. elegans genome. Based on similarity to C. briggsae, we found strong evidence for 1,300 new C. elegans genes. In addition, comparisons of the two genomes will help to understand the evolutionary forces that mold nematode genomes.
With the Caenorhabditis briggsae genome now in hand, C. elegans biologists have a powerful new research tool to refine their knowledge of gene function in C. elegans and to study the path of genome evolution
In this study we systematically examined the differences between the proteomes of Metazoa and other eukaryotes. Metazoans (Homo sapiens, Ceanorhabditis elegans and Drosophila melanogaster) were compared with a plant (Arabidopsis thaliana), fungi (Saccharomyces cerevisiae and Schizosaccaromyces pombe) and Encephalitozoan cuniculi. We identified 159 gene families that were probably lost in the Metazoan branch and 1263 orthologous families that were specific to Metazoa and were likely to have originated in their last common ancestor (LCA). We analyzed the evolutionary rates of pan-eukaryotic protein families and identified those with higher rates in animals. The acceleration was shown to occur in: (i) the LCA of Metazoa or (ii) independently in the Metazoan phyla. A high proportion of the accelerated Metazoan protein families was found to participate in translation and ribosome biogenesis, particularly mitochondrial. By functional analysis we show that no metabolic pathway in animals evolved faster than in other organisms. We conclude that evolution in the LCA of Metazoa was extensive and proceeded largely by gene duplication and/or invention rather than by modification of extant proteins. Finally, we show that the rate of evolution of a gene family in animals has a clear, but not absolute, tendency to be conserved.
PDZ domains are protein–protein interaction modules that recognize specific C-terminal sequences to assemble protein complexes in multicellular organisms. By scanning billions of random peptides, we accurately map binding specificity for approximately half of the over 330 PDZ domains in the human and Caenorhabditis elegans proteomes. The domains recognize features of the last seven ligand positions, and we find 16 distinct specificity classes conserved from worm to human, significantly extending the canonical two-class system based on position −2. Thus, most PDZ domains are not promiscuous, but rather are fine-tuned for specific interactions. Specificity profiling of 91 point mutants of a model PDZ domain reveals that the binding site is highly robust, as all mutants were able to recognize C-terminal peptides. However, many mutations altered specificity for ligand positions both close and far from the mutated position, suggesting that binding specificity can evolve rapidly under mutational pressure. Our specificity map enables the prediction and prioritization of natural protein interactions, which can be used to guide PDZ domain cell biology experiments. Using this approach, we predicted and validated several viral ligands for the PDZ domains of the SCRIB polarity protein. These findings indicate that many viruses produce PDZ ligands that disrupt host protein complexes for their own benefit, and that highly pathogenic strains target PDZ domains involved in cell polarity and growth.
The PDZ domain is a structural domain that functions as a protein–protein interaction module that recognizes specific C-terminal peptide sequences to assemble intracellular complexes important in signaling pathways of multicellular organisms. These modules are associated with human disease and are targets of viruses and other pathogens. By examining peptide specificity and substrate diversity of roughly one half of the PDZ domains known to exist in human and the nematode Caenorhabditis elegans, we were able to show that PDZ domains are more specific than previously appreciated. PDZ domains also remain functional under high mutational pressure, and only a few of the vast number of possible PDZ domain specificities are utilized in nature. These PDZ domain specificities are conserved from human to worm, implying that the specificities evolved early and were reused over evolution instead of being reshaped. The specificity map generated here was used to predict and experimentally confirm new viral PDZ-binding motifs. We present evidence that pathogenic viruses, including avian influenza, bind host PDZ domains via these motifs, thereby competing with signaling by host complexes, which leads to disruption of growth and polarity of the host cells.
A genome-scale specificity map for PDZ domains reveals how family members recognize ligands to assemble signaling complexes and also reveals how viruses target these domains to subvert host cell function.
The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA-binding specificities of proteins. This initial release of the UniPROBE database provides a centralized resource for accessing comprehensive PBM data on the preferences of proteins for all possible sequence variants (‘words’) of length k (‘k-mers’), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In total, the database hosts DNA-binding data for over 175 nonredundant proteins from a diverse collection of organisms, including the prokaryote Vibrio harveyi, the eukaryotic malarial parasite Plasmodium falciparum, the parasitic Apicomplexan Cryptosporidium parvum, the yeast Saccharomyces cerevisiae, the worm Caenorhabditis elegans, mouse and human. Current web tools include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences. The UniPROBE database is available at http://thebrain.bwh.harvard.edu/uniprobe/.
We present YOGY a web-based resource for orthologous proteins from nine eukaryotic organisms: Homo sapiens, Mus musculus, Rattus norvegicus, Arabidopsis thaliana, Drosophila melanogaster, Caenorhabditis elegans, Plasmodium falciparum, Schizosaccharomyces pombe and Saccharomyces cerevisiae. Using a gene name from any of these organisms as a query, this database provides comprehensive, combined information on orthologs in other species using data from five independent resources: KOGs, Inparanoid, HomoloGene, OrthoMCL and a table of curated fission and budding yeast orthologs. Associated Gene Ontology (GO) terms of orthologs can also be retrieved for functional inference. Integrating these different and complementary datasets provides a straightforward tool to identify known and predicted orthologs of proteins from a variety of species. This resource should be useful for bench scientists looking for functional clues for their genes of interest as well as for curators looking for information that can be transferred based on orthology and for rapidly identifying the relevant GO terms as an aid to literature curation. YOGY is accessible online at .
Cse4p is an evolutionarily conserved histone H3-like protein that is thought to replace H3 in a specialized nucleosome at the yeast (Saccharomyces cerevisiae) centromere. All known yeast, worm, fly, and human centromere H3-like proteins have highly conserved C-terminal histone fold domains (HFD) but very different N termini. We have carried out a comprehensive and systematic mutagenesis of the Cse4p N terminus to analyze its function. Surprisingly, only a 33-amino-acid domain within the 130-amino-acid-long N terminus is required for Cse4p N-terminal function. The spacing of the essential N-terminal domain (END) relative to the HFD can be changed significantly without an apparent effect on Cse4p function. The END appears to be important for interactions between Cse4p and known kinetochore components, including the Ctf19p/Mcm21p/Okp1p complex. Genetic and biochemical evidence shows that Cse4p proteins interact with each other in vivo and that nonfunctional cse4 END and HFD mutant proteins can form functional mixed complexes. These results support different roles for the Cse4p N terminus and the HFD in centromere function and are consistent with the proposed Cse4p nucleosome model. The structure-function characteristics of the Cse4p N terminus are relevant to understanding how other H3-like proteins, such as the human homolog CENP-A, function in kinetochore assembly and chromosome segregation.