Systematic analysis of gene overexpression phenotypes provides an insight into gene function, enzyme targets, and biological pathways. Here, we describe a novel functional genomics platform that enables a highly parallel and systematic assessment of overexpression phenotypes in pooled cultures. First, we constructed a genome-level collection of ~5100 yeast barcoder strains, each of which carries a unique barcode, enabling pooled fitness assays with a barcode microarray or sequencing readout. Second, we constructed a yeast open reading frame (ORF) galactose-induced overexpression array by generating a genome-wide set of yeast transformants, each of which carries an individual plasmid-born and sequence-verified ORF derived from the Saccharomyces cerevisiae full-length EXpression-ready (FLEX) collection. We combined these collections genetically using synthetic genetic array methodology, generating ~5100 strains, each of which is barcoded and overexpresses a specific ORF, a set we termed “barFLEX.” Additional synthetic genetic array allows the barFLEX collection to be moved into different genetic backgrounds. As a proof-of-principle, we describe the properties of the barFLEX overexpression collection and its application in synthetic dosage lethality studies under different environmental conditions.
barFLEX array; gene overexpression; barcoders; synthetic dosage lethality
In parallel with evolutionary developments, the Hsp90 molecular chaperone system shifted from a simple prokaryotic factor into an expansive network that includes a variety of cochaperones. We have taken high-throughput genomic and proteomic approaches to better understand the abundant yeast p23 cochaperone Sba1. Our work revealed an unexpected p23 network that displayed considerable independence from known Hsp90 clients. Additionally, our data uncovered a broad nuclear role for p23, contrasting with the historical dogma of restricted cytosolic activities for molecular chaperones. Validation studies demonstrated that yeast p23 was required for proper Golgi function, ribosome biogenesis and was necessary for efficient DNA repair from a wide range of mutagens. Notably, mammalian p23 had conserved roles in these pathways as well as being necessary for proper cell mobility. Taken together, our work demonstrates that the p23 chaperone serves a broad physiological network and functions both in conjunction with and sovereign to Hsp90.
Intense experimental and theoretical efforts have been made to globally map genetic interactions, yet we still do not understand how gene-gene interactions arise from the operation of biomolecular networks. To bridge the gap between empirical and computational studies, we: i) quantitatively measure genetic interactions between ~185,000 metabolic gene pairs in Saccharomyces cerevisiae, ii) superpose the data on a detailed systems biology model of metabolism, and iii) introduce a machine-learning method to reconcile empirical interaction data with model predictions. We systematically investigate the relative impacts of functional modularity and metabolic flux coupling on the distribution of negative and positive genetic interactions. We also provide a mechanistic explanation for the link between the degree of genetic interaction, pleiotropy, and gene dispensability. Last, we demonstrate the feasibility of automated metabolic model refinement by correcting misannotations in NAD biosynthesis and confirming them by in vivo experiments.
The coordination of cell-cycle events with developmental processes is essential for the reproductive success of organisms. In Drosophila melanogaster, meiosis is tightly coupled to oocyte development, and early embryos undergo specialized S-M mitoses that are supported by maternal products. We previously showed that the small phosphoprotein α-endosulfine (Endos) is required for normal oocyte meiotic maturation and early embryonic mitoses in Drosophila. In this study, we performed a genetic screen for dominant enhancers of endos00003 and identified several genomic regions that, when deleted, lead to impaired fertility of endos00003/+ heterozygous females. We uncovered matrimony (mtrm), which encodes a Polo kinase inhibitor, as a strong dominant enhancer of endos. mtrm126 +/+ endos00003 females are sterile because of defects in early embryonic mitoses, and this phenotype is reverted by removal of one copy of polo. These results provide compelling genetic evidence that excessive Polo activity underlies the strong functional interaction between endos00003 and mtrm126. Moreover, we show that endos is required for the increased expression of Mtrm in mature oocytes, which is presumably loaded into early embryos. These data are consistent with the model that maternal endos antagonizes Polo function in the early embryo to ensure normal mitoses through its effects on Mtrm expression during late oogenesis. Finally, we also identified genomic deletions that lead to loss of viability of endos00003/+ heterozygotes, consistent with recently published studies showing that endos is required zygotically to regulate the cell cycle during development.
α-endosulfine; matrimony; polo; early embryonic cell cycle; Drosophila
Two types of drug synergy, genetic and promiscuous, are explored in S. cerevisiae. The results suggest that promiscuous synergy predominates, and that propensity to synergize is an intrinsic drug property with the potential to accelerate the search for synergistic drug combinations.
Discovered 37 synergistic interactions among antifungal chemicalsPromiscuous synergy is the predominant form of drug synergyRate of synergy is an intrinsic property of drugs that can guide searches for drug synergy
Drug synergy allows a therapeutic effect to be achieved with lower doses of component drugs. Drug synergy can result when drugs target the products of genes that act in parallel pathways (‘specific synergy'). Such cases of drug synergy should tend to correspond to synergistic genetic interaction between the corresponding target genes. Alternatively, ‘promiscuous synergy' can arise when one drug non-specifically increases the effects of many other drugs, for example, by increased bioavailability. To assess the relative abundance of these drug synergy types, we examined 200 pairs of antifungal drugs in S. cerevisiae. We found 38 antifungal synergies, 37 of which were novel. While 14 cases of drug synergy corresponded to genetic interaction, 92% of the synergies we discovered involved only six frequently synergistic drugs. Although promiscuity of four drugs can be explained under the bioavailability model, the promiscuity of Tacrolimus and Pentamidine was completely unexpected. While many drug synergies correspond to genetic interactions, the majority of drug synergies appear to result from non-specific promiscuous synergy.
chemical genetics; drug combinations; drug discovery; genetic interactions
Classical forward genetics has been foundational to modern biology, and has been the paradigm for characterizing the role of genes in shaping phenotypes for decades. In recent years, reverse genetics has been used to identify the functions of genes, via the intentional introduction of variation and subsequent evaluation in physiological, molecular, and even population contexts. These approaches are complementary and whole genome analysis serves as a bridge between the two. We report in this article the whole genome sequencing of eighteen classical mutant strains of Neurospora crassa and the putative identification of the mutations associated with corresponding mutant phenotypes. Although some strains carry multiple unique nonsynonymous, nonsense, or frameshift mutations, the combined power of limiting the scope of the search based on genetic markers and of using a comparative analysis among the eighteen genomes provides strong support for the association between mutation and phenotype. For ten of the mutants, the mutant phenotype is recapitulated in classical or gene deletion mutants in Neurospora or other filamentous fungi. From thirteen to 137 nonsense mutations are present in each strain and indel sizes are shown to be highly skewed in gene coding sequence. Significant additional genetic variation was found in the eighteen mutant strains, and this variability defines multiple alleles of many genes. These alleles may be useful in further genetic and molecular analysis of known and yet-to-be-discovered functions and they invite new interpretations of molecular and genetic interactions in classical mutant strains.
single nucleotide polymorphism; SNP; indel; comparative genomics; classical mutant
Despite the ecological and economic importance of lignin and other wood chemical components, there are few studies of the natural genetic variation that exists within plant species and its adaptive significance. We used models developed from near infra-red spectroscopy to study natural genetic variation in lignin content and monomer composition (syringyl-to-guaiacyl ratio [S/G]) as well as cellulose and extractives content, using a 16-year-old field trial of an Australian tree species, Eucalyptus globulus. We sampled 2163 progenies of 467 native trees from throughout the native geographic range of the species. The narrow-sense heritability of wood chemical traits (0.25–0.44) was higher than that of growth (0.15), but less than wood density (0.51). All wood chemical traits exhibited significant broad-scale genetic differentiation (QST = 0.34–0.43) across the species range. This differentiation exceeded that detected with putatively neutral microsatellite markers (FST = 0.09), arguing that diversifying selection has shaped population differentiation in wood chemistry. There were significant genetic correlations among these wood chemical traits at the population and additive genetic levels. However, population differentiation in the S/G ratio of lignin in particular was positively correlated with latitude (R2 = 76%), which may be driven by either adaptation to climate or associated biotic factors.
tree improvement; wood chemicals; adaptation; lignin; cellulose; extractives; syringyl; guaiacyl
Plant root systems must grow in a manner that is dictated by endogenous genetic pathways, yet sensitive to environmental input. This allows them to provide the plant with water and nutrients while navigating a heterogeneous soil environment filled with obstacles, toxins, and pests. Gravity and touch, which constitute important cues for roots growing in soil, have been shown to modulate root architecture by altering growth patterns. This is illustrated by Arabidopsis thaliana roots growing on tilted hard agar surfaces. Under these conditions, the roots are exposed to both gravity and touch stimulation. Consequently, they tend to skew their growth away from the vertical and wave along the surface. This complex growth behavior is believed to help roots avoid obstacles in nature. Interestingly, A. thaliana accessions display distinct growth patterns under these conditions, suggesting the possibility of using this variation as a tool to identify the molecular mechanisms that modulate root behavior in response to their mechanical environment. We have used the Cvi/Ler recombinant inbred line population to identify quantitative trait loci that contribute to root skewing on tilted hard agar surfaces. A combination of fine mapping for one of these QTL and microarray analysis of expression differences between Cvi and Ler root tips identifies a region on chromosome 2 as contributing to root skewing on tilted surfaces, potentially by modulating cell wall composition.
Arabidopsis; root; skewing; waving; cis-prenyltransferase
Spore germination in Saccharomyces cerevisiae is a process in which a quiescent cell begins to divide. During germination, the cell undergoes dramatic changes in cell wall and membrane composition, as well as in gene expression. To understand germination in greater detail, we screened the S. cerevisiae deletion set for germination mutants. Our results identified two genes, TRF4 and ERG6, that are required for normal germination on solid media. TRF4 is a member of the TRAMP complex that, together with the exosome, degrades RNA polymerase II transcripts. ERG6 encodes a key step in ergosterol biosynthesis. Taken together, these results demonstrate the complex nature of germination and two genes important in the process.
Saccharomyces cerevisiae; germination; sporulation; ERG6; TRF4
To identify genes involved in phenotypic traits, translational genomics from highly characterized model plants to poorly characterized crop plants provides a valuable source of markers to saturate a zone of interest as well as functionally characterized candidate genes. In this paper, an integrated view of the pea genetic map was developed. A series of gene markers were mapped and their best reciprocal homologs were identified on M. truncatula, L. japonicus, soybean, and poplar pseudomolecules. Based on the syntenic relationships uncovered between pea and M. truncatula, 5460 pea Unigenes were tentatively placed on the consensus map. A new bioinformatics tool, http://www.thelegumeportal.net/pea_mtr_translational_toolkit, was developed that allows, for any gene sequence, to search its putative position on the pea consensus map and hence to search for candidate genes among neighboring Unigenes. As an example, a promising candidate gene for the hypernodulation mutation nod3 in pea was proposed based on the map position of the likely homolog of Pub1, a M. truncatula gene involved in nodulation regulation. A broader view of pea genome evolution was obtained by revealing syntenic relationships between pea and sequenced genomes. Blocks of synteny were identified which gave new insights into the evolution of chromosome structure in Papillionoids and Eudicots. The power of the translational genomics approach was underlined.
Pisum sativum; functional consensus map; synteny; model legume species; translational genomics
Estimating the line origin of chromosomal sections from marker genotypes is a vital step in quantitative trait loci analyses of outbred line crosses. The original, and most commonly used, algorithm can only handle moderate numbers of partially informative markers. The advent of high-density genotyping with SNP chips motivates a new method because the generic sets of markers on SNP chips typically result in long stretches of partially informative markers. We validated a new method for inferring line origin, triM (tracing inheritance with Markov models), with simulated data. A realistic pattern of marker information was achieved by replicating the linkage disequilibrium from an existing chicken intercross. There were approximately 1500 SNP markers and 800 F2 individuals. The performance of triM was compared to GridQTL, which uses a variant of the original algorithm but modified for larger datasets. triM estimated the line origin with an average error of 2%, was 10% more accurate than GridQTL, considerably faster, and better at inferring positions of recombination. GridQTL could not analyze all simulated replicates and did not estimate line origin for around a third of individuals at many positions. The study shows that triM has computational benefits and improved estimation over available algorithms and is valuable for analyzing the large datasets that will be standard in future.
interval mapping; outbred line cross; line origin probabilities; hidden Markov model; SNP chip
Accurate information on haplotypes and diplotypes (haplotype pairs) is required for population-genetic analyses; however, microarrays do not provide data on a haplotype or diplotype at a copy number variation (CNV) locus; they only provide data on the total number of copies over a diplotype or an unphased sequence genotype (e.g., AAB, unlike AB of single nucleotide polymorphism). Moreover, such copy numbers or genotypes are often incorrectly determined when microarray signal intensities derived from different copy numbers or genotypes are not clearly separated due to noise. Here we report an algorithm to infer CNV haplotypes and individuals’ diplotypes at multiple loci from noisy microarray data, utilizing the probability that a signal intensity may be derived from different underlying copy numbers or genotypes. Performing simulation studies based on known diplotypes and an error model obtained from real microarray data, we demonstrate that this probabilistic approach succeeds in accurate inference (error rate: 1–2%) from noisy data, whereas previous deterministic approaches failed (error rate: 12–18%). Applying this algorithm to real microarray data, we estimated haplotype frequencies and diplotypes in 1486 CNV regions for 100 individuals. Our algorithm will facilitate accurate population-genetic analyses and powerful disease association studies of CNVs.
copy number variation; EM algorithm; haplotype inference; phasing
Using the homozygous diploid Saccharomyces deletion collection, we searched for strains with defects in K+ homeostasis. We identified 156 (of 4653 total) strains unable to grow in the presence of hygromycin B, a phenotype previously shown to be indicative of ion defects. The most abundant group was that with deletions of genes known to encode membrane traffic regulators. Nearly 80% of these membrane traffic defective strains showed defects in uptake of the K+ homolog, 86Rb+. Since Trk1, a plasma membrane protein localized to lipid microdomains, is the major K+ influx transporter, we examined the subcellular localization and Triton-X 100 insolubility of Trk1 in 29 of the traffic mutants. However, few of these showed defects in the steady state levels of Trk1, the localization of Trk1 to the plasma membrane, or the localization of Trk1 to lipid microdomains, and most defects were mild compared to wild-type. Three inositol kinase mutants were also identified, and in contrast, loss of these genes negatively affected Trk1 protein levels. In summary, this work reveals a nexus between K+ homeostasis and membrane traffic, which does not involve traffic of the major influx transporter, Trk1.
VPS genes; TRK1
A quantitative trait locus (QTL) affecting female fertility, scored as the inverse of the number of inseminations to conception, on Bos taurus chromosome 7 was detected by a daughter design analysis of the Israeli Holstein population (P < 0.0003). Sires of five of the 10 families analyzed were heterozygous for the QTL. The 95% confidence interval of the QTL spans 27 cM from the centromere. Seven hundred and four SNP markers on the Illumina BovineSNP50 BeadChip within the QTL confidence interval were tested for concordance. A single SNP, NGS-58779, was heterozygous for all the five QTL heterozygous patriarchs, and homozygous for the remaining five QTL homozygous sires. A significant effect on fertility was associated with this marker in the sample of 900 sires genotyped (P < 10−6). Haplotype phase was the same for four of the five segregating sires. Thus concordance was obtained in nine of the ten families. We identified a common haplotype region associated with the rare and economically favorable allele of the SNP, spanning 270 kbp on BTA7 upstream to 4.72 Mbp. Eleven genes found in the common haplotype region should be considered as positional candidates for the identification of the causative quantitative trait nucleotide. Copy number variation was found in one of these genes, KIAA1683. Four gene variants were identified, but only the number of copies of a specific variant (V1) was significantly associated with breeding values of sires for fertility.
quantitative trait locus (QTL); copy number variation (CNV); KIAA1683; gene-variants; female-fertility
If perturbing two genes together has a stronger or weaker effect than expected, they are said to genetically interact. Genetic interactions are important because they help map gene function, and functionally related genes have similar genetic interaction patterns. Mapping quantitative (positive and negative) genetic interactions on a global scale has recently become possible. This data clearly shows groups of genes connected by predominantly positive or negative interactions, termed monochromatic groups. These groups often correspond to functional modules, like biological processes or complexes, or connections between modules. However it is not yet known how these patterns globally relate to known functional modules. Here we systematically study the monochromatic nature of known biological processes using the largest quantitative genetic interaction data set available, which includes fitness measurements for ∼5.4 million gene pairs in the yeast Saccharomyces cerevisiae. We find that only 10% of biological processes, as defined by Gene Ontology annotations, and less than 1% of inter-process connections are monochromatic. Further, we show that protein complexes are responsible for a surprisingly large fraction of these patterns. This suggests that complexes play a central role in shaping the monochromatic landscape of biological processes. Altogether this work shows that both positive and negative monochromatic patterns are found in known biological processes and in their connections and that protein complexes play an important role in these patterns. The monochromatic processes, complexes and connections we find chart a hierarchical and modular map of sensitive and redundant biological systems in the yeast cell that will be useful for gene function prediction and comparison across phenotypes and organisms. Furthermore the analysis methods we develop are applicable to other species for which genetic interactions will progressively become more available.
Genetic interactions indicate functional dependencies between genes and are a powerful tool to predict gene function. Functionally related genes tend to have similar profiles of genetic interactions. Recently, global scale mapping of quantitative (positive and negative) genetic interactions has been performed. This data clearly shows groups of genes connected by predominantly positive or negative interactions, termed monochromatic groups. These groups often correspond to functional modules, such as biological processes or protein complexes, or connections between modules, but it is not yet known how these patterns globally relate to known functional modules. Here we systematically evaluate the monochromatic nature of known biological processes and their connections in the yeast Saccharomyces cerevisiae. We find that 10% of biological processes and less than 1% of inter-process connections are monochromatic. Further, we show that protein complexes are responsible for a surprisingly large fraction of these monochromatic groups. The monochromatic processes, complexes and connections we find chart a hierarchical and modular map of sensitive and redundant biological systems in the yeast cell that will be useful for gene function prediction and comparison across phenotypes and organisms.
Intrinsically disordered regions are widespread, especially in proteomes of higher eukaryotes. Recently, protein disorder has been associated with a wide variety of cellular processes and has been implicated in several human diseases. Despite its apparent functional importance, the sheer range of different roles played by protein disorder often makes its exact contribution difficult to interpret.
We attempt to better understand the different roles of disorder using a novel analysis that leverages both comparative genomics and genetic interactions. Strikingly, we find that disorder can be partitioned into three biologically distinct phenomena: regions where disorder is conserved but with quickly evolving amino acid sequences (flexible disorder); regions of conserved disorder with also highly conserved amino acid sequences (constrained disorder); and, lastly, non-conserved disorder. Flexible disorder bears many of the characteristics commonly attributed to disorder and is associated with signaling pathways and multi-functionality. Conversely, constrained disorder has markedly different functional attributes and is involved in RNA binding and protein chaperones. Finally, non-conserved disorder lacks clear functional hallmarks based on our analysis.
Our new perspective on protein disorder clarifies a variety of previous results by putting them into a systematic framework. Moreover, the clear and distinct functional association of flexible and constrained disorder will allow for new approaches and more specific algorithms for disorder detection in a functional context. Finally, in flexible disordered regions, we demonstrate clear evolutionary selection of protein disorder with little selection on primary structure, which has important implications for sequence-based studies of protein structure and evolution.
Using a structure–function analysis, we find that Rvs proteins are initially recruited to sites of endocytosis through their curvature-sensing and membrane-binding ability in a manner dependent on complex sphingolipids.
BAR domains are protein modules that bind to membranes and promote membrane curvature. One type of BAR domain, the N-BAR domain, contains an additional N-terminal amphipathic helix, which contributes to membrane-binding and bending activities. The only known N-BAR-domain proteins in the budding yeast Saccharomyces cerevisiae, Rvs161 and Rvs167, are required for endocytosis. We have explored the mechanism of N-BAR-domain function in the endocytosis process using a combined biochemical and genetic approach. We show that the purified Rvs161–Rvs167 complex binds to liposomes in a curvature-independent manner and promotes tubule formation in vitro. Consistent with the known role of BAR domain polymerization in membrane bending, we found that Rvs167 BAR domains interact with each other at cortical actin patches in vivo. To characterize N-BAR-domain function in endocytosis, we constructed yeast strains harboring changes in conserved residues in the Rvs161 and Rvs167 N-BAR domains. In vivo analysis of the rvs endocytosis mutants suggests that Rvs proteins are initially recruited to sites of endocytosis through their membrane-binding ability. We show that inappropriate regulation of complex sphingolipid and phosphoinositide levels in the membrane can impinge on Rvs function, highlighting the relationship between membrane components and N-BAR-domain proteins in vivo.
Genetic interactions are highly informative for deciphering the underlying functional principles that govern how genes control cell processes. Recent developments in Synthetic Genetic Array (SGA) analysis enable the mapping of quantitative genetic interactions on a genome-wide scale. To facilitate access to this resource, which will ultimately represent a complete genetic interaction network for a eukaryotic cell, we developed DRYGIN (Data Repository of Yeast Genetic Interactions)—a web database system that aims at providing a central platform for yeast genetic network analysis and visualization. In addition to providing an interface for searching the SGA genetic interactions, DRYGIN also integrates other data sources, in order to associate the genetic interactions with pathway information, protein complexes, other binary genetic and physical interactions, and Gene Ontology functional annotation. DRYGIN version 1.0 currently holds more than 5.4 million measurements of genetic interacting pairs involving ∼4500 genes, and is available at http://drygin.ccbr.utoronto.ca
High-throughput studies have enabled the large-scale mapping of synthetic lethal genetic interaction networks in the budding yeast Saccharomyces cerevisiae (S. cerevisiae). Recently, complementary high-throughput methods have been developed to map genetic interactions in the fission yeast Schizosaccharomyces pombe (S. pombe), enabling comparative analyses of genetic interaction networks between S. pombe and S. cerevisiae, two species separated by hundreds of millions of years of evolution. The resultant data has providing our first view of a possible core genetic interaction network shared between two distantly related eukaryotes, and identified numerous species-specific interactions that may contribute to the unique biology of these two different organisms. These and other results suggest that comparative interactomic studies will provide novel insights into the structure of genetic interaction networks.
pombe; cerevisiae; comparative genomics; synthetic lethal; SGA
A report on the Cold Spring Harbor Laboratory meeting 'Yeast Cell Biology', Cold Spring Harbor, USA, 12-17 August 2003.
A report on the Cold Spring Harbor Laboratory meeting 'Yeast Cell Biology', Cold Spring Harbor, USA, 12-17 August 2003.
Mechanisms for activating the actin-related protein 2/3 (Arp2/3) complex have been the focus of many recent studies. Here, we identify a novel mode of Arp2/3 complex regulation mediated by the highly conserved actin binding protein coronin. Yeast coronin (Crn1) physically associates with the Arp2/3 complex and inhibits WA- and Abp1-activated actin nucleation in vitro. The inhibition occurs specifically in the absence of preformed actin filaments, suggesting that Crn1 may restrict Arp2/3 complex activity to the sides of filaments. The inhibitory activity of Crn1 resides in its coiled coil domain. Localization of Crn1 to actin patches in vivo and association of Crn1 with the Arp2/3 complex also require its coiled coil domain. Genetic studies provide in vivo evidence for these interactions and activities. Overexpression of CRN1 causes growth arrest and redistribution of Arp2 and Crn1p into aberrant actin loops. These defects are suppressed by deletion of the Crn1 coiled coil domain and by arc35-26, an allele of the p35 subunit of the Arp2/3 complex. Further in vivo evidence that coronin regulates the Arp2/3 complex comes from the observation that crn1 and arp2 mutants display an allele-specific synthetic interaction. This work identifies a new form of regulation of the Arp2/3 complex and an important cellular function for coronin.
actin; yeast; coronin; Arp2/3 complex; coiled coil