The capacity of wine yeast to utilize the nitrogen available in grape must directly correlates with the fermentation and growth rates of all wine yeast fermentation stages and is, thus, of critical importance for wine production. Here we precisely quantified the ability of low complexity nitrogen compounds to support fast, efficient and rapidly initiated growth of four commercially important wine strains. Nitrogen substrate abundance in grape must failed to correlate with the rate or the efficiency of nitrogen source utilization, but well predicted lag phase length. Thus, human domestication of yeast for grape must growth has had, at the most, a marginal impact on wine yeast growth rates and efficiencies, but may have left a surprising imprint on the time required to adjust metabolism from non growth to growth. Wine yeast nitrogen source utilization deviated from that of the lab strain experimentation, but also varied between wine strains. Each wine yeast lineage harbored nitrogen source utilization defects that were private to that strain. By a massive hemizygote analysis, we traced the genetic basis of the most glaring of these defects, near inability of the PDM wine strain to utilize methionine, as consequence of mutations in its ARO8, ADE5,7 and VBA3 alleles. We also identified candidate causative mutations in these genes. The methionine defect of PDM is potentially very interesting as the strain can, in some circumstances, overproduce foul tasting H2S, a trait which likely stems from insufficient methionine catabolization. The poor adaptation of wine yeast to the grape must nitrogen environment, and the presence of defects in each lineage, open up wine strain optimization through biotechnological endeavors.
The number of chromosome sets contained within the nucleus of eukaryotic organisms is a fundamental yet evolutionarily poorly characterized genetic variable of life. Here, we mapped the impact of ploidy on the mitotic fitness of baker's yeast and its never domesticated relative Saccharomyces paradoxus across wide swaths of their natural genotypic and phenotypic space. Surprisingly, environment-specific influences of ploidy on reproduction were found to be the rule rather than the exception. These ploidy–environment interactions were well conserved across the 2 billion generations separating the two species, suggesting that they are the products of strong selection. Previous hypotheses of generalizable advantages of haploidy or diploidy in ecological contexts imposing nutrient restriction, toxin exposure, and elevated mutational loads were rejected in favor of more fine-grained models of the interplay between ecology and ploidy. On a molecular level, cell size and mating type locus composition had equal, but limited, explanatory power, each explaining 12.5%–17% of ploidy–environment interactions. The mechanism of the cell size–based superior reproductive efficiency of haploids during Li+ exposure was traced to the Li+ exporter ENA. Removal of the Ena transporters, forcing dependence on the Nha1 extrusion system, completely altered the effects of ploidy on Li+ tolerance and evoked a strong diploid superiority, demonstrating how genetic variation at a single locus can completely reverse the relative merits of haploidy and diploidy. Taken together, our findings unmasked a dynamic interplay between ploidy and ecology that was of unpredicted evolutionary importance and had multiple molecular roots.
Organisms vary in the number of chromosome sets contained within the nucleus of each cell, but neither the reasons nor the consequences of this variation are well understood. We designed yeasts that differed in the number of chromosome sets but were otherwise identical and mapped the consequences of such ploidy variations during exposure to a large palette of environments. Contrary to commonly held assumptions, we found ploidy effects on the mitotic reproductive capacity of yeast to be the rule rather than the exception and to be highly evolutionarily conserved. Furthermore, our data rejected previously contemplated hypotheses of generalizable advantages of haploidy or diploidy when cells face nutrient starvation or are exposed to toxins or increased mutation rates. We also mapped the molecular processes mediating ploidy–environment interactions, showing that cell size and mating type locus composition had equal explanatory power. Finally we show that ploidy effects can be mechanistically very subtle, as a designed shift from one plasma membrane Li+ transporter to another completely altered the relative merits of having one or two chromosome sets when exposed to high Li+ concentrations. This complex and dynamic interplay between the number of chromosomes sets and the fluctuating environment must be taken into account when considering organismal form and behavior.
Comparative genomics is a formidable tool to identify functional elements throughout a genome. In the past ten years, studies in the budding yeast Saccharomyces cerevisiae and a set of closely related species have been instrumental in showing the benefit of analyzing patterns of sequence conservation. Increasing the number of closely related genome sequences makes the comparative genomics approach more powerful and accurate.
Here, we report the genome sequence and analysis of Saccharomyces arboricolus, a yeast species recently isolated in China, that is closely related to S. cerevisiae. We obtained high quality de novo sequence and assemblies using a combination of next generation sequencing technologies, established the phylogenetic position of this species and considered its phenotypic profile under multiple environmental conditions in the light of its gene content and phylogeny.
We suggest that the genome of S. arboricolus will be useful in future comparative genomics analysis of the Saccharomyces sensu stricto yeasts.
Multivariate approaches have been successfully applied to genome wide association studies. Recently, a Partial Least Squares (PLS) based approach was introduced for mapping yeast genotype-phenotype relations, where background information such as gene function classification, gene dispensability, recent or ancient gene copy number variations and the presence of premature stop codons or frameshift mutations in reading frames, were used post hoc to explain selected genes. One of the latest advancement in PLS named L-Partial Least Squares (L-PLS), where ‘L’ presents the used data structure, enables the use of background information at the modeling level. Here, a modification of L-PLS with variable importance on projection (VIP) was implemented using a stepwise regularized procedure for gene and background information selection. Results were compared to PLS-based procedures, where no background information was used.
Applying the proposed methodology to yeast Saccharomyces cerevisiae data, we found the relationship between genotype-phenotype to have improved understandability. Phenotypic variations were explained by the variations of relatively stable genes and stable background variations. The suggested procedure provides an automatic way for genotype-phenotype mapping. The selected phenotype influencing genes were evolving 29% faster than non-influential genes, and the current results are supported by a recently conducted study. Further power analysis on simulated data verified that the proposed methodology selects relevant variables.
A modification of L-PLS with VIP in a stepwise regularized elimination procedure can improve the understandability and stability of selected genes and background information. The approach is recommended for genome wide association studies where background information is available.
Saccharomyces cerevisiae is the main microorganism responsible for wine alcoholic fermentation. The oenological phenotypes resulting from fermentation, such as the production of acetic acid, glycerol, and residual sugar concentration are regulated by multiple genes and vary quantitatively between different strain backgrounds. With the aim of identifying the quantitative trait loci (QTLs) that regulate oenological phenotypes, we performed linkage analysis using three crosses between highly diverged S. cerevisiae strains. Segregants from each cross were used as starter cultures for 20-day fermentations, in synthetic wine must, to simulate actual winemaking conditions. Linkage analysis on phenotypes of primary industrial importance resulted in the mapping of 18 QTLs. We tested 18 candidate genes, by reciprocal hemizygosity, for their contribution to the observed phenotypic variation, and validated five genes and the chromosome II right subtelomeric region. We observed that genes involved in mitochondrial metabolism, sugar transport, nitrogen metabolism, and the uncharacterized ORF YJR030W explained most of the phenotypic variation in oenological traits. Furthermore, we experimentally validated an exceptionally strong epistatic interaction resulting in high level of succinic acid between the Sake FLX1 allele and the Wine/European MDH2 allele. Overall, our work demonstrates the complex genetic basis underlying wine traits, including natural allelic variation, antagonistic linked QTLs and complex epistatic interactions between alleles from strains with different evolutionary histories.
Genetic variation for plastic phenotypes potentially contributes phenotypic variation to populations that can be selected during adaptation to novel ecological contexts. However, the basis and extent of plastic variation that manifests in diverse environments remains elusive. Here, we characterize copper reaction norms for mRNA abundance among five Saccharomyces cerevisiae strains to 1) describe population variation across the full range of ecologically relevant copper concentrations, from starvation to toxicity, and 2) to test the hypothesis that plastic networks exhibit increased population variation for gene expression. We find that although the vast majority of the variation is small in magnitude (considerably <2-fold), not just some, but most genes demonstrate variable expression across environments, across genetic backgrounds, or both. Plastically expressed genes included both genes regulated directly by copper-binding transcription factors Mac1 and Ace1 and genes indirectly responding to the downstream metabolic consequences of the copper gradient, particularly genes involved in copper, iron, and sulfur homeostasis. Copper-regulated gene networks exhibited more similar behavior within the population in environments where those networks have a large impact on fitness. Nevertheless, expression variation in genes like Cup1, important to surviving copper stress, was linked with variation in mitotic fitness and in the breadth of differential expression across the genome. By revealing a broader and deeper range of population variation, our results provide further evidence for the interconnectedness of genome-wide mRNA levels, their dependence on environmental context and genetic background, and the abundance of variation in gene expression that can contribute to future evolution.
gene expression variation; environmental plasticity; copper; mRNA population genetic variation; phenotypic evolution
Gene finding is a complicated procedure that encapsulates algorithms for coding sequence modeling, identification of promoter regions, issues concerning overlapping genes and more. In the present study we focus on coding sequence modeling algorithms; that is, algorithms for identification and prediction of the actual coding sequences from genomic DNA. In this respect, we promote a novel multivariate method known as Canonical Powered Partial Least Squares (CPPLS) as an alternative to the commonly used Interpolated Markov model (IMM). Comparisons between the methods were performed on DNA, codon and protein sequences with highly conserved genes taken from several species with different genomic properties.
The multivariate CPPLS approach classified coding sequence substantially better than the commonly used IMM on the same set of sequences. We also found that the use of CPPLS with codon representation gave significantly better classification results than both IMM with protein (p < 0.001) and with DNA (p < 0.001). Further, although the mean performance was similar, the variation of CPPLS performance on codon representation was significantly smaller than for IMM (p < 0.001).
The performance of coding sequence modeling can be substantially improved by using an algorithm based on the multivariate CPPLS method applied to codon or DNA frequencies.
Conditional temperature-sensitive (ts) mutations are valuable reagents for studying essential genes in the yeast Saccharomyces cerevisiae. We constructed 787 ts strains, covering 497 (~45%) of the 1,101 essential yeast genes, with ~30% of the genes represented by multiple alleles. All of the alleles are integrated into their native genomic locus in the S288C common reference strain and are linked to a kanMX selectable marker, allowing further genetic manipulation by synthetic genetic array (SGA)–based, high-throughput methods. We show two such manipulations: barcoding of 440 strains, which enables chemical-genetic suppression analysis, and the construction of arrays of strains carrying different fluorescent markers of subcellular structure, which enables quantitative analysis of phenotypes using high-content screening. Quantitative analysis of a GFP-tubulin marker identified roles for cohesin and condensin genes in spindle disassembly. This mutant collection should facilitate a wide range of systematic studies aimed at understanding the functions of essential genes.
In genomics, a commonly encountered problem is to extract a subset of variables out of a large set of explanatory variables associated with one or several quantitative or qualitative response variables. An example is to identify associations between codon-usage and phylogeny based definitions of taxonomic groups at different taxonomic levels. Maximum understandability with the smallest number of selected variables, consistency of the selected variables, as well as variation of model performance on test data, are issues to be addressed for such problems.
We present an algorithm balancing the parsimony and the predictive performance of a model. The algorithm is based on variable selection using reduced-rank Partial Least Squares with a regularized elimination. Allowing a marginal decrease in model performance results in a substantial decrease in the number of selected variables. This significantly improves the understandability of the model. Within the approach we have tested and compared three different criteria commonly used in the Partial Least Square modeling paradigm for variable selection; loading weights, regression coefficients and variable importance on projections. The algorithm is applied to a problem of identifying codon variations discriminating different bacterial taxa, which is of particular interest in classifying metagenomics samples. The results are compared with a classical forward selection algorithm, the much used Lasso algorithm as well as Soft-threshold Partial Least Squares variable selection.
A regularized elimination algorithm based on Partial Least Squares produces results that increase understandability and consistency and reduces the classification error on test data compared to standard approaches.
The fission yeast Schizosaccharomyces pombe has been widely used to study eukaryotic cell biology, but almost all of this work has used derivatives of a single strain. We have studied 81 independent natural isolates and 3 designated laboratory strains of Schizosaccharomyces pombe. Schizosaccharomyces pombe varies significantly in size but shows only limited variation in proliferation in different environments compared with Saccharomyces cerevisiae. Nucleotide diversity, π, at a near neutral site, the central core of the centromere of chromosome II is approximately 0.7%. Approximately 20% of the isolates showed karyotypic rearrangements as detected by pulsed field gel electrophoresis and filter hybridization analysis. One translocation, found in 6 different isolates, including the type strain, has a geographically widespread distribution and a unique haplotype and may be a marker of an incipient speciation event. All of the other translocations are unique. Exploitation of this karyotypic diversity may cast new light on both the biology of telomeres and centromeres and on isolating mechanisms in single-celled eukaryotes.
pombe; karyotype; diversity; fission yeast
Multivariate approaches are important due to their versatility and applications in many fields as it provides decisive advantages over univariate analysis in many ways. Genome wide association studies are rapidly emerging, but approaches in hand pay less attention to multivariate relation between genotype and phenotype. We introduce a methodology based on a BLAST approach for extracting information from genomic sequences and Soft- Thresholding Partial Least Squares (ST-PLS) for mapping genotype-phenotype relations.
Applying this methodology to an extensive data set for the model yeast Saccharomyces cerevisiae, we found that the relationship between genotype-phenotype involves surprisingly few genes in the sense that an overwhelmingly large fraction of the phenotypic variation can be explained by variation in less than 1% of the full gene reference set containing 5791 genes. These phenotype influencing genes were evolving 20% faster than non-influential genes and were unevenly distributed over cellular functions, with strong enrichments in functions such as cellular respiration and transposition. These genes were also enriched with known paralogs, stop codon variations and copy number variations, suggesting that such molecular adjustments have had a disproportionate influence on Saccharomyces yeasts recent adaptation to environmental changes in its ecological niche.
BLAST and PLS based multivariate approach derived results that adhere to the known yeast phylogeny and gene ontology and thus verify that the methodology extracts a set of fast evolving genes that capture the phylogeny of the yeast strains. The approach is worth pursuing, and future investigations should be made to improve the computations of genotype signals as well as variable selection procedure within the PLS framework.
A fundamental goal in biology is to achieve a mechanistic understanding of how and to what extent ecological variation imposes selection for distinct traits and favors the fixation of specific genetic variants. Key to such an understanding is the detailed mapping of the natural genomic and phenomic space and a bridging of the gap that separates these worlds. Here we chart a high-resolution map of natural trait variation in one of the most important genetic model organisms, the budding yeast Saccharomyces cerevisiae, and its closest wild relatives and trace the genetic basis and timing of major phenotype changing events in its recent history. We show that natural trait variation in S. cerevisiae exceeds that of its relatives, despite limited genetic variation, and follows the population history rather than the source environment. In particular, the West African population is phenotypically unique, with an extreme abundance of low-performance alleles, notably a premature translational termination signal in GAL3 that cause inability to utilize galactose. Our observations suggest that many S. cerevisiae traits may be the consequence of genetic drift rather than selection, in line with the assumption that natural yeast lineages are remnants of recent population bottlenecks. Disconcertingly, the universal type strain S288C was found to be highly atypical, highlighting the danger of extrapolating gene-trait connections obtained in mosaic, lab-domesticated lineages to the species as a whole. Overall, this study represents a step towards an in-depth understanding of the causal relationship between co-variation in ecology, selection pressure, natural traits, molecular mechanism, and alleles in a key model organism.
An overall aim in modern biology is to achieve an in-depth understanding of an organism's physiology in the context of its ecology and historic selective pressures that have been acting on its genome. The baker's yeast, Saccharomyces cerevisiae, has a peculiar life history completely dominated by clonal reproduction and self-fertilization, prompting the suggestion that natural yeasts are remnants of repeated population bottlenecks in essentially clonal lineages. Such a life history dominated by mitotic proliferation purports a strong evolutionary influence of genetic drift and predicts trait variation to be high and largely defined by the genetic history of each population. Here we chart a highly resolved map of natural trait variation in S. cerevisiae and its closest non-domesticated relative, Saccharomyces paradoxus, and confirm this prediction. We found that trait variation in budding yeast is indeed high and largely defined by population rather than source environment. In particular, the West African population was found to be phenotypically unique with an extreme abundance of low-performance alleles. Our findings support the idea of population bottlenecks in the recent yeast evolutionary history and a large influence of genetic drift.
Despite a century of research and increasing environmental and human health concerns, the mechanistic basis of the toxicity of derivatives of the metalloid tellurium, Te, in particular the oxyanion tellurite, Te(IV), remains unsolved. Here, we provide an unbiased view of the mechanisms of tellurium metabolism in the yeast Saccharomyces cerevisiae by measuring deviations in Te-related traits of a complete collection of gene knockout mutants. Reduction of Te(IV) and intracellular accumulation as metallic tellurium strongly correlated with loss of cellular fitness, suggesting that Te(IV) reduction and toxicity are causally linked. The sulfate assimilation pathway upstream of Met17, in particular, the sulfite reductase and its cofactor siroheme, was shown to be central to tellurite toxicity and its reduction to elemental tellurium. Gene knockout mutants with altered Te(IV) tolerance also showed a similar deviation in tolerance to both selenite and, interestingly, selenomethionine, suggesting that the toxicity of these agents stems from a common mechanism. We also show that Te(IV) reduction and toxicity in yeast is partially mediated via a mitochondrial respiratory mechanism that does not encompass the generation of substantial oxidative stress. The results reported here represent a robust base from which to attack the mechanistic details of Te(IV) toxicity and reduction in a eukaryotic organism.
Eukaryotic translation initiation factor 4G (eIF4G) is thought to influence the translational efficiencies of cellular mRNAs by its roles in forming an eIF4F-mRNA-PABP mRNP that is competent for attachment of the 43S preinitiation complex, and in scanning through structured 5' UTR sequences. We have tested this hypothesis by determining the effects of genetically depleting eIF4G from yeast cells on global translational efficiencies (TEs), using gene expression microarrays to measure the abundance of mRNA in polysomes relative to total mRNA for ~5900 genes.
Although depletion of eIF4G is lethal and reduces protein synthesis by ~75%, it had small effects (less than a factor of 1.5) on the relative TE of most genes. Within these limits, however, depleting eIF4G narrowed the range of translational efficiencies genome-wide, with mRNAs of better than average TE being translated relatively worse, and mRNAs with lower than average TE being translated relatively better. Surprisingly, the fraction of mRNAs most dependent on eIF4G display an average 5' UTR length at or below the mean for all yeast genes.
This finding suggests that eIF4G is more critical for ribosome attachment to mRNAs than for scanning long, structured 5' UTRs. Our results also indicate that eIF4G, and the closed-loop mRNP it assembles with the m7 G cap- and poly(A)-binding factors (eIF4E and PABP), is not essential for translation of most (if not all) mRNAs but enhances the differentiation of translational efficiencies genome-wide.
In the global osmoshock translational response in yeast, some gene products were translationally mobilized without transcriptional up-regulation. Conversely, other transcriptionally up-regulated mRNAs were translationally inhibited. Analogous changes occurred on the protein level. These translational responses were strongly dependent on Hog1 and Rck2.
Cellular responses to environmental changes occur on different levels. We investigated the translational response of yeast cells after mild hyperosmotic shock by isolating mRNA associated with multiple ribosomes (polysomes) followed by array analysis. Globally, recruitment of preexisting mRNAs to ribosomes (translational response) is faster than the transcriptional response. Specific functional groups of mRNAs are recruited to ribosomes without any corresponding increase in total mRNA. Among mRNAs under strong translational up-regulation upon shock, transcripts encoding membrane-bound proteins including hexose transporters were enriched. Similarly, numerous mRNAs encoding cytoplasmic ribosomal proteins run counter to the overall trend of down-regulation and are instead translationally mobilized late in the response. Surprisingly, certain transcriptionally induced mRNAs were excluded from ribosomal association after shock. Importantly, we verify, using constructs with intact 5′ and 3′ untranslated regions, that the observed changes in polysomal mRNA are reflected in protein levels, including cases with only translational up-regulation. Interestingly, the translational regulation of the most highly osmostress-regulated mRNAs was more strongly dependent on the stress-activated protein kinases Hog1 and Rck2 than the transcriptional regulation. Our results show the importance of translational control for fine tuning of the adaptive responses.
Since the completion of the genome sequence of Saccharomyces cerevisiae in 19961,2, there has been an exponential increase in complete genome sequences accompanied by great advances in our understanding of genome evolution. Although little is known about the natural and life histories of yeasts in the wild, there are an increasing number of studies looking at ecological and geographic distributions3,4, population structure5-8, and sexual versus asexual reproduction9,10. Less well understood at the whole genome level are the evolutionary processes acting within populations and species leading to adaptation to different environments, phenotypic differences and reproductive isolation. Here we present one- to four-fold or more coverage of the genome sequences of over seventy isolates of the baker's yeast, S. cerevisiae, and its closest relative, S. paradoxus. We examine variation in gene content, SNPs, indels, copy numbers and transposable elements. We find that phenotypic variation broadly correlates with global genome-wide phylogenetic relationships. Interestingly, S. paradoxus populations are well delineated along geographic boundaries while the variation among worldwide S. cerevisiae isolates shows less differentiation and is comparable to a single S. paradoxus population. Rather than one or two domestication events leading to the extant baker's yeasts, the population structure of S. cerevisiae consists of a few well-defined geographically isolated lineages and many different mosaics of these lineages, supporting the idea that human influence provided the opportunity for cross-breeding and production of new combinations of pre-existing variation.
Cellular signalling networks integrate environmental stimuli with the information on cellular status. These networks must be robust against stochastic fluctuations in stimuli as well as in the amounts of signalling components. Here, we challenge the yeast HOG signal-transduction pathway with systematic perturbations in components' expression levels under various external conditions in search for nodes of fragility. We observe a substantially higher frequency of fragile nodes in this signal-transduction pathway than that has been observed for other cellular processes. These fragilities disperse without any clear pattern over biochemical functions or location in pathway topology and they are largely independent of pathway activation by external stimuli. However, the strongest toxicities are caused by pathway hyperactivation. In silico analysis highlights the impact of model structure on in silico robustness, and suggests complex formation and scaffolding as important contributors to the observed fragility patterns. Thus, in vivo robustness data can be used to discriminate and improve mathematical models.
gTow; HOG; robustness; signal transduction; systems biology
A fundamental goal in chemical biology is the elucidation of on- and off-target effects of drugs and biocides. To this aim chemogenetic screens that quantify drug induced changes in cellular fitness, typically taken as changes in composite growth, is commonly applied.
Using the model organism Saccharomyces cerevisiae we here report that resolving cellular growth dynamics into its individual components, growth lag, growth rate and growth efficiency, increases the predictive power of chemogenetic screens. Both in terms of drug-drug and gene-drug interactions did the individual growth variables capture distinct and only partially overlapping aspects of cell physiology. In fact, the impact on cellular growth dynamics represented functionally distinct chemical fingerprints.
Our findings suggest that the resolution and quantification of all facets of growth increases the informational and interpretational output of chemogenetic screening. Hence, by facilitating a physiologically more complete analysis of gene-drug and drug-drug interactions the here reported results may simplify the assignment of mode-of-action to orphan bioactive compounds.
Connecting genotype to phenotype is fundamental in biomedical research and in our understanding of disease. Phenomics—the large-scale quantitative phenotypic analysis of genotypes on a genome-wide scale—connects automated data generation with the development of novel tools for phenotype data integration, mining and visualization. Our yeast phenomics database PROPHECY is available at . Via phenotyping of 984 heterozygous diploids for all essential genes the genotypes analysed and presented in PROPHECY have been extended and now include all genes in the yeast genome. Further, phenotypic data from gene overexpression of 574 membrane spanning proteins has recently been included. To facilitate the interpretation of quantitative phenotypic data we have developed a new phenotype display option, the Comparative Growth Curve Display, where growth curve differences for a large number of mutants compared with the wild type are easily revealed. In addition, PROPHECY now offers a more informative and intuitive first-sight display of its phenotypic data via its new summary page. We have also extended the arsenal of data analysis tools to include dynamic visualization of phenotypes along individual chromosomes. PROPHECY is an initiative to enhance the growing field of phenome bioinformatics.
Despite a strong evolutionary pressure to reduce genome size, proteins vary in length over a surprisingly wide range also in very compact genomes. Here we investigated the evolutionary forces that act on protein size in the yeast Saccharomyces cerevisiae utilizing a system-wide bioinformatics approach. Data on yeast protein size was compared to global experimental data on protein expression, phenotypic pleiotropy, protein-protein interactions, protein evolutionary rate and biochemical classification.
Comparing the experimentally determined abundance of individual proteins, highly expressed proteins were found to be consistently smaller than lowly expressed proteins, in accordance with the biosynthetic cost minimization hypothesis. Yeast proteins able to maintain a high expression level despite a large size tended to belong to a very distinct set of protein families, notably nuclear transport and translation initiation/elongation. Large proteins have significantly more protein-protein interactions than small proteins, suggesting that a requirement for multiple interaction domains may constitute a positive selective pressure for large protein size in yeast. The higher frequency of protein-protein interactions in large proteins was not accompanied by a higher phenotypic pleiotropy. Hence, the increase in interactions may not reflect an increase in function differentiation. Proteins of different sizes also evolved at similar rates. Finally, whereas the biological process involved was found to have little influence on protein size the biochemical activity exerted by the protein represented a dominant factor. More than one third of all biochemical activity classes were enriched in one or more size intervals.
In yeast, there is an inverse relationship between protein size and protein expression such that highly expressed proteins tend to be of smaller size. Also, protein size is moderately affected by protein connectivity and strongly affected by biochemical activity. Phenotypic pleiotropy does not seem to affect protein size.
The N-terminal acetyltransferase NatB in Saccharomyces cerevisiae consists of the catalytic subunit Nat3p and the associated subunit Mdm20p. We here extend our present knowledge about the physiological role of NatB by a combined proteomics and phenomics approach. We found that strains deleted for either NAT3 or MDM20 displayed different growth rates and morphologies in specific stress conditions, demonstrating that the two NatB subunits have partly individual functions. Earlier reported phenotypes of the nat3Δ strain have been associated with altered functionality of actin cables. However, we found that point mutants of tropomyosin that suppress the actin cable defect observed in nat3Δ only partially restores wild-type growth and morphology, indicating the existence of functionally important acetylations unrelated to actin cable function. Predicted NatB substrates were dramatically overrepresented in a distinct set of biological processes, mainly related to DNA processing and cell cycle progression. Three of these proteins, Cac2p, Pac10p, and Swc7p, were identified as true NatB substrates. To identify N-terminal acetylations potentially important for protein function, we performed a large-scale comparative phenotypic analysis including nat3Δ and strains deleted for the putative NatB substrates involved in cell cycle regulation and DNA processing. By this procedure we predicted functional importance of the N-terminal acetylation for 31 proteins.