PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of ploscompComputational BiologyView this ArticleSubmit to PLoSGet E-mail AlertsContact UsPublic Library of Science (PLoS)
 
PLoS Comput Biol. 2010 July; 6(7): e1000865.
Published online 2010 July 29. doi:  10.1371/journal.pcbi.1000865
PMCID: PMC2912337

A Comprehensive, Quantitative, and Genome-Wide Model of Translation

Yitzhak Pilpel, Editor

Abstract

Translation is still poorly characterised at the level of individual proteins and its role in regulation of gene expression has been constantly underestimated. To better understand the process of protein synthesis we developed a comprehensive and quantitative model of translation, characterising protein synthesis separately for individual genes. The main advantage of the model is that basing it on only a few datasets and general assumptions allows the calculation of many important translational parameters, which are extremely difficult to measure experimentally. In the model, each gene is attributed with a set of translational parameters, namely the absolute number of transcripts, ribosome density, mean codon translation time, total transcript translation time, total time required for translation initiation and elongation, translation initiation rate, mean mRNA lifetime, and absolute number of proteins produced by gene transcripts. Most parameters were calculated based on only one experimental dataset of genome-wide ribosome profiling. The model was implemented in Saccharomyces cerevisiae, and its results were compared with available data, yielding reasonably good correlations. The calculated coefficients were used to perform a global analysis of translation in yeast, revealing some interesting aspects of the process. We have shown that two commonly used measures of translation efficiency – ribosome density and number of protein molecules produced – are affected by two distinct factors. High values of both measures are caused, i.a., by very short times of translation initiation, however, the origins of initiation time reduction are completely different in both cases. The model is universal and can be applied to any organism, if the necessary input data are available. The model allows us to better integrate transcriptomic and proteomic data. A few other possibilities of the model utilisation are discussed concerning the example of the yeast system.

Author Summary

Translation is the production of proteins by decoding mRNA produced in transcription, and is a part of the overall process of gene expression. Although the general theoretical background of translation is known, the process is still poorly characterised at the level of individual proteins. In particular, the quantitative parameters of translation, such as time required to complete it or the number of protein molecules produced from a transcript during its lifetime, are extremely difficult to measure experimentally. To overcome this problem, we developed a computational model that, on the basis of only few datasets and general assumptions, measures quantitatively the translational activity at the level of individual genes. We discussed it concerning the example of the yeast system; however, it can be applied to any organism of known genome. We used the obtained results to study the general characteristics of the yeast translational system, revealing the diversity of strategies of gene expression regulation. We exemplified and discussed other possible ways of model utilisation, as it may help in examining protein-protein interactions, metabolic pathways, gene annotation, ribosome queueing, protein folding, and translation initiation. It also may be crucial for better integration of cell-wide, high-throughput experiments.

Introduction

The rate of translation differs for individual proteins, reflecting both the intrinsic capability of an mRNA molecule to be translated and the environmental factors affecting the efficiency of the translation process. The first is well characterised in other studies [1][3] that discuss mRNA features responsible for the regulation of translation (e.g., length of the 5′ UTR, presence and location of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e001.jpgORFs, type and number of initiation codons, sequence context around the initiation codon, presence and location of mRNA secondary structure elements, codon usage, mRNA stability, and posttranscriptional modifications). However, the second describes the features of the environment in which translation occurs, namely the amounts of particular mRNA transcripts in a cell, the accessibility of the translation machinery elements required to initiate and accomplish protein synthesis (such as free ribosomes, tRNAs, and elongation factors), as well as growth conditions, which have been proven to evoke gene-specific translational control [4].

Although the general theoretical background of translation is known, the process of protein synthesis is still poorly characterised at the level of individual proteins. Experimental determination of absolute translation rates (i.e., in time units) is a tremendous task and we are not aware of any such research. Even though the factors specified above have been studied separately for some proteins, little is known about the extent to which they affect the process and how they cooperate to keep the synthesis rate at the required level. Another strategy to examine translation activity is to integrate genome-wide expression datasets from different sources [5][8]. However, it was shown [9] that these datasets cannot be used to predict translation rates at the level of individual proteins, as they suffer from large random errors and systematic shifts in reported values.

In practice, upon the development of techniques to examine transcriptome data experimentally (microarrays, Northern blotting, RNA-seq, etc.), the mRNA concentration has become a broadly used measure of protein abundance. Nevertheless, recent research indicates that there is only a partial correlation between mRNA and protein abundances [10][16]. It was shown that the mRNA transcription level can explain only 20–40% of the observed amounts of proteins [17], [18], which leads to conclusion that the role of translation in regulation of gene expression has been constantly underestimated. Thus, a deeper insight into the process of translation is required to better integrate transcriptomic and proteomic data [19][21].

In this study, we developed a model to measure the absolute, translational activity at the level of individual genes. The model was implemented in Saccharomyces cerevisiae, however, it can be used to study translation in any other organism of known genome, but only if the following data are available: (i) a dataset of mRNA relative abundance and ribosome footprints; (ii) tRNAs decoding specificities; (iii) average cell volume; (iv) average number of active ribosomes in a cell; (v) average number of mRNA transcripts in a cell; and (vi) a dataset of mRNA half-lives (optionally).

In our calculations for the yeast system the first dataset came from one genome-wide experiment provided by Ingolia et al. [22], quantifying simultaneously mRNA abundance and ribosome footprints by means of deep sequencing. This method is thought to provide a far more precise measurement of transcript levels than other hybridisation or sequence-based approaches [23]. Based on this dataset, we determined the absolute time of translation, in SI units, for individual genes. The time is the sum of the time required to accomplish two main steps of protein translation: initiation and elongation. Analysing the initiation or elongation time alone provides quantitative information on the extent of translation regulation at these two steps separately. Moreover, by introducing mRNA concentrations into the model, one can calculate the relative rate of translation initiation, which does not depend on the transcriptional level of a corresponding gene. Assuming identical conditions for all mRNAs in the cell (i.e., equal amounts of available ribosomes, elongation factors, tRNAs, etc.), the measure will reflect the mRNA's intrinsic ability (in relation to other analysed mRNAs) to regulate the efficiency of translation initiation. Such a deep insight into the process of initiation is particularly important, as this step of protein synthesis is thought to be the main and rate-limiting target for translational control [24]. Furthermore, by combining our results with a dataset on mRNA stability [25], we calculated the absolute amounts of protein produced from each transcript during its lifespan.

We compared our results with direct experimental studies measuring the mRNA and protein levels of chosen genes. Good correlation with most of the experimental data was observed, and calculated mRNA and protein abundances did not differ significantly from those reported in vivo. In addition, other calculated parameters of translation, such as the overall rate of protein synthesis, were in agreement with earlier reports.

The calculated translational parameters were also used to study the general characteristics of the yeast translational system, revealing the diversity of strategies of gene expression regulation. For instance, we showed that two commonly used measures of translation efficiency – ribosome density and number of protein molecules produced – are affected by two distinct factors. We observed strong negative correlations between values of both measures and translation initiation time, however, the origins of initiation time reduction for most efficient transcripts are completely different. In case of elevated ribosome density, short initiation is caused mostly by mRNA instristic capability of being translated discussed at the beginning of this section. Contrary, in case of high number of protein molecules produced, short initiation is caused primarily by elevated mRNA concetrations.

Finally, we exemplified and discussed other possible ways of model utilisation, as the model may be of considerable help in examining gene expression regulation, protein-protein interactions, metabolic pathways, gene annotation, ribosome queuing, protein folding, and translation initiation. Additionally, the model provides an overall and quantitative picture of the translation process, crucial for better integration of transcriptomic and proteomic data from high-throughput experiments.

Results

The following translational parameters were attributed to the yeast genes (for derivation, see the Materials and Methods): An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e002.jpg, length of the transcript coding sequence (CDS) in codons; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e003.jpg, absolute number of transcripts in a yeast cell; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e004.jpg, total amount of protein molecules produced from transcripts of particular type; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e005.jpg, ribosome density in number of ribosomes attached to a transcript per 100 codons; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e006.jpg, the absolute number of ribosomes on a transcript; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e007.jpg, total time of translation of one protein molecule from a given transcript; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e008.jpg, total time required for translation initiation; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e009.jpg, total time required for translation elongation; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e010.jpg, mean time required for elongation of one codon of a transcript; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e011.jpg, translation initiation frequency; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e012.jpg, relative rate of binding of free ribosomes to the 5′ end of a transcript, proportional to the concentration of the transcript; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e013.jpg, relative rate of successful accomplishments of initiation once the ribosome-mRNA complex is formed (the obtained values of the parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e014.jpg ranged from 3.4e-4 to 65.9. For clarity, we decided to normalise them by the maximal reported value of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e015.jpg obtained for the gene YLL040C. The normalised values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e016.jpg range from 0 to 1 and allow more intuitive comparison); An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e017.jpg, estimated half-life of a transcript; and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e018.jpg, estimated mean lifetime of a transcript. Parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e019.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e020.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e021.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e022.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e023.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e024.jpg are given in SI units.

We managed to attribute quantitative measures of translation to the majority of 4648 transcripts from the initial dataset. Four transcripts were rejected at the beginning of processing, as ribosome footprints were not observed on them. Further, 23 transcript had unrealistic, elevated An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e025.jpg values (i.e., An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e026.jpg). Assuming, that a ribosome covers ten codons, a transcript CDS built of 100 codons cannot contain more than ten ribosomes. Eventually, we eliminate transcripts at which queuing of the ribosomes may occur. Our simulation program yielded 130 transcripts suspected of queuing, plus 21 for which translation at the 5′ end is so slow that the first attached ribosome prevents the attachment of the following ribosomes. Further calculations were performed for the most relevant transcripts, i.e., the remaining 4470 yeast genes, without ribosome queuing.

The values of parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e027.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e028.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e029.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e030.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e031.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e032.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e033.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e034.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e035.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e036.jpg, and normalised An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e037.jpg were determined for all 4470 transcripts in the dataset, of which 4192 could also be attributed with additional parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e038.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e039.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e040.jpg. The calculation of all parameters except An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e041.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e042.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e043.jpg was entirely based on the results from one high-throughput experiment. Parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e044.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e045.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e046.jpg engaged one additional dataset of mRNA relative half-lives. The general characteristics of the parameters are specified in Table 1. The values of parameters for individual genes are provided in Supplementary Table S1.

Table 1
The translational parameters calculated in the model.

The calculated parameters allow to study the process of translation at the level of individual genes. Figure 1 depicts the translation process in time on the example of protein YJL173C, a highly conserved subunit of Replication Protein A (RPA). Similar schematics may be constructed for the majority of yeast genes.

Figure 1
Translation model of YJL173C.

Correlations with existing data

Next, we checked if our calculations were in agreement with published data on protein and mRNA abundances. We compared our results with two previously published studies that provide information on transcript and protein copy number for numerous S. cerevisiae genes [13], [26], by performing linear regression through the origin on the log-transformed values. The adjusted An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e053.jpg values, as well as the corresponding regression coefficients, were calculated for six pairs of datasets and the results are presented in Table 2. Scatter plots are presented in Figure 2. The results show that our model explains 84% of the variability in mRNA abundance and 97% of the variability in protein abundance reported by experimental studies. Such An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e054.jpg values are reasonably good, taking into account the differences in the particular yeast strains and laboratory protocols used, as well as the fact that our calculations are based on a few simplifications that can disrupt the final outcome. Moreover, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e055.jpg values reported for our model do not stand out from those calculated for comparisons of two experimental datasets with each other, suggesting that the observed differences constitute the internal variability of the system, not a methodology error. To measure if our results suffer from systematic shift, we calculated the fold difference values for transcript and protein abundance comparisons with two experimental datasets (see Supplementary Figure S1). In general, our calculations slightly overestimate the transcript copy number and underestimate the protein copy number, in relation to published data. This is mainly caused by the assumption we made: that one yeast cell contains, on average, 36,000 transcripts. The transcript copy number used in both reference studies is originally taken from older research [27], which quantified the relative mRNA concentrations and transformed them into absolute copy number, assuming 15,000 as the total number of transcripts per cell. This estimation seems inadequate to us in the light of current discoveries, which are explained in the Materials and Methods.

Figure 2
Model results vs experimental studies.
Table 2
Model determined mRNA and protein abundances versus experimental studies.

Transcript copy number is also problematic due to the wide discrepancies in mRNA levels reported by different studies [28]. Above mentioned mRNA concentration dataset [27] was obtained in a serial analysis of gene expression (SAGE) experiment and it is likely that such concentration estimates have low precision for low abundance mRNAs [13], [26]. On the other hand, it is hypothesized that SAGE is more accurate for abundant mRNAs when compared with other widely used technique: high-density oligonucleotide arrays (HDA) [26], [28]. Thus, we decided to compare mRNA concentrations calculated in our model with results obtained in genome-wide HDA experiment [29]. We performed linear regression through the origin on log-transformed data on mRNA abundance for 3769 genes. Scatter plot and the distribution of fold difference values are presented in Figure 3. The obtained adjusted An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e069.jpg value was 0.30 (see Table 2), meaning that parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e070.jpg is able to explain only one third of the variability in mRNA abundance reported by this experiment [29]. This discrepancy is probably caused again by the experimental error. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e071.jpg reflects mRNA concentration obtained by means of deep-sequencing, technique considered to be far more precise in measuring mRNA levels than other hybridisation or sequence-based approaches [23]. However, it is likely, that it is less precise for low abundance mRNAs, which may be seen in Figure S1 provided by Ingolia et al. [22]. This would explain why parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e072.jpg better describes variability in mRNA concentrations obtained from SAGE than HDA experiments.

Figure 3
Calculated transcript abundance vs experimental studies.

In addition, we estimated that the cell-wide rate of translation for S. cerevisiae at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e075.jpgC is 5.5 amino acids (aa) per second, which corresponds to an average time of translation for one codon of 183 ms. This is in agreement with experimental studies, reporting rates of 8.8 aa/sec and 5.2 aa/sec for fast-growing and slow-growing yeast cells, respectively [30]. It is worth noting that the obtained value is also within the range reported for proteins from other organisms, namely 6 aa/sec for human apolipoprotein [31], 0.74 aa/sec for rabbit hemoglobin [32], 5 aa/sec for chick ovalbumin [33], and an average translation rate of 7.3 aa/sec in cockerel liver [34].

Furthermore, it is reported in independent studies that the total amount of protein in a yeast cell varies from An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e076.jpg g [16] to An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e077.jpg g [35]. Based on known protein sequences and the molecular mass of particular amino acids, we can calculate the mass of each yeast protein. By multiplying this by the protein copy number An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e078.jpg and summing over all expressed yeast proteins, we estimated that the total mass of proteins in a yeast cell is around An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e079.jpg g. Although this number is smaller than values reported previously, it is still consistent taking into account the fact that we excluded from the calculations all transcripts with ribosome density An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e080.jpg, as our model cannot operate on such elevated values of this parameter. Most likely, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e081.jpg results in very high level of translation, meaning that excluded transcripts would have large An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e082.jpg values, if they could be counted by our model. Thus, excluding these transcripts strongly affects the final mass of proteins in a yeast cell, diminishing it noticeably. Moreover, we must not forget that calculated values of the parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e083.jpg reflect only the total amount of proteins produced from a given transcript, whereas the cell contains many other proteins produced in the past that are still present in the cell.

General features of the yeast gene expression system

Based on our results, we can draw the following conclusions concerning gene expression in S. cerevisiae:

First, half of the genes produce less than 2.73 transcripts per cell. The distribution of the transcript copy number is skewed with a long right tail: only 55 genes have more than 100 mRNA copies. Unsurprisingly, the top 20 genes with the highest An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e084.jpg values turned out to be either ribosomal proteins (18 genes) or enzymes engaged in glycolysis (genes YKL060C and YKL152C). One mRNA molecule is translated from 0.14 to 40,110 times, and the median is 257.9. Typically, one gene produces 677 protein copies; however, the most active genes may generate more than 2 million protein copies. Only six genes are common for the sets of the top 20 genes with the highest transcript levels and protein abundance. Among the 20 most highly produced proteins, there are 14 ribosomal proteins, two genes engaged in glycolysis (YCR012W, YKL060C), a highly expressed mitochondrial aminotransferase (YHR208W), alcohol dehydrogenase (YOL086C), and two cell wall proteins (YLR110C, YKL096W-A). There is only partial correlation between transcript and protein copy number and protein production does not necessarily follow the concentration of mRNA molecules (see Figure 4). We compared mRNA (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e085.jpg) and protein (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e086.jpg) abundance calculated in our model, by performing linear regression through the origin on log transformed data. Adjusted An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e087.jpg value calculated over the entire dataset (4192 genes with known An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e088.jpg) was 0.59. This means that over 40% (in log space) of the variation in protein abundance cannot be explained by variation in mRNA abundance, suggesting some additional, posttranscriptional mechanisms of gene expression regulation.

Figure 4
Correlation of mRNA and protein expression levels.

Next, we analysed yeast genes for expression strategies applied to produce the highest number of protein molecules. We prepared two datasets: 200 genes with the highest An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e093.jpg values (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e094.jpg) and 200 genes with the lowest An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e095.jpg values (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e096.jpg). We compared the rest of the translation parameters between these two sets, performing a two-sided Mann-Whitney test. The mean value of most parameters differs between the two datasets in an intuitive manner: genes coding for highly abundant proteins usually produce more transcripts, which have a shorter time of translation (both An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e097.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e098.jpg), as well as stronger resilience to degradation and are occupied by more ribosomes per 100 codons. All differences are statistically significant with p-value An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e099.jpg (data not shown). Only one parameter appeared not to affect the number of proteins produced: the relative rate An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e100.jpg of initiating translation once the ribosome attaches to the free 5′ end of an mRNA molecule (p-valueAn external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e101.jpg). Moreover, the Spearman's correlation coefficient between parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e102.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e103.jpg for the entire dataset is very weak (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e104.jpg, p-value An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e105.jpg).

Analogously, we analysed two datasets of 200 genes with the highest and lowest An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e106.jpg values (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e107.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e108.jpg, respectively). According to the Mann-Whitney test, transcripts of higher ribosome density typically produce more protein molecules and have shorter times of translation (both An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e109.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e110.jpg). All differences are statistically significant with p-value An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e111.jpg (data not shown). In contrast to the result mentioned above, the shorter time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e112.jpg for genes of the highest ribosome density is here caused mainly by elevated An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e113.jpg, while An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e114.jpg has little influence, but the difference in An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e115.jpg is still statistically significant (p-value An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e116.jpg). Nevertheless, no significant correlation was observed between the parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e117.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e118.jpg measured over the entire dataset (p-valueAn external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e119.jpg). The roles of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e120.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e121.jpg in modifying values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e122.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e123.jpg are detailed in Supplementary Figure S2.

Furthermore, we studied, in detail, 20 genes from the set of 200 genes producing the highest number of proteins but with low transcriptional activity (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e124.jpg for all of them). Interestingly, these genes are involved in many distinct biological processes, with the notable exception of ribosome formation. The mechanism of their regulation, deduced from the values of the translational parameters, is almost the same for all genes. For instance, two parameters seem to play the main role in sustaining the high protein synthesis rate: relatively long mean life-time of the mRNA molecule, reaching up to several dozens hours (the maximal observed mean lifetime of a yeast transcript is 61 hours), and about four times shorter time of translation initiation, caused mainly by relatively high An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e125.jpg values. On average, the observed An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e126.jpg is one order of magnitude higher than the median An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e127.jpg for all yeast genes. The shorter pause between subsequent initiations results in elevated ribosome density An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e128.jpg and increased protein production rate. On the other hand, the total time of translation, as well as the mean elongation time, are unexpectedly long (i.e., slightly above the median value of all yeast genes (see Table 3)). This indicated that in cases of long-lived mRNAs, high transcriptional rates and usage of frequent codons are not required to achieve a high rate of protein synthesis. This strategy of expression constitutes an interesting but still inscrutable example of translation regulation, and further research should be carried out.

Table 3
Translational parameters of 20 genes of low transcriptional activity and high protein production rate.

Translation times and codon bias

In Supplementary Table S2 we present times of translation of individual yeast codons at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e131.jpgC. We compared these values with codon optimality An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e132.jpg calculated by [36]. The value of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e133.jpg measures whether the codon is preferred in highly expressed genes compared with all other codons encoding the same amino acid. An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e134.jpg is calculated as the odds ratio of codon usage between highly and lowly expressed genes. Figure 5 shows that there is negative correlation between An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e135.jpg value and translation time of a codon. However, while optimal codons have only short times of translation, non-optimal codons may be translated at both high and low rates. Linear regression model through the origin on log transformed values confirmed this conclusion: the obtained adjusted An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e136.jpg is only 0.15. This indicates, that translation speed may be the one, but not the only criterium for selection on codon bias. This is in agreement with other reports, discussed widely in the recent review [37]. Also, it has been shown that codon usage bias in yeast is associated with translation accuracy [38] and protein structure [36].

Figure 5
Codon optimality vs translation time.

Translational parameters and protein interactions

Interacting proteins are often precisely co-expressed, presumably to maintain proper stoichiometry among interacting components [39]. For instance, it was shown that functionally associated proteins exhibit correlated mRNA expression profiles over a set of environmental conditions [40], [41]. Other studies report the co-evolution of codon usage of functionally linked genes [39], [42] and show that codon usage is a strong predictor of protein-protein interactions [43]. Our model provides far more information on translation regulation than mRNA expression profiles or codon usage alone, thus we decided to examine calculated parameters in a set of well-known interacting proteins.

As a model, we chose the 20S proteasome complex, built of 28 proteins. There are 14 genes in the yeast genome coding for proteasome subunits An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e143.jpg1–An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e144.jpg7 and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e145.jpg1–An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e146.jpg7, and each subunit is present in the complex in two copies [44]. Only subunit An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e147.jpg is nonessential for the functionality of the complex and may be replaced by the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e148.jpg subunit under stress conditions to create a more active proteasomal isoform [45].

The analysis of the translational parameters (see Supplementary, Table S3) shows that the mean translation time (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e149.jpg) of all proteins is similar and ranges from 194 to 259 ms. As all interacting proteins are of similar length, the total time of elongation does not vary much; the biggest observed difference between two proteins was An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e150.jpg24 s. However, the level of transcription is more variable and ranges from 3.99 to 22.61 transcripts per cell. There is a considerable divergence of ribosome density An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e151.jpg (from 0.61 to 4.32), but regulation at the level of translation initiation (similar values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e152.jpg and variability of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e153.jpg reaching two orders of magnitude) keeps the initiation time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e154.jpg at the same level for all 14 proteins. The biggest observed difference of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e155.jpg between two proteins equals An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e156.jpg31 s. This results in congruent total times of translation An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e157.jpg, the difference between maximal and minimal values is only two-fold with a mean value of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e158.jpg73 s. Nevertheless, the observed differences in the mean lifetimes of mRNA molecules are huge, reaching up to 278 min. In consequence, the number of protein molecules produced is strongly variable, ranging from 318 to 11,185 molecules per cell, and this is surprising as the stoichiometry of the 20S proteasome would rather suggest equal amounts of all subunits. Indeed, for four proteasome proteins, the value of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e159.jpg parameter is almost the same, about 2,600 subunits of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e160.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e161.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e162.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e163.jpg per cell. Similar values, which do not exceed the range 2,600An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e164.jpg1,000, were reported for subunits An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e165.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e166.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e167.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e168.jpg. Subunits An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e169.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e170.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e171.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e172.jpg are produced to less than 1,100 copies, while the rate of protein synthesis of subunits An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e173.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e174.jpg is 5,481 and 11,185 molecules per cell, respectively. To maintain the number of different subunits at the same level, the high translation rates of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e175.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e176.jpg may be balanced by post-translational regulation, presumably by elevated protein degradation. Conversely, the reduced translation rate of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e177.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e178.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e179.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e180.jpg may be compensated at the level of transcription, for instance by more frequent transcription initiations. In addition, the limited number of these subunits, as well as the relatively short life-time of their mRNAs, makes them ideal candidates for regulators of the abundance of proteasome complexes.

Discussion

The main advantage of the proposed model is that basing it on only few datasets and general assumptions allows the calculation of many important translational parameters, which are extremely difficult to measure experimentally. As a result, the majority of yeast genes may be attributed with quantitative rates of expression and protein synthesis. These data may be used to study both the general characteristics of the process of translation in yeast and the rates of protein production of individual genes. The model itself is general and universal and can be applied to other organisms if all of the necessary input datasets are available.

However, as with any theoretical model, this one also has some drawbacks. The quality of our calculations strongly depends on the quality of the input data. To study the example of S. cerevisiae, we carefully chose the dataset of ribosome profiles and made sure that data on mRNA abundance and ribosome footprints were obtained under the same experimental conditions. Similarly, all global parameters, such as the overall number of transcripts and ribosomes in a cell, were determined with care and attention, after insightful analysis of the literature. To extend our model to the number of proteins produced, we decided to use an additional dataset of mRNA half-lives. The assumption that lies at the basis of mRNA half-life calculation is that in the steady state of mRNA turnover, the time required to synthesise an mRNA molecule equals the time to degrade it. Obviously, this is not true for many transcripts, as the cell cycle and environmental stimuli force changes in mRNA turnover. Additionally, we must not forget that the parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e181.jpg, calculated based on mRNA half-life, reflects only the total amount of protein molecules produced by the transcripts of a given gene. The protein degradation rate is not taken into account, and therefore, especially in case of short-lived proteins, the observed protein concentration will be smaller than estimated in this paper. This may be the cause of some of the discrepancies between the estimated protein abundances and those previously reported.

The true meaning of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e182.jpg parameter is also important when analysing the set of 20 genes characterised by low levels of transcription and high levels of protein production rate. As their transcripts may be sustained in a cell for up to a few dozen hours, they may produce a large amount of protein in their lifetime, even if the translation is not very efficient. However, this does not necessarily indicate that all synthesised proteins are aggregated in a cell, and their number is constantly increasing. It is more likely that these proteins are systematically degraded and replaced by new ones produced from the same mRNA. Interestingly, genes regulated thusly would not be classified as highly expressed by any standard methods, as their transcripts are not present in the cell in many copies, and their mean time of elongation is about average, so no codon bias is suspected.

In addition, our model revealed some interesting aspects of global translation characteristics. In many studies ribosome density is used as the only measure of translational activity [6]. We have shown that high ribosome density is caused mainly by the elevated relative rate of translation initiation after forming of the ribosome-mRNA complex – An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e183.jpg. In contrast, another measure of translation efficiency, protein production rate An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e184.jpg, is affected mostly by the relative rate of finding an mRNA molecule by a free ribosome An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e185.jpg, while the influence of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e186.jpg in this case is negligible. These results reflect the complexity of translation regulation, suggesting that any translational parameter, when considered separately, is not sufficient to fully characterise the process.

It has been stated before [46] that the regulation of gene expression is controlled at multiple stages, and no general rule exists describing how it works. In fact, the regulation of expression is different for each gene, and its main role is to produce the required amount of a given protein at the proper time. In contrast to the typically used methods of quantifying translation (i.e., codon bias and transcript abundance measurements), the proposed model does not concentrate only on one parameter of translation. In fact, it allows one to study, in depth, many strategies of gene expression, showing which parameters play the main role in which type of control.

Furthermore, the model opens the prospect for new analysis of mRNA molecules. As mentioned before, the translation initiation rate depends on mRNA abundance and intrinsic features of the transcript. The calculated parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e187.jpg measures the relative efficiency of translation initiation, excluding the influence of mRNA concentration. Thus, for the first time, it provides a quantitative way to compare mRNA sequences from the same organism with respect to initial codon context, 5′UTR secondary structure, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e188.jpgORF presence, and other mRNA features responsible for the efficiency of translation initiation.

Another possible application of the model is the analysis of the calculated translational parameters in the context of protein complexes, where proper stoichiometry among interacting components is maintained. As exemplified by the study of the yeast 20S proteasome, such analysis enables one to draw some interesting conclusions about the regulation of the individual proteins, as well as the entire complex. Moreover, we have shown that some parameters, in particular translation times An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e189.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e190.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e191.jpg, are similar for all proteins of the complex. Possibly, the calculated model parameters, if properly integrated, could become a strong predictor of protein-protein interactions. It would be interesting to carry out a similar search for proteins participating in the same metabolic pathway, as functionally related proteins are usually co-expressed. In such a case, the analysis of the translational parameters pattern could become useful for the functional annotation of genes.

The model can also be used to study the elongation process in the context of ribosome queuing. It provides all the necessary tools to deeply analyse the strategies developed by living cells to avoid ribosome stacking on a translated mRNA molecule.

Additionally, clustered codons that pair to low-abundance tRNA isoacceptors cause local slow-down of the elongation rate. It has been hypothesised, that such slow-down might facilitate the co-translational folding of defined protein segments, by temporally separating their synthesis [47]. Recently, it has been proposed that discontinuous elongation of the peptide chain can control the efficiency and accuracy of the translation process [48]. Our model provides the measure of yeast codon elongation rates that may be used to better examine the co-translational folding. In contrast to the measure used in the aforementioned study, it is quantitative and more precise, as it takes into account the delay caused by near- and non-cognate aa-tRNAs.

Finally, the crucial coefficients of the model, i.e., the time of insertion of cognate aa-tRNAs and time delays caused by near- and non-cognate aa-tRNAs binding, can be calculated with respect to different temperatures. This provides the possibility to study the excess to which the temperature affects the efficiency of translation, provided that the ribosome footprints and mRNA concentrations are also measured at a few different temperatures.

In conclusion, although experimental confirmation is still required, this model constitutes an important tool for understanding the process of protein synthesis.

Materials and Methods

Theoretical model of translation

The molecular mechanism of translation was well characterised previously [49]. However, for the purpose of this research, we must consider the process both at the single transcript and genome-wide levels. Quantifying the process of protein biosynthesis engages vast array of data, some of which is incomplete or missing. Thus, the following assumptions and simplifications must be made: (i) the pools of all molecules participating in translation (mRNA, tRNA, ribosomes, translation factors, and so on) are constant, and molecules diffuse without restraint; (ii) all transcripts derived from the same gene have identical sequence, i.e., there is no alternative splicing and/or posttranscriptional modification; and (iii) the elongation process is never interrupted, and it always ends by producing a full-length protein molecule (note, that experimentally estimated procesivity of translation in yeast was 99.8–99.9% [50]). When these assumptions are satisfied, the model is as follows:

Let An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e192.jpg be the set of all transcripts present in the yeast cell at the moment of observation. We can make a partition of the set An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e193.jpg into An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e194.jpg subsets, each containing transcripts of identical sequence. Thus, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e195.jpg denotes the number of transcriptionally active genes in the cell. To each gene (subset), we attribute the index An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e196.jpg and define An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e197.jpg as the number of transcripts in the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e198.jpg subset. The variable An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e199.jpg is reflected though by the transcriptional activity of a gene.

Let An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e200.jpg be the total observed time of synthesis of one protein molecule from a transcript belonging to the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e201.jpg subset. We define it as:

equation image
(1)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e203.jpg denotes the time required for translation initiation, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e204.jpg is the total time of the elongation process.

We define An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e205.jpg as the time interval from the point when the free 5′ end of a transcript becomes available for ribosomes to the moment when a ribosome finds the initiator AUG codon and the entire complex enters into the elongation phase. The inverse of the initiation time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e206.jpg is initiation frequency An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e207.jpg:

equation image
(2)

If these frequencies are multiplied by a brief time interval An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e209.jpg, one obtains the probabilities that the initiation process will occur during time interval An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e210.jpg. We assume that the initiation of translation follows the scanning model [51], which postulates that the small ribosomal subunit enters at the 5′ end of the mRNA and moves linearly, searching for the initiator AUG codon; once it finds it, the elongation process begins. We define An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e211.jpg as the relative binding rate of free ribosomes to the 5′ end of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e212.jpg transcript, and assume it is proportional to the concentration of the transcript (see Eq.10). This means, that in our model the binding constants of ribosomes are the same for all mRNAs. Contrary, the process of 5′UTR scanning by the ribosome is not straightforward, as there are many intrinsic features of mRNA molecules that can considerably delay or hasten the start of elongation (for review, see [1]). Sometimes, the ribosome detaches from the mRNA molecule before reaching the initial AUG, and the process must return to the point when a ribosome binds at the 5′ end. To describe the efficiency of the scanning process by one numerical parameter, we normalised An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e213.jpg by the rate of binding of free ribosomes An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e214.jpg:

equation image
(3)

The calculated parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e216.jpg describes the rate of successful accomplishment of initiation on the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e217.jpg transcript once the ribosome-mRNA complex is formed. Its value reflects the relative capability of an mRNA molecule to be translated, regardless its expression level. The rates An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e218.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e219.jpg are calculated in relation to all studied transcripts, thus they can only be compared within one particular analysis.

The time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e220.jpg (see Eq.1) is defined as a time interval from the recognition of the initiator AUG codon by the ribosome to the moment when the last peptide bond of a protein molecule is formed. Each elongation event consists of two main steps: (i) finding the correct tRNA molecule, and (ii) formation of the peptide bond and translocation. The time required for the first event is much larger than for the second. In fact, the second step is almost instantaneous [52]; thus, the times needed for transpeptidase and translocation reactions can be neglected, and time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e221.jpg may be simplified to:

equation image
(4)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e223.jpg is the time of translation of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e224.jpg codon, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e225.jpg is the number of codons in the coding sequence of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e226.jpg transcript.

Translation times for all yeast codons, as well as the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e227.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e228.jpg, can be calculated on the basis of existing data (see below). These values can also be used to calculate times An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e229.jpg and the rest of the model parameters An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e230.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e231.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e232.jpg, if the numbers of ribosomes attached to the mRNA molecules are known. Here, the reasoning is as follows:

Let An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e233.jpg be the number of ribosomes attached to the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e234.jpg transcript. We introduce the measure of ribosome density An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e235.jpg, defined as the number of ribosomes attached to the transcript per 100 codons:

equation image
(5)

One ribosome occupies ten codons of a mRNA molecule [53], and the E site of one ribosome can be immediately adjacent to the A site of another ribosome [54]. This means that the maximum possible value is An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e237.jpg. Next, the attachment of a ribosome to the 5′ end is possible only if it is not occupied by other ribosomes. Thus, the most efficient mRNA sequences should have An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e238.jpg. Nevertheless, the majority of observed An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e239.jpg values are much smaller, meaning that there are usually gaps of varying length between attached ribosomes. As the exact positions of ribosomes on a particular transcript cannot be deduced from the data, we must operate on the averaged gap lengths, defined as the quotient of the transcript length An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e240.jpg and number of attached ribosomes An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e241.jpg. The length of a gap measured in codons is meaningless, as each type of codon has a different translation time. However, the gap can be calculated as the sum of translation times of these codons, becoming an adequate measure of the time interval between individual translation initiation events on a given mRNA molecule. This time is actually a delay from the best possible initiation frequency and reflects the efficiency of the initiation process. In principle, this is the time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e242.jpg (see Eq. 1) expressed in the same time units as the translation times of particular codons:

equation image
(6)

Note that due to unknown ribosome positions on a transcript, both the gap length and time of its translation are averaged. The rest of the parameters (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e244.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e245.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e246.jpg) can be calculated based on An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e247.jpg, as shown in Eq. 1, 2, and 3.

Calculating model parameters

The S. cerevisiae coding sequences used in our calculations were downloaded from the Saccharomyces Genome Database [55] (accessed 25An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e248.jpg June 2009). For each gene, we determined the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e249.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e250.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e251.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e252.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e253.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e254.jpg on the basis of the recent research of Ingolia et al. [22], quantifying simultaneously mRNA abundance and ribosome footprints by means of deep sequencing. The study was done for the yeast strain BY4741 grown in YEPD at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e255.jpgC. In the first step Ingolia et al. performed deep sequencing on a DNA library that was generated from fragmented total mRNA in order to measure abundance of different yeast transcripts. Next, they applied a new ribosome-profiling strategy based on the deep sequencing of ribosome-protected fragments. This resulted in a dataset of 4,648 reliable transcripts (for the definition of “reliability”, see Supplementary Materials of [22]) that was used as an input in our research. For each transcript in the dataset, the following values were attributed: An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e256.jpg, raw count of mRNA-seq reads aligned to transcript coding sequence (CDS); An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e257.jpg, density of mRNA-seq reads in reads per kilobase per million CDS-aligned reads (RPKM); An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e258.jpg, raw count of ribosome CDS-aligned footprints; and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e259.jpg, density of ribosome footprints in reads per kilobase per million CDS-aligned reads. Next, the relative numbers of reads counted in RPKM were transformed into the transcript copy numbers. Normally, for each transcript An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e260.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e261.jpg is defined as:

equation image
(7)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e263.jpg is the length of the transcript CDS in codons, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e264.jpg is the sum of all mappable reads An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e265.jpg [56]. Assuming uniform distribution of the mappable reads across the transcriptome coding sequences, the probability of observing An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e266.jpg reads on the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e267.jpg transcript CDS of length An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e268.jpg in An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e269.jpg attempts corresponds to the fraction of the transcriptome composed of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e270.jpg transcript:

equation image
(8)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e272.jpg is the sum of all CDS of the transcriptome in base pairs. The meaning of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e273.jpg was explained in the previous section. We can substitute final RPKMs to get:

equation image
(9)

Although the length of the entire transcriptome was estimated as An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e275.jpg nucleotides [57], deriving An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e276.jpg is more problematic, as little is known about the accurate boundaries of non-coding elements in transcript sequences [9]. There were some attempts to determine the length of UTRs on a global scale in yeast [58], [59], but the results show that even the length of transcripts derived from the same gene of the same yeast strain cultured in the same growing conditions may vary considerably. This causes the discrepancies between reported transcript lengths by these two studies, making the analysis at the level of individual genes difficult and inaccurate.

To overcome this problem we use An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e277.jpg, the relative rate of binding of free ribosomes to the 5′end of a given transcript (see Eq.3). This rate corresponds to the fraction of transcript An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e278.jpg in the set of all transcripts. By substituting An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e279.jpg as shown in Eq. 9, we obtained the following relation:

equation image
(10)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e281.jpg is the sum of all densities An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e282.jpg of mRNA-seq reads. Thus:

equation image
(11)

The next step was to calculate An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e284.jpg (the absolute number of translationally active ribosomes attached to the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e285.jpg transcript), and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e286.jpg (the measure of ribosome density, as defined in Eq.5). The dataset used provides information only on ribosome footprints aligned to the coding sequences. However, in practice, there were some exceptions to this rule, caused mostly by the presence of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e287.jpgORFs in the 5′UTR sequences [22]. Due to the lack of data and aforementioned difficulties in determining exact transcript length, this fact is not taken into account in our analysis. Furthermore, we defined An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e288.jpg as the number of all ribosomes in a yeast cell and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e289.jpg as the fraction of ribosomes participating in the process of translation at the moment of observation. In contrast to raw mRNA-seq reads, the distribution of ribosome footprints is not uniform across the transcriptome, due to differences in genes translational activity. Thus, the probability of observing a ribosome attached to the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e290.jpg transcript corresponds to the fraction of all ribosome footprints An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e291.jpg composed of the raw footprints in the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e292.jpg transcript, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e293.jpg. This probability is equal to the ratio of all ribosomes engaged in translation of transcripts of type An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e294.jpg and the number of all occupied ribosomes in the cell:

equation image
(12)

Thus, ribosome density for the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e296.jpg transcript can be calculated as:

equation image
(13)

Global parameters estimation

Three parameters must be estimated to transform relative numbers of transcripts and ribosomes attached to them into absolute measures. These parameters are An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e298.jpg, the total number of mRNA transcripts in a yeast cell; An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e299.jpg, the total number of ribosomes in a yeast cell; and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e300.jpg, the fraction of ribosomes participating in the translation process at the moment of observation. There are many studies concerning the quantitative measurement of yeast cells, and we used the Bionumbers database [60] to extract these data.

Two reports provide an independent, yet coherent, estimation of the total number of ribosomes: 187,000An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e301.jpg56,000 [9] and 200,000 [57] molecules per cell. In this study, we decided to set An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e302.jpg to 200,000. The value of 85% was established for the parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e303.jpg, as stated in experimental studies [61], [62]. The number of all transcripts in a cell is more problematic. Many contemporary studies assume that a yeast cell contains 15,000 mRNAs per cell on average [27], [63], which is based on estimations done over 30 years ago [64]. Current research, based on more up-to-date techniques (e.g., in situ hybridisation or GATC-PCR) argues that the number should be at least doubled [65] or even quadrupled [62]. We decided to use the value of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e304.jpg situated between these estimates and equal to 36,000. This number was also confirmed by other studies [65].

Assuming An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e305.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e306.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e307.jpg, we obtained the mean ribosome density equal to 1.66 ribosomes per 100 codons. This is in agreement with experimental analysis, which reports that, on average, there is one ribosome per 156 nucleotides, corresponding to a density of 1.92 ribosomes per 100 codons [61]. Moreover, it was estimated that mRNA constitutes 5% of the total amount of RNA present in a cell, and the RNA[ratio]DNA ratio is 50[ratio]1 [57]. Assuming the yeast genome size of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e308.jpg nucleotides, the expected length of the entire transcriptome would be An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e309.jpg nucleotides. Thus, the length of all transcribed coding sequences An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e310.jpg can be defined as:

equation image
(14)

The meanings of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e312.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e313.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e314.jpg are explained above. Thus, the calculated length of all coding sequences equals An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e315.jpg nucleotides. This would suggest that non-coding elements constitute, on average, more than 50% of a transcript. In conclusion, it seems that the chosen parameter values generate reasonable measures of the global characteristics of the yeast cell.

Determining absolute times of translation

In the previous section, we calculated the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e316.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e317.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e318.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e319.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e320.jpg for each gene. To determine the absolute times of translation An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e321.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e322.jpg, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e323.jpg, we need to know the times of translation for individual codons. To achieve this goal, we adapted a model proposed for Escherichia coli [66] to the yeast system. Here, we briefly present the model and all of the necessary changes we made. For a description of the derivation, see the original paper.

The transport mechanism in the cytoplasm is diffusion, thus the aa-tRNAs act as a random walker, and the ribosomes on mRNAs with vacant A sites are the targets. We assume a yeast cytoplasm volume An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e324.jpg [67]. We divide it into An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e325.jpg walker occupation sites, where:

equation image
(15)

and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e327.jpg is a measure of the walker size. The values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e328.jpg used previously [66] were determined separately for individual E. coli aa-tRNA molecules [68]. As we are not aware of any similar reports for S. cerevisiae, we decided to use An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e329.jpg for all yeast codons, which is the mean of the E. coli An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e330.jpg values. The average time that elapses before the arrival of a walker An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e331.jpg is defined as:

equation image
(16)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e333.jpg is the characteristic time of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e334.jpg walker, associated with its transition from one cellular occupation site to the other. It depends on the size of the walker An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e335.jpg and its diffusion coefficient An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e336.jpg:

equation image
(17)

The measures of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e338.jpg were taken directly from [66]. As this value depends only on the accepted amino acid, we assumed that the difference in size between yeast and E.coli tRNA molecules is negligible. In Eq.16, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e339.jpg stands for the probability that a tRNA-aa molecule of type An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e340.jpg arrives at an open A site in the time interval An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e341.jpg and is proportional to the number of walker occupation sites containing the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e342.jpg walker:

equation image
(18)

We assume that the number of the molecules of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e344.jpg walker An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e345.jpg is proportional to the number of corresponding tRNA genes of type An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e346.jpg, which is reasonable, as it was shown that in yeast the concentration of the various tRNA species is largely determined by tRNA gene copy number [69]. In particular, the calculated correlation coefficient between tRNA gene copy number and experimentally determined tRNA abundance for a subset of 21 tRNA species equaled 0.91. According to [57], the RNA-DNA ratio is 50[ratio]1 and tRNA constitutes 15% of the total amount of RNA in a yeast cell. Assuming a genome size of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e347.jpg nucleotides, the total cellular tRNA size is An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e348.jpg nucleotides. When divided by the average tRNA molecule size (74.5 nt) we obtain the number of tRNA molecules in a cell equal to 2,818,792. Next, this number was multiplied by the fraction of all tRNA genes composed of the tRNA genes of type An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e349.jpg, yielding the absolute amount of particular tRNA molecules in a cell. Gene copy number and predicted decoding specificities of yeast tRNAs were taken from Table 1 of [69]. The values of all presented parameters for individual tRNAs are gathered in Supplementary Table S4.

All 61 codons that code for the 20 amino acids have one or more aa-tRNAs and varying numbers of near-cognates. Near-cognates are defined as having a single mismatch in the codon-anticodon loop in either the 2nd or 3rd position. Since some cognate tRNAs have a mismatch in the 3rd position, these tRNAs are excluded from the set of near-cognates [70]. The theoretical background of the model is based on the observation that the translation rate of a codon reflects the competition between its non-cognate, near-cognate and cognate aa-tRNAs [71], and that such nonspecific binding of the tRNAs to the ribosomal A site is rate-limiting to the elongation cycle for every codon [72]. The model of Fluitt et al [66] introduces two competition measures, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e350.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e351.jpg, being the quotients of the sum of arrival frequencies of near-cognates vs. cognates and non-cognates vs. cognates, respectively. For each codon, we determined its cognates, near-, and non-cognates (based on [69]) and calculated the competition measures An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e352.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e353.jpg (see Supplementary Table S2).

According to [66], the average time to add an amino acid coded by the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e354.jpg codon to the nascent peptide chain can be calculated as:

equation image
(19)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e356.jpg is the average time to insert an amino acid from a cognate aa-tRNA, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e357.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e358.jpg are the average time delays caused by the binding attempts of near- and non-cognate tRNAs, respectively. Based on existing data and assumption that the activation energies for the various reactions do not vary much, Fluitt et al [66] showed how to calculate the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e359.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e360.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e361.jpg at any given temperature. Table 4 contains these values for S. cerevisiae at 20, 24, 30, and 37An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e362.jpgC. Next, we calculated translation rates An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e363.jpg for all yeast codons at the four different temperatures (see Supplementary Table S2). However, as the main part of our analysis is based on the ribosome footprints measured at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e364.jpgC, in further calculations we use only the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e365.jpg estimated at this temperature. The last step was to calculate times An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e366.jpg for individual An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e367.jpg genes, as described in Eq.4.

Table 4
Time of tRNAs insertions at four different temperatures.

Ribosome queuing

It has been found that subsequent ribosomes are loaded onto the transcript sufficiently fast to make them interfere with each other, leading to ribosome queuing [73]. This phenomenon is usually caused by the presence of rare codons clusters in CDS, although other sequence features may also be very important [74]. Such elongation pauses may have distinct consequences, for instance ORF shifting or ribosome dissociation, often followed by decay of the mRNA and partly completed protein products [75]. Moreover, stalled ribosomes generate a false picture of a transcript translational activity, elevating the observed ribosome density in relation to the actual frequency of translation initiation events. For these reasons, we decided to reduce the dataset to the transcripts on which ribosome queuing does not occur. We wrote a simple program that simulates the ribosomes translocation along a transcript sequence. A ribosome moves from one codon to another only if it has spent a required amount of time for translation of the current codon (taken from Supplementary Table S2) and the subsequent codon is vacant. The successive ribosome attempts to attach to the initial AUG codon after the elapse of time interval An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e381.jpg, calculated as shown in Eq.6. The cumulative time of the movement is calculated for each ribosome separately. If this time is identical for each ribosome, translation is believed to pass without ribosome queuing. If the time is different, namely, the first ribosome moves faster than the rest, it means that some sequence features allow ribosome stacking under the assumed conditions (i.e., temperature and translational parameters). If the attachment of subsequent ribosomes is prevented by very slow translation of the first few codons, we consider it a particular case of ribosome queuing and reject all such transcripts.

Calculation of protein abundances

To enrich our dataset, we estimated the total number of proteins produced from a given transcript. Considering the mRNA molecules as a decaying quantity, we defined An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e382.jpg as the mean lifetime of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e383.jpg transcript expressed in time units:

equation image
(20)

where An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e385.jpg is the half-life of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e386.jpg transcript. Assuming that each translation event happens independently, we calculated the abundance of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e387.jpg protein as the number of translation initiation events that happen during the life-time of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e388.jpg transcript multiplied by the its copy number:

equation image
(21)

The dataset of mRNA relative half-lives is provided in the Supplementary Materials of [25]. In our calculations, we used the times t0 measured at exponential growth in YPD medium for 5,718 ORFs. It was determined experimentally by independent studies that the absolute mRNA half-life of the yeast gene YOR202W (HIS3) ranges from 7 (at 24An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e390.jpgC) [76] to 11 min (at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e391.jpgC) [77]. Assuming the mean value of 9 min for this gene, we can quantify the half-lives for the rest of the genes in the dataset, as well as the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e392.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e393.jpg.

Calculations summary

Based on the presented reasoning, we calculated translational parameters for the majority of yeast genes. In particular, parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e394.jpg was calculated on the basis of yeast coding sequences downloaded from [55]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e395.jpg was obtained from Eq.11, where values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e396.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e397.jpg were taken from the experimental study [22], and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e398.jpg was set to 36,000, as estimated by [65]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e399.jpg was obtained from Eq.13, where values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e400.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e401.jpg were taken from [22], An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e402.jpg was set to 200,000, based on [57], and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e403.jpg was set to 0.85 as stated in [61], [62]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e404.jpg was calculated from Eq.5. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e405.jpg was calculated from Eq.4, based on yeast coding sequences downloaded from [55] and the values of translation times of codons An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e406.jpg, calculated as shown in Eq.19. The values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e407.jpg, An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e408.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e409.jpg (at 30An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e410.jpgC) used in Eq.19 were calculated as shown in [66], and the values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e411.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e412.jpg were calculated separately for each codon as shown in [66], by substituing the number of its cognates, near-, and non-cognates tRNAs determined on the basis of [69]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e413.jpg was obtained from Eq.6, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e414.jpg from Eq.2. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e415.jpg was obtained from Eq.10, where values of An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e416.jpg and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e417.jpg were taken from the experimental study [22]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e418.jpg was obtained from Eq.3 and then normalised by its maximal value reported for the gene YLL040C. Total time of translation An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e419.jpg was calculated as stated in Eq.1. Mean time required for elongation of one codon of the An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e420.jpg transcript (An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e421.jpg) was calculated by dividing elongation time An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e422.jpg by the length of this transcript in codons An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e423.jpg. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e424.jpg was obtained on the basis of relative half-lives for yeast transcripts reported by [25] and mRNA half-life of the yeast gene YOR202W, assumed to be on average 9 min [76], [77]. Parameter An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e425.jpg was obtained from Eq.20, and An external file that holds a picture, illustration, etc.
Object name is pcbi.1000865.e426.jpg from Eq.21. The meaning of all variables was presented at the beginning of this section.

Supporting Information

Figure S1

The comparison of model parameters x and B with experimentally determined mRNA and protein abundances.

(0.26 MB PDF)

Figure S2

The comparison of translational parameters between genes of high and low protein abundance, as well as between genes of high and low ribosome density.

(0.23 MB PDF)

Table S1

A separate csv file containing calculated quantitative measures of translation for 4,621 yeast genes. There are 151 genes for which ribosome queuing was reported (parameter queue ! = 0, see below); the values of translational parameters of these genes may be irrelevant. Column descriptions: (gene) the systematic name of the yeast gene taken from Saccharomyces Genome Database; (L) length of the transcript CDS in codons; (x) absolute number of gene transcripts in a yeast cell; (b) absolute number of proteins produced from one molecule of a transcript during its lifespan; (B) total amount of protein molecules produced from transcripts of a particular type (B = b * x); (g) ribosome density in number of ribosomes attached to a transcript per 100 codons (g< = 10); (w) absolute number of ribosomes attached to one transcript; (P) translation initiation frequency (the inverse of I); (Pz) relative rate of binding of free ribosomes to the 5′ end of a transcript; (Ps) relative rate of successful accomplishment of initiation once the ribosome-mRNA complex is formed; for clarity, normalised by the maximal observed value of Ps (65.88365), reported for gene YLL040C; (T) total time of translation of one protein molecule from a given transcript in ms (T = I + E); (I) total time (in ms) required for translation initiation, defined as a temporal interval from the point when the free 5′ end of a transcript becomes available for ribosomes to the moment when a ribosome finds the initiation AUG codon and the entire complex starts the phase of elongation; (E) total time required for translation elongation of a transcript in ms; (mean_E) mean time required for elongation of one codon of a transcript in ms; (h) estimated half-life of a transcript in ms; (m) estimated mean lifetime of a transcript in ms; and (queue) ribosome queuing index estimated at 30 Celcius degree: value “0” - no ribosome queuing was observed for a transcript, value “1” - ribosome queuing was observed for a transcript, and value “2” the translation at the 5′ end of a transcript is slow enough to delay the attachment of the successive ribosomes to the mRNA molecule.

(0.58 MB CSV)

Table S2

The list of codons and their properties.

(0.02 MB PDF)

Table S3

The translational parameters calculated for 14 genes coding proteins of the 20S yeast proteasome.

(0.02 MB PDF)

Table S4

Decoding specificities of yeast tRNAs and calculated values of the model parameters for particular codons.

(0.02 MB PDF)

Acknowledgments

We express our gratitude to Nicholas T. Ingolia for providing additional supplementary material on ribosome profiling [22] - without this dataset this work could not be done.

Footnotes

The authors have declared that no competing interests exist.

We received no funding for this work.

References

1. Kochetov AV, Kolchanov NA, Sarai A. Interrelations between the efficiency of translation start sites and other sequence features of yeast mRNAs. Mol Genet Genomics. 2003;270:442–7. [PubMed]
2. Kozak M. Structural features in eukaryotic mRNAs that modulate the initiation of translation. J Biol Chem. 1991;266:19867–70. [PubMed]
3. Mignone F, Gissi C, Liuni S, Pesole G. Untranslated regions of mRNAs. Genome Biol. 2002;3:REVIEWS0004. [PMC free article] [PubMed]
4. Dever TE. Gene-specific regulation by general translation factors. Cell. 2002;108:545–56. [PubMed]
5. Belle A, Tanay A, Bitincka L, Shamir R, O'Shea EK. Quantification of protein half-lives in the budding yeast proteome. Proc Natl Acad Sci U S A. 2006;103:13004–9. [PubMed]
6. Beyer A, Hollunder J, Nasheuer HP, Wilhelm T. Post-transcriptional expression regulation in the yeast Saccharomyces cerevisiae on a genomic scale. Mol Cell Proteomics. 2004;3:1083–92. [PubMed]
7. García-Martínez J, González-Candelas F, Pérez-Ortín JE. Common gene expression strategies revealed by genome-wide analysis in yeast. Genome Biol. 2007;8:R222. [PMC free article] [PubMed]
8. Lu P, Vogel C, Wang R, Yao X, Marcotte EM. Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat Biotechnol. 2007;25:117–24. [PubMed]
9. von der Haar T. A quantitative estimation of the global translational activity in logarithmically growing yeast cells. BMC Syst Biol. 2008;2:87. [PMC free article] [PubMed]
10. Anderson L, Seilhamer J. A comparison of selected mRNA and protein abundances in human liver. Electrophoresis. 1997;18:533–537. [PubMed]
11. Greenbaum D, Colangelo C, Williams K, Gerstein M. Comparing protein abundance and mRNA expression levels on a genomic scale. Genome Biol. 2003;4:117. [PMC free article] [PubMed]
12. Griffin T, Gygi S, Ideker T, Rist B, Eng J, et al. Complementary profiling of gene expression at the transcriptome and proteome levels in Saccharomyces cerevisiae. Mol Cell Proteomics. 2002;1:323–333. [PubMed]
13. Gygi S, Rochon Y, Franza B, Aebersold R. Correlation between protein and mRNA abundance in yeast. Mol Cell Biol. 1999;19:1720–1730. [PMC free article] [PubMed]
14. Ideker T, Thorsson V, Ranish J, Christmas R, Buhler J, et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science. 2001;292:929–934. [PubMed]
15. MacKay V, Li X, Flory M, Turcott E, Law G, et al. Gene expression analyzed by high-resolution state array analysis and quantitative proteomics: response of yeast to mating pheromone. Mol Cell Proteomics. 2004;3:478–489. [PubMed]
16. von der Haar T, McCarthy J. Intracellular translation initiation factor levels in Saccharomyces cerevisiae and their role in Cap-complex function. Mol Microbiol. 2002;46:531–544. [PubMed]
17. Nie L, Wu G, Zhang W. Correlation between mRNA and protein abundance in Desulfovibrio vulgaris: a multiple regression to identify sources of variations. Biochem Biophys Res Commun. 2006;339:603–610. [PubMed]
18. Tian Q, Stepaniants S, Mao M, Weng L, Feetham M, et al. Integrated genomic and proteomic analyses of gene expression in mammalian cells. Mol Cell Proteomics. 2004;3:960–969. [PubMed]
19. Kolkman A, Daran-Lapujade P, Fullaondo A, Olsthoorn MMA, Pronk JT, et al. Proteome analysis of yeast response to various nutrient limitations. Mol Syst Biol. 2006;2:2006.0026. [PMC free article] [PubMed]
20. Mata J, Marguerat S, Bähler J. Post-transcriptional control of gene expression: a genome-wide perspective. Trends Biochem Sci. 2005;30:506–14. [PubMed]
21. Newman JRS, Ghaemmaghami S, Ihmels J, Breslow DK, Noble M, et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature. 2006;441:840–6. [PubMed]
22. Ingolia N, Ghaemmaghami S, Newman J, Weissman J. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324:218–23. [PMC free article] [PubMed]
23. Wang Z, Gerstein M, Snyder M. RNA-seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009;10:57–63. [PMC free article] [PubMed]
24. Preiss T, W Hentze M. Starting the protein synthesis machine: eukaryotic translation initiation. Bioessays. 2003;25:1201–11. [PubMed]
25. García-Martínez J, Aranda A, Pérez-Ortín JE. Genomic run-on evaluates transcription rates for all yeast genes and identifies gene regulatory mechanisms. Mol Cell. 2004;15:303–13. [PubMed]
26. Futcher B, Latter GI, Monardo P, McLaughlin CS, Garrels JI. A sampling of the yeast proteome. Mol Cell Biol. 1999;19:7357–68. [PMC free article] [PubMed]
27. Velculescu VE, Zhang L, Zhou W, Vogelstein J, Basrai MA, et al. Characterization of the yeast transcriptome. Cell. 1997;88:243–51. [PubMed]
28. Coghlan A, Wolfe KH. Relationship of codon bias to mRNA concentration and protein length in Saccharomyces cerevisiae. Yeast. 2000;16:1131–45. [PubMed]
29. Holstege FC, Jennings EG, Wyrick JJ, Lee TI, Hengartner CJ, et al. Dissecting the regulatory circuitry of a eukaryotic genome. Cell. 1998;95:717–28. [PubMed]
30. Waldron C, Jund R, Lacroute F. Evidence for a high proportion of inactive ribosomes in slow-growing yeast cells. Biochem J. 1977;168:409–415. [PubMed]
31. Boström K, Wettesten M, Borén J, Bondjers G, Wiklund O, et al. Pulse-chase studies of the synthesis and intracellular transport of apolipoprotein B-100 in Hep G2 cells. J Biol Chem. 1986;261:13800–6. [PubMed]
32. Lodish HF, Jacobsen M. Regulation of hemoglobin synthesis. Equal rates of translation and termination of α- and β-globin chains. J Biol Chem. 1972;247:3622–9. [PubMed]
33. Palmiter RD. Regulation of protein synthesis in chick oviduct. II. Modulation of polypeptide elongation and initiation rates by estrogen and progesterone. J Biol Chem. 1972;247:6770–80. [PubMed]
34. Gehrke L, Bast RE, Ilan J. An analysis of rates of polypeptide chain elongation in avian liver explants following in vivo estrogen treatment. I. Determination of average rates of polypeptide chain elongation. J Biol Chem. 1981;256:2514–21. [PubMed]
35. Baroni MD, Martegani E, Monti P, Alberghina L. Cell size modulation by CDC25 and RAS2 genes in Saccharomyces cerevisiae. Mol Cell Biol. 1989;9:2715–23. [PMC free article] [PubMed]
36. Zhou T, Weems M, Wilke CO. Translationally optimal codons associate with structurally sensitive sites in proteins. Mol Biol Evol. 2009;26:1571–80. [PMC free article] [PubMed]
37. Hershberg R, Petrov DA. Selection on codon bias. Annu Rev Genet. 2008;42:287–99. [PubMed]
38. Drummond DA, Wilke CO. Mistranslation-induced protein misfolding as a dominant constraint on coding-sequence evolution. Cell. 2008;134:341–52. [PMC free article] [PubMed]
39. Fraser HB, Hirsh AE, Wall DP, Eisen MB. Coevolution of gene expression among interacting proteins. Proc Natl Acad Sci U S A. 2004;101:9033–8. [PubMed]
40. Grigoriev A. A relationship between gene expression and protein interactions on the proteome scale: analysis of the bacteriophage T7 and the yeast Saccharomyces cerevisiae. Nucleic Acids Res. 2001;29:3513–9. [PMC free article] [PubMed]
41. Jansen R, Greenbaum D, Gerstein M. Relating whole-genome expression data with protein-protein interactions. Genome Res. 2002;12:37–46. [PubMed]
42. Lithwick G, Margalit H. Relative predicted protein levels of functionally associated proteins are conserved across organisms. Nucleic Acids Res. 2005;33:1051–7. [PMC free article] [PubMed]
43. Najafabadi HS, Salavati R. Sequence-based prediction of protein-protein interactions by means of codon usage. Genome Biol. 2008;9:R87. [PMC free article] [PubMed]
44. Groll M, Ditzel L, Löwe J, Stock D, Bochtler M, et al. Structure of 20S proteasome from yeast at 2.4 A resolution. Nature. 1997;386:463–71. [PubMed]
45. Kusmierczyk AR, Kunjappu MJ, Funakoshi M, Hochstrasser M. A multimeric assembly factor controls the formation of alternative 20S proteasomes. Nat Struct Mol Biol. 2008;15:237–44. [PubMed]
46. Orphanides G, Reinberg D. A unified theory of gene expression. Cell. 2002;108:439–51. [PubMed]
47. Purvis IJ, Bettany AJ, Santiago TC, Coggins JR, Duncan K, et al. The efficiency of folding of some proteins is increased by controlled rates of translation in vivo. A hypothesis. J Mol Biol. 1987;193:413–7. [PubMed]
48. Zhang G, Hubalewska M, Ignatova Z. Transient ribosomal attenuation coordinates protein synthesis and co-translational folding. Nat Struct Mol Biol. 2009;16:274–80. [PubMed]
49. Kapp LD, Lorsch JR. The molecular mechanics of eukaryotic translation. Annu Rev Biochem. 2004;73:657–704. [PubMed]
50. Arava Y, Boas FE, Brown PO, Herschlag D. Dissecting eukaryotic translation and its control by ribosome density mapping. Nucleic Acids Res. 2005;33:2421–32. [PMC free article] [PubMed]
51. Kozak M. Pushing the limits of the scanning mechanism for initiation of translation. Gene. 2002;299:1–34. [PubMed]
52. Pape T, Wintermeyer W, Rodnina MV. Complete kinetic mechanism of elongation factor Tu-dependent binding of aminoacyl-tRNA to the a site of the E. coli ribosome. EMBO J. 1998;17:7490–7. [PubMed]
53. Yusupova GZ, Yusupov MM, Cate JH, Noller HF. The path of messenger RNA through the ribosome. Cell. 2001;106:233–41. [PubMed]
54. Culver GM. Meanderings of the mRNA through the ribosome. Structure. 2001;9:751–8. [PubMed]
55. Saccharomyces genome database. URL http://www.yeastgenome.org/. [Online; accessed 25-June-2009].
56. Mortazavi A, Williams B, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-seq. Nat Methods. 2008;5:621–8. [PubMed]
57. Warner JR. The economics of ribosome biosynthesis in yeast. Trends Biochem Sci. 1999;24:437–40. [PubMed]
58. Miura F, Kawaguchi N, Sese J, Toyoda A, Hattori M, et al. A large-scale full-length cDNA analysis to explore the budding yeast transcriptome. Proc Natl Acad Sci U S A. 2006;103:17846–51. [PubMed]
59. Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, et al. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science. 2008;320:1344–9. [PMC free article] [PubMed]
60. Milo R, Jorgensen P, Moran U, Weber G, Springer M. Bionumbers–the database of key numbers in molecular and cell biology. Nucleic Acids Res. 2010;38:D750–3. [PMC free article] [PubMed]
61. Arava Y, Wang Y, Storey JD, Liu CL, Brown PO, et al. Genome-wide analysis of mRNA translation profiles in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 2003;100:3889–94. [PubMed]
62. Zenklusen D, Larson DR, Singer RH. Single-RNA counting reveals alternative modes of gene expression in yeast. Nat Struct Mol Biol. 2008;15:1263–71. [PubMed]
63. Wodicka L, Dong H, Mittmann M, Ho MH, Lockhart DJ. Genome-wide expression monitoring in Saccharomyces cerevisiae. Nat Biotechnol. 1997;15:1359–67. [PubMed]
64. Hereford LM, Rosbash M. Number and distribution of polyadenylated RNA sequences in yeast. Cell. 1977;10:453–62. [PubMed]
65. Miura F, Kawaguchi N, Yoshida M, Uematsu C, Kito K, et al. Absolute quantification of the budding yeast transcriptome by means of competitive PCR between genomic and complementary DNAs. BMC Genomics. 2008;9:574. [PMC free article] [PubMed]
66. Fluitt A, Pienaar E, Viljoen H. Ribosome kinetics and aa-tRNA competition determine rate and fidelity of peptide synthesis. Comput Biol Chem. 2007;31:335–46. [PMC free article] [PubMed]
67. Jorgensen P, Nishikawa JL, Breitkreutz BJ, Tyers M. Systematic identification of pathways that couple cell growth and division in yeast. Science. 2002;297:395–400. [PubMed]
68. Nissen P, Thirup S, Kjeldgaard M, Nyborg J. The crystal structure of Cys-tRNA-Cys-EF-Tu-GDPNP reveals general and specific features in the ternary complex and in tRNA. Structure. 1999;7:143–56. [PubMed]
69. Percudani R, Pavesi A, Ottonello S. Transfer RNA gene redundancy and translational selection in Saccharomyces cerevisiae. J Mol Biol. 1997;268:322–30. [PubMed]
70. Pienaar E, Viljoen HJ. The tri-frame model. J Theor Biol. 2008;251:616–27. [PMC free article] [PubMed]
71. Rodnina MV, Wintermeyer W. Fidelity of aminoacyl-tRNA selection on the ribosome: kinetic and structural mechanisms. Annu Rev Biochem. 2001;70:415–35. [PubMed]
72. Zouridis H, Hatzimanikatis V. Effects of codon distributions and tRNA competition on protein translation. Biophys J. 2008;95:1018–33. [PubMed]
73. Sørensen MA, Pedersen S. Absolute in vivo translation rates of individual codons in Escherichia coli. The two glutamic acid codons GAA and GAG are translated with a threefold difference in rate. J Mol Biol. 1991;222:265–80. [PubMed]
74. Romano MC, Thiel M, Stansfield I, Grebogi C. Queueing phase transition: theory of translation. Phys Rev Lett. 2009;102:198104. [PubMed]
75. Buchan JR, Stansfield I. Halting a cellular production line: responses to ribosomal pausing during translation. Biol Cell. 2007;99:475–87. [PubMed]
76. Herrick D, Parker R, Jacobson A. Identification and comparison of stable and unstable mRNAs in Saccharomyces cerevisiae. Mol Cell Biol. 1990;10:2269–84. [PMC free article] [PubMed]
77. Iyer V, Struhl K. Absolute mRNA levels and transcriptional initiation rates in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A. 1996;93:5208–12. [PubMed]

Articles from PLoS Computational Biology are provided here courtesy of Public Library of Science