|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: GL CJ. Performed the experiments: YSG. Analyzed the data: GL CG CJ. Contributed reagents/materials/analysis tools: LAS GD. Wrote the paper: GL CJ.
Examples of metabolic rhythms have recently emerged from studies of budding yeast. High density microarray analyses have produced a remarkably detailed picture of cycling gene expression that could be clustered according to metabolic functions. We developed a model-based approach for the decomposition of expression to analyze these data and to identify functional modules which, expressed sequentially and periodically, contribute to the complex and intricate mitochondrial architecture. This approach revealed that mitochondrial spatio-temporal modules are expressed during periodic spikes and specific cellular localizations, which cover the entire oscillatory period. For instance, assembly factors (32 genes) and translation regulators (47 genes) are expressed earlier than the components of the amino-acid synthesis pathways (31 genes). In addition, we could correlate the expression modules identified with particular post-transcriptional properties. Thus, mRNAs of modules expressed “early” are mostly translated in the vicinity of mitochondria under the control of the Puf3p mRNA-binding protein. This last spatio-temporal module concerns mostly mRNAs coding for basic elements of mitochondrial construction: assembly and regulatory factors. Prediction that unknown genes from this module code for important elements of mitochondrial biogenesis is supported by experimental evidence. More generally, these observations underscore the importance of post-transcriptional processes in mitochondrial biogenesis, highlighting close connections between nuclear transcription and cytoplasmic site-specific translation.
In bacterial and eukaryotic cells, gene expression is regulated at both the transcriptional and translational levels. In eukaryotes these two processes cannot be directly coupled because the nuclear membrane separates the chromosomes from the ribosomes. Although the transcription levels in different cellular conditions have been widely examined, genome-wide post-transcriptional mechanisms are poorly documented and therefore, the connections between the two processes are difficult to explain. In this work, the time-regulated expression of the genes involved in the construction of the mitochondrion, an important organelle present in nearly all the eukaryotic cells, was scrutinized both at transcriptional and post-transcriptional levels. We observed that temporal transcriptional profiles coincide with groups of genes which are translated at specific cellular loci. The description of these relationships is functionally relevant since the genes which are transcribed early in mitochondria cycles are those which are translated to the vicinity of mitochondria. In addition, these early genes code for essential assembling factors or core elements of the protein complexes whereas the peripheral proteins are translated later in the cytoplasm. Also, these observations support the concerted action of important regulatory factors which control either the gene transcription level (transcription factors) or the mRNA localization (mRNA-binding proteins).
Cell construction requires the tight linking of various molecular processes, from nuclear transcription to the site-specific production of proteins. The control of the orchestration of these processes remains poorly understood. In classical experimental conditions, coordinated waves of transcription are difficult to observe because of the metabolic asynchrony of the cells in growing cultures. A yeast system with properties avoiding these difficulties was recently described. In well-defined continuous cultures of Saccharomyces cerevisiae, the oxygen consumption rate oscillates with a constant period , implying that cell-to-cell signaling synchronizes oxidative and reductive functions in the culture. The gene-expression dynamics of the yeast metabolic cycle is therefore a useful model system for studies of the lifecycle of groups of transcripts in eukaryotic cells . Indeed, microarray studies have demonstrated periodicity in the expression of the yeast genome, and consequently the existence of similar temporal expression patterns in functionally connected groups of genes . Genes specifying functions associated with energy appeared to be expressed with exceptionally robust periodicity, consistent with the variations in the amount of dissolved oxygen in the medium of synchronized culture. In pioneering studies , it was shown that yeast mitochondrial morphology oscillates in response to energetic demands driven by the ultradian clock output.
In this work, our purpose was to distinguish temporal gene clusters, which may allow describing a biologically relevant scenario of mitochondria biogenesis. Depending on the addressed points and on the quality of the microarray data, several methods such as SVD (Singular Value decomposition), PCA (Principal Components Analysis), self-organizing maps, wavelet multiresolution decomposition and FFT (Fast Fourier Transform) have been used to analyze relevant transcript data . We decided to use a model-based approach  to decomposition of published expression data for the 626 oscillating nuclear genes encoding mitochondrial proteins. We established a classification of these genes into temporal groups, which cover the 5-hour long metabolic cycle, and present a dynamic and global picture of mitochondrial biogenesis. These temporal groups correlate both with particular functional properties of the corresponding proteins and with specific translational sites in the cell. This global description of mitochondrial transcriptome clusters in temporal phases is consistent with the concept of RNA regulons, according to which post-transcriptional RNA operons may constitute an important element of eukaryotic genome expression ,.
Microarray data from the study by Tu et al.  were collected from the Gene Expression Omnibus database , under accession number GSE3431. This dataset comprised the normalized gene expression values, i.e. the median of each array (all data points) equal 1, used by Tu et al.  in their pioneering analysis. Tu et al.  performed microarray experiments at 25-minute intervals, over three consecutive metabolic cycles (the length of one cycle is ~300 minutes). For each gene, expression measurements were thus available for 36 successive time points. We considered only those genes for which expression measurements were available and which (i) displayed significant periodic patterns, as defined by Tu et al.  (~3552 genes with a confidence level greater than 95%) and (ii) were identified as involved in mitochondrial biogenesis, as defined by Saint-Georges et al.  (~794 genes). The resulting expression matrix comprised data for 626 genes (the complete list is available in Dataset S1).
Our aim was to investigate in more detail the published gene expression profiles obtained from yeast cell cultures displaying highly periodic cycles in the form of respiratory bursts. Starting from previous work , we developed the EDPM algorithm (Expression Decomposition based on Periodic Models) to analyze precisely gene expression patterns during yeast metabolic cycles. Overview of the EDPM procedure is presented in Figure S1. The main idea is to decompose each gene expression profile obtained with microarray technology (vector D, Figure 1A), into a mixture of pre-defined model patterns (matrix P, Figure 1A). For that, the algorithm calculates a vector W of ω-values, such that the standard multiplication of vector W and matrix P, forms a vector M that reproduces the initial vector of expression measurements D (eq. 1 below). The ω-values are determine using an optimization procedure (see below and Figure 2 for an illustration) and therefore indicate the contribution of each model pattern to the expression pattern observed for a particular gene. The data vector M matches the vector D exactly if P is a perfect model of the biological:
The W vector is calculated to minimize the square of the distance between the M and D vectors. For a given gene i, the criterion to be optimized — i.e. numerically minimized — to find the optimum solution of ω-values is as follows:
Where, is the microarray expression measurement of gene i at time t, is the value of model pattern k at time t, and K is the number of model patterns (K=15 in this study, see below). Pi is a penalty function introduced to ensure that the sum of ω-values is equal to 1:
Hence, the greater the value of the greater the contribution of model pattern k to the observed expression pattern of gene i.
Finally, two others parameters controlling the amplitude and the level of the M vector oscillatory signals are also included in the optimization procedure (Text S1). They are not specified here for the sake of clarity.
All the genes whose expression oscillates in Tu et al.  dataset exhibit specific properties, as periodic signals among 3 successive cycles and a unique period for all genes. Using these two characteristics, we defined 15 model patterns (see Figure 1B) according to simple cosine functions, i.e.
Where w=1 (Tu et al.  periodic patterns have a constant period) and t varies from (three periods to model the three successive metabolic cycles covered by the microarray measurements of Tu et al. ). represents a time interval between the different model patterns, it varies from such that each model pattern reaches its maximal value a different time t. As all the model patterns differ only in terms of the time interval between patterns, the ω-values calculated by EDPM can be seen as a kind of gene “footprint”, indicating the time phase during the metabolic cycle, at which the gene is strongly expressed (and/or its mRNA is present in larger amounts).
Note that in EDPM, model patterns are pre-defined in order to match specific properties shared by the analyzed gene expression patterns. The more the model patterns are adaptable to the observed gene expression measurements, the more the EDPM optimization is efficient, i.e. the final value (eq. 2) is close to 0. In the case study of the YMC biological process presented here, cosine functions appeared to be a relevant choice (Text S1). In principle, any microarray dataset can be analyzed using the EDPM approach, it may only required the definition of new model patterns adapted to the gene expression characteristics.
As an illustration, we used the EDPM algorithm to analyze the gene expression profiles of the three sentinel genes (MRPL10, POX1 and RPL17B) used by Tu et al.  to define three successive superclusters of gene expression during the metabolic cycle: R/B (reductive, building), R/C (reductive, charging) and Ox (oxidative). The results are shown Figure 1C. D and M vectors are superimposed and represented graphically in the upper part of panel C. The initial vector of expression measurements, D, obtained with microarray technology is plotted in orange; the M vector, obtained by the multiplication of vector W (calculated with EDPM) and model pattern P is plotted in red. EDPM appears to give a smoothed representation of gene expression profiles and hence facilitates identification of the time interval in which the gene is mostly expressed. The ω-derived “footprint” representations of each gene expression profile are represented in the bottom part of panel C. As expected, they are clearly different and reveal three distinct periods for maximal gene expression: successive green patterns nos. 1 and 15 for the R/B sentinel gene (MRPL10); blue pattern no. 5 for the R/C sentinel gene (POX1); and red pattern no. 12 for the Ox sentinel gene (RPL17B).
To cluster genes whose RNA level peaks at the same time points in the yeast metabolic cycle (YMC), we used the ω-values obtained for each gene using EDPM algorithm (see previous paragraph). Pearson correlation coefficients (r) were calculated between all W vector pairs, and hierarchical cluster analysis was applied. This classical clustering method can be summarized as follows: (1) Distances (d) between all W vector pairs is calculated using Pearson's correlation analysis (d=1−r); (2) The resulting distance matrix is thoroughly inspected to find the smallest distance; (3) The corresponding genes are joined together in the tree and form a new cluster; (4) The distances between the newly formed cluster and the other genes are recalculated; (5) Steps 2, 3 and 4 are repeated until all genes and clusters are linked in a final tree.
We searched for cis-acting signals in 3′ and 5′UTR sequences, using motifs predicted by the MatrixREDUCE algorithm . For 3′UTR signals, we tested several motifs identified in previous studies , as possible binding sites for mRNA stability regulators in Saccharomyces cerevisiae. For 5′ UTR signals, we examined upstream regions between nucleotide positions −600 and −1 and searched for motifs between 1 and 7 nt long. We assessed whether any of the signals were observed at a frequency greater than that expected by chance, by calculating p-values as described in  (hypergeometric distribution). We then search the YEASTRACT database for transcription factors with DNA-binding sites matching the motifs identified with MatrixREDUCE .
The EDPM algorithm was implemented in R programming language (http://cran.r-project.org/) and functions were numerically minimized using the quasi-Newton method (R function available in the BASE package). Hierarchical clustering was carried out with the “hclust” function (also available in R programming language), with the “ward” method for gene agglomeration. MatrixREDUCE source code is freely available online from http://bussemaker.bio.columbia.edu/software/MatrixREDUCE/ and was used for analyses of upstream sequences with default parameters (see the documentation available online for more information).
All the strains used in this study are isogenic to BY4742 (MATα; his3Δ1; leu2 Δ0; lys2 Δ0; ura3Δ0) from the Euroscarf gene deletion library.
To test the maintenance of the mitochondrial genome, the mutant cells were crossed with the rho° control strain KL14-4A/60 (MATa his1, trp2, rho°), devoid of any mitochondrial genome, and the diploid growth was tested on respiratory medium containing 2% glycerol.
To test mutant respiratory growth, the cells were streaked on non-fermentable media containing 2% glycerol, 2% ethanol or 0.5% lactate and incubated for several days at 28 or 36°C.
Cytochrome absorption spectra of whole cells grown on 2% galactose were recorded at liquid nitrogen temperature after reduction by dithionite using a Cary 400 spectrophotometer (Varian, San Fernando, CA) .
Tu et al.  showed that yeast cells grown under continuous and nutrient-limited conditions display highly periodic cycles (called Yeast Metabolic Cycles or YMC), in the form of respiratory bursts in which more than half the entire genome (~3552 genes in S. cerevisiae) is expressed in a periodic manner. Among these numerous genes whose mRNA level is modified during YMC, we identified 626 genes as being involved in mitochondrial biogenesis. These genes account for 86% of the nuclear genes known to encode proteins found in mitochondria . This observation suggests that genes associated with the mitochondria tend to display more periodic patterns of expression than the other yeast genes (p-value is lower than 1×10−40). This set of 626 genes was used for the following analyses.
To investigate in detail the published gene expression profiles for the 626 genes involved in mitochondrial biogenesis, we developed the EDPM algorithm (Expression Decomposition based on Periodic Models, see Methods and Figure 1) for two reasons. First, we wanted to identify precisely the time interval in the YMC during which each mitochondrial gene is mostly expressed. Second, we wanted to associate mitochondrial genes into temporal classes representing distinct expression phases during YMC. The principle of the EDPM algorithm consists in breaking each gene expression pattern down into a mixture of model patterns (Figure 1A and Figure S1). These model patterns are time-delayed mathematical functions mimicking ideal expression oscillations (1 to 15, Figure 1B). EDPM allows the calculation of ω-derived “footprint” representations of each gene expression profile; these representations describe the contribution of each model pattern to the gene expression profile analyzed and hence, are good indicators of the time at which the RNA of the gene peaks in abundance during the metabolic cycle (see Figure 1C for an illustration). Note that two genes, expressed at the same moment but with different magnitudes, may exhibit identical ω-footprints with EDPM.
The 626 genes involved in mitochondria biogenesis were analyzed with EDPM (Dataset S2) and classified in several clusters according to their ω-footprints. Hierarchical cluster analysis (see Methods) objectively supported a six-cluster distribution. These clusters were named A to F. Each comprises a distinct subclass of genes that are periodically expressed and the mRNAs in each cluster peaks in different time windows of the metabolic cycle (see Dataset S3 for a complete list of genes in each cluster).
In Text S1, we present detailed justifications for the use of EDPM algorithm together with several methodological controls. Important conclusions could be raised from these analyses. In particular, we could demonstrate that a minimal number of 9 model patterns is required to stabilize the gene repartition into phases A to F. Under 9, the number of model patterns was not enough to precisely indentified the phase transitions EDPM. The choice of 15 model patterns appeared to be a good compromise between a precise determination of transition phases and the computation time required to perform EDPM decomposition. Moreover, we could observe that the EDPM criterion was significantly smaller using the real expression data rather than the sample data. This justified the use of EDPM for the 626 genes analyzed in this study. All these genes exhibit periodic gene expression profiles during the YMC (this was demonstrated by Tu et al. ) and hence are compatible with the 15 model patterns used here. Moreover, it should be noted that the final w-values distribution is also a good indicator of the EDPM relevance. In case of shuffle data, the w-values are homogeneously distributed, indicating that no particular model pattern can explain the random expression profiles (see Text S1 for an illustration). Finally, a comparative analysis of the clustering results obtained with EDPM w-values and other methodologies allowed us to demonstrate that EDPM improves the dissection of the expression temporal waves.
The gene clusters presented in the previous section can be distinguished by the model patterns that contribute the most to the EDPM decompositions (see vertical arrows, middle panel Figure 3). The 15 model patterns differ by the time at which they reach their maximal values, so there is a direct relationship between clusters A to F and the temporal phases during metabolic cycle (Figure 4A). For instance, cluster A (262 genes) and B (77 genes) comprised genes whose EDPM decomposition preferentially followed model patterns n°1 and n°2, respectively. The mRNAs of these genes peak at the very beginning of the metabolic cycle (between 0 and 75 minutes). Genes in clusters C (123 genes) and D (27 genes) mainly conformed to model patterns n°5 and 8 and are expressed in the middle of the metabolic cycle (between 75 and 200 minutes). Finally, clusters E (30 genes) and F (107 genes) followed model patterns n°11 and 13 and comprised genes that are expressed at the end of the metabolic cycle (between 200 and 300 minutes).
Previous cluster analysis of the entire microarray data set identified three superclusters of gene expression termed R/B (reductive, building), R/C (reductive, charging) and Ox (oxidative) . Our analysis is consistent with this three cluster organization but it offers a more refined view of transcriptome dynamics. R/B genes are found in phases A and B, R/C genes in phases C and D, and Ox genes in phases E and F. To assess the biological relevance of the chronological order of the transcriptional classes A to F, we grouped the genes involved in mitochondria biogenesis into eleven model functional groups (see Dataset S4 for a detailed list of genes attributed to each functional group). Seven of these groups are shown in Figure 4B. They are labeled “Translation machinery” (83 genes), “Translation regulation” (47 genes), “Assembly factors” (32 genes), “Protein import” (40 genes), “Respiratory chain complex” (34 genes), “TCA cycle” (21 genes) and “Amino acid synthesis” (31 genes). We determined the percentage of genes in each biological function that belong to each temporal class A to F and there was a clear chronological distribution, highly biologically relevant (Figure 4B). For instance, functional discrimination between phases A and B is critical for respiratory chain complex assembly: almost all the genes (~78%) encoding assembly factors for these complexes have a high mRNA level in phase A (see the function named “Assembly factors”), whereas most genes (~56%) encoding structural units are present in phase B (function named “Respiratory chain complex”). This is important because temporal discrimination between the two gene classes probably facilitates complex construction: assembly factors being required at the start of subassembly intermediate formation (see, for instance, the case of Shy1p in COX assembly ). More generally, mRNAs coding for mitochondrial proteins peak at different times during the yeast metabolic cycle. The first mRNAs to appear are those for genes whose function is associated with the translation machinery (or regulation) and assembly factors (Figure 4B, phase A), followed by those involved in the synthesis of the respiratory chain structural proteins (Figure 4B, phase B) and finally mRNAs coding for enzymes involved in amino-acid biosynthesis are more abundant in phase F. This implies that mRNAs and probably the corresponding mitochondrial proteins are produced sequentially along the yeast metabolic cycle described by Tu et al. .
We compared the cellular localization of translation of the mRNAs for genes in clusters A to F. We previously described three classes of nuclear mRNAs encoding mitochondrial proteins, differing in their sites of translation . Class I and II mRNAs are found near mitochondria, whereas class III mRNAs are translated on free cytoplasmic polysomes. The subcellular localization of class I mRNAs is dependent on Puf3p, whereas class II mRNAs are Puf3p independent. The distribution of mRNAs of these three translation classes among the successive temporal phases (A to F) is represented in Figure 4C. The most salient feature is the substantial overlap between phase A and class I genes. Class I mRNAs dominate in phase A (183 of 262 genes), with only a few class I genes in other phases. Class II and III mRNAs are more evenly distributed, with the frequency of class II mRNAs being highest in phase C, and class III mRNAs being more frequently present in phases D, E and F. These observations imply coordination between mRNA oscillations and site of translation in the cytoplasm. This phenomenon is the consequence of transcriptional and post-transcriptional regulations and is presumably controlled by complex coordination of trans-acting factors acting on cis elements that remain to be identified. We therefore systematically searched for cis-acting signals in 5′ and 3′ UTR sequences, using several approaches (see below).
We investigated the regulatory processes governing the tight coordination of gene expression in phases A to F, by applying the MatrixREDUCE algorithm . The motif with the highest score was CCAATCA (see Dataset S5 for complete results). This motif is compatible with the binding site of the transcription factor Hap4p , a transcriptional activator and global regulator of respiratory gene expression. The proportion of genes belonging to each A to F phases for which Hap4p binding sites were found in upstream sequences is presented in Figure 5A. Nineteen % of the genes in cluster B and 15% of the genes in cluster C contain Hap4p binding sites. The corresponding enrichment p-values are significant, at 3×10−5 (cluster B) and 2×10−4 (cluster C) (see Methods). We analyzed other HAP genes encoding transcription factors involved in the regulation of gene expression in response to oxygen levels. The HAP1 gene was identified as particularly interesting because (i) its RNA level, like that of HAP4, varies substantially during the metabolic cycle (see Figure 5C) and (ii) the upstream sequences of more than 15% of the genes in cluster B (p-value=7×10−4) have a Hap1p binding site . Thus, the transcription factors Hap4p and Hap1p are excellent candidates to play an important role in the regulation of the yeast metabolic cycle.
Our search of regulatory elements in 3′ UTR sequences used work by Foat et al. , in which the authors used their MatrixREDUCE algorithm to identify binding sites for six mRNA stability regulators in Saccharomyces cerevisiae. The consensus sequences for these motifs are CCUGUAAAUACCC (=P3E), UUAUGUAUCAUA (=P4E), UAUAUAUUCUUA (=PRSE), CUGAUUACACGG (=RUPE), UUUGAUAGACUC (=NRSE1), UUGUGUAAUCCAUCGAUCAU (=NRSE2) and determine binding specificity for several RNA-binding proteins, including Puf3p (P3E motif) and Puf4p (P4E motif). We assessed the potential link between the occurrence of these motifs in 3′ UTR sequences and the temporal phases A to F by calculating the proportion of genes in each cluster with at least one motif. The P3E, PRSE and NRSE1 motifs appeared to be significantly overrepresented in phases A (p-value 6×10−120), B/C (6×10−8) and F (1×10−9), respectively (Figure 5B). The P3E motif is the site recognized by the RNA-binding protein Puf3p , which contributes to localizing mRNAs close to mitochondria . We extended this observation, by considering all Puf3p targets experimentally determined by  (Text S2), and observed that more than 80% of the Puf3p mRNA targets are found in phase A. The presence of a Puf3p motif in class A genes is fully consistent with the translational properties of class I mRNAs: the localization of these mRNAs near mitochondria is altered when PUF3 is deleted .
This analysis leads to the prediction that unknown cluster A genes translated in the vicinity of mitochondria in a Puf3p-dependent way (class I) are likely to be involved in early steps of mitochondria biogenesis. To test this experimentally, we examined the properties of nine strains carrying deletions of uncharacterized cluster A/class I genes (Figure 6B and Dataset S3). For each mutant strain, we checked the ability to grow on non fermentable carbon sources and tested the assembly of respiratory complexes III and IV by recording cytochrome spectra (see Methods). Disturbance of early steps of mitochondrial biogenesis —for example replication of mitochondrial DNA, mitochondrial transcription and translation— can affect maintenance of mitochondrial DNA , we also tested whether these mutant strains retained the mitochondrial chromosome by measuring the production of petite cells (rho−). The phenotypes of these deleted strains are presented in Figure 6C. Strikingly, seven out of the nine gene-deleted strains displayed severe respiratory dysfunctions (poor growth on non-fermentable media) and/or alterations in their cytochrome spectra. These phenotypes strongly suggest that most of the unknown phase A/class I genes have functions in mitochondrial transcription/translation or assembly of respiratory complexes. This is strongly in favour of the idea that during this short period (phase A lasts only 25 minutes, Figure 4A), there is a surge in the abundance of mRNAs important for mitochondrial biogenesis and that they are translated at particular subcellular localization.
It was recently observed , that yeast cells can be synchronized and exhibit synchronous waves of storing and then burning carbohydrates. Using microarrays, it was shown that many nuclear genes coding for mitochondrial proteins, have their mRNAs which oscillate and peak at a time when highest rate of respiration has passed. It was suggested  that cells are either rebuilding or duplicating their mitochondria at this time. We took advantage of these data to better analyze the mitochondria rebuilding program and identified new gene clusters reflecting spatio-temporal groups of gene expression. Our findings are entirely consistent with the notion of RNA regulons ,, according to which mRNA-binding proteins (RBP) play an important role, coordinating the various post-transcriptional events. We show here that 262 mRNAs coding for important mitochondrial proteins (assembly factors, ribosomal proteins, translation regulators) are coordinately and periodically present in increasing amounts early in the mitochondrial cycle (phase A=25 minutes). In addition, most of these mRNAs are specifically localized in the vicinity of mitochondria under the control of the protein Puf3p. This suggests that during this particular time-window, Puf3p acts in the control of mRNA localization/translation. During the rest of the mitochondrial cycle, Puf3p may function (possibly in association with other RBPs) either in mRNA degradation  or in the control of bud-directed mitochondrial movement . Following this early phase A, phases B (50 minutes) and C (50 minutes) concern elements of the fundamental mitochondrial machineries (respiratory chain complexes, TCA cycle, etc.). Undoubtedly, this chronology of events should reflect the logic of mitochondria construction.
This point can be illustrated with the well-documented assembly process of cytochrome c oxidase (COX) ,, a fascinating process involving the sequential and ordinate addition of 11 subunits to an initial seed consisting of Cox1p (Table 1, “core” and “shield proteins”). In addition to the structural subunits, a large number of accessory factors are required to build the holoenzyme. Unexpectedly, we found that all the mRNAs for these accessory factors are relatively abundant early in mitochondrial biogenesis, that is during phase A. Cluster A includes genes whose expression is essential for a preliminary step, consisting of the synthesis of all the elements (RNA polymerase, ribosomes, translation factors) required for mitochondrial production of Cox1p; this step is followed by the construction of the core enzyme (Cox1p+Cox2p+Cox3p). We also observed that the mRNAs coding for the 18 assembly factor transcripts involved in COX assembly , are mostly found during phase A (Table 1, “assembly factors”) and, in addition, all but one are translated in the vicinity of mitochondria under the control of Puf3p (MLR class I, ). The situation is very different for structural COX proteins (shield proteins of the complex). Except for Cox5A, all the corresponding mRNAs are found in phase B, indicating that the corresponding genes are expressed after those of phase A. Unlike phase A mRNAs, they are all translated on free cytoplasmic polysomes (MLR class III, ). This scenario agrees with the previous biochemical description of short intermediates ; especially interesting is the observation that Cox5Ap, found here in phase A, was previously identified as the first structural protein added to the S2 complex . The properties of COX assembly described here are common to the other respiratory chain complexes. The mRNAs for assembly factors mostly peak in phase A and they are translated close to mitochondria, under the control of Puf3p; they initiate the formation of respiratory complexes by the successive addition of structural proteins whose mRNAs mostly peak in phase B. This is the first evidence that, at least in the conditions described in , the construction of the respiratory chain is one of the first steps of mitochondrial biogenesis; indeed, all the production machinery (assembly factors, translation, etc.) are available in phase A to produce and assemble the protein complexes in phase B.
Genes coding for mitochondrial proteins can be classified into two different regulatory systems. This dichotomy is well illustrated in the case of OXPHOS complexes coding genes. The first class corresponds to mRNAs translated to the vicinity of mitochondria, mainly present in phase A and which code, for instance, for assembly factors. Genes of the second class code for structural proteins, and are found mainly in phases B or C during which transcription regulation is the major mechanism. Previous studies suggested that genes coding for assembly factors are not transcriptionally regulated . We confirmed and extended these preliminary observations by showing that genes encoding assembly factors: (i) are expressed before genes encoding structural proteins, (ii) have a functional Puf3p binding site which controls localization/translation to the vicinity of mitochondria and may thus generate discrete foci on the matrix face of the mitochondrial membrane, and (iii) do not contain any evident signals in their 5′UTR, a feature which distinguishes them from the genes encoding structural proteins. The mRNAs for translation and assembly factors are all expressed only during phase A, but mRNAs for structural proteins are found during phases A, B and to a lesser extent C. This is likely to reflect the timing of the building of the various complexes. Thus, for instance, COX assembly requires an intact functional ATPase , which is in agreement with the fact that mRNAs for ATPase structural proteins are mostly found in phase A (see Dataset S4) whereas the COX equivalents are mostly in phase B (see Dataset S4). Also, unlike genes encoding assembly factors, genes coding for structural proteins of the respiratory chain complexes are mainly controlled transcriptionally. According to the environmental conditions, either Hap4p (depending on carbon availability ,) or Hap1p (depending on oxygen concentration ), regulate the transcription of nuclear genes coding for structural proteins. Binding sites for these two transcription factors are present significantly more frequently than expected from a random distribution in the genes of clusters A, B and C (Figure 5A). In addition, the amounts for both HAP4 mRNA and HAP1 mRNA also oscillate and peak in phases A and C, respectively (Figure 5C). HAP1 mRNA variation is interesting because Hap1p can repress its own transcription and may act either as a repressor or as an activator, depending on oxygen levels . It was observed that fluctuating levels of O2 dissolved in the culture, indicates changing activities of mitochondrial oxygen consumption and cellular redox switching . Thus, Hap1p, which is an oscillating redox sensor, is an excellent candidate to signal the transition between non-respiratory rebuilding and respiratory phases (Figure 5D).
Overall, we report a comprehensive picture of the biogenesis of yeast mitochondria and illustrate spatio-temporal differences between groups of nuclear genes. The unexpected finding that transcriptionally or post-transcriptionaly regulated groups of genes are expressed both at different times and translated in different places may be of relevance to mitochondria in other species. Indeed, mammalian β F1-ATPase mRNA is found in the outer membrane and is translated, under the control of 3′UTR signals and RNA-binding proteins , only during cell cycle phase G2/M ; this gives credence to the general applicability of our observations. Studies with human cells are currently underway to assess the similarities and differences between yeast and human cells regarding these aspects of mitochondrial biogenesis.
Figure that describes the principle of the EDPM procedure.
(0.38 MB PDF)
Detailed justifications for the use of EDPM algorithm together with several methodological controls.
(1.60 MB PDF)
Distribution of all Puf3p targets (as determined experimentally by Gerber et al.) across the A to F phases.
(0.43 MB PDF)
Complete gene lists with expression measurements from Tu et al. , for the 626 genes analyzed in this study.
(0.46 MB XLS)
EDPM results for each of the 626 genes analyzed in this study.
(0.20 MB XLS)
Distribution of genes into the six temporal phases of the mitochondrial cycle.
(0.21 MB XLS)
Distribution of genes into 11 major functional classes, and properties of the genes.
(0.12 MB XLS)
Detailed results obtained with the MatrixREDUCE algorithm, searching for 5′ regulatory signals.
(0.02 MB XLS)
We thank Frédéric Devaux for fruitful discussions, Catherine Etchebest for advices concerning Fourier analysis, Benjamin Tu for assistance about data normalization, and finally Adam Bussière for his helpful support during the manuscript revisions. We would also like to honor the memory of Prof. Serge Hazout, who came up with the initial idea for the EDPM algorithm.
The authors have declared that no competing interests exist.
ANR-06-BLAN-0234-01 and ARC (A08/2/1034). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.