In order to understand molecular biology at a systems level, it is first necessary to learn the functions of genes by identifying their participation in specific cellular pathways and processes. While protein sequence and structural analyses can provide valuable insights into the biochemical roles of proteins, it has proven much more difficult to associate proteins with the pathways where they perform these roles. Recently, high-throughput and whole-genome screens have been used to form basic hypotheses of protein participation in biological processes. However, the results of these studies are not individually reliable enough to functionally associate proteins with pathways. Many computational approaches have been developed to integrate data from such high-throughput assays and to generate more reliable predictions 
, but protein function cannot be confidently assigned without rigorous experimental validation targeted specifically to the predicted pathway or process. Surprisingly few follow-up laboratory efforts have been performed on the basis of computational predictions of protein function, and as such, these computational approaches remain largely unproven, and consequently underutilized by the scientific community 
. Here, we demonstrate that computational predictions can successfully drive the characterization of protein roles using traditional experiments. To test the approach, we systematically measured the mitochondrial transmission rates of a tractable set of S. cerevisiae
strains carrying deletions of genes predicted to be necessary for this biological process.
The mitochondrion is an organelle central to several key cellular processes including respiration, ion homeostasis, and apoptosis. Proper biogenesis and inheritance of mitochondria is critical for eukaryotes as 1 in 5,000 humans suffers from a mitochondrial disease 
has proven to be an invaluable system for studying a variety of human diseases 
, including cancer 
, neurologic disorders 
, and mitochondrial diseases 
. Yeast is a particularly attractive model system for studying mitochondrial biology due to its ability to survive without respiration, permitting the characterization of mutants that impair mitochondrial function. The process of mitochondrial biogenesis and inheritance 
(hereafter, mitochondrial biogenesis) comprises a number of sub-processes that together ensure that new mitochondria are generated and segregated to a daughter cell. Mitochondrial biogenesis begins with the nuclear genes encoding mitochondrial proteins being transcribed, translated, and targeted to the mitochondria for import 
. The mitochondria must also replicate its own genome 
and assemble the numerous membrane-bound complexes necessary for proper function 
. During mitochondrial transmission, the mitochondria are actively transported along actin cables to the bud neck, where they are then segregated between the mother and daughter cells 
. In addition to the experimental utility of yeast, it is well suited for the application of computational prediction approaches due to the availability of manually-curated annotations of yeast biology and the available wealth of genome-scale data.
Previous efforts have focused on identifying mitochondria-localized proteins through laboratory techniques such as mass spectrometry and 2D-PAGE 
and through computational predictions of cellular localization 
. These approaches have resulted in the identification of over 1,000 mitochondria-localized proteins in S. cerevisiae 
. However, despite yeast's convenience as a model system, mitochondrial phenotypes of ~370 of these 1,000 localized proteins have not been characterized, so the mitochondrial role of these predictions is unknown (over half of these 370 have no known function in any cellular process). Previous computational efforts have attempted to address this problem by predicting putative mitochondrial protein modules 
and examining expression neighborhoods around mitochondrial proteins 
. While valuable, these predictions of protein function have not been confirmed through laboratory efforts. Rather, these studies have performed assays for protein localization to the mitochondria, which is not sufficient to convert these predictions to concrete knowledge of protein roles 
Here, we describe a strategy that combines computational prediction methods with quantitative experimental validation in an iterative framework. Using this approach, we identify new genes with roles in the specific process of mitochondrial biogenesis by directly measuring the ability of cells carrying deletions of candidate genes to propagate functioning mitochondria to daughter cells. We assayed our 193 strongest predictions with no previous experimental literature evidence of phenotypes and interactions establishing a function in mitochondrial biogenesis. By these assays we experimentally discovered an additional 109 proteins required for proper mitochondrial biogenesis at a level of rigor acceptable for function annotation. Further, we identified more specific roles in mitochondrial biogenesis for several predicted genes through mitochondrial motility assays and measurements of respiratory growth rates. We also discovered genes with redundant mitochondrial biogenesis roles through targeted examination of double knockout phenotypes. This demonstrates that using an ensemble of computational function prediction methods to target definitive, time-consuming experiments to a tractably sized set of candidate proteins can result in the rapid discovery of new functional roles for proteins. Our results also show that most mutants resulting in severe respiratory defects have already been discovered. This is likely to be the case for mutant screens in many fundamental biological processes, because saturating screens have discovered mutations with strong phenotypes. However, even in a well-studied eukaryote like S. cerevisiae
, there are many processes that have not yet been fully characterized by identifying all proteins required for its normal function 
. As such, most of the remaining undiscovered protein functions are only identifiable by rigorous, quantitative assays that can detect subtle phenotypes, such as those used by our study.