Mitochondria play a central role in metabolism, energy production, ion homeostasis, and apoptosis [1
] and are found in most eukaryotes. Not surprisingly, a large fraction of all characterized human Mendelian disease genes encode proteins localized to mitochondria [3
]. It is estimated that 700–800 mitochondrial proteins are present in Saccharomyces cerevisiae
with a higher number in humans [2
]. It is widely accepted that mitochondria originated from an endosymbiosis between an ancestral alpha-proteobacterium and a eukaryotic host [5
]. During evolution most of the genes encoded by the mitochondrial genome (mtDNA) were transferred to the nucleus or were lost [7
]; now only eight proteins are encoded by the mtDNA of yeast and 13 in humans. Thus, despite having their own genome, mitochondria are highly dependent on extra-mitochondrial processes for their function and biogenesis.
Genome-scale approaches have catalyzed the identification of mitochondrial proteins in different organisms through, for example, analysis of deletion phenotypes [8
], subcellular localization [10
], gene expression [4
], and mass spectrometry-based proteomics [4
]. Each systematic dataset surveyed different properties of mitochondrial proteins, identifying proteins physically residing in mitochondria and genes functionally related to the organelle. A comparison of two datasets in mouse [18
] and 22 datasets in yeast [4
] demonstrated that indeed different sources of experimental evidence clearly provide diverse degrees of complementary information on mitochondrial localization, phenotype, and regulation. In fact the integration of all data types can overcome the limited sensitivity and specificity of each dataset individually and can result in a more accurate catalog of mitochondrial associated proteins [4
]. Similar integrated analyses have been applied recently to identify human mitochondrial proteins, despite a much smaller collection of available genome-wide datasets [22
Nevertheless, characterization of the mitochondrial proteome is still incomplete. To date, 533 proteins are verified as localized to the mitochondrial organelle in yeast by single-gene studies [22
]. Yet even with a set of 533 annotated mitochondrial proteins, about one third of the expected 800 proteins still remains to be verified. Beyond mitochondrial protein identification, the functional role of all proposed candidates remains to be explored in the context of known mitochondrial proteins.
Protein networks, which describe the interrelationships among components, provide a context to functionally characterize candidate proteins. Functional links between proteins have been defined based on physical interactions [24
], expression regulation [30
], mutant phenotypes [8
], phylogenetic profiles [35
], literature mining [36
], and orthology transfer of interaction evidence across species [37
]. Analogous to the identification of mitochondrial proteins, integrating heterogeneous but complementary interaction data types improves the accuracy and the coverage in detecting protein associations [39
] and has been implemented globally [40
]. However, a comprehensive network reconstruction for mitochondria is missing and moving from a list of proteins to their placement into a functional context is needed.
Here, we analyze the yeast mitochondria at a systems level, first by defining an accurate and comprehensive list of mitochondrial components and then integrating it with diverse data sources on protein associations to construct a network of functional interactions. The network yields a comprehensive map of mitochondrial modules and a functional context for hundreds of uncharacterized components. Analyses of systems properties—conditioned by, but not easily deduced from the individual parts of the system—reveal hypotheses about expression regulation and evolution. Our survey has implications beyond yeast, for candidate gene identification of human mitochondrial disorders.