|Home | About | Journals | Submit | Contact Us | Français|
We predicted gene function using synthetic lethal genetic interactions between null alleles in Saccharomyces cerevisiae. Phenotypic and protein interaction data indicate that synthetic lethal gene pairs function in parallel or compensating pathways. Congruent gene pairs, defined as sharing synthetic lethal partners, are in single pathway branches. We predicted benomyl sensitivity and nuclear migration defects using congruence; these phenotypes were uncorrelated with direct synthetic lethality. We also predicted YLL049W as a new member of the dynein–dynactin pathway and provided new supporting experimental evidence. We performed synthetic lethal screens of the parallel mitotic exit network (MEN) and Cdc14 early anaphase release pathways required for late cell cycle. Synthetic lethal interactions bridged genes in these pathways, and high congruence linked genes within each pathway. Synthetic lethal interactions between MEN and all components of the Sin3/Rpd3 histone deacetylase revealed a novel function for Sin3/Rpd3 in promoting mitotic exit in parallel to MEN. These in silico methods can predict phenotypes and gene functions and are applicable to genomic synthetic lethality screens in yeast and analogous RNA interference screens in metazoans.
The robustness of a biological network to defects can be probed by synthetic lethality, which reveals that a cell survives individual gene deletions, but cannot survive deletion of specific gene pairs. Synthetic lethal interactions have been rationalized with two hypotheses: (i) two genes in a single linear pathway can show synthetic lethality; (ii) synthetic lethal genes act in parallel or compensating pathways (Tucker and Fields, 2003). These two hypotheses predict distinctly different patterns of synthetic lethality: enrichment of interactions within single pathways versus depletion of interactions within pathways and enrichment between pathways. These two hypotheses also make different predictions for the non-lethal phenotypes of the underlying single gene deletions: a shared phenotype for genes in a single pathway, or possibly differing phenotypes for genes in parallel pathways.
Hypothesis (i) is possible only when alleles are hypomorphic but not complete loss-of-function mutants: each mutation reduces flux partially, but the combined reduction from two mutations leads to lethality. Hypothesis (i) does not apply to synthetic lethality between null alleles, with complete loss of function. Hypothesis (ii) is expected in this case, with each null mutation knocking out one of the two parallel pathways that sustain normal growth. In this view, an essential protein complex that retains function when single non-essential subunits are deleted (but not multiple subunits simultaneously) is formally represented by multiple pathways, one for each functional stoichiometry, connected in parallel.
Data sets to test these rationales are arising from high-throughput synthetic lethality screens accomplished in Saccharomyces cerevisiae using synthetic genetic array (SGA) and synthetic lethality analysis on microarrays (SLAM). These screens test a deletion of interest (query gene) against all possible viable yeast single-deletion strains (target genes) (Tong et al, 2001; Ooi et al, 2003; Pan et al, 2004). As human disease susceptibility may encompass gene mutations in multiple pathways, synthetic lethality is relevant to human disease processes (Tucker and Fields, 2003).
We focus on the subset of genetic interactions restricted to synthetic lethal interactions and synthetic fitness (slow growth) defects between null alleles. These interactions are easier to interpret than more general genetic interactions (enhancer, suppressor screens) or other types of mutant alleles (e.g., hypomorphs of essential genes). Null mutants constructed by the International Yeast Gene Deletion Consortium represent the vast majority currently under study by the yeast community (Giaever et al, 2002). For brevity, we use the term synthetic lethal to include both the lethal and reduced fitness phenotypes.
Synthetic lethal interactions have been used to predict that interaction partners share function in the same pathway (Tong et al, 2001, 2004; Wong et al, 2004). Here, we emphasize the alternative hypothesis suggested above, that synthetic lethal interactions bridge parallel pathways, which are in a sense orthogonal to direct synthetic lethal interactions (Figure 1A). This concept is formalized computationally as follows. Pathway membership is inferred using the hypergeometric P-value for a shared pattern of interaction partners, which we abbreviate as the congruence score (Figure 1B). We present evidence that functional associations inferred from the congruence score are stronger than associations between the synthetic lethal interaction partners themselves. Two types of functional associations are explored: biochemical participation in protein complexes, through joint analysis of synthetic lethal interactions (Tong et al, 2004) with protein complex data (Gavin et al, 2002; Ho et al, 2002; Mewes et al, 2004) (see Supplementary information, Supplementary Figures S1 and S2); and phenotypes of the underlying single gene deletion mutants, including nuclear migration and drug sensitivity. The nuclear migration assay and the physical interaction detected between Jnm1p and Yll049wp confirm our prediction that the previously uncharacterized yeast gene YLL049W is a new member of the dynein–dynactin pathway.
As has been noted previously, only ~1% of synthetic lethal interactions occur between genes whose products reside in a single protein complex (Tong et al, 2001). While, as pointed out by the authors of that paper, this is a greater fraction than would be expected by chance, it is clear that the vast majority of synthetic lethal interactions are not explained by common protein complex membership and we would argue that this 1% represents the exception and not the rule. The parallel pathway model suggests that genes sharing synthetic lethal interaction partners may function in a single pathway, and their gene products should have an increased probability to reside in a single protein complex.
The raw number of shared genetic interaction partners has been used previously to rank the probability of a physical interaction between the corresponding gene products (Tong et al, 2004). Here, we instead use the hypergeometric P-value for the number of shared neighbors, which accounts for the number of interaction partners of each gene (Figure 1B). To convert this value to a convenient scale, we define the congruence score as the negative log10 of the P-value; related measures have been used to analyze protein interaction networks (Goldberg and Roth, 2003; Schlitt et al, 2003) and multiple characters from single RNA interference (RNAi) screens (Gunsalus et al, 2004). The congruence score has the benefit of providing a natural significance threshold incorporating the size of the network. The performance of a predictive method can be visualized by plotting the number of true positives versus the number of false positives as a function of the number of predictions made, known as a receiver operating characteristic (ROC) curve. Based on the area under the ROC curve, the performance of congruence score method is superior to counting the number of shared partners in predicting protein complex membership in the stringent regime (Supplementary Figure S3 and Supplementary Table S1).
We separated the synthetic lethal interaction data into ‘query' and ‘target' sets, based on whether each gene node represents a non-essential query gene (126 are included in the published data) or a target gene (982 of which are synthetic lethal partners of at least one query). We calculate congruence scores for each pair of target genes (Supplementary Figure S4).
The fraction of target gene pairs in the same protein complex (Gavin et al, 2002; Ho et al, 2002) increases with congruence score, rising to 100% at the highest values (Figure 2A). Analysis using the MIPS database of curated complexes (Mewes et al, 2004) yields similar results (Supplementary Figure S5). Even for the smallest non-zero congruence scores, the observed fractions of pairs within the same complex are greater than expected by chance (P<0.005). Gene products of pairs with congruence score 5 have a higher probability of protein complex co-residence than products of synthetic lethal interaction partners. Moreover, using synthetic lethal interactions to predict complex co-residence shows higher false positive rate ([false positives]/[false positives+true negatives]) and higher false discovery rate ([false positives]/[false positives+true positives]) than using congruence score (Supplementary Figure S3).
Functional associations, determined by extracting Gene Ontology (GO) (Ashburner et al, 2000) annotations and calculating correlations based on the depth of the deepest parent term (see Materials and methods), are greater for congruent genes than for synthetic lethal pairs. Biological Process and Cellular Component correlations increase with congruence score and are greater than the similarity between direct genetic interaction partners (Figure 2B). As is typically the case, the GO Molecular Function annotations have smaller correlation as they refer to molecular, rather than biological, roles. For congruence scores 7, 10, and 6, respectively, the GO process, function, and component correlations for congruent gene pairs are significantly higher than the corresponding correlations for the raw synthetic lethal pairs (0.25, 0.05, and 0.31), respectively (P<0.05). Calculations based on semantic similarity of GO terms (Lord et al, 2003) show even stronger performance of the congruence score relative to synthetic lethality (Supplementary Figure S12).
In summary, a congruence interaction with score 10 provides a tighter functional relationship than synthetic lethality, consistent with our interpretation of single versus parallel pathways. Although individual synthetic lethal gene pairs may share synthetic lethal partners (as observed by Tong et al, 2004), high congruence score typically excludes direct synthetic lethal interaction, in agreement with our model (Figure 2C). When congruence score is greater than or equal to 14, the binomial P-value for observed number of synthetic lethal interactions becomes insignificant given the overall frequency of synthetic lethal interactions observed in the entire congruence data set (P>0.05).
A network generated by setting a threshold congruence value 10 recapitulates known functional associations and suggests novel associations (Figure 2D). Sets of genes known to function within the same pathway tend to cluster together. As expected, the congruence links overlap known protein interactions, whereas synthetic lethal links do not. For example, a prefoldin complex gene cluster inferred from congruence links (PAC10, GIM3, GIM4, GIM5, and YKE2) corresponds to the PAC10 complex shown in Supplementary Figure S1B.
In some cases where proteins encoded by genes with congruence links were not detected within the same protein complex by high-throughput studies (Gavin et al, 2002; Ho et al, 2002), other experiments have indicated physical interactions. SWR1, SWC1, VPS71, VPS72, SIF2, and ARP6 encode subunits of SWR1 chromatin remodeling complex catalyzing exchange of histone H2A with histone variant Htz1p (Mizuguchi et al, 2004). Genes in a highly connected congruence cluster may function in the same pathway through transient physical interactions, or they may participate in a pathway as separate physical entities. For example, Cin1p, Cin2p, and Pac2p are all tubulin folding factors that function in a pathway leading to microtubule stability (Hoyt et al, 1997). Physical interaction between Pac2p and Cin1p has been reported (Fleming et al, 2000). Cin8p is a kinesin motor protein involved in mitotic spindle assembly and chromosome segregation, and interacts with microtubules (Gheber et al, 1999). Possibilities include that Cin1p, Cin2p, Pac2p, and Cin8p interact transiently during mitosis, or that they influence the same molecular environment independently. For example, activities of Cin1p, Cin2p, and Pac2p might generate an optimal microtubule substrate for Cin8p.
The largest connected component in Figure 2D includes known members of the dynein–dynactin spindle orientation pathway (ARP1, NUM1, DYN1, PAC11, PAC1, DYN2, JNM1, YMR299C, and NIP100) and corresponds to a group observed previously using clustering (Tong et al, 2004). The dynactin protein complex (Arp1p, Jnm1p, and Nip100p) defined by biochemical studies is required for proper spindle orientation and chromosome partitioning to daughter cells during anaphase (Kahana et al, 1998). Additional reported protein–protein interactions in this congruence cluster include Jnm1p–Yll049wp, Nip100p–Pac11p, Pac11p–Dyn2p, and Pac11p–Num1p (Uetz et al, 2000; Farkasovsky and Kuntzel, 2001; Ito et al, 2001). We predict YLL049W as a new component of the dynein–dynactin spindle orientation pathway, which is consistent with previous observation (Tong et al, 2004). We have experimentally validated the functional prediction of YLL049W by showing that its null mutant allele exhibits a nuclear migration defect similar to dynactin component JNM1. Furthermore, we have successfully detected a physical interaction between Jnm1p and Yll049wp using a directed two-hybrid test. Both experiments will be described in detail in the next section. The second uncharacterized open reading frame (ORF), YDR149C, is also congruent to dynein–dynactin components. Its ORF overlaps the beginning of its neighbor NUM1, and we suggest that the ydr149cΔ phenotype is in fact due to concomitant mutation of NUM1.
Distinct lesions to a single pathway branch should result in similar systems-level perturbations. We reasoned that similarity of a numeric phenotype of a deletion mutant should be better predicted by congruence score than by a direct synthetic lethal interaction.
We investigated the ability of the congruence score to predict the penetrance of nuclear migration defects in a population of mutant cells. Mutations in the dynein–dynactin spindle orientation pathway are known to increase the nuclear migration defect rate. We selected six genes in the pathway as landmarks (DYN1, ARP1, DYN2, JNM1, NUM1, and NIP100) and then measured the defect rate at 13°C for 59 mutants of genes with congruence score 4 to at least one of the landmarks (Supplementary Figure S6 and Supplementary Table S2). To summarize the relationship between phenotype and congruence score, each mutant's migration defect (% abnormal) was plotted as a function of congruence scores to landmark genes (Figure 3A). The average congruence score is highly correlated with the defect rate (Spearman correlation coefficient=0.51, two-sided P=3.9 × 10−5). Additionally, at or above congruence score of 10, all mutants exhibit moderate to severe nuclear migration defects (14–80% abnormal cells).
Among the mutants found to exhibit a nuclear migration defect was one representing the unstudied gene YLL049W (Supplementary Table S2). Further analysis of the yll049w mutant showed that the observed defects are temperature-dependent, similar to jnm1 mutants, whereas a mutant for the Kinesin-related KIP2 gene displayed temperature-independent defects (Supplementary Table S3). Notably, the JNM1–YLL049W congruence score (15.2) is higher than the JNM1–KIP2 congruence score (10.8), consistent with more similar phenotypes.
It is evident from this analysis that uncharacterized ORF YLL049W is required for robust nuclear migration. High-throughput yeast two-hybrid results suggested a protein–protein interaction between Yll049wp and dynactin subunit Jnm1p (Ito et al, 2001). We have experimentally confirmed this physical interaction between Yll049wp and Jnm1p using a different two-hybrid system (Supplementary Figure S7). These results provide supporting evidence for interaction between the two proteins, but do not address whether the association is stable, transient, or bridged by other proteins. The dynein–dynactin pathway for nuclear positioning includes many protein components that are not dynein or dynactin complex members, whose contributions influence microtubule dynamics, the formation of a capture site on the cell cortex, and proteins that regulate spatial and temporal steps in the determination of nuclear orientation and migration during the cell cycle (Sheeman et al, 2003; Knaus et al, 2005; Li et al, 2005). Kip2p acts to ensure nuclear positioning within the dynein–dynactin pathway (Miller et al, 1998) by transporting dynein to the microtubule plus ends (Lee et al, 2003; Carvalho et al, 2004).
Our data indicate that YLL049W is a previously unknown component of the dynein–dynactin spindle orientation pathway and suggest that it might be a subunit of yeast dynactin. Elucidation of the specific molecular function of YLL049W will require further study.
To test the general application of using congruence score as phenotype predictor, we chose sensitivity to benomyl, a microtubule-depolymerizing agent, as our second phenotype assay for deletion mutants. The microtubule biogenesis gene CIN1 (Hoyt et al, 1990) was selected as the benomyl-sensitive landmark. Null mutants of 31 genes with congruence scores 4 for CIN1 were tested for growth defects on medium containing 5 μg/ml of benomyl at 25°C (Supplementary Table S4). With increasing congruence score cutoff, the fraction of benomyl-sensitive null mutants rises to 1 (Figure 3B). We again observed significant correlation between the congruence score and the fraction of benomyl-sensitive mutants (Spearman correlation coefficient=0.49, two-sided P=0.006).
To validate the hypothesis that congruence interaction inferred from synthetic lethality indicates a closer functional association between genes than direct synthetic lethality, we selected landmarks of seven benomyl-sensitive mutant strains (cin1Δ, yml094c-aΔ, pac10Δ, pfd1Δ, gim3Δ, tub3Δ, and gim5Δ) from the top list of 451 candidate benomyl-sensitive mutant strains from a recent high-throughput genetic screen (Pan et al, 2004). We then ranked genes based on their average congruence score with seven landmarks (Supplementary Table S5). As a test of the competing hypothesis that synthetic lethal interactions themselves indicate direct functional associations, we also ranked genes by the raw number of synthetic lethal interactions with seven landmarks (Supplementary Table S6). The congruence score and the raw number of interactions were then tested for correlation with benomyl LD50, the dose that is lethal to at least 50% of the cells, equivalent to control/experimental hybridization signal ratio 2 used as threshold by Pan et al (2004). The congruence score is significantly correlated with LD50 (Spearman correlation coefficient=−0.17, two-side P=0.04), but the number of synthetic lethal links is not (Spearman correlation coefficient=0.06, two-side P=0.22) (Figure 3C and D). These results support the idea that genetic congruence correlates better with a given phenotype than direct synthetic lethal interaction and indicate that congruence is a superior measure for predicting certain phenotypes.
All genes having high congruence scores with landmarks are involved in direct microtubule biogenesis. For example, PAC10, YKE2, GIM3, GIM4, and GIM5 all belong to the prefoldin complex that acts to deliver unfolded proteins to cytosolic chaperonin (Geissler et al, 1998; Vainberg et al, 1998). On the other hand, we noticed that some genes with multiple synthetic lethal interaction links with landmarks tend to function in a distinct pathway from microtubule biogenesis. For example, SWC1 and ARP6 are subunits of SWR1 chromatin remodeling complex catalyzing exchange of histone H2A with histone HTZ1 (Mizuguchi et al, 2004).
Because increasing congruence score is related to protein complex co-residence, we predicted that genes encoding proteins known to co-reside in a complex would have similar synthetic lethal interaction profiles. We verified this hypothesis using PFD1 as a dSLAM (diploid-based synthetic lethality analysis on microarrays) query; the remaining prefoldin complex members have been characterized as queries in the SGA study. We identified 33 PFD1 synthetic lethal partners (Supplementary Table S7). High congruence values between PFD1 and other prefoldin components, GIM3, GIM4, GIM5, PAC10, and YKE2, equal to 14, 14, 9, 15, and 16, demonstrate the overlap between congruence links and protein complex membership (Supplementary Table S8). The five prefoldin members used as query genes in SGA exhibit much more significant overlap among themselves (congruence scores in the range of 23–67) than to PFD1. However, this may arise from systematic biases between the SGA and dSLAM methods rather than a biological distinction for PFD1. Additionally, 13 of 33 PFD1 synthetic lethal partners map to reported protein complexes (Supplementary Table S7). Notably, none of the 33 PFD1 synthetic lethal partners is a prefoldin component. This supports the hypothesis that physical and synthetic lethal interactions are generally orthogonal.
We further tested the hypothesis that synthetic lethal interactions between null alleles define parallel pathways, by performing dSLAM screens of genes required for mitotic exit. Two parallel pathways, the Cdc14 early anaphase release (FEAR) and the mitotic exit network (MEN), are required for release of the essential protein phosphatase Cdc14p from nucleolus during yeast cell cycle (Stegmeier et al, 2002). Components of the FEAR network include SLK19 and SPO12, whereas those of MEN include LTE1 and CLA4. Double mutant cells of these two pathways fail in Cdc14p release from the nucleolus and arrest in telophase with a large-budded morphology.
To test the parallel pathway model, we performed dSLAM experiments using SLK19, SPO12, and LTE1 as queries; CLA4 was previously used as a query in the SGA study (Tong et al, 2004) (Supplementary Table S9). We re-identified known synthetic lethality interactions between the FEAR and MEN pathways (Stegmeier et al, 2002; Goehring et al, 2003). High congruence was observed between SLK19 and SPO12, and between LTE1 and CLA4, but not across FEAR/MEN pathways (Supplementary Table S10). In addition, our genome-wide screens discovered synthetic lethal interactions between LTE1 and SIN3, RPD3, PHO23, and SAP30, the components of the Sin3/Rpd3 histone deacetylase complex (Loewith et al, 2001) (Figure 4A and Supplementary Table S9). Although most were initially not identified in the previous study (Tong et al, 2004), we also observed synthetic lethal interactions between CLA4 and all four components of the Sin3/Rpd3 complex (data not shown). These interactions were specific to MEN because synthetic lethality was not observed between the Sin3/Rpd3 histone deacetylase components and FEAR network components in either dSLAM or individual assays. These results led us to predict that the Sin3/Rpd3 histone deacetylase might play an important role during mitotic exit when the MEN pathway is mutated. In support of this, cells of the double mutants, lte1Δ rpd3Δ, lte1Δ sin3Δ, lte1Δ sap30Δ, lte1Δ pho23Δ, were unable to exit mitosis, and arrested with a dumbbell-shaped morphology typical of a mitotic exit defect. Furthermore, the viability of these double mutants was restored when TAB1-6, a dominant allele of CDC14 that binds weakly to the negative regulator Cfi1p/Net1p (Shou et al, 2001), but not the wild-type CDC14 was expressed (Figure 4B). Interestingly, this TAB1-6 allele also suppressed the lethality of an lte1Δ slk19Δ double mutant (Figure 4B). Thus the Sin3/Rpd3 histone deacetylase likely acts in parallel with MEN in promoting exit from mitosis.
Synthetic lethal interaction provides evidence for compensating gene function. This compensation has been rationalized as buffering within a single pathway, or buffering between two parallel or compensating pathways (Tong et al, 2001, 2004; Wong et al, 2004). We find that the parallel pathway model permits successful inference of protein complex membership from synthetic lethal data. The parallel pathway model, but not the single pathway model, yields successful predictions for phenotypes including nuclear migration defect rates and drug sensitivity. The parallel pathway model is also consistent with known pathways comprising genes identified in synthetic lethal screens. The model motivated our confirmation of YLL049W as participating in the dynein–dynactin nuclear migration pathway by phenotypic analysis, permitted identification of benomyl-sensitive strains based on congruence to landmark genes, and yielded a novel prediction of Sin3/Rpd3 histone deacetylase as a new module for mitotic exit that acts in parallel with MEN.
Using a different analysis strategy, Kelley and Ideker (2005) recently reported that synthetic lethal interactions are typically ‘between pathway', whereas ‘within-pathway' interactions occur infrequently. For their purposes, all subsets of proteins that are densely connected by physical interactions in non-mutant cells were considered ‘within pathway'. If a pathway is defined strictly by its components, however, the view that null allele synthetic lethality must always occur between parallel pathways can be enforced, precluding ‘within-pathway' explanations. In such a view, members of a protein complex that functions in the absence of either of two subunits, but not both, would participate in three parallel pathways: one that includes all possible components, and one for each ‘incomplete' complex (all of which might function in non-mutant cells). More generally, methods that summarize synthetic lethal relationships are often more useful than raw synthetic lethal pairs.
This recent analysis also predicted that Yll049wp associates with dynactin during spindle orientation (Kelley and Ideker, 2005), consistent with our observation from congruence analysis that YLL049W is functionally related to dynein–dynactin pathway. Our characterization includes experimental validations that support the prediction, and provides evidence from congruence score and detailed phenotype that the function of YLL049W is more similar to JNM1 than KIP2. Confirmation of a physical interaction between YLL049W and JNM1 further suggests that the prediction will be useful in future detailed analysis of the molecular role of YLL049W.
The congruence score metric compares favorably with other methods for inferring functional associations from synthetic lethal data. First, it produces stronger inference of gene function than the underlying direct genetic interactions. For example, direct interactions are unable to predict benomyl sensitivity, whereas congruence is a strong predictor of similar sensitivity. Second, the congruence metric naturally provides a P-value and can give improved performance relative to the raw count of the number of shared interaction partners. Finally, the P-values provided by the congruence score can provide an advantage over methods such as hierarchical clustering, which continue to depend on visual inspection of clusters and definition of cluster boundaries.
The quantitative characteristic of each congruent pair interaction can be used to consider interactions above a given threshold, allowing experimentalists to consider which network features reflect the most significant evidence in the data set, and to include less significant observations to be evaluated when desired. Importantly, a congruence summary at any significance level quantitatively relates genes according to their functional similarity by interaction profiles, not individual synthetic lethal pairings. To identify congruent gene pairs with greater or lesser significance, the interaction linkages can be annotated, or the map can be redrawn at differing congruence cutoff scores. For example, Supplementary Figure S8, Figure 2D, and Supplementary Figure S9 are all target gene congruence network by setting congruence score 8, 10, and 15, respectively. This aspect of network analysis will become increasingly important as the information summarized within it grows. Some biologically important relationships may inherently be present in the genetic congruence network only at relatively low significance overall. These can be viewed by extracting a local network containing first-degree congruence relationships in much the same way as the current large-scale interaction network is commonly viewed in subsections (Tong et al, 2001, 2004; Ooi et al, 2003).
A possible limitation of our analysis is the low coverage of the synthetic lethal network, with only ~2% screened by high-throughput methods using query genes selected on the basis of specific biological themes (Tong et al, 2004). To assess the sensitivity of our analysis to missing data, and also to possible false positives, we repeated our analysis with data sets modified to contain up to 30% false positives (random interactions added to the data) and 30% false negatives (observed interactions removed from the data) (Supplementary Figure S10). Note that the false-positive rate is quite low for the SGA data owing to confirmation by tetrad or random spore analysis; false negatives are estimated in the range of 17–41% (Tong et al, 2004). Although the congruence scores shift to lower values, the overall performance is similar to using the original data set (compare Figure 2 and Supplementary Figure S10). These observations suggest that the congruence score method is robust to noisy and incomplete data.
Continuing genetic interaction screens will generate increasing volumes of data. A critical challenge is to develop computational approach to integrating these data and eventually understanding gene function. Several hurdles will need to be surmounted. Essential genes are missing from the synthetic lethal network, although they may be probed eventually using non-null mutant alleles. Certain higher-order redundancy processes may also require more than two-gene deletion to be observed. The most promising approach to ease the limitations may be to combine different types of networks for improved inference. We have performed joint analysis on genetic network and physical network to argue that the correct functional links between genes should be orthogonal to the synthetic lethal interaction (see Supplementary information). Future studies by combining other types of heterogeneous network data, such as gene expression and phylogenetic information, will certainly improve our inference of biological systems.
This work in budding yeast, made possible by the development of the comprehensive deletion collection, massively parallel phenotyping techniques, and quantitative analysis of synthetic lethal interaction data within a statistical framework, will create a template for testing and improving our understanding of biological buffering and genetic robustness in many systems as researchers gather similar information data sets from other organisms. Genome-wide synthetic lethality screens using RNAi are becoming available in other organisms (van Haaften et al, 2004) and may eventually allow analysis similar to the one we have performed in yeast. Full-genome RNAi screens have been conducted for Caenorhabditis elegans and Drosophila melanogaster (Kamath et al, 2003; Boutros et al, 2004), and genome-wide screens in other metazoans are in progress. In instances where RNAi knockdown is complete, the congruence score method should provide a quantitative metric for shared gene function through calculating the probability of a gene pair sharing phenotypic defects in the RNAi screens. Therefore, the methodology we have applied to predict gene functions from yeast genomic synthetic lethality can be certainly extended to analogous RNAi screens for the discovery of novel gene tasks in higher organisms.
Synthetic lethal interactions, including lethal and sick phenotypes, were derived from SGA analysis in budding yeast, S. cerevisiae (Tong et al, 2004). We removed six essential query genes from the original 132-query gene network, including MYO2, SCC1, CDC2, CDC7, CDC42, and CDC45. The intermediate (viable) phenotypes exhibited by conditional alleles of essential genes may include loss-of-function, unregulated function, and gain-of-function aspects. In contrast, null alleles of non-essential genes are by definition solely loss-of-function mutations. We ascertained that our results and conclusions do not change when these six essential genes are included in the analysis. Yeast protein complex data were collected from two high-throughput studies, TAP and HMS-PCI, both using approaches of affinity purification of tagged bait protein to pull down complexes followed by mass spectrometry analysis (Gavin et al, 2002; Ho et al, 2002). Protein complexes that contain two or more non-essential gene encoded proteins were used (353 complexes from TAP and 427 complexes from HMS-PCI). We defined a protein complex to include the bait protein and all prey proteins detected by the bait. Similar analysis was also performed using curated MIPS protein complex data set (‘complexcat.scheme', June 12, 2003, 145 complexes with two or more non-essential gene encoded proteins) (Mewes et al, 2004) and results are provided in the Supplement. Pairwise protein interactions in S. cerevisiae derived from high-throughput yeast two-hybrid assays (Uetz et al, 2000; Ito et al, 2001) were also analyzed and found to support our conclusions (results not shown).
Synthetic lethal interactions from SGA were reported as a pair of genes directed from the query gene to the target gene. A randomized network was generated by keeping the query gene list unchanged, randomly picking one of the 982 target genes identified in the SGA screen according to the probability of each target gene shown in the interaction list with replacement, and matching it to the query gene. Duplicate query–target pairs and self-interaction pairs are rejected during randomization. Results depict the average over 10 randomizations.
We separated the SGA interaction data into query and target sets, based on whether each gene node represents a non-essential query gene (126 are included in the published data) or a target gene (982 of which are synthetic lethal partners by at least one query). We depict results for the target genes, as the number of primary nodes is much larger and should, in principle, include the query genes.
The probability of a gene pair sharing at least k synthetic lethal interaction partners was derived from the hypergeometric distribution:
in which C(j,k) is the combinatorial factor j!/k!(j−k)!, m is the number of synthetic lethal interaction partners for gene 1, n is the number of synthetic lethal interaction partners for gene 2, and t is the total number of query genes (126 genes) if calculation is for a target pair or the total number of target genes (4700 genes) if calculation is for a query pair. The congruence score is −log10[p(xk obs)]. High-scoring pairs from query genes reveal similar patterns as target genes (Supplementary Figure S11) from the data set of Tong et al (2004).
To correct for multiple testing of target pairs, we estimate that a final P-value of 0.01 requires a per-link P-value of ~0.01/9822, or 10−8, corresponding to a congruence score of 8 or more. For illustrative purposes, we selected a more stringent threshold of 10 (Figure 2D). At this significance, the congruence network contains only 68 nodes with 138 first-degree interactions, summarizing relationships among 1184 synthetic lethal pairs overall.
Network figures were created using Cytoscape 1.1 (Shannon et al, 2003).
GO is held as a directed acyclic graph (DAG) to describe attributes of gene products in three ontologies—biological process, molecular function, and cellular component (Ashburner et al, 2000). To calculate the GO term similarity between a pair of genes, depths of different sub-branches of the GO DAG have been recorded for each gene. Here, we assume that all of the links in the GO DAG are of equal weight. Then, the deepest depth in the GO DAG at which the pair of genes share an annotation was found and defined as depth d. Gene pairs with genes without annotation were discarded. The maximal depth Max(depths) and minimal depth Min(depths) for all genes in the synthetic lethal data set were calculated for each of three ontologies. The GO annotation correlation for a pair of congruent genes with depth d was defined by (d−Min(depths))/(Max(depths)−Min(depths)). For example, the maximal depth is equal to 17 and the minimal depth is equal to 1 for biological process ontology. The deepest depth for shared annotation of gene pair JNM1 and KIP2 is 11. Thus, the GO annotation correlation for JNM1 and KIP2 for biological process is calculated as (11−1)/(17−1)=0.63. This is similar to the GO depth correlation in a previous study of Drosophila physical interactions (Giot et al, 2003), except that the previous study normalized the depth correlation to fall in the range 0–1. This method differs from the semantic similarity method (Lord et al, 2003) in two ways: (1) it weights GO terms by depth, whereas semantic similarity weights terms by frequency; (2) it uses the depth of the deepest annotated term, whereas semantic similarity averages over annotations. Results from the two methods are consistent (Supplementary Figure S12).
To account for 17–41% false negatives in the SGA data set, we randomly removed 30% of interactions from the original data assuming reported interactions are all correct. To account for potential false positives (although SGA data set contains very few false positives as every interaction has been individually confirmed), we randomly replaced 30% of original interactions with random interactions. These two data sets containing false negatives and false positives, respectively, were used to repeat the congruence analysis, and this process was repeated 10 times (Supplementary Figure S10).
Null mutants of 59 genes with congruence scores greater than or equal to 4 for six landmark genes (NUM1, DYN1, DYN2, ARP1, JNM1, or NIP100) were tested for nuclear migration defects at 13°C. Deletion mutants were grown in YPD at 30°C until low-log phase and then cultures were shifted to 13°C for 24 h. Formaldehyde was added to 3.7% and cells were incubated at room temperature overnight. Cells were washed in 1 M sorbitol/50 mM potassium phosphate pH 7.5 (SK), permeabilized in SK+3.7% formaldehyde+0.5% Triton X-100 for 7 min, washed in SK, and then stained in SK+DAPI (100 ng/ml). Cells were examined under a fluorescence microscope, and 50 or 100 single large budded cells were scored for nuclear morphology. Normal cells had one DAPI mass at or through the bud neck or two DAPI masses, one in each cell body.
Null mutants of 31 genes with congruence scores greater than or equal to 4 for CIN1 were tested for growth defects on media containing low concentrations of benomyl at 25°C. Deletion mutants were grown on YPD agar, equal amounts of yeast (by OD600) were suspended in water in a 96-well plate and five-fold dilutions were performed. A 96-pin device was used to transfer yeast from each well to a YPD agar plate containing benomyl (5 μg/ml in DMSO) and to YPD agar with DMSO only. Plates were incubated at 25°C for 3 days and scored for growth defects on benomyl versus DMSO alone.
dSLAM was performed using PFD1 as query gene and a pool of ~6000 heterozygous diploid knockout strains. The detailed method is described elsewhere (Pan et al, 2004). Briefly, the heterozygous deletion collection was transformed with a PFD1 knockout construct as a pool, sporulated, and haploid double mutants were selected. Knockout-specific barcode tags were amplified with Cy3-labeled primers and hybridized to a microarray with Cy5-labeled control tags from haploid single mutants. Mutants were scored as positive only if both UPTAG and DNTAG had ratios greater than 2.0.
dSLAM was performed using LTE1, SPO12, and SLK19 as query genes. The procedure is same as PFD1 experiment described above. The data presented are the results of individual confirmation by random spore analysis or tetrad analysis.
Yeast two-hybrid experiments were performed using activation and binding domain vectors pOAD (LEU2-marked) and pOBD-2 (TRP1-marked), respectively, and yeast strains PJ69-4a and PJ69-4alpha (James et al, 1996). Materials were kindly provided by Stanley Fields, Yeast Resource Center. GAL4-binding domain fusions were transformed into PJ69-4alpha and GAL4-activation domain fusions were transformed into PJ69-4a. The two strains were mated and diploids were selected on SC −Leu −Trp. The resulting diploids were plated on SC −Ade −His media in two dilutions (2 μl of 0.1 OD600/ml and 0.02 OD600/ml) at 30°C. Growth at 4 days demonstrated a strong physical interaction between Jnm1p and Yll049wp (Supplementary Figure S7).
The constructs used were JNM1-BD, JNM1 fusion with GAL4-binding domain; YLL049W-AD, YLL049W fusion with GAL4-activation domain; BD, binding domain alone; AD, activation domain alone.
Two independent JNM1-BD and two independent YLL049W-AD transformants supported growth when appropriately combined. YLL049W-BD+AD alone resulted in growth owing to self-activation and was therefore not informative (data not shown).
The plasmids and strains used for this study are distinct from those used by Ito et al (2001), who reported high-throughput yeast two-hybrid interaction between JNM1 and YLL049W.
PJ69-4a: MAT a trp1-901 leu2-3,112 ura3-52 his3-200 gal4Δ gal80Δ LYS2GAL1-HIS3 GAL2-ADE2 met2GAL7-lacZ
PJ69-4alpha: MAT alpha trp1-901 leu2-3,112 ura3-52 his3-200 gal4Δ gal80Δ LYS2GAL1-HIS3 GAL2-ADE2 met2GAL7-lacZ
We attempted further confirmation of the physical interaction between Yll049wp and Jnm1p with both co-immunoprecipitation (co-IP) and GST pull-down experiments. We first attempted to make yeast strains expressing fusion proteins (Yll029w-3HA, Yll049w-13Myc, Jnm1-3HA, and Jnm1-13Myc) by genomic integration using the Pringle cassettes that confer G418 resistance (Longtine et al, 1998). For all four cases, multiple G418-resistant integrants were selected and confirmed by PCR diagnosis. In each case, yeast extracts were prepared from two representative candidate clones and analyzed by Western blot for expression of fusion protein. While the Jnm1-3HA and Jnm1-13Myc fusion proteins were easily detected, we were unable to detect either Yll049w-3HA or Yll049-13Myc. One possible explanation is that the expression level from the endogenous YLL049W promoter is so low that the fusion proteins cannot be detected. We thus obtained from Dr Heng Zhu a plasmid (with URA3 as the selectable marker) that has been reported to overexpress GST-Yll049w under control of the robust galactose-inducible GAL1 promoter (Zhu et al, 2001). We transformed this plasmid into yeast strains expressing both Jnm1-3HA and Jnm1-13Myc and grew the transformants in synthetic medium lacking uracil (for selecting the plasmid). Standard galactose induction protocol was followed to induce expression of GST-Yll049w (Zhu et al, 2001). Again, we were unable to detect the GST-Yll049w fusion protein in these strains. In contrast, GST-Ctf4 and GST-Jnm1 fusion proteins were expressed at high levels from strains harboring the corresponding GAL1-GST fusion plasmids under the same conditions. This result suggests that the Yll039w protein might become extremely unstable when tagged with epitope tags. Given that the Yll049w fusion proteins were not expressed at detectable level, we were unable to perform co-IP or GST pull-down experiments to confirm a physical interaction between Yll049w and Jnm1. We also note that a yeast strain expressing Yll049w-TAP was not available from the collection of TAP-tagged yeast strains made by O'Shea and Weissman's group (Ghaemmaghami et al, 2003), possibly because such a strain did not express detectable fusion protein.
JSB acknowledges support from the Whitaker Foundation, NIH/NIGMS, and NIH/NCRR. FAS, JDB, XP, and BDP were supported in part by NHGRI grant HG02432 and by the Technology Center for Networks and Pathways (RR020839). BP acknowledges support from an NIH/NIGMS training grant. XP was partly supported by a postdoctoral fellowship from the Leukemia & Lymphoma Society. We also thank Dr David Cutler and Dr Angelika Amon for invaluable discussions, Dr Raymond Deshaies for providing the TAB1-6 allele, and Dr Stanley Fields for providing vectors pOAD and pOBD-2 and strains PJ69-4a and PJ69-4α.
Statement of contributions
PY developed statistical and computational methods and generated information-based predictions. BDP developed the congruence calculation, conducted the nuclear migration and benomyl screens, and two-hybrid test for interaction between Yll049wp and Jnm1p. XP conducted the PFD1, LTE1, SPO12, and SLK19 dSLAM screens and the suppression of synthetic lethality between LTE1 and the Sin3/Rpd3 components by TAB1-6.
JDB and FAS helped initiate and supervised the experimental work.
FAS and JSB helped initiate the theoretical work, and JSB supervised the theoretical and computational work.