The interactome of an organism is the network formed by the complete set of binary physical interactions that can occur between all proteins. Low-throughput protein-protein interaction experiments are of immense value to understand cellular processes at the molecular level. However, the development of high-throughput approaches can substantially increase the pace and scale of discovery, while permitting the implementation of standardized and systematic quality control. Initial steps towards binary interactome mapping in metazoans have been undertaken1–5
, and the resulting partial interactome maps: i) provide insights into the organization of biological networks, ii) assist in determining functions of many proteins and complexes, and iii) identify hundreds of novel connections to proteins associated with human diseases.
High-throughput interactome mapping is particularly needed for C. elegans
, a major model organism for which the set of protein-protein interactions derived from small-scale experiments and accessible in public databases is limited to less than 500. The first proteome-scale version of the Worm Interactome (WI5)3
combines several sources of protein-protein interactions: literature-curated interactions, yeast two-hybrid (Y2H) “module” maps each devoted to a specific biological process1,6–11
, “interolog” interactions, i.e.
, predicted pairs of interactors whose respective orthologs interact in another organism, and lastly, Y2H interactions derived from a high-throughput screen performed with ~ 2,000 “metazoan” proteins as baits3
(WI-2004). WI5 represents a key resource to elaborate biological hypotheses and investigate the properties of the C. elegans
interaction network. However, WI5 includes non-binary interactions derived from the literature, non-experimentally confirmed interologs, and some lower-confidence Y2H interactions.
Our updated Worm Interactome map (WI8) combines the implementation of several techniques and strategies that are critical for generating high-quality protein-protein interaction data on a proteome scale. First, we expanded the worm interactome map by screening a matrix of ~ 10,000 × ~ 10,000 proteins. Second, we developed new standards to deliver a dataset of unprecedented quality. These standards involve a highly stringent high-throughput yeast two-hybrid (HT-Y2H) assay, strict methods for filtering and updating existing datasets, independent measurement of technical quality, and evaluation of biological relevance. Importantly, since worm genome annotations are improved frequently12
, we updated previous protein-protein interaction data according to recent gene models. Finally, we provide an empirical estimate of the full size of the C. elegans
interactome, through the implementation of a novel interaction mapping framework based exclusively on protein-protein interaction data13
To extend the use of WI8 beyond protein-protein interaction analysis and to place WI8 into broader biological context, we integrated the resulting protein-protein interactions with complementary datasets such as physical and genetic interactions from curated literature, our new interolog dataset, phenotypic profiling data and a co-expression compendium. We also identified tissue localizations and developmental stages in which interacting pairs are most likely to be physiologically relevant whenever ‘anatomical annotation’14
or ‘spatiotemporal expression patterns’15
were available for both proteins.
Our new dataset, WI-2007, provides 1,816 high confidence binary protein-protein interactions. Previously published high-quality C. elegans binary protein-protein interactions were integrated with WI-2007 into the updated WI8 version of the worm interactome, providing 3,816 high-quality binary physical interactions between 2528 proteins. We demonstrated that WI8 is significantly enriched for functionally linked protein pairs, confirming its high biological relevance and demonstrating the value of unbiased large-scale Y2H screens in inferring protein function.