Many proteins within cells do not function as individual activities, but associate with specific partners to form multisubunit modules with specific functions. These in turn may associate with other functional modules to form a multifunctional macromolecular complex. While the identification of subunits of such complexes can be achieved through a combination of protein purification and proteomics, it is more challenging to ascertain how individual subunits interact and are spatially arranged within these macromolecular complexes. High-resolution characterization of multi-protein assemblies using any single experimental or computational method is generally very difficult, especially since traditional methods such as X-ray crystallography or NMR have certain limitations in characterizing large dynamic protein complexes. However, even if it is not feasible to determine the structure of whole protein complexes at atomic or amino-acid levels, methods predicting lower-resolution macromolecular models that accurately position proteins and their connections will accelerate our understanding of protein complexes and their cellular functions. Here, we describe a method capable of determining the architectural organization of multi-protein complexes. It employs a combination of computational approaches and a systematic collection of quantitative proteomics data from wild-type and deletion strain purifications. We applied this approach on a data set generated in this study, which aims to gain novel insights into the Saccharomyces cerevisiae Spt–Ada–Gcn5 histone acetyltransferase (HAT) (SAGA) complex.
SAGA is a well-studied multi-protein complex involved in regulating histone post-translational modifications. Originally identified in yeast, the SAGA complex was subsequently shown to be evolutionarily conserved in every organism through humans (
Lee and Workman, 2007). Early on, through the use of genetics and conventional biochemistry approaches, SAGA was recognized to be a multi-protein complex that is made up of smaller functional modules () (
Grant et al, 1997,
1998,
1999;
Sterner et al, 1999). The HAT module, which carries out the HAT activity of the SAGA complex, was the first module to be described and its catalytic subunit Gcn5 was shown to harbor limited substrate recognition and specificity (
Grant et al, 1999). Subsequently, the Ada2 and Ada3 proteins were shown to also be part of this module (
Horiuchi et al, 1997;
Saleh et al, 1997;
Balasubramanian et al, 2002). Early work already recognized the existence of three distinct Gcn5-containing complexes that have since been characterized as SAGA, a variant of the SAGA complex, named SLIK/SALSA, and ADA (
Grant et al, 1997). All three complexes share the Gcn5/Ada2/Ada3 HAT module. SAGA and SLIK also share all other subunits with the exception of a C-terminal truncated form of Spt7 and Spt8 (
Pray-Grant et al, 2002;
Sterner et al, 2002). On the other hand, only a single unique subunit, Ahc1, was known to exist in the ADA complex (
Eberharter et al, 1999) in addition to the HAT module. More recently, a second catalytic module, the deubiquitinylation (DUB) module, was identified within SAGA/SLIK (SALSA), which is important for the DUB of histone H2B (
Henry et al, 2003;
Daniel et al, 2004). Work from many laboratories has led to the identification of several subunits of this module, that is Ubp8, Sgf11, Sus1 and Sgf73 (
Ingvarsdottir et al, 2005;
Lee et al, 2005,
2009;
Kohler et al, 2006,
2008). In addition, Chd1 was shown to be part of SAGA (
Pray-Grant et al, 2005); however, it was not identified in our purifications.
Due to the complexity of the SAGA/ADA protein complex network, we reasoned that it is an ideal system to test our approach. Furthermore, partial structural information has been established for the SAGA complex, which therefore provides an objective to evaluate our method. Using electron microscopy (EM),
Wu et al (2004) determined the first low-resolution 3D model of the SAGA complex; however, this study only localized 9 of the 19 known subunits of SAGA and the DUB module was not known to be part of SAGA at that time. On the other hand, two recent studies also determined the high-resolution structure of the four subunits of the DUB module (
Kohler et al, 2010;
Samara et al, 2010). Since these studies characterized only portions of the SAGA complex, there is no complete model for the architecture of SAGA. Here, we aimed to improve our understanding of the organization of proteins within the complex as well as to identify any components missing from earlier studies.
Using our method, we confirmed all known components of the DUB and HAT modules, and furthermore revealed that the HAT module contains an additional protein, Sgf29, that is present in all Gcn5 complexes. Sgf29 mutants resemble those in Ada2, Ada3 and Gcn5 by displaying classic ADA phenotypes (
Berger et al, 1992). We also identified a novel subunit of the ADA complex, which we termed Ahc2. The most intriguing observation revealed through our analysis is that the SAGA complex consists of five distinct modules. In addition to the previously described DUB, HAT/Core and ADA modules, we identified two novel modules, which we termed SA_SPT (i.e. Saga-associated SuPpressors of Ty) and SA_TAF (i.e. Saga-associated TATA-binding protein-associated factors). Unexpectedly, these modules, which are responsible for the different functions of the SAGA complex, are capable of assembling independently from the remaining modules of the complex.