|Home | About | Journals | Submit | Contact Us | Français|
Many soluble and membrane proteins form homooligomeric complexes in a cell which are responsible for the diversity and specificity of many pathways, may mediate and regulate gene expression, activity of enzymes, ion channels, receptors, and cell adhesion processes. The evolutionary and physical mechanisms of oligomerization are very diverse and its general principles have not yet been formulated. Homooligomeric states may be conserved within certain protein subfamilies and might be important in providing specificity to certain substrates while minimizing interactions with other unwanted partners. Moreover, recent studies have led to a greater awareness that transitions between different oligomeric states may regulate protein activity and provide the switch between different pathways. In this review we summarize the biological importance of homooligomeric assemblies, physico-chemical properties of their interfaces, experimental and computational methods for their identification and prediction. We particularly focus on homooligomer evolution and describe the mechanisms to develop new specificities through the formation of different homooligomeric complexes. Finally, we discuss the possible role of oligomeric transitions in the regulation of protein activity and compile a set of experimental examples with such regulatory mechanisms.
Recent findings have led to a greater awareness that only a small fraction of proteins function in isolation while the majority form complexes with identical or very similar chains (called “homooligomers” hereafter) or with different non-homologous chains (called “heterooligomers”). Many soluble and membrane-bound proteins form homooligomeric complexes in a cell [1–6] (Figure 1). For example, a majority of the enzymes in the BRENDA Enzyme Database  represent homooligomers and analysis of high-throughput protein–protein interaction networks has shown that there are significantly more self-interacting proteins than expected by chance . Despite the importance and abundance of homooligomers in a cell, the mechanisms of oligomerization are not very well understood and general principles have not been formulated. One explanation for this situation comes from the ambiguity of homooligomer experimental characterization and difficulty regarding their computational prediction. Indeed, numerous papers are dedicated to the analyses of protein-protein interaction networks but the computational methods employed in these studies are not able to properly handle the self-interactions and usually neglect them. In this review we attempt to summarize the biological importance of homooligomeric assemblies, their evolution, physico-chemical properties, and role in the regulation of cellular processes.
It is difficult to overestimate the functional importance of protein homooligomerization, which provides the diversity and specificity of many pathways and may mediate and regulate gene expression, activity of enzymes, ion channels, receptors, and cell-cell adhesion processes [9–15]. It has been suggested that large assemblies consisting of many identical subunits have advantageous regulatory properties as they can undergo sensitive phase transitions . Formation of homooligomers can also provide sites for allosteric regulation, generate new binding sites at interfaces to increase specificity, and increase diversity in the formation of regulatory complexes . In addition, oligomerization allows proteins to form large structures without increasing genome size and provides stability, while the reduced surface area of the monomer in a complex can offer protection against denaturation [2,6,17].
The main experimental techniques used to study the oligomeric states of proteins include X-ray and neutron scattering, mass spectrometry, gel-filtration, dynamic light scattering, analytical ultracentrifugation, and fluorescence resonance energy transfer (FRET) , (see also references for Table 1). For example, analytical centrifugation and gel filtration chromatography provide data on molecular mass distribution, the subunit stoichiometry of the complexes and equilibrium constants. FRET characterizes the kinetics and dynamics of complex formation, monitoring the extent of energy transfer between donor and acceptor, while X-ray and neutron scattering offer the atomic details of interaction interfaces. Nowadays proteins are being crystallized using high-throughput techniques and very often without the extensive biochemical or biophysical characterization of their oligomeric states. Different computational methods have been proposed to identify the biological oligomeric complexes but only a few of them may decipher biological assemblies from crystalline states with high enough accuracy [19–22].
Such methods reconstruct both biological and crystal-packing interfaces by applying crystallographic symmetry operations, then differentiate the biological from the crystal-packing interfaces by computational criteria. For example, the PISA algorithm applies graph theory to find the set of stable assemblies, which fill all the crystal space in a systematic manner, where nodes and edges correspond to protein monomers and interfaces between them . To distinguish “biologically relevant” from crystal packing interactions one can use ad-hoc scoring schemes which are based on interface area, amino acid composition, number of contacts, topological complementarity, and other characteristics [23–28]. Another group of methods tries to verify and predict oligomeric states and binding modes of a protein based on evolutionary conservation of homooligomeric binding modes of its homologs or on the presence of specific sequence and structural features [29–34]. For example, the NCBI IBIS server infers the interacting partners and the locations of binding sites for a given unknown protein by inspecting protein complexes formed by homologous proteins . To ensure the biological relevance of binding sites, similar sites of homologous proteins are clustered together based on their sequence and structure conservation.
The amino acid composition of homooligomeric interfaces differs from those of crystal packing interfaces, heterooligomeric interfaces and solvent-exposed surfaces and largely depends on the type of homooligomeric complexes [23–27,35–38]. For example, obligate homooligomers (complexes where monomers are unstable and usually non-functional upon isolation) and weakly associated dimers are characterized by the large fraction of hydrophobic and to a lesser extent aromatic residues on their interfaces while non-obligate (transient) complexes include more polar and charged residues [6,24,35,39]. Moreover, interfaces of obligate homooligomers are usually larger  but contain fewer hydrogen bonds per residue compared to heterooligomers . The analysis of kinetic and equilibrium data on dimeric proteins shows that per-residue interface and surface areas of “three state dimers” (monomers are stable on their own) is significantly smaller than that of “two state dimers” (monomers are not stable when separated from the complex) .
As can be seen in Figure 1 homooligomers mostly form dimers (38%), trimers (4%) and tetramers (10%) (Fig. 1), which have predominantly cyclic or dihedral symmetries [2,3,5,41]. Several explanations were proposed to account for these observations of self-attraction and symmetry, including stability, foldability and evolutionary optimization arguments [42–44]. The physical effect of a statistically enhanced self-attraction was modeled to show that interactions between identical random surfaces are stronger than attractive interactions between different random surfaces of the same size . Furthermore, it was demonstrated that the efficiency of co-aggregation between different monomers and protein domains decreases with decreasing sequence identity . Binding arrangements involving isologous homooligomeric interfaces with a two-fold symmetry axis seem to be more frequently conserved in evolution compared to non-isologous interfaces . In addition, symmetrical dimers were shown to contain more residues in disordered regions than heterodimers, which might modulate the high specificity of interactions between the dimer complexes and their interacting partners and play an important role in allosteric regulation [47,48].
Different scenarios of protein oligomerization have been discussed in the literature. Some evolutionary pathways may follow kinetic pathways of two-state or three-state folding [49,50]. At the same time assembly pathways of oligomers may mimic evolutionary pathways and homooligomers with dihedral symmetry may evolve and assemble through their cyclic intermediates [41,51]. One of the major mechanisms of oligomerization is gene duplication with subsequent diversification when one copy of the gene retains its original function whereas another gene copy is under relaxed evolutionary constraints and so may develop new functional properties. This mechanism may lead to the formation of oligomeric paralogs and may create protein complexes with novel specificities [46,52–55]. It has been shown that in S. cerevisiae up to 20% of complexes evolved by step-wise partial duplications  whereas this mechanism was found to be less prevalent in E. coli .
Similarity in protein sequences, folds and functions between two orthologous proteins does not necessarily imply that they will have the same interacting partners . Although homooligomeric states and binding modes have an overall tendency to be conserved within the clades on phylogenetic trees, they can only be reliably transferred from very close homologs (sharing higher than 30% sequence identity for oligomeric states and sharing higher than 50–70% identity for binding modes inference) [41,46]. Indeed, in some cases the oligomeric state can be evolutionarily conserved while the binding arrangement can be quite different. This points to the possibility that interactions and binding arrangements between paralogs are not necessarily inherited from the ancestral homooligomer but rather can develop anew in evolution. For example, proteins from the glycosyltransferase family may function as monomers, dimers, or tetramers in different organisms. While dimeric and tetrameric proteins appear to be as ancient as monomers, the binding modes can differ even between very similar proteins, for instance engaging completely opposite sites on the molecule . Despite this diversity, there are certain features of evolutionarily conserved homooligomeric binding modes, for example, it has been found that binding modes with larger interfaces are more frequently conserved in evolution whereas smaller interfaces are often acquired more recently in evolution [41,46].
Next we will discuss several evolutionary mechanisms that play key roles in homooligomerization: domain swapping, formation of Leucine zippers, amino acid substitutions, and insertions/deletions on oligomeric interfaces.
Domain swapping includes opening up of the monomeric conformation and exchanging identical regions between two monomers . Overall, more than 60 examples of domain swapping are reported in protein structures from the Protein Data Bank . Although swapping of terminal regions seems to be more common, the swapped regions can be located anywhere in the structure and the number of swapped regions can be greater than one [58,59]. For example, RNase A has two swapping regions at the N- and C- termini and can form a dimer, trimer, or higher order oligomers (Figure 2A) . A number of studies analyzed the properties of inter-domain linker regions since these regions are flexible and might be responsible for domain swapping. It has been shown that insertions/deletions and substitution of certain residues (especially Pro) in the inter-domain linker region might affect monomer-dimer equilibrium [61–63].
There are several known structural motifs used by proteins for oligomerization. The most common motif is the alpha-helical coiled-coil. The coiled-coil motif is observed as a series of continuous heptad repeats (abcdefg)n in protein sequences . When the motif forms alpha-helical structure, hydrophobic residues at the “a” and “d” positions interact with each other to form a helix bundle. The “Leu zipper” motif is a type of the coiled-coil structure where leucine is frequently observed at the “d” position (Figure 2B) . The coiled-coil motifs may form different oligomeric states depending on amino acids in the heptad repeat. For example, the yeast transcription factor GCN4 forms a dimeric coiled coil, however the replacements of Ile by Leu at the “a” and Leu by Ile at the “d” positions enables GCN4 to form a tetramer . In addition it was shown that other combinations of Leu, Ile, and Val at the “a” and “d” positions may cause trimerization [66,67].
Amino acid substitutions introduced on the protein surface/interface may cause association or dissociation of homooligomers. A simple mechanism of amino acid substitutions that mediate the oligomeric states was proposed recently: the replacement of solvent exposed residues by more hydrophobic and larger protruding residues may shift the equilibrium toward the formation of oligomers  (Figure 2C). The artificial design of new oligomers by amino acid substitution was performed on examples of four assemblies . Some of these proteins needed only a single amino acid replacement to associate to a higher-order complex and it was shown that introducing large non-polar side chains, such as phenylalanines or tryptophans, facilitated the complex formation . Amino acid substitutions on homooligomer interfaces may change the functional activity of a protein and have been implicated in several diseases. For example, mutations leading to the disease fructose intolerance were shown to destabilize the tetramer of the enzyme fructose-1,6-biphosphate aldolase A and decrease its activity . Certain substitutions in glutamate receptors may stabilize their dimer interface and reduce desensitization while amino acid substitutions in nuclear hormone receptors may disrupt the salt bridge stabilizing the functional dimer [71,72].
Insertions and deletions at oligomer interfaces provide another important mechanism to modulate different protein oligomeric states in evolution (Figure 2D). The inspection of insertions and deletions of homologous proteins in different oligomeric states revealed that about one quarter of them are located on interaction interfaces and are responsible for enabling or disabling the formation of oligomers [33,34,54]. According to these studies, insertions and deletions which differentiate monomers and dimers are located preferentially in loop regions and to a lesser extent on alpha-helices and beta-strands. It has been shown that insertions and deletions modulating oligomerization usually have a lower aggregation propensity and contain a larger fraction of polar, charged residues compared to conventional interfaces and protein surfaces . Moreover, the removal of enabling regions from protein structures may result in the complete or partial loss of stability . Interestingly, different oligomerization mechanisms can be employed by different proteins from the same family. For example, in the dihydrofolate reductase family, the loop of bacteriophage T4 leads to the formation of a homodimer  while homodimerization of the same enzyme from Thermotoga maritima is achieved by amino acid substitutions .
As we described in the previous sections, homooligomeric states and binding modes can be very conserved within specific clades and protein families. At the same time protein regions modulating oligomerization may also be preserved in evolution, for example, the presence and locations of enabling insertions and deletions might be typical only for a given protein subfamily which in turn can be characterized by the well-defined oligomeric state. It implies that important homooligomerization features might function as specificity determinants providing high binding affinity to certain interacting partners while minimizing interactions with other unwanted partners. Such a mechanism would be essential for the separation of functional pathways of close paralogs, preventing the possible usage of similar surface regions interacting with the same or very similar partners; or facilitating through specific features the interactions with the novel partners. As shown in Table 1, proteins in different oligomeric states might have quite different binding affinities and therefore functional activities. Figure 3 and the two following examples illustrate the development of new protein specificities through homooligomerization in different organisms or in paralogs from the same organism.
For example, proteins from the human p53 C-terminal domain family function as homotetramers. Two of these family members (p63 and p73) can also form mixed heterotetramers, however p53 protein cannot associate with either p63 or p73 . Interestingly, a recent study showed that p63 and p73 are different from p53 in that the former have an additional alpha-helix which stabilizes the tetramer . The absence of this helix in p53 explains its less promiscuous binding which results in the separation of the p53 pathway from paralogous p63/p73 pathways. Another example involves the LIM-domain-binding protein Ldb1, a nuclear adaptor protein which interacts with diverse proteins containing LIM domains and plays essential roles in development and cellular differentiation [75,76]. Humans have two paralogs Ldb1 and Ldb2. Since the loss of Ldb1 causes severe developmental defects in embryos that are not compensated for by Ldb2 , Ldb1 and Lbd2 are likely to participate in different pathways. Interestingly, it has been shown that Ldb1 forms a trimer while Ldb2 exists in a monomer-tetramer-octamer equilibrium, suggesting that the oligomeric differences might enable Ldb1 and Ldb2 to interact with different partners and in different pathways .
Cellular processes are extremely complex requiring many factors and regulators to provide desired outcomes and prevent inefficient waste of energy. Control mechanisms change the rate of particular cellular processes which in turn can be modulated at the level of gene expression or protein-protein interactions, by changing protein activity or post-translational modifications, by production of secondary messengers or other mechanisms. Oligomeric complexes between homologous proteins can integrate different pathways and provide the cross talk between them. Moreover, proteins might exist in dynamic equilibrium between different oligomeric states which can be controlled by physiological conditions (pH, certain ionic strength, temperature), ligands and post-translational modifications . Therefore transitions between different oligomeric states may regulate protein activity and provide the switch between different pathways. It was shown that certain mutations may induce changes in oligomeric state and activity but do not compromise the stability . It has also been shown that properties of weak transient homooligomers which exist in dynamic equilibrium are different from those of permanent dimmers; the former contain smaller, more planar and polar interfaces . In addition, reversible transitions between discrete conformations and oligomeric states might account for protein cooperative binding properties and allosteric mechanisms in signal transduction .
Here we summarize and present different scenarios for how shifting the equilibrium between different oligomeric states might serve as a regulatory mechanism. These scenarios include: a) homooligomerization might be important for protein self-activation; b) conformational changes accompanying transition may lead to exposure/suppression of active or protein binding sites; c) the formation of the oligomer may inhibit binding of a monomer to its substrate; d) finally, the post-translational modifications or binding of small molecules at or near the homooligomer interface may shift the equilibrium between different oligomeric states. In addition to experimental examples from the previous study  we manually compiled a list of experimentally verified examples of these mechanisms in Table 1.
We will describe a specific example of how the transition between the dimeric and tetrameric forms of pyruvate kinase can be implicated in tumor formation. Pyruvate kinase is a key glycolytic enzyme and its activity is consistently altered during tumorigenesis. It has been shown that during tumor formation the M2 isoform of pyruvate kinase is overexpressed . The active tetrameric form of this enzyme from normal cells has high affinity to its substrate phosphoenolpyruvate (PEP), associates with the glycolytic enzyme complex, and produces high levels of ATP. The dimeric form has low affinity to PEP, does not associate with the glycolytic enzyme complex and is accompanied by low levels of ATP. Having the inactive dimeric form is advantageous for tumor cells as phosphometabolites above pyruvate kinases in the glycolitic cycle accumulate in tumor cells and can then be available as precursors for synthesis required by tumorigenesis. The tetramer-dimer ratio is regulated by different factors, for example by fructose 1,6-P2 and serine concentrations. Moreover Rous sarcoma virus may also phosphorylate M2-PK and lead to its dimerization and disassociation from the glycolytic enzyme complex .
The experimental characterization of homooligomeric structures, the dynamic equilibrium between different oligomeric states, and computational inference from the crystalline state has always been a challenge. Moreover, the diversity of homooligomeric binding modes among homologous proteins complicates the study of their evolutionary pathways and their computational prediction. Due to the advances of new experimental techniques, invaluable experimental data on homooligomers and their dynamics have been filling this gap, but most of them are focused on individual systems and do not provide the broader picture on the specific role of homooligomers in cellular function, especially in the regulation of protein functional activity. There remains an urgent need for new and improved experimental and computational methods for identification, modeling, and predicting homooligomers.
We thank Tom Madej for careful reading of the manuscript. This work was supported by National Institutes of Health/DHHS (Intramural Research program of the National Library of Medicine). K.H. was supported by a JSPS Research Fellowship from the Japan Society for the Promotion of Science.