|Home | About | Journals | Submit | Contact Us | Français|
B-ZIP transcription factors (98) are exclusively eukaryotic proteins that bind to sequence-specific double-stranded DNA as homodimers or heterodimers to either activate or repress gene transcription (34). We have examined both of the recently published DNA sequences of the human genome (51, 95) and identified 56 genes that contain the B-ZIP motif. Three sequences were identical, giving a total of 53 unique B-ZIP domains with the potential to form 2,809 dimers. This creates the possibility for a tremendous range of transcriptional control (23, 50, 52). While significant effort has been directed at identifying dimerization partners of B-ZIP proteins, the full complement of dimerization partners remains to be elucidated. This review highlights two topics: (i) the known structural rules that regulate leucine zipper dimerization specificity and (ii) experimental data addressing mammalian B-ZIP dimerization partners.
We have annotated the leucine zippers of all human B-ZIP domains, highlighting amino acids in the a, d, e, and g positions that appear critical for leucine zipper dimerization specificity. These data were used to group B-ZIP proteins into 12 families with similar dimerization properties: (i) those that strongly favor homodimerization within the family (PAR, CREB, Oasis, and ATF6), (ii) those that have the ability to both homodimerize and heterodimerize with similar affinities (C/EBP, ATF4, ATF2, JUN, and the small MAFs), and (iii) those that favor heterodimerization with other families (FOS, CNC, and large MAFs).
In the late 1980s, several mammalian B-ZIP proteins were purified by double-stranded DNA affinity chromatography, and the genes encoding these proteins were cloned. Among the first cloned were the AP-1 (c-FOS and c-JUN) heterodimer, (4), the CREB homodimer (65), and the C/EBP homodimer (37, 53). These newly isolated genes were used as probes in low-stringency DNA hybridizations to identify new sequence-related B-ZIP proteins (5, 78, 101). In addition, new B-ZIP proteins (25, 36, 102) were isolated by screening lambda phage protein expression libraries with radiolabeled DNA binding elements (29, 85, 97). These functional DNA binding assays successfully isolated B-ZIP proteins because the B-ZIP motif is compact and refolds easily (87). The wealth of new sequences led to a confusing nomenclature, because multiple groups independently isolated and named the same B-ZIP proteins. Moreover, initial classification into families was often based on apparent DNA binding activity, resulting in grouping of proteins with different dimerization properties. Several reviews have helped clarify these issues, including a comprehensive review by Hurst in 1995 (34), one focusing on ATF proteins (24), and another focusing on FOS and JUN proteins (6). We hope this review will further contribute to a systematic B-ZIP classification.
Amino acid alignment of B-ZIP proteins allowed the identification of the B-ZIP motif, a long bipartite α-helix that is 60 to 80 amino acids long (98). The N-terminal half contains two clusters of basic amino acids responsible for sequence-specific DNA binding, while the C-terminal half contains an amphipathic protein sequence of variable length with a leucine every seven amino acids. The shorter leucine zippers have less protein sequence flexibility, because amino acids must be optimized for dimerization stability. Longer leucine zippers allow better regulation of dimerization specificity, because they can contain amino acids that are suboptimal for stability but favor interaction with a particular partner. This amphipathic sequence, termed the “leucine zipper” (52), mediates homo- and heterodimerization of B-ZIP proteins (3, 20, 41, 44, 54, 77, 86, 91). Figure Figure11 shows the X-ray crystal structure of the B-ZIP domain from yeast GCN4 bound to DNA (15). B-ZIP DNA binding stabilizes the basic region inducing the random coil to form an α-helical extension of the leucine zipper (73, 84). Several B-ZIP proteins, including the small and large MAF proteins (13, 42), contain additional DNA binding elements N terminal to the basic region that increase the number of specific DNA bases that can be bound.
The leucine zipper dimerization domain forms a parallel coiled coil (75) that consists of four to five heptads, in which each heptad is composed of two α-helical turns or seven amino acids, labeled a, b, c, d, e, f, and g (61). Amino acids in the a, d, e, and g positions regulate leucine zipper oligomerization, dimerization stability, and dimerization specificity.
Amino acids in the a and d positions are on the same surface of the α-helix and are typically hydrophobic. The a and d amino acids from one monomer interact with the complementary a′ and d′ amino acid positions in the opposite monomer (′ refers to the second α-helix in the dimer). This interaction creates a hydrophobic core essential for dimer stability (87). The g and e positions typically contain charged amino acids (8, 96). X-ray crystallography reveals g↔e′ interhelical interactions between amino acids in the g position and oppositely charged amino acids in the e′, which is five amino acids C terminal (2, 15, 19, 21, 75, 80). Electrostatic interactions between the amino acids in the g↔e′ pair can be either attractive or repulsive and thus can regulate both homodimerization and heterodimerization. Furthermore, Van der Waal interactions between the g and e′ methylene groups and the underlying a and d amino acids contribute to stability (2).
We have examined two versions of the DNA sequence of the human genome (51, 95) and identified 56 genes that contain the B-ZIP motif. Three of these motifs were identical, resulting in 53 unique B-ZIP domains. Four of the proteins were found only in the Celera database. Table Table11 provides the chromosomal location, unique database search identifiers, and the number of amino acids found N terminal and C terminal of the B-ZIP domain. We have generated three dendrograms that examine the relatedness of the amino acid sequences of the 53 human B-ZIP domains. One dendrogram is based on the entire B-ZIP domain, one is based on the basic region that is critical for DNA binding, and the last is based on the leucine zipper region that regulates dimerization specificity (Fig. (Fig.2).2). The three dendrograms are similar. The differences are interesting, because they reveal whether similarities are based on DNA binding properties or dimerization specificities. For example, the basic region dendrogram places the CREB, ATF6, and Oasis families together, reflecting their binding to the CRE DNA sequence (5′-TGACGTCA-3′). In contrast, the leucine zipper dendrogram places the Oasis family separately from the CREB and ATF6 families, reflecting their different dimerization properties. It is possible that these families could compete for binding to the CRE DNA sequence to give a range of transcriptional control. Another example is the XBP protein, which is not related to any sequence when its basic region is examined, but clusters with the ATF6 proteins when its leucine zipper is examined.
To achieve a functional classification of the B-ZIP proteins, we have considered the dimerization properties of the B-ZIP domains. We have examined the amino acids in the g, a, d, and e positions of the leucine zipper region of the B-ZIP domains to rationalize the known and predict the unknown dimerization properties of the 53 human B-ZIP domains whose amino acid sequences are presented in Fig. Fig.3.3. Using this analysis, we have grouped B-ZIP proteins with similar dimerization properties into 12 families. Our grouping is consistent with the dendrograms and agrees particularly with the dendrogram based on the leucine zipper sequence. An examination of the human genome by Green and colleagues (90) grouped B-ZIP proteins into eight previously identified families, (PAR, CREB, C/EBP, FOS, JUN, Maf, CNC, and ATF6) based on amino acid similarity throughout the B-ZIP domain. Our analysis divides their FOS family into FOS, ATF2, and ATF4; the MAF family into the small MAFs and large MAFs; and the CREB family into the CREB and Oasis families. We also reassigned two sequences, XBP1 was moved from the FOS family to the ATF6 family, and NFLI3 was moved from the C/EBP to the PAR family.
We divide these 12 families into three general groups: (i) those that strongly favor homodimerization within the family (PAR, CREB, Oasis, and ATF6), (ii) those that homodimerize and heterodimerize (C/EBP, ATF4, ATF2, JUN, and the small MAFs), and (iii) those that strongly favor heterodimerization with other families (FOS, CNC, and large MAFs). Amino acids in the leucine zipper region of each human B-ZIP domain that regulate attractive and repulsive interactions are color-coded in Fig. Fig.33 and reveal similar interaction patterns within each family. Figure Figure44 presents a helical wheel representation of homo- and heterodimers to help explain the color code used in Fig. Fig.3.3. A similar annotation for Drosophila B-ZIP proteins indicates that the 12 families we have identified are conserved in the insects (17).
An examination of the g↔e′ pairs in the first four heptads of human B-ZIP proteins results in several general observations. Thirty percent are attractive, with acidic-basic pairs (orange) predominating; 23% are repulsive with acidic-acidic (red) pairs predominating; and 30% contain a single charged amino acid that can stabilize either homodimerization or heterodimerization.
The d position of the hydrophobic interface typically contains leucine with very few polar and no charged amino acids. The 2nd heptad a position typically contains asparagine. The a position in other heptads typically contains hydrophobic amino acids, but asparagine and basic amino acids are occasionally observed, which are critical for regulating dimerization specificity. The surprisingly limited diversity of amino acids in the a, d, e, and g positions suggests that a limited set of rules might regulate leucine zipper dimerization specificity.
In this section, we will review what is known about the contribution of individual amino acids to leucine zipper dimerization specificity and apply these rules to all human B-ZIP proteins. B-ZIP dimerization stability can be measured by circular dichroism spectroscopy (CD). Ellipticity at 222 nm indicates the presence of α-helical structure, and ellipticity decreases as the B-ZIP structure is unfolded by denaturants such as heat or urea. Thermal- or denaturant-induced unfolding of B-ZIP dimers occurs cooperatively and reversibly from an α-helical dimer to an unfolded monomer. The homodimerizing B-ZIP domains typically have a melting temperature (Tm [midpoint of thermal transition]) of approximately 50°C, which represents 10 kcal/mol/dimer or 100 nM affinity at 37°C (1, 48, 49). The thermal denaturation is well described by a two-state model so that mutational analysis of specific amino acids allows their contribution to dimerization stability to be measured (47, 87).
Seventy-six percent of the g and e positions in the leucine zipper of human B-ZIP proteins contain one of four long-side-chain amino acids: the acidic glutamic acid (E), the basic arginine (R) or lysine (K), or the polar glutamine (Q) (2, 8, 74, 96). Changing the g and e positions in two heptads of the PAR B-ZIP domain from charged amino acids to alanine resulted in the formation of tetramers instead of dimers (48). Thus, charged amino acids in the g and e positions inhibit the higher-order oligomerization that would occur if the g and e positions formed a continuous hydrophobic surface with the a and d positions.
Charged amino acids in the g and e positions contribute to leucine zipper stability. The most common g↔e′ pair has glutamic acid (E) in the g position and an oppositely charged arginine (R) or lysine (K) in the following e′ position (E↔R and E↔K). These oppositely charged amino acids produce a g↔e′ pair that contributes −1.3 (E↔R) or −0.9 (E↔K) kcal/mol/pair more energy to dimer stability than a reference alanine pair (A↔A) (47). The g↔e′ pairs R↔R and K↔K are stabilizing, contributing −0.1 and −0.3 kcal/mol/pair, respectively, relative to the A↔A pair. Presumably the repulsive electrostatic energies between the like charges in arginine or lysine pairs are overcome by the favorable Van der Waals interactions between the methylenes of arginine or lysine side chains and hydrophobic amino acids in the a and d positions (2). E↔E is the only pair less stable than A↔A, contributing +0.4 kcal/mol/pair.
In addition to affecting stability, charged amino acids in the g and e positions also regulate dimerization specificity (2, 3, 48, 57, 67, 74, 96, 104). In the analysis of stability presented above, we considered the stability of a charged g↔e′ pair relative to an A↔A pair without addressing any potential energetic interaction between the amino acids in the g↔e′ pair; the contribution of the individual E or R side chain to stability can be determined by examining the E↔A and A↔R pairs. Any excess stability conferred by the E↔R pair is due to the energetic interaction between E and R, termed the “coupling energy.” This can be calculated by using a double-mutant thermodynamic cycle (31, 81) that involves the analysis of four proteins. This idea is presented graphically in Fig. Fig.5.5. For example, the coupling energy of the E↔R pair is derived from a comparison of a protein containing the E↔R pairs with three additional proteins mutated to contain the pairs A↔R, E↔A, and A↔A (47). The levels of stability of the pairs relative to A↔A are as follows: E↔R, −1.3 kcal/mol; E↔A, −0.1 kcal/mol; and A↔R, −0.7 kcal/mol. The additional energy of the E↔R pair compared to the A↔R and E↔A pairs is −0.5 kcal/mol/salt-bridge [E↔R − (A↔R + E↔A) = coupling energy] [−1.3 − (−0.7 + −0.1) = −0.5] and represents the energy of interaction (coupling energy) between E and R. Table Table22 lists the stability of each g↔e′ pair relative to A↔A, and Table Table33 gives their coupling energy.
The larger coupling energy for the E↔R pair (−0.5 kcal/mol) than that of the E↔K pair (−0.3 kcal/mol) indicates that the E↔R pair contributes more to dimerization specificity than E↔K. The like-charged E↔E, K↔K, and R↔R pairs all have destabilizing coupling energies (+0.7, +0.6, and +0.8 kcal/mol, respectively) that are larger than the E↔R and E↔K attractive coupling energies (Table (Table3).3). This suggests that preventing a repulsive g↔e′ pair is more important for driving dimerization specificity than forming an attractive pair. The energetic basis of the coupling energy for g↔e′ pairs is a subject of ongoing debate in the literature (47, 56, 59), with some studies suggesting the measured coupling energy does not have a strong electrostatic component.
The suggestion that coupling energy may not be driven by charge interactions is highlighted by the polar glutamine that has a repulsive coupling energy in pairs with either acidic E or basic K (Table (Table3).3). Because of the positive calculated coupling energies, we have color-coded E↔Q and Q↔E pairs in red as depicting repulsive acidic pairs and K↔Q, R↔Q, and Q↔K in blue as depicting repulsive basic pairs (Fig. (Fig.33).
Many B-ZIP families, including JUN, CNC, and C/EBP, have only one charged amino acid in the g↔e′ pair. These charged amino acids contribute to the stability of the homodimer, as seen in the A↔R and E↔A pairs in the double-mutant thermodynamic analysis. Heterodimers, however, may be preferred if they form an attractive g↔e pair, because of the coupling energy. Thus, incomplete g↔e′ pairs can stabilize both homodimer and heterodimeric interactions.
Amino acids in the a and d positions of the leucine zipper are typically hydrophobic, with a variety of amino acids in the a positions and leucine in 84% of the d positions. An exception is the 2nd heptad a position, which contains asparagine in most homodimerizing B-ZIP proteins. The a and d amino acids are packed in the “holes and knobs” pattern predicted by Crick in 1953 (10), which is essential for leucine zipper dimer stability (87). When the GCN4 leucine zipper a and d positions were changed from valine and leucine to other aliphatic residues, trimers and tetramers formed (26). Similar to the role of amino acids in the e and g positions, amino acids in the a and d positions function to prevent higher-order oligomerization.
Several groups have examined the contribution of amino acids in the a and d positions to leucine zipper stability (63, 88). Leucine in the 4th d position is 9.2 kcal/mol/dimer more stable than alanine and 5.9 kcal/mol/dimer more stable than isoleucine. The additional stability conferred by leucine relative to isoleucine, an amino acid of similar size, likely results from the unique packing interactions of the two leucines with each other and with neighboring amino acids (63). The preferential energetic contribution of leucine over other aliphatic amino acids is not observed in the a position (99, 105; Vinson laboratory, unpublished data).
In addition to g↔e′ pairs, amino acids in the a position can also regulate dimerization specificity. The asparagine side chains, which are common in the 2nd heptad a position (Fig. (Fig.3),3), interact interhelically to form a polar pocket in the hydrophobic interface that limits oligomerization and directs the orientation of the α-helices (69). Hu and coworkers (103) mutated various a position asparagines to isoleucine and found that the asparagine-containing molecules interact preferentially with each other; thus, asparagine a position interactions are important for homodimerization specificity. In contrast, the basic amino acids lysine and arginine in the a position result in repulsive interactions that destabilize homodimerization and thus favor heterodimerization.
Coupling energies for the g↔e′ pair can be measured in the context of a homodimerizing system because the g and e positions in the monomer (and thus the g and e' positions in the dimer) can be changed independently. In contrast, changing a and a′ or d and d′ independently requires a heterodimerizing system. The contribution of the d position to heterodimer formation remains to be examined. However, the prevalence of leucine in the d position in mammalian B-ZIPs suggests that this position contributes to stability rather than to specificity. The a position in contrast is more variable than the d position, suggesting this position may be important for regulating dimerization specificity.
We used a heterodimerizing leucine zipper system to determine the contribution of alanine, leucine, isoleucine, valine, asparagine, and lysine in the a position to dimerization specificity. Pairs comprising any combination of the three aliphatic amino acids leucine, isoleucine, and valine have similar coupling energies. Asparagine in contrast, prefers to interact with itself and not with the aliphatic amino acids. The N↔N and V↔V interactions are more stable than the N↔V interactions by 2.3 and 5.3 kcal/mol, respectively. Therefore, asparagine drives interactions with asparagine in the a position. Asparagines in the a position of the 4th heptads of the Oasis family and the 3rd and 5th heptads of the ATF6 families appears to be critical for their predicted homodimerization. In contrast, K↔K interactions in the a position are repulsive, so that lysine preferentially interacts with asparagine and the aliphatic amino acids (Vinson laboratory, unpublished data). The FOS, CNC, large MAF, and small MAF B-ZIP families use basic amino acids in the a position to create heterodimerizing B-ZIP families
A problem with analyzing leucine zipper interactions by CD spectroscopy is that the stability of heterodimers can only be measured if the heterodimer is significantly more stable than either homodimer. To gain further insights into dimerization specificity, we have developed an A-ZIP protein consisting of a leucine zipper and a designed amphipathic acidic α-helical sequence that replaces the B-ZIP basic region (Fig. (Fig.6).6). This acidic extension forms a coiled-coil structure with the basic region in the B-ZIP|A-ZIP heterodimer and stabilizes the complex by up to 8 kcal/mol (1, 22, 49, 64, 71). The B-ZIP|A-ZIP heterodimer drives interactions between weakly attractive or even somewhat repulsive leucine zippers and gives us access to measuring a range of dimerization affinities. This is important because in vivo dimerization is often driven by DNA binding. The B-ZIP|A-ZIP heterodimer is more stable than the B-ZIP protein bound to DNA, so that the A-ZIPs specifically prevent B-ZIP DNA binding at equimolar concentrations. Because the acidic extension interacts with all basic regions, the specificity of interaction between A-ZIP and B-ZIP domains is primarily leucine zipper dependent (1, 22, 64, 70). The inhibition of DNA binding provides an assay for dimerization.
Competition between A-ZIP protein and DNA for interaction with a B-ZIP protein can be analyzed with the gel shift assay. Figure Figure66 shows the FOS|JUND heterodimer and the C/EBPα, CREB, and PAR homodimers binding to their canonical DNA binding sites. One molar equivalent of A-PAR inhibits the DNA binding of PAR; it does not inhibit the DNA binding of FOS|JUND, C/EBPα, or CREB, even at 100 molar equivalences (Fig. (Fig.7,7, top panel). Similarly, 1 molar equivalent of A-CREB, A-C/EBP, and A-FOS specifically inhibit the DNA binding of CREB, C/EBP, and FOS|JUND, respectively. In contrast, A-ATF4 has somewhat promiscuous dimerization properties, inhibiting the DNA binding of FOS|JUND and C/EBPα at 1 molar equivalent (Fig. (Fig.7E7E).
We can rationalize the specificity seen in Fig. Fig.77 based on the leucine zipper sequence of each protein. The homodimerizing PAR, CREB, and C/EBP families have similar a and d positions with an asparagine in the a position of the 2nd heptad. The families differ in their attractive g↔e′ pairs, PAR has four attractive g↔e′ pairs, while CREB and C/EBP each have two attractive g↔e′ pairs that are subsets of the PAR pattern. Dimerization between PAR and CREB or PAR and C/EBP is prevented, because PAR preferentially homodimerizes due to the 4.0-kcal/mol/dimer of coupling energy from the eight attractive g↔e′ pairs in the four heptads. Therefore, CREB or C/EBP remain to homodimerize. C/EBP and CREB do not interact because their g↔e′ pairs are in different heptads. To test the validity of this idea, we mutated only three amino acids in the g and e positions of the C/EBP leucine zipper to confer the PAR pattern of g↔e′ pairs. The mutated C/EBP displays dimerization properties similar to those of PAR (64).
We can predict dimerization partners from Fig. Fig.33 by using the following strategy. Heptads that interact to produce attractive g↔e′ pairs and drive dimerization are green-green, orange-orange, red-blue, and blue-red. Heptads that interact to give repulsive g↔e′ pairs and discourage dimerization are green-orange, orange-green, red-red, or blue-blue. As noted previously, biophysical measurements show that repulsive g↔e′ pairs are more important than attractive g↔e′ pairs in driving dimerization specificity. The a and d positions that affect dimerization are colored black. Asparagine in the a position will preferentially dimerize with another a position asparagine, while basic amino acids in the a position repel each other and thus drive heterodimerization. It should be appreciated that these rules are an oversimplification! We expect more detailed work will show more interactions that are critical for mediating dimerization specificity.
In the following section, we describe the known dimerization properties of each B-ZIP family and rationalize these properties based on the leucine zipper amino acid sequences.
The homodimerizing B-ZIP families are PAR, CREB, Oasis, and ATF6. Homodimerizing leucine zippers have two defining properties. First, each B-ZIP family has a distinct pattern of attractive g↔e′ pairs. Second, all have asparagine in the a position of the 2nd heptad. The Oasis family has an additional a position asparagine in the 4th heptad, while the ATF6 family has two additional a position asparagines in the 3rd and 5th heptads that help prevent heterodimerization with other B-ZIP proteins.
The PAR family, named for a conserved proline- and acid-rich domain N terminal of the basic region (14), consists of four family members, TEF/VBP, HLF, DBP, and the more distant NFIL3. PAR family members dimerize within the family (14, 33), but not with C/EBP family members (36). The dimers bind the palindromic DNA consensus sequence (5′-ATTACGTAAT-3′), which differs from the CREB binding site by 1 bp per half-site. We view PAR with its four attractive g↔e′ pairs as the canonical homodimerizing leucine zipper. The 1st heptad contains an R↔E pair (green), and the 2nd, 3rd, and 4th heptads contain E↔R or E↔K pairs (orange). The a and d positions contain aliphatic amino acids, except for an asparagine in the 2nd a position. The attractive g↔e′ pairs and the absence of destabilizing a and d amino acids suggest that these proteins will homodimerize.
The CREB family, which has one of the shortest leucine zippers, contains three members ATF1, CREM, and CREB. ICRE is an alternative splice product of the CREB gene. These proteins form dimers within the family (11, 12, 60). CREB homodimers bind the palindromic DNA consensus sequence 5′-TGACGTCA-3′, termed the cyclic AMP responsive element (CRE).
This family has conserved attractive R↔E pairs (green) in the 1st heptad and E↔K pairs (orange) in the 3rd heptad. The crystal structure of CREB bound to a consensus CRE identified the R↔E and E↔K interhelical interactions and an additional Y↔E interaction one heptad N-terminal of the leucine zipper, as being important in dimer stability (80). A single asparagine in the 2nd heptad a position and the unique pattern of g↔e′ pairs results in homodimerization among CREB members. Although the g↔e′ pairs are a subset of those found in the PAR family, heterodimerization between PAR and CREB is not predominant, because the PAR|PAR homodimer has greater coupling energy than a PAR|CREB heterodimer.
The Oasis family has five proteins, Oasis (30), CREB-H (72), CREB3 (18, 55), hcp201085, and hcp1698600. These proteins also bind the CRE DNA sequence. The 2nd and 3rd heptads contain attractive g↔e′ pairs, and the a positions of the 2nd and 4th heptads contain asparagine. Two characteristics are consistent with this being a homodimerizing family. First, there is a unique pattern of attractive g↔e′ pairs. Second, the asparagine in the 4th a position will encourage dimerization within the family, but not with the other homodimerizing families without an asparagine in the 4th a position.
ATF6 proteins have a unique pattern of attractive g↔e′ pairs (orange) in the 2nd, 3rd, and 5th heptads. These three heptads also contain asparagine in the a position. This novel combination of attractive g↔e′ pairs and asparagine placements is likely to promote dimerization within the family and discourage interactions with other B-ZIP proteins. Xbp-1 has a 4th heptad a position threonine that may result in exclusive homodimerization of this protein. However, the interactions in the 2nd, 3rd, and 5th heptads remain the same, so that interaction of Xbp-1 with other ATF6 proteins is a possibility.
The C/EBP, ATF4, ATF2, JUN, and the small MAF families homodimerize and heterodimerize with other B-ZIP families. These proteins have properties found in both the homodimerizing and heterodimerizing proteins.
The C/EBP family consists of seven members: C/EBPα, C/EBPβ, C/EBPγ, C/EBPδ, C/EBP, CAA60698, and CHOP10. C/EBP proteins form dimers within the family (5, 101) that bind to the palindromic DNA consensus sequence 5′-ATTGCGCAAT-3′ and the related CRE and PAR sites (16). The CHOP10 basic region is unique among the listed B-ZIP proteins with a proline in the basic region that likely distorts the α-helical basic region and alters DNA binding. The CHOP10|C/EBP dimer binds a unique DNA sequence (92). Dimerization within the family is driven by attractive g↔e′ (orange) pairs in the 2nd and 4th heptads and a single asparagine at the 2nd heptad a position.
Heterodimeric interactions have been reported between C/EBP and other B-ZIP families, such as the FOS, JUN (32), ATF4 (93, 96), and ATF2 (83) families. C/EBPβ has also been reported to interact with CREB (89), however, A-C/EBP and A-CREB fail to inhibit the DNA binding of CREB and C/EBP, respectively (Fig. (Fig.6),6), suggesting repulsion between the leucine zippers of C/EBP and CREB. Heterodimerization between C/EBP and other B-ZIP proteins may be promoted by incomplete g↔e′ pairs found in the 1st and 3rd heptad g positions of C/EBP.
One of the C/EBP members, C/EBPγ, may have dimerization properties similar to those found in the FOS and JUN families described later. These properties include a repulsive R↔K (blue) pair in the 1st heptad found in JUN, a repulsive E↔E (red) pair in the 2nd heptad found in FOS, and a histidine in the 5th d position, as found in JUN, FOS, and ATF2.
The ATF4 family has three members, ATF4, ATF5, and hcp1709392. Besides homodimerizing, ATF4 heterodimerizes with C/EBP (93, 96), FOS (23), and NRF2 (27). This family is noteworthy in having acidic and basic repulsive g↔e′ pairs as well as attractive g↔e′ pairs, which may explain their promiscuous dimerization properties. The interface contains the 2nd heptad asparagine in the a position found in homodimerizing B-ZIP proteins.
The ATF2 family contains three proteins: ATF2, ATF7, and CRE-BPa (68). These proteins contain an asparagine in the 2nd heptad a position and attractive g↔e′ pairs (orange) in the 3rd and 4th heptads, structural features that favor homodimerization. The 5th heptad d position contains a histidine that is also found in the FOS and JUN families and may be important for interaction with both of these families.
ATF2 has been reported to heterodimerize with JUN (23, 58), FOS (28), and C/EBP (83) family members. Heterodimerization with FOS could be driven by an incomplete g↔e′ pair in the 1st heptad of C/EBP interacting with the E↔E pair in the 1st heptad of FOS to form an attractive K↔E pair.
Heterodimerization between C/EBP and ATF2 has been observed on a chimeric DNA sequence composed of a C/EBP half site and an ATF2 half site (83). CEBPα and ATF2 homodimers have four attractive interhelical salt bridges and no repulsive g↔e′ pairs, while a ATF2|C/EBP heterodimer has two attractive and one repulsive g↔e′ pair, suggesting that these two proteins would prefer to homodimerize. The formation of heterodimerization on a chimeric site demonstrates the importance of DNA sequence in modulating dimerization specificity.
The JUN family is comprised of three proteins, c-JUN, JUND, and JUNB. The best-studied JUN partner is FOS. JUN and FOS heterodimerize to form the AP-1 transcription factor, originally isolated as a biochemical activity, that binds the 5′-TGAGTCA-3′ DNA sequence, termed the “TRE” (12-O-tetradecanoylphorbol-13-acetate response element) (79, 82). Many other proteins have been reported to heterodimerize with JUN family members, among them CNC, ATF2 (58, 68), ATF3, and c-Maf. JUN family members can also homodimerize, but these complexes bind DNA poorly, bringing into question their biological function (70).
The amino acid sequence of the JUN leucine zipper is consistent with the experimentally observed promiscuous dimerization. Properties that drive heterodimerization are repulsive K↔K (blue) and Q↔K pairs (blue) in the 1st and 4th heptads, respectively. An incomplete g↔e′ pair in the 3rd heptad also promotes promiscuous heterodimerization. In contrast, the asparagine in the 2nd heptad a position is commonly found in homodimerizing B-ZIP proteins. Heterodimerization with ATF2 creates a canonical interface and attractive g↔e′ pairs. An elegant study (94) showed that by changing amino acids in the e and g positions, the promiscuous dimerization of JUN could be restricted to either FOS or ATF2 with a corresponding change in the biological activity of the JUN mutants. The large number of basic amino acids in the g and e positions encourages heterodimerization with acidic proteins such as FOS. A histidine (H) in the 5th heptad d position is conserved among the JUN, FOS, and ATF2 proteins and contributes to dimer stability (9), but its contribution to dimerization specificity has yet to be elucidated.
There are three small MAF (S-MAF) musculoaponeurotic fibrosarcome proteins, MafF, MafG, and MafK. The S-MAFs homodimerize, but do not contain, a transactivation domain and thus repress transcription. However, they also heterodimerize with CNC and FOS family members to activate gene expression (35, 40, 43). The S-MAF leucine zipper contains attractive E↔R (orange) in the 3rd and 4th heptads that favor homodimerization (also observed in the ATF2 family) and an asparagine in the 3rd a heptad. Features that promote heterodimerization include a lysine in the 1st heptad a position (also found in the L-MAFs) and an incomplete glutamic acid g↔e′ pair (red) in the 2nd heptad.
The heterodimerizing B-ZIP families are FOS, CNC, and the large Maf. Three general properties are apparent for these proteins: (i) repulsive g↔e′ pairs that inhibit homodimerization; (ii) amino acids in the a positions include lysine or arginine, which discourages homodimerization; (iii) incomplete g↔e′ pairs that promote promiscuous heterodimerization.
There are nine FOS family proteins: c-Fos, FosB, Fra1, Fra2, hcp34067, ATF3, JDP2, SNFT, and BATF. FOS family members heterodimerize with the JUN, CNC, and small Maf families (reviewed in reference 6). The FOS family has acidic amino acids in the g and e positions and heterodimerize with basic JUN zippers (74). Specifically, FOS dimerization properties can be divided into three sets of characteristics that are slightly variable. Five of the proteins (c-Fos, FosB, Fra1, Fra2, and hcp34067) contain conserved repulsive E↔E or Q↔E pairs (red) in the 1st and 4th heptads and repulsive E↔Q or E↔E pairs (red) in the 2nd and 3rd heptads. The hydrophobic interface is composed of a threonine in the 1st a position, lysines in the 2nd and 4th a positions, and a histidine in the 5th heptad. The lysines in the a position inhibit homodimerization and drive heterodimerization.
The remaining FOS family members, ATF3, JDP2, SNFT, and BATF, are not as acidic and do not contain as many repulsive lysines in the a position as the prototypical FOS leucine zipper. They can be further divided into two groups: SNFT and BATF, which do not contain the repulsive 4th heptad a position basic amino acid found in ATF3 and JDP2.
The second acidic leucine zipper family is named after the founding member, cap'n'collar (CNC), a Drosophila protein (62). There are six members: BACH1, CNC1/NRF1, NF-E2, BACH2, CNC2/NRF2, and NF-E2L3. These proteins heterodimerize with the S-MAF family (35, 66). These proteins have either a repulsive acidic g↔e′ pair or an acidic incomplete g↔e′ pair in the 1st heptad. The presence of multiple incomplete g↔e′ pairs is expected to confer promiscuous dimerization properties. Like the FOS family, the CNC family has lysines in the a positions that drive heterodimerization. However, in contrast to the FOS family, where the lysines are in the 2nd and 4th heptad a positions, CNC proteins contain a basic amino acid in the 2nd and 3rd heptad a positions. This arrangement should alter the heterodimerization partners for the CNC proteins compared to FOS proteins. Interestingly, the CNC|S-MAF heterodimer forms a 3rd heptad a position interaction between N and K, similar to the 2nd heptad interaction found between FOS and JUN in the 2nd heptad.
L-MAF family members form heterodimers driven by the repulsive Q↔K (blue) or Q↔R pairs (blue) in the 2nd heptad, while homodimerization is mediated through attractive E↔K pairs (orange) in the 4th heptad. Along with the small MAF proteins, these are the only B-ZIP proteins with aliphatic amino acids in the 2nd a position. The 1st and 4th heptad a position lysine or arginine discourages homodimerization. Features that promote promiscuous heterodimerization include incomplete g↔e′ pairs composed of glutamic acid (red) in the 1st and 3rd heptads.
HCF has a unique pattern of interactions that suggest it will homodimerize. This protein contains attractive g↔e′ pairs (orange) in the 1st and 4th heptads. The a positions contain asparagine in the 1st and 2nd heptad a positions and serine (a small polar amino acid similar to asparagine) in the 4th heptad. HCF was found to interact with Luman (18, 55).
We have reviewed the literature on the dimerization properties of mammalian B-ZIP proteins and made predictions about the amino acids in the a, d, e, and g positions of the leucine zipper that mediate their known dimerization specificities. We have extended these predictions to all the identified B-ZIP proteins in the human genome. These predictions appear more robust for the homodimerizing proteins than for the heterodimerizing proteins. This type of analysis will be valuable in predicting the dimerization properties of B-ZIP proteins from newly sequenced genomes for which much less experimental data exists. Additional experimental data is needed to quantify the attraction and repulsion between different B-ZIP leucine zippers to gain insight into which dimers may form in vivo. In addition, the contribution of DNA binding to B-ZIP stability needs to be quantified to gain insight into how DNA sequences can regulate B-ZIP dimer partner choice.
We thank Tsonwin Hai, Jon Shuman, and an anonymous reviewer for comments on the manuscript.