|Home | About | Journals | Submit | Contact Us | Français|
Bromodomains (BRDs) are protein interaction modules that specifically recognize ε-N-lysine acetylation motifs, a key event in the reading process of epigenetic marks. The 61 BRDs in the human genome cluster into eight families based on structure/sequence similarity. Here, we present 29 high-resolution crystal structures, covering all BRD families. Comprehensive crossfamily structural analysis identifies conserved and family-specific structural features that are necessary for specific acetylation-dependent substrate recognition. Screening of more than 30 representative BRDs against systematic histone-peptide arrays identifies new BRD substrates and reveals a strong influence of flanking posttranslational modifications, such as acetylation and phosphorylation, suggesting that BRDs recognize combinations of marks rather than singly acetylated sequences. We further uncovered a structural mechanism for the simultaneous binding and recognition of diverse diacetyl-containing peptides by BRD4. These data provide a foundation for structure-based drug design of specific inhibitors for this emerging target family.
► Human bromodomain family characterized with 29 high-resolution crystal structures ► Peptide arrays establish core histone binding preferences of BRD ► Interactions with histone-acetylated lysine sites are quantified ► Flanking posttranslational modifications greatly impact acetylated lysine recognition
ε-N-acetylation of lysine residues (Kac) is one of the most frequently occurring posttranslational modifications (PTMs) in proteins (Choudhary et al., 2009). Acetylation has a profound effect on the physiochemical properties of modified lysine residues neutralizing the positive charge of the ε-amino group (Kouzarides, 2000). Lysine acetylation is abundant in large macromolecular complexes that function in chromatin remodeling, DNA damage, and cell-cycle control (Choudhary et al., 2009) and particularly in histones. Cellular acetylation levels are stringently controlled by two enzyme families: the histone acetyltransferases (HATs) and histone deacetylases (HDACs) (Shahbazian and Grunstein, 2007). Histone acetylation has been associated with transcriptional activation, but specific marks have also been linked to DNA repair (Kouzarides, 2007).
Bromodomains (BRDs) are protein interaction modules that exclusively recognize acetylation motifs. BRDs are evolutionarily conserved and present in diverse nuclear proteins comprising HATs (GCN5, PCAF), ATP-dependent chromatin-remodeling complexes (BAZ1B), helicases (SMARCA), methyltransferases (MLL, ASH1L), transcriptional coactivators (TRIM/TIF1, TAFs) transcriptional mediators (TAF1), nuclear-scaffolding proteins (PB1), and the BET family (Muller et al., 2011) (Figure 1A and Table 1). Despite large sequence variations, all BRD modules share a conserved fold that comprises a left-handed bundle of four α helices (αZ, αA, αB, αC), linked by loop regions of variable length (ZA and BC loops), which line the Kac binding site and determine binding specificity. Cocrystal structures with peptides have demonstrated that Kac is recognized by a central deep hydrophobic cavity, where it is anchored by a hydrogen bond to an asparagine residue present in most BRDs (Owen et al., 2000).
Dysfunction of BRD proteins has been linked to development of several diseases. For instance, recurrent t(15;19) chromosomal translocations that result in a fusion protein that comprises both BRD4 or BRD3 and the NUT (nuclear protein in testis) lead to an aggressive form of human squamous carcinoma (French, 2010a; French et al., 2001). Deregulation of transcription as a consequence of altered protein acetylation patterns is a hallmark of cancer, a mechanism that is currently targeted by HDAC inhibitors (Lane and Chabner, 2009). It is likely that selective inhibitors capable of targeting BRDs will find broad application in medicine and basic research as exemplified by the recent development of highly specific and potent acetyl-lysine competitive BET BRD inhibitors (Chung et al., 2011; Dawson et al., 2011; Delmore et al., 2011; Filippakopoulos et al., 2010; Mertz et al., 2011; Nicodeme et al., 2010).
To date, only a small number of lysine acetylation marks have been identified to specifically interact with individual BRDs, and the often weak affinities reported for BRD interactions with their potential target sites have been determined by a variety of different techniques making data comparison difficult (Muller et al., 2011). Reported affinities range from nano- to millimolar dissociation constant (KD) values raising the issue of which affinity window is relevant for specific BRD-peptide interactions. For instance, the BRDs of BRD2 have been shown to bind histone 4 acetylated at lysine 12 (H4K12ac) (Kanno et al., 2004), with KD values that range from 360 μM for the diacetylated peptide H4K5acK12ac (Umehara et al., 2010) to 2.9 mM for the monoacetylated H4K12ac peptide (Huang et al., 2007). Closely spaced multiple Kac sites have also been shown to significantly increase affinity of the histone H4 N terminus for BRDT by simultaneous binding to the same BRD (Morinière et al., 2009). Thus, the field would greatly benefit from a more systematic analysis of BRD structure and peptide binding properties in order to better understand acetylation-mediated signaling as interpreted through BRDs.
Here, we present a comprehensive structural characterization of the human BRD family together with identified Kac-specific interaction sites of these essential protein recognition modules with their target sites in histones. Using available sequence databases, we identified 61 BRDs in the human proteome that are present in 46 diverse proteins. High-throughput cloning led to the establishment of 171 expression systems that yielded functional recombinant proteins. Using these reagents, we crystallized and determined the structures of 29 BRDs, including 25 structures that had not been previously published. We performed a SPOT blot analysis that covered all possible Kac sites of human histones (Nady et al., 2008) for 43 members of the BRD family. We identified 485 linear Kac-dependent BRD-binding motifs, and determined accurate binding affinities in solution for 81 known cellular histone marks by isothermal titration calorimetry (ITC). Furthermore, we found that BRD peptide recognition is dependent on patterns of multiple modifications rather than on a single acetylation site. This study provides a comprehensive structural comparison of this protein family interpreted in the context of a large array of histone interaction data, establishing a powerful resource for future functional studies of this family of epigenetic reader domains.
Analysis of sequence databases (NCBI, UniProt, PFAM) identified 46 diverse human proteins that contain a total of 61 diverse BRDs. BRD-containing proteins are large multidomain proteins associated with chromatin remodeling, transcriptional control, methyl or acetyltransferase activity, or helicases (Figure 1A and Table 1). The domain organization in BRD-containing proteins is evolutionarily highly conserved, and the BRD motif is often flanked by other epigenetic reader domains. Most frequently observed combinations include the presence of plant homeodomains (PHDs) N terminal to the BRD, multiple BRDs, as well as various other domains that generally mediate protein interactions such as bromo-adjacent homology (BAH) domains (Goodwin and Nicolas, 2001).
Phylogenetic analysis of the BRD family outside the two central core helices was complicated by the low-sequence homology and nonconserved insertions in BRD loop regions. We therefore used three-dimensional structure-based alignments including available NMR models together with secondary structure prediction (Jones, 1999) and manual curation of the aligned sequences to establish an alignment of all human BRDs (Figure S1). The derived phylogram clustered into eight major BRD families designated by Roman numerals (I–VIII) (Figure 1B). Key references for all BRD-containing proteins are included in Table 1.
Multiple protein interaction modules can be tightly linked to form a single stable interaction domain, or they can be connected by flexible linker sequences allowing conformational adaptation to diverse sequences motifs. An example of a tightly linked dual-domain reader is the PHD-BRD of TRIM24, in which both domains interact through a large interface that orients both peptide binding cavities to the same side of the protein (Tsai et al., 2010). In contrast, the two BRDs in TAF1 are free to orient independently as shown by the different domain orientations in the dual-domain structure determined here and a previously published model (Jacobson et al., 2000) (Figure 1C). The frequent combination of multiple interaction modules in the same protein suggests that the epigenetic reading process involves concomitant recognition of several PTMs.
In order to establish a platform of recombinant BRDs for functional and structural studies, we subcloned all human BRDs into bacterial expression systems in frame with a cleavable N- (or C-) terminal His6 tag. A total of 1,031 constructs resulted in the identification of 171 expression systems covering 44 BRDs that yielded stable and soluble proteins. Details of the cloned constructs are summarized in Table S1, and descriptions of one representative expression system per BRD are summarized in Table S2. The expressed proteins provide an excellent coverage of representative BRDs of all eight families.
A total of 133 recombinant BRD constructs covering 44 unique BRDs were expressed at levels sufficient for structural studies, resulting in the determination of a total of 33 crystal structures of apo-BRDs (Table S3), or BRDs in complex with acetylated peptides (Table S4). Together with previously published structural information, each BRD family is represented by at least one structural model, and families I, II, and VIII are either completely or nearly completely covered (Figure 1B). All structures presented here were refined at high resolution. A summary of the crystallization conditions, data collection, and refinement statistics is compiled in Table S5.
Despite the low degree of overall sequence homology, all BRDs shared a conserved overall fold comprising four α helices (αZ, αA, αB, αC) linked by highly variable loop regions (ZA and BC loops) that form the docking site for interacting recognition motifs (Figures 1D and andS2A).S2A). The C and N termini are highly diverse and may comprise additional helices that extend the canonical BRD fold (e.g., the sixth BRD of PB1 has an additional C-terminal helix) or largely extended kinked helices that are present as C- or N-terminal extensions (for instance in TAF1L or ATAD2). The four helices form a deep cavity that is extended by the two loop regions (ZA and BC loops), creating a largely hydrophobic Kac binding pocket. The most notable structural difference within the BRD core fold is a hairpin insertion located between helix αZ and the ZA loop that is present in all family VIII members. The proximity to the Kac binding site suggests that this insert may play a role in recruitment of acetylated binding partners. Loop insertions are frequently found within the ZA loop, resulting in substantial differences in the rim region of the binding pocket. Hydrophobic residues in the ZA loop may contribute to protein instability and the low crystallization success rate observed in our work for BRDs of families VI and V. Indeed, in the recently published structure of the MLL tandem PHD-BRD module, a flexible insertion found in the MLL ZA loop was deleted in order to generate a more stable construct (Wang et al., 2010).
In stark contrast to the conserved fold of BRDs (Figure S3A), their surface properties are highly diverse. The electrostatic potential of the surface area around the Kac binding site ranges from highly positively to strongly negatively charged, suggesting that BRDs recognize largely different sequences (Figure 2). Based on their surface properties, interactions with highly basic histones are not likely for BRDs with highly positive surfaces, as observed for instance for the third BRD of PB1.
Structural superimposition of 33 BRD crystal structures and 4 NMR models revealed conserved motifs throughout the folded protein domain. To refer to specific sites, we chose the first BRD of BRD4 as a reference sequence for numbering of residues (Figure S2A). The N-terminal helix αZ is highly diverse, but it contains three conserved hydrophobic residues oriented toward the core of the helical bundle. This conserved motif follows the generic sequence ϕ1x1x2(x3)ϕ2x3x4x5(x6)ϕ3, where ϕi are hydrophobic residues, and xj represent any amino acid. The insertions at x3 are present in the N-terminal domain of BET family members. Insertions x6 are present in the C-terminal BRDs of TAF1 and TAF1L and possibly PRKCBP1 (Figures S2A and S2B). Helix αZ is flanked by a diverse sequence region and a β hairpin insert present in all family VIII BRDs (Figures S1 and andS3B).S3B). These diverse loop inserts are typically followed by a short helical segment in the ZA loop. The C terminus of the helical segment is stabilized by a highly conserved phenylalanine (F83 in BRD4(1)) that is deeply buried by hydrophobic residues present in helix αC, bridging both sides of the helical bundle (Figure S2C). The ZA loop harbors also three conserved proline residues in addition to hydrophobic residues such as the conserved V87/Y97 pair that closely pack to hydrophobic residues present in αC stabilizing the loop conformation. A conserved tyrosine (Y97) defines the N terminus of the ZA loop helix present in all BRDs except TRIM28 and the sixth BRD of PB1, which have unusually short ZA loops that have lost this structural element (Figures 3A and S2C).
Helix αA is preceded by a Pϕ1D motif (ϕ1 is a hydrophobic residue). The conserved aspartate caps the helix αA forming a hydrogen bond with a backbone amide. Also for this helix, the main sites of conservation are hydrophobic residues that contribute to the stability of the core of the structure (Figure 3B). The loop region AB contains a highly conserved tyrosine (Y119) that hydrogen bonds to a conserved aspartate (D128) located in helix αB, presumably stabilizing the loop-helix fold. The long helix αB shows a conserved pattern of the sequence ϕxxDϕxxϕϕxNϕxxY/F (Figure 3C). A conserved asparagine (N135) hydrogen bonds with the ZA loop backbone linking to this αB loop region that is additionally stabilized by a small hydrophobic core formed around the conserved aromatic residue (Y139) preceding the Kac docking residue (N140). An asparagine residue that anchors Kac by formation of a critical hydrogen bond initiates the BC loop. Structural comparison suggested that this asparagine can be replaced by other hydrogen bond donors, such as threonine or tyrosine side chains. In MLL, however, an aspartate occupies this position suggesting that this domain either does not bind acetylated lysine residues or has a significantly different mechanism to recognize its target sequence. Similar to helix αZ, the C-terminal helix αC exhibits little sequence conservation apart from a number of hydrophobic core residues (Figure 3D). In summary, we have identified several highly conserved sequence motifs in BRDs that serve to stabilize the structural fold and conformation of loop regions flanking the Kac binding pocket. An overview of the sequence conservation is shown in Figure S2D.
Histone tails are hot spots of PTMs that play key roles in regulation of transcription and all aspects of chromatin biology. However, to date, no systematic study has addressed binding specificity of reader domains. Here, we used SPOT peptide arrays that cover all possible Kac sites of the human histones (H1.4, H2A, H2B, H3, and H4) in order to identify interaction sites for 33 representative BRDs. To distinguish between Kac-dependent and independent binding, we also included all corresponding unmodified peptides. In general, affinities of Kac for BRDs are low, suggesting that additional interaction domains may be required for higher affinity target-specific binding, in vivo. In some cases we observed Kac-independent interaction of BRDs with nonacetylated control peptides. To date, it is not clear whether BRDs participate in Kac-independent protein interactions as it has been described for PHDs that recognize a broad variety of differently methylated, acetylated, and nonmodified peptides (Lan et al., 2007; Org et al., 2008; Tsai et al., 2010).
We identified 485 interactions of BRDs to histone peptides that depend on the presence of a single Kac site (Figures 4 and andS4;S4; Table S6). The nuclear body protein SP140 as well as the related protein LOC93349 and PCAF showed nonspecific binding to most peptides. In contrast, the second, fourth, fifth, and sixth BRD of PB1, MLL, and TRIM28 interacted with only a few histone Kac peptides. Also, a number of promiscuous sequences were identified, such as the H2AK36 and H2BK85-containing peptides that interacted with most BRDs. To validate the detected interactions and to obtain accurate binding constants in solution, we synthesized 53 singly acetylated peptides and determined binding constants by ITC (Table S7). We included also 14 peptides that did not bind to BRDs in the SPOT array. As expected these peptides did not show measurable interactions by ITC, suggesting that false negatives are not a major concern in the SPOT array study. Also in agreement with the array study, binding of 20 identified interacting peptides was confirmed by ITC experiments showing KD values between 3 and ~300 μM. However, 16 peptides that were selected based on published recognition sites did not give rise to detectable interactions in the SPOT array and still exhibited KD values between 10 and 730 μM by ITC. The detection limit of SPOT arrays is about 500 μM, but the data suggest that SPOT arrays do not detect all possible interacting motifs. Steric constraints of the immobilized peptides and potentially the lack of sufficient N- and C-terminal flanking regions are the most likely reasons for the failure to detect BRD recognition motifs in SPOT arrays. In addition, 35 (12%) of acetyl-lysine containing peptides were not recognized by acetyl-lysine specific antibodies. However, most of these peptides contained proline residues in close proximity of the Kac site, a likely reason for the failure of the antibody to recognize these sites. Other peptides showed crossreactivity with the His6 antibody and have been removed from the analysis.
The false negative rate was particularly high for BET family members. Recently, it was demonstrated that murine BRDT preferentially recognized diacetylated motifs, whereas most monoacetylated peptides tested did not bind tightly to mBRDT BRDs (Morinière et al., 2009). This observation prompted us to design a systematic histone H3 array in which we explored combinations of acetylated and trimethylated (Kme3) lysines as well as phospho-serine/phospho-threonine (pS/pT) modifications around each acetylated lysine (Figures 5 and andS5S5).
Interactions previously reported for singly acetylated lysine sites were largely confirmed. Interestingly, most of the 43 BRDs tested were highly sensitive to modifications flanking the Kac mark. For instance, BRD4(2) did not interact with H3 peptides singly acetylated on K4. In contrast, this domain showed strong interaction with diacetylated H3 (H3K4acK9ac) but not with the same peptide acetylated at K4 but trimethylated at K9 (H3K4acK9me3). The strongest interaction was observed using diacetylated H3K4acK9ac in combination with phosphorylation at T3. Similarly, the BRD of FALZ showed no interaction with non- or singly acetylated K4 but interacted strongly with H3 pT3K4acK9ac. Also, WDR9(2) and EP300 exclusively interacted with the triply modified H3 pT3K4acK9ac peptide. The WDR9(2) interaction with H3K14ac showed strong dependence on S10 and T11 phosphorylation as well as acetylation at K18. Indeed, ITC experiments showed that the binding affinity of many BRDs was significantly increased for multiply modified peptides (Tables S7 and S8). For example, the KD of CREBBP decreased from 733 μM for H3K14ac to 131 μM for H3pS10K14acK18ac suggesting that many BRDs recognize a pattern of modifications rather than a single Kac mark.
To obtain better insight into BRD recognition of multiply acetylated histone tails, we designed a systematic μ-SPOT array of peptide 11-mer that harbored multiple Kac sites of the N-terminal tails of histones H3 and H4 (Figure 6A). Screening against BRDs of the BET family showed that BRD4(2) interacted with most combinations of two and three acetylated lysines, whereas BRD4(1) seemed to specifically recognize multiple marks found on the H4 tail. A tetra-acetylated H4 peptide that contained the acetylation sites K5, K8, K12, and K16 bound with single-digit micromolar KD values to the first BRDs of BRD2 and BRD4, increasing affinity at least 20-fold when compared to single marks. The second BET BRDs bound to tetra-acetyl H4 peptides with about 10-fold weaker affinities, suggesting that the first BRD in BET proteins recognizes the H4 tail (Figure 6B). Recently, it was demonstrated that BRDT requires two Kac residues for high-affinity binding (Morinière et al., 2009). Our peptide binding data suggest that the BET family and several other BRDs may also recognize multiply acetylated peptides. However, our binding data cannot discriminate between simultaneous recognition of two Kac as opposed to increased avidity for a multiply modified peptide. In order to determine whether the diverse sequence and spacing of histone Kac residues can be accommodated by a single BRD, we systematically determined cocrystal structures of BRD4(1) with the diacetylated peptides H4K5acK8ac, H4K12acK16ac, and H4K16acK20ac. In all cases the two acetylated lysines bound simultaneously and with identical conformations to the BRD4(1) Kac binding site (Figure 6C). The N-terminal Kac always formed the anchoring hydrogen bond with the conserved asparagine (N140). In the N-terminal region of H4, flexible glycine residues allow variable peptide conformations with two (H41–11K5acK8ac; Figure S6A) or three (H411–21K12acK16ac; Figure S6B) linking residues, whereas the large side chains in H415–25K16acK20ac (Figure S6C) fit perfectly into surface grooves created by the ZA and BC loops. These structures explain the similar affinities observed for the various combinations of di- and triacetylated H4 peptides. They also suggest that the greater apparent affinity of BRD4(1) and BRD2(1) for tetra-acetylated H4 peptides is an avidity effect. However, not all diacetylated H4 sequences are compatible with this bidentate recognition process. The cocrystal structure of H47–17K8acK12ac with BRD4(1) revealed a canonical monoacetylated recognition mode (Figure S6D), suggesting that the H49–11 linker sequence is not suitable for a simultaneous recognition of the two Kac by a single BRD. Consistent with this notion, ITC experiments revealed a binding stoichiometry (N) of 0.5, indicating binding of two BRDs to the H4K8acK12ac peptide, whereas only a single binding event with significantly increased affinity was observed for the H4K5acK8ac peptide (Table S8). A representative set of ITC data is shown in Figure 6D.
We were interested in the sequence requirements of the diacetyl-lysine BET recognition and designed a systematic peptide array in which we modulated the spacer sequence and residue properties of residues located between the two Kac binding sites (Figures 7A and andS7).S7). For the first BRDs of the BET family, a spacer of two glycine residues was optimal. However, BRD2(1) also tolerated longer linker sequences. For two-residue linkers, bulky amino acids in the first linker position were not tolerated, but changes of residue properties in the second linker position did not strongly influence binding. Intriguingly, the wild-type sequence “GG” of the H4 K5/K8 linker region seems highly optimized for interaction of the first BRD of BET family members. Binding of di-Kac marks separated by three residue spacers as found in sequences linking the H4 K8/K16 and K16/K20 required a glycine or a hydrophobic residue in the first linker position for optimal binding to the first BET BRDs. Acidic residues in any linker position led to loss of interaction with H4 histone tail peptides. In contrast, the second BRDs of BET BRDs bound either weakly (BRD2), not at all (BRD3), or promiscuously (BRD4) to histone sequences and their variants present in this array. ITC data collected on the first BRD of BRD4 showed a 30-fold increase in affinity between the singly acetylated peptide H4K5ac to the most optimal wild-type peptide H4K5acK8ac. In contrast, diacetylation had only a modest effect on binding affinities of the second BRD of BRD4. As reported for interactions with single acetylation sites in the case of the BRDs of BRD2 (Umehara et al., 2010), alanine mutants of the conserved asparagine (N140 and N443 in the first and second BRDs in BRD4) did also abolish binding of diacetylated peptides in both SPOT assays as well as in ITC (Figures 7B and 7C; Table S9).
In order to address the question of whether full-length BRD4 also interacts with the identified Kac sites in the context of intact nucleosomes, we performed pull-down assays on nucleosomal preparations using Flag-tagged BRD4 and antibodies that specifically recognize Kac sites. In agreement with our peptide array studies, we identified histone interaction of BRD4 with the H4 sites K5ac, K8ac, K12ac, K16ac, and H3 K14ac (Figure 7D). Unfortunately, no antibodies are currently available that specifically recognize diacetylated marks in histones H3 and H4. We therefore analyzed Kac-enriched tryptic digests prepared from pull-downs of salt-extracted histone using C-terminally biotinylated BRD4(1) and BRD4(2) by mass spectroscopy (Figure 7E). We were able to detect numerous polyacetylated histone peptides associated with BRD4 BRDs; incubation with the active BRD4 inhibitor (+)-JQ1 (Filippakopoulos et al., 2010), but not its inactive stereoisomer (−)-JQ1, abrogated interaction with acetylated histones, indicating that the purifications were specific. Importantly, we observed that the first BRD, BRD4(1), interacted mostly with polyacetylated histone H4 peptides and that the majority of peptides identified contained at least two Kac sites. These results are in good agreement with the strong increase in binding affinity and the preference for BRD4(1) for histone diacetyl marks observed in our in vitro binding studies.
Recent developments in biotechnology and structural biology have facilitated rapid generation of structural data enabling determination of high-resolution structures of most members of certain protein families within a short time frame (Barr et al., 2009). The study presented here represents a comprehensive structural description of the entire human BRD family with at least one representative structural model for each branch in the BRD phylogenetic tree. Structural coverage of families I, II, and VIII is complete or nearly complete. These crystal structures enabled a detailed sequence comparison of this highly diverse domain family. Importantly, although the protein family database Pfam (Finn et al., 2010) extended the BRD fold from the initially predicted central helices αA and αB (Haynes et al., 1992) to a 110 residue motif (Jeanmougin et al., 1997), sequence-based tools still fail to predict correct domain boundaries for BRDs that contain long ZA and BC loop insertions. The excellent structural coverage of the BRD family enabled the identification of BRD signature motifs and family-specific secondary structure elements, such as the ZA loop helix αAZ and the subfamily VIII-specific β hairpin insert.
Peptide arrays offer a rapid technology for screening protein-peptide interactions. The technology was developed more than a decade ago (Reineke et al., 2001) and has recently been applied to study epigenetic methyl-lysine reader domain interactions with histone tails (Nady et al., 2008, 2011). Recent progress in array technology allows peptide densities of up to 40,000 spots per square centimeter of solid support, enabling in principle genome-wide analysis of reader domains with peptidic recognition motifs (Beyer et al., 2009). To date, more than 100 histone PTMs that function as recruitment platforms for chromatin proteins have been described (Kouzarides, 2007). We chose, therefore, a systematic peptide array that covered all possible histone acetylation sites to characterize a representative set of BRD reader domains. However, many BRDs may interact with acetylation sites present in nonhistone proteins. In fact, we did not observe interaction with histone peptides for a number of BRDs.
Recognition sites for only a few BRDs have been previously characterized, and reported substrate affinities range from the low micromolar to the millimolar KD range. Specific recognition sites in histones identified by our SPOT arrays that contained only a single Kac site per peptide had binding affinities between 3 and 350 μM, which fall into the affinity range that has been reported for other BRD-Kac interactions (Shen et al., 2007; Zeng et al., 2008). However, comparison of binding constants determined in solution by ITC with SPOT intensity did not always correlate, suggesting that peptides linked to cellulose supports used in this study did not allow quantification of binding affinities. However, a recent study found good correlation with SPOT intensities and substrate KM values for deacetylases indicating improved correlation for proteins with enzymatic activity (Smith et al., 2011).
The weak contribution of the Kac mark to the binding affinity of BRDs to their target sites makes BRD interactions particularly sensitive to changes in the environment of the Kac site. The high density of PTMs in histones and other signaling molecules results in a large number of potential combinations of marks that regulate chromatin-templated recognition processes. In this study we selected a limited but systematic set of combinations that may be present in histone H3 and comprehensively profiled this array against the human BRD family. The observed strong influence of neighboring PTMs, such as phosphorylation, on recognition of their target sites by BRDs suggests tight coupling of phosphorylation signaling with epigenetic mechanisms of regulation. Many examples of this coupling have been reported for chromatin-modifying enzymes. For instance, H3S10 phosphorylation has been shown to be functionally linked to GCN5-mediated acetylation at H3K14 (Lo et al., 2000, 2001), and crosstalk of the three marks, H3K9ac, H3pS10, and H4K16ac, regulates transcriptional elongation of certain genes by providing a nucleosome platform that recruits BRD4/P-TEFb (Zippo et al., 2009). Also, H3pS10 is a prerequisite for H3K4 trimethylation (Li et al., 2011), which in turn has been shown to prevent phosphorylation at H3T3 by haspin (Eswaran et al., 2009). These data strongly suggest that combinatorial motifs rather than single PTMs determine the cellular outcome of processes regulated by epigenetic reader domains. This hypothesis would also explain the large amount of contradictory results in studies where single marks have been assigned specific function such as transcriptional activation or silencing. Thus, the reading process of the “histone code” is a sophisticated, nuanced chromatin language that recognizes combinations of marks rather than single PTMs (Berger, 2007).
Recent structural and biophysical studies demonstrated that murine BRDT requires at least two adjacent acetylation sites for tight interaction with the histone H4 tails (Morinière et al., 2009). Our SPOT array and ITC data showed that multiple Kac sites are generally required for specific recognition of the histone H4 tail by all human BET family members.
We were interested if interactions of diacetyl-lysine also occur outside the BET BRD family. Using rigid docking of the H4K5acK8ac peptide onto all available crystal structures revealed that a number of other BRDs would have an acetyl-lysine binding site architecture that would be compatible with the binding of this diacetylated peptide (data not shown).
The presence of multiple reader modules in chromatin modification complexes led to the proposal that distinct epigenetic signatures are interpreted by a multivalent reading process that engages diverse binding modules (Ruthenburg et al., 2007). For instance, recently, the dual-reader module PHD-BRD in BPTF has been shown to specifically recognize a combination of H4K16ac and H3K4me3 at the mononucleosome level (Ruthenburg et al., 2011). Similarly, the tandem Tudor domain of UHRF1 recognizes H3 when K9 is trimethylated, and K4 is unmodified, a histone modification state associated with heterochromatin (Nady et al., 2011). Combinations of PHD and BRDs are particularly frequent and are a hallmark of BRD proteins in families V and VI, and the PHD-BRD structure showed that the two reader domains form a single, stable functional unit (Tsai et al., 2010). The work by Tsai et al. (2010) also suggests that the TRIM24 PHD-BRD di-domain binds two different histone tails in opposite orientations. Similarly, our array and ITC studies on BRD4 showed that the first BRD of this protein has high affinity for the histone H4 tail, whereas the second BRD most likely recognizes multiply acetylated marks in histone H3. This would be consistent with the notion that proteins that harbor multiple reader domains act as integration platforms for different chromatin proteins.
BRDs have recently emerged as promising targets for the development of protein interaction inhibitors (Chung et al., 2011; Filippakopoulos et al., 2010; Hewings et al., 2011; Nicodeme et al., 2010). The acetylation of lysine residues neutralizes the charge of the primary amine. As a consequence, BRD acetyl-lysine binding sites are deep and largely hydrophobic binding pockets that represent attractive targeting sites for the development of Kac competitive inhibitors. Proteins containing epigenetic reader modules have been implicated in the development of many diseases (Baker et al., 2008; Muller et al., 2011; Reynoird et al., 2010).
The recent development of potent and highly specific Kac competitive inhibitors for BET BRDs provides a compelling case for targeting these BRDs for the treatment of an extremely aggressive subtype of squamous cell carcinoma that is caused by chromosomal rearrangement of BRD3 or BRD4 with NUT (Filippakopoulos et al., 2010; French, 2010b). Recent data strongly suggested that targeting BET BRDs will be beneficial for many diverse cancer types due to downregulation of oncogenes such as c-Myc (Dawson et al., 2011; Delmore et al., 2011; Mertz et al., 2011). The structural data presented here provide the foundation for the rational design of selective BRD inhibitors that will be valuable tools for our understanding of the role of epigenetic reader modules in health and disease.
BRD constructs were subcloned into pET28-derived expression vectors. All proteins were expressed as His6-tagged fusions and were purified using Ni-chelating affinity chromatography. Analytical details for construct design, protein expression, and purification are given in the Extended Experimental Procedures and in Tables S1 and S2.
cDNA encoding human BRD containing proteins were obtained from different sources. Most of them were synthesized (Genscript) for codon optimization, some were provided by the MGC collection, the IMAGE collection or from commercial sources. The obtained cDNA sequences were used as templates to amplify BRD regions employing the Polymerase Chain Reaction (PCR) in the presence of Platinum Pfx DNA polymerase (Invitrogen, UK). All relevant details are listed in Table S1. PCR products were purified (QIAquick PCR Purification Kit, QIAGEN Ltd. UK) and further subcloned into pET28 derived expression vectors, pNIC28-Bsa4 (gi|124015065) or pNIC-CTHF (gi|124015079), using ligation independent cloning (Stols et al., 2002). Constructs were transformed into competent Mach1 cells (Invitrogen, UK) to yield the final plasmid DNA and were verified by sequencing.
Constructs were transformed into competent BL21 (DE3) cells (Invitrogen) or into BL21 (DE3)-R3-pRARE2 cells (phage-resistant derivative with a pRARE plasmid encoding rare codon tRNAs). Cells were grown at 37°C either in Luria-Bertani medium (LB-broth, Merck) or in Terrific Broth (Merck) from overnight cultures. Protein expression was induced overnight with 0.1 mM isopropyl-β-D-thiogalactopyranoside (IPTG) at 18°C at an OD600 nm of 0.9 or 3.0 respectively. Cultures were harvested by centrifugation (8,700 x g for 15 min at 4°C) on a Beckman Coulter Avanti J-20 XP centrifuge, and then re-suspended in lysis buffer (50 mM HEPES, pH 7.5 at 20°C, 500 mM NaCl, 5 mM Imidazole, 5% glycerol and 0.5 mM tris(2-carboxyethyl)phosphine (TCEP) in the presence of 1:200 (v/v) Protease Inhibitor Cocktail III (Calbiochem). Cells were lysed at 4°C using an EmulsiFlex-C5 high pressure homogenizer (Avestin - Mannheim, Germany) and the DNA was removed by precipitation on ice for 30 min with 0.15% (v/v) of PEI (Polyethyleneimine). Lysates were cleared by centrifugation (16,000 x g for 1h at 4°C, JA 25.50 rotor, on a Beckman Coulter Avanti J-20 XP centrifuge) and were applied to a Nickel affinity column (nickel nitrilotriacetic acid (Ni-NTA) resin, QIAGEN Ltd., 5 ml, equilibrated with 20 ml lysis buffer). Columns were washed once with 30 ml of lysis buffer then twice with 10 ml of lysis buffer containing 30 mM Imidazole. Proteins were eluted using a step elution of imidazole in lysis buffer (50, 100, 150, 2 × 250 mM Imidazole). All fractions were collected and monitored by SDS-polyacrylamide gel electrophoresis (Bio-Rad Criterion Precast Gels, 4%–12% Bis-Tris, 1.0 mm, from Bio-Rad, CA. Gel run conditions: 180 V, 400 mA, 55 min in XT MES buffer). The eluted proteins were treated overnight at 4°C with TEV (Tobacco Etch Virus) protease to remove the hexa-histidine expression tag and were further purified by size exclusion chromatography on Superdex 75 16/60 HiLoad gel filtration columns (GE/Amersham Biosciences) on ÄktaPrime plus systems (GE/Amersham Biosciences). Proteins in 10 mM HEPES, 500 mM NaCl and 5% glycerol were concentrated to 10 mg/ml with a Amicon® Ultra (MILLIPORE) concentrators employing 10 kDa cut-offs (10 MWCO), flash frozen in liquid nitrogen and stored at −80°C. Protein concentration was estimated using a NanoDrop ND-1000 spectrophotometer (also see Table S2).
Soluble BRD constructs were expressed in LB media (5 × 50 ml) in the presence of 50 μg/ml kanamycin. Cell growth was allowed at 37°C to an optical density of about 0.5 (OD600nm). Protein expression was induced by 1 mM IPTG, overnight, at 18°C. Cells were harvested by centrifugation at 4,000 × g for 15 min at 4°C and re-suspended in 4 ml of binding buffer (50 mM HEPES pH7.5 at 20°C; 500 mM NaCl; 5% glycerol, 5 mM Imidazole) complemented with 0.5 mM TCEP and 1:200 (v/v) Protease Inhibitor Cocktail III (Calbiochem). Cells were disrupted with an ultrasonic processor (SONICS Vibra-Cell, amplitude 60, 10 s ON, 10 s OFF, for 2 min). Lysates were cleared by centrifugation and were applied to Ni-NTA columns (QIAGEN Ltd., 2 ml, equilibrated with 20 ml lysis buffer). Columns were washed once with 30 ml of lysis buffer then twice with 10 ml of the same buffer containing 30 mM Imidazole. Proteins were eluted using a step elution of imidazole in lysis buffer (50, 100, 150, 2 × 250 mM Imidazole). All fractions were collected and monitored by SDS-polyacrylamide gel electrophoresis. Fractions containing the recombinant proteins were pooled and concentrated using Amicon® Ultra (MILLIPORE) concentrators (10 MWCO) to a final volume of 1 ml before being loaded on NAP-10 column (GE-Healthcare) in order to exchange the buffer to 25 mM HEPES (pH 7.5 at 20°C), 150 mM NaCl and 5% glycerol. Samples were flash frozen in liquid nitrogen and stored at −80°C until used (Table S2).
Three different types of SPOT membrane were utilized in order to probe preferences of BRD recognition sites: the first array contained peptides from all four core histones (H2A, H2B, H3 and H4) covering all possible lysine acetylation sites, with peptide sizes ranging from 10 to 14 amino acids. The second array contained short peptides (11 residues) from human histone 3 exploiting possible posttranslational modifications around each acetyl-lysine epitope, including phosphorylation (on serine and threonine residues), acetylation (on lysine residues) and trimethylation (on lysine residues). The third array covered all possible lysine acetylation sites found on the four core human histones, as well as peptides containing multiple adjacent acetyl-lysine epitopes, employing a microSPOT (μSPOT) technology using a smaller foot-print.
Peptides for the first two arrays were synthesized directly on cellulose membranes (Intavis) using a MultiPep SPOT peptide arrayer (Intavis) and commercially available standard (Intavis) or modified (Bachem) L-amino acid precursors as previously described (Nady et al., 2008). The quality of the synthesized array was evaluated as follows: i) immune reactive peptides were identified by incubation of the membrane with an anti-His antibody in the absence of any His6-tagged BRD. 8 peptides were recognized by this antibody and were excluded from the analysis.
In all cases a general monoclonal primary antibody against acetylated lysine (#9681, Cell Signaling Technology) was used to probe the proper incorporation of acetylated lysine on each membrane, using the protocol provided by the manufacturer for Western blotting. Phosphorylation was probed using a primary antibody against phosphorylated Ser10 (ab47297, Abcam) and Thr11 (ab5168, Abcam), employing an anti-rabbit HRP fragment secondary antibody (Amersham Biosciences).
Membranes were washed 3 × 5 min with PBST (3.2 mM Na2HPO4, 0.5 mM KH2PO4, 1.3 mM KCl, 135 mM NaCl and 0.1% Tween 20, pH 7.4) and were subsequently blocked with 5% milk in PBST overnight at 4°C in order to minimize nonspecific binding of the proteins to the membranes. After 2 washes with PBST (5 min each) followed by a single wash with PBS (3.2 mM Na2HPO4, 0.5 mM KH2PO4, 1.3 mM KCl, 135 mM NaCl pH 7.4) for 5 min, his6 tagged BRD proteins were added to a final concentration of 1 μM and the membranes were incubated over night at 4°C in PBS. Each membrane was washed 3 times in PBST, blocked for 1 hr with 10% milk in PBST, and washed again 3 × 5 min with PBST. HPR-conjugated anti-His-tag antibodies (Novagen #71841) were added in 5% milk/PBST solution at a dilution of 1:2000. After 1 hr incubation, membranes were washed 3 × 20 min in PBST. The assay was developed with an ECL kit (Pierce ECL Western Blotting substrate, Thermo Scientific) following the manufacturer's protocol. Chem-illuminescence was detected with an Image reader (Fujifilm LAS-4000 ver.2.0) with an incremental exposure time of 5 min for a total of 80 min. Intensities of the resulting spots were quantified with the Kodak 1D ver.3.6.2 Scientific Imaging System. All experiments were performed at room temperature.
Membranes were washed 3 × 10 min in water, 2 × 30 min in stripping buffer A (6 M Guanidinium HCl, 1% Triton X-100) followed by an overnight incubation with Stripping Buffer A and Talon Beads at room temperature. The next day each membrane was washed twice with Stripping Buffer B (500 mM Imidazole, 500 mM NaCl, 20mM TRIS-HCl, pH 7.5) for 30 min each, followed by a series of washes with deionized and distilled water (ddH2O) at room temperature and at 60°C. Finally, a series of washes was performed, 10 min each with ddH2O, 10% TFA (trifluoroacetic acid), ddH2O, 20% EtOH, 50% EtOH and 95% EtOH. Membrane were dried overnight and stored for extended periods of time at −20°C until they were re-used.
Individual proteins were crystallized in sitting drops at either 4°C or 20°C. Crystals were cryo-protected, flash frozen and X-ray diffraction data were collected at 100 K on beam lines X10SA at the Swiss Light Source (SLS), at Diamond (beam lines I02, I03, I04, I04.1), or at a Rigaku FRE Superbright home source. Diffraction images were indexed, and integrated using MOSFLM (Leslie and Powell, 2007), HKL2000 or XDS (Kabsch, 2010b) and data were scaled using SCALA (Evans, 2007), SCALEPACK (Otwinowski and Minor, 1997), or XSCALE (Kabsch, 2010a), respectively. Structures were solved by molecular replacement using PHASER (McCoy et al., 2005) and were refined against maximum likelihood targets using REFMAC (Murshudov et al., 1997). Iterative rounds of refinement were interspersed with manual rebuilding in COOT (Emsley and Cowtan, 2004). Thermal motions were analyzed using TLSMD (Painter and Merritt, 2006) and hydrogen atoms were included in late refinement cycles. Crystallization conditions, data collection and refinement statistics, PDB accession codes are compiled in Table S5.
Multiple sequence/structural alignments were carried out using STRAP (Gille and Frömmel, 2001) and ICM Pro (MolSoft LLC version 3.7-2c) (Abagyan et al., 1994) and were further manually edited (Figure S1). In the absence of an X-Ray or NMR structure model the PFAM boundaries for each BRD were extended using PSIPRED (version 2) (Jones, 1999) for secondary structure prediction and the secondary structure elements were further used to guide the STRAP/ICM alignment. A phylogenetic tree of the resulting structure based alignment was generated using the ClustalW2 program (Larkin et al., 2007) and is given in Figure 1B. Sequence conservations were visualized using the WebLogo (Crooks et al., 2004) online web server.
Experiments were carried out on a VP-ITC microcalorimeter or an ITC200 (MicroCal, LLC Northampton, MA). All experiments were performed at 10 or 15°C in 50 mM HEPES pH 7.5, 150 mM NaCl. All titrations were conducted using an initial injection of 2 μl followed by 29 identical injections of 8 μl (VP-ITC) or 0.3 μl followed by identical injections of 1 μl (ITC200). The dilution heats were measured on separate experiments and were subtracted from the titration data. Thermodynamic parameters were calculated using ΔG = ΔH - TΔS = -RTlnKB, where ΔG, ΔH and ΔS are the changes in free energy, enthalpy and entropy of binding respectively. In most cases a single binding site model was employed, supplied with the MicroCal Origin software package. Multiple binding events were also confirmed with the software package SEDPHAT (Houtman et al., 2007). Binding constants and thermodynamic parameters are given in Tables S7 (Single Kac marks), S8 (multiple marks) and S9 (linker sequences).
HEK293 cells were grown in DMEM with 10% FBS and transfected with Brd4-Flag (UniProt: O60885, residues 1-1362, cloned in pcDNA5) vector using GeneJuice (EMD) according to the manufacturer's instructions. The transfected cells were treated with 500 nM (+)-JQ1 or (-)-JQ1 (Filippakopoulos et al., 2010) for 16 hr and cells collected by scraping. Nucleosome isolation protocol was based on Ruthenburg and co-workers (Ruthenburg et al., 2011) with modification at the end of the nuclease treatment. Briefly the cells were washed in buffer A (10 mM HEPES pH 7.9, 10 mM KCl, 1.5 mM MgCl2, 340 mM sucrose, 10% (v/v) glycerol, protease inhibitors (Roche) TSA 1 μg/ml, beta-mercaptoethanol 5 mM. The cell pellet was resuspended in buffer A with 0.1% Triton x-100 and incubated on ice for 10 min. The nuclei were pelleted, washed twice with buffer A and resuspended in buffer A to nucleic acid concentration of 1.2 μg/μl measured as previously described (Brand et al., 2008). CaCl2 was added to 2 mM, followed by the micrococcal nuclease (Worthington) (1 U/50 μg DNA) and the reaction incubated at 37°C for 10 min. The reaction stopped adding 4 mM EGTA on ice. To facilitate the release of digested nucleosomes, NaCl was added to final concentrations of 200 mM and reactions spun down at 13,000 rpm for 5 min. The soluble nucleosomes in the supernatant were collected and diluted (1:5) in the IP buffer (100 mM KCl 5% glycerol, 10 mM TRIS-HCl pH 8.0, 10 mg/ml BSA, protease inhibitors, 1 μg/ml TSA, beta-mercaptoethanol 5 mM). Anti-Flag M2 antibody (Sigma) or normal mouse IgG (Abcam) and protein G Dynabeads (Invitrogen) were added and incubated at 4°C overnight. The reactions were washed 5 times in the IP buffer and eluted with 100 mM TRIS-HCl pH 8.0, 150 mM NaCl, 2 mM EDTA, 5 mM DTT, 1% Triton, 3% SDS 15 mM beta-mercaptoethanol. The immunoprecipitated proteins were run on Tris-Bis PAGE (Invitrogen) and transferred to nitrocellulose membranes (Pall) that were probed with the following antibodies: H4K5ac (Abcam 61236), H4K12ac (Upstate 07-595), H4 (Abcam 7311), H3K9ac (Abcam 12179), H4K16ac (Active Motif 39167), H4K18ac (Cell Signaling Technology 2594) and secondary HRP conjugated antibodies (Cell Signaling Technology).
HeLa cells were grown in DMEM supplemented with 10% fetal bovine serum and antibiotics (Penicillin-Streptomycin cocktail). In order to generate cells arrested in M phase, cells at ~50% confluency were first treated with 2 mM thymidine for 24 h; the media was then removed, the cells washed with 1X PBS, and fresh media added for three hours to allow release from the thymidine block. Nocodazole (100 ng/ml) was then added for 16 hr to arrest cells in M phase. The cells were then harvested by “mitotic shake-off” to harvest rounded mitotic cells which were then washed with cold PBS and frozen as a dry pellet at −80°C.
Histones were extracted from asynchronous or M phase cells using high-salt as previously described (Shechter et al., 2007) with minor modifications. Briefly, frozen cell pellets from 20 × 15 cm plates of HeLa cells (corresponding to approximately 4 × 108 cells) were resuspended in 10 ml of cold extraction buffer (10 mM HEPES-NaOH pH 8.0, 10 mM KCl, 1.5 mM MgCl2, 0.34 M sucrose, 10% glycerol, 0.1 mM PMSF, 10 mM sodium butyrate) with 0.2% NP-40 and incubated on ice for 10 min with occasional mixing. Samples were then centrifuged (6,500 x g, 5 min at 4°C) and the supernatant discarded. An additional 10 ml of cold extraction buffer without NP-40 was used to resuspend the pellet, and the mixture was centrifuged (6,500 x g, 5 min at 4°C). The supernatant was completely removed and the pellet resuspended in 10 ml of no-salt buffer (3 mM EDTA, 0.2 mM EGTA), vortexed intermittently for 2 min and further incubated at 4°C on a nutator for 30 min. The samples were then centrifuged (6,500 x g, 5 min at 4°C) and the supernatant discarded. Histones were extracted by vortexing the pellet in 10 ml of high-salt solubilization buffer (50 mM TRIS-HCl pH 8.0, 2.5 M NaCl, 0.5% NP-40) for 2 min, followed by incubation at 4°C on a nutator for 30 min. DNA was pelleted by centrifugation (16,000 x g, 10 min at 4°C) and the supernatant containing histones transferred to a fresh tube. Three buffer exchanges were performed with a spin filter device of 5,000 Da molecular weight cut-off to reduce salt concentration before storing at −80°C until further use.
500 μg of salt-extracted histones from asynchronous or M phase HeLa cells were mixed with 50 μg of recombinant biotinylated BRD4(1) or BRD4(2) and incubated at 4°C on a nutator for 60 min in the presence of 1 μM (+)-JQ1 or 1 μM (-)-JQ1. Samples were transferred to a fresh tube containing 30 μl of streptactin sepharose beads (IBA BioTAGnology) and incubated for 60 min at 4°C on a nutator. The beads were then washed five times with 1 ml of wash buffer (50 mM HEPES-NaOH pH 8.0, 500 mM KCl, 2 mM EDTA, 0.1% NP-40, 10% glycerol) and two times with 1 ml of no-salt wash buffer (20 mM TRIS-HCl pH 8.0, 2 mM CaCl2). Bound histones were eluted by incubating the beads in 1 ml of 0.5% Trifluoroacetic acid (TFA) at 4°C on a nutator. The supernatants were transferred to fresh tubes and then evaporated to dryness and stored at −80°C.
Dried sample were resuspended in 100 μl of 20 mM TRIS-HCl pH 8.0, and 1 μg of trypsin (Sigma-Aldrich; Singles) was added to each sample. Samples were incubated overnight at 37°C with agitation, and supplemented with an extra 0.5 μg of trypsin before another incubation of 4 hr. The samples were then boiled for 10 min to inhibit trypsin activity and subsequently let to cool down to room temperature for approximately 15 min. Samples were then diluted to 400 μl with peptide wash buffer solution (50 mM MOPS pH 7.2, 10 mM NaPO4, 50 mM NaCl) and incubated with 30 μl of anti-acetyl-lysine agarose beads (ImmuneChem Pharmaceuticals Inc.) overnight at 4°C. The next morning, the beads were collected by gentle centrifugation and the supernatant (unbound fraction) transferred to a fresh tube for subsequent analysis by mass spectrometry. The beads were washed once with 1 ml of peptide wash buffer solution and once with 1 ml of no-salt wash buffer (20 mM TRIS-HCl pH 8.0, 2 mM CaCl2). Peptides were eluted by incubating the beads with 1 ml of 0.5% TFA for 30 min at 4°C and then transferred to a fresh tube before being evaporated to dryness.
Dried peptides were dissolved in 5% formic acid and analyzed by LC-MS/MS using a NanoLC-Ultra 2D plus HPLC system (Eksigent, Dublin, USA) coupled to a LTQ-Orbitrap Velos (Thermo Electron, Bremen, Germany) equipped with a nanoelectrospray ion source (Proxeon Biosystems, Odense, Denmark). The LTQ-Orbitrap Velos instrument under Xcalibur 2.0 was operated in the data dependent mode to automatically switch between MS and up to 10 subsequent MS/MS acquisition. Raw MS and MS/MS spectra were processed using Prohits (Liu et al., 2010). Peptides and proteins were identified using the Mascot software (Matrix Science, London, UK) and the human RefSeq database (version 45, released on February 2nd 2010, containing 34,604 sequences). Mass tolerance of 7 ppm in MS mode and 0.6 Da in MS/MS mode with trypsin specificity were used, and 4 missed cleavage sites were allowed. No fixed modification was selected, but N-acetyl protein, N-pyroglutamine, oxidized methionine, acetylation of lysine and phosphorylation of serine, threonine and tyrosine were searched as variable modifications. Relative quantitation of acetylated peptides using MS spectra was achieved with Proteome Discoverer 1.2 (Thermo Electron, Bremen, Germany). The peak area for acetylated peptides co-purifying with BRD4(1) or BRD4(2) were calculated first by summing the areas under the curve for a given acetylated peptide in the asynchronous or nocodazole-arrested cells (for a given bromodomain), then by computing the area of all acetylated peptides associated with either bromodomain. The relative peak intensity of each acetylated peptide was then determined by expressing (in percent) the ratio of the area under the peak of a given peptide peak over the total area under the peak of all peptides. The efficiency of purification with each bromodomain was monitored by analyzing the fraction unbound to the anti-acetyl lysine beads; specificity in the interaction was ascertained by analyzing by quantitative MS each of the samples incubated with the inhibitors (-)-JQ1 and (+)-JQ1.
Peptides were synthesized on cellulose membranes using a MultiPep SPOT peptide arrayer (Intavis). His6-tagged BRDs were added to a final concentration of 1 μM, and blots were developed using an ECL kit (Thermo Scientific) following the manufacturer's protocol.
Experiments were carried out on a VP-ITC or an ITC200 microcalorimeter (MicroCal, Northampton, MA, USA). In most cases a single binding site model was employed, supplied with the MicroCal Origin software package.
Individual proteins were crystallized at either 4°C or 20°C, and X-ray diffraction data were collected on beamlines listed in Table S5. Structures were solved by molecular replacement and were refined as described in detail in the Extended Experimental Procedures. Crystallization conditions, data collection, refinement statistics, and PDB accession codes are listed in Tables S3, S4, and S5.
HEK293 cells were grown in DMEM with 10% fetal bovine serum (FBS) and transfected with BRD4-Flag (UniProt: O60885, residues 1–1,362, cloned in pcDNA5) vector using GeneJuice (EMD) according to the manufacturer's instructions. Nucleosome isolation protocol was based on Ruthenburg et al. (2011) with modification as described in the Extended Experimental Procedures.
HeLa cells were grown in DMEM supplemented with 10% FBS and antibiotics (penicillin-streptomycin). Histones were extracted with high salt (Shechter et al., 2007) and incubated with recombinant biotinylated BRDs in the presence of (+)-JQ1 or (−)-JQ1 prior to purification on Strep-Tactin Sepharose beads. After elution of bound histones using trifluoroacetic acid (TFA) and sample lyophilization, trypsin digestion was performed, and trypsin was inhibited. Acetylated peptides were purified using anti-Kac agarose beads (ImmuneChem Pharmaceuticals), and both the unbound fraction and the bound fraction (eluted in TFA) were prepared for mass spectrometry (MS). LC-MS/MS was performed using a NanoLC-Ultra 2D plus HPLC system (Eksigent) coupled to a LTQ-Orbitrap Velos (Thermo Electron) equipped with a nanoelectrospray ion source (Proxeon Biosystems). Spectra were assigned by Mascot (Matrix Science, v2.3) against the human RefSeq database (version 45). Relative quantitation of acetylated peptides was achieved with Proteome Discoverer 1.2 (Thermo Electron). The efficiency of purification with each BRD was monitored by analyzing the fraction unbound to anti-Kac; specificity was ascertained by analyzing the samples incubated with the inhibitor (+)-JQ1.
The SGC is a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research, the Canada Foundation for Innovation, Genome Canada, GlaxoSmithKline, Pfizer, Eli Lilly, the Novartis Research Foundation, the Ontario Ministry of Research and Innovation, and the Wellcome Trust. P.F. is supported by a Wellcome Trust Career Development Fellowship (095751/Z/11/Z). T.P. and A.-C.G. are supported by a Global Leadership Round in Genomics and Life Sciences from the Ontario Ministry of Health and Innovation. J.-P.L. is supported by a fellowship from the Natural Sciences and Engineering Research Council of Canada. A.-C.G. and C.H.A. hold Canada Research Chairs in Functional Proteomics and Structural Proteomics, respectively.
The models and structure factors of the reported bromodomain proteins have been deposited in the Protein Data Bank with the PDB accession codes 2NXB (BRD3(1)), 2OO1 (BRD3(2)), 2OSS (BRD4(1)), 2OUO (BRD4(2)), 2RFJ (BRDT(1)), 3HME (BRD9), 3D7C (GCN5L2), 3DAI (ATAD2), 3G0L (BAZ2B), 3GG3 (PCAF), 3DWY (CREBBP), 3UV2 (FALZ), 3IU5 (PB1(1)), 3HMF (PB1(2)), 3K2J (PB1(3)), 3TLP (PB1(4)), 3G0J (PB1(5)), 3IU6 (PB1(6)), 3I3J (EP300), 3HMH (TAF1L(2)), 3UV4 (TAF1(2)), 3LXJ (KIAA1240), 3MQM (ASH1L), 3MB3 (PHIP(2)), 3UVD (SMARCA4), 3NXB (CECR2), 3Q2E (WDR9(2)), 3UV5 (TAF1(1/2)), 3RCW (BRD1), 3UVW (BRD4(1)/H4K5acK8ac), 3UVX (BRD4(1)/H4K12acK16ac), 3UVY (BRD4(1)/H4K16acK20ac), 3UW9 (BRD4(1)/H4K8acK12ac), 3P1C (CREBBP/Kac), 3P1D (CREBBP/N-Methyl-2-pyrrolidone), and 3MB4 (PB1(5)/N-Methyl-2-pyrrolidone).