|Home | About | Journals | Submit | Contact Us | Français|
The ANL superfamily of adenylating enzymes contains acyl- and aryl-CoA synthetases, firefly luciferase, and the adenylation domains of the modular Non-Ribosomal Peptide Synthetases (NRPSs). Members of this family catalyze two partial reactions, the initial adenylation of a carboxylate to form an acyl-AMP intermediate, followed by a second partial reaction, most commonly, the formation of a thioester. Recent biochemical and structural evidence has been presented that supports the use by this enzyme family of a remarkable catalytic strategy for the two catalytic steps. The enzymes use a 140° domain rotation to present opposing faces of the dynamic C-terminal domain to the active site for the different partial reactions. Support for this Domain Alternation strategy is presented along with an explanation of the advantage of this catalytic strategy on the reaction catalyzed by the ANL enzymes. Finally, the ramifications of this domain rotation in the catalytic cycle of the modular NRPS enzymes are discussed.
The results of numerous investigations on luciferase and other acyl adenylate synthetases indicate that large conformational changes occur in the protein when the specific substrates combine at the active site.
W.D. McElroy, M. DeLuca, J. Travis, (1967)
From a comparison of the allosteric manifestations of the overall reaction and partial reaction (b) it appears that the same conformational state (R), showing a greater affinity for CoA, catalyses both the overall reaction and partial step (b), whereas the T-configuration, favoured in the presence of PPi, seems to catalyse the ATP-forming step (reversal of partial reaction a).
J. Bar-Tana and G. Rose, (1968)
These statements were made over 40 years ago about a family of enzymes that has recently garnered much attention. This family of ligases, which now includes acyl- and aryl-CoA synthetases, the adenylation domains of Non-Ribosomal Peptide Synthetases (NRPSs), and firefly luciferase, catalyzes the activation of a carboxylate substrate with ATP to form an acyl adenylate intermediate that is used in a diverse set of second partial reactions. The study of this family of adenylating enzymes has a long history, dating back to work in the 1950s with acylCoA synthetases. A decade later, biochemical studies of the reactions catalyzed by different members of this family allowed McElroy et al. (1) as well as Bar-Tana and Rose (2) to suggest that large conformational changes play a role in the catalysis of the complete two-step reaction. Bar-Tana and Rose even suggested that the two partial reactions of the acyl-CoA synthetases might be catalyzed by different conformational states. Forty years later, the structural biochemistry of these enzymes has been thoroughly investigated and it is remarkable how prescient these early statements appear to have been. This review describes the biochemical and structural data for a novel catalytic mechanism used by these enzymes and the chemical requirements of the two-step reactions that benefit from this strategy.
The activation of biological carboxylates as thioesters with CoA has been known for over 60 years. The conversion of acetate to acetyl-CoA was first described by Lipmann (3) using a partially purified bacterial enzyme. The mechanism proceeded through an acetyl-phosphate intermediate, which was then converted to acetyl-CoA. A second enzymatic mechanism for the synthesis of acetyl-CoA was identified in the next decade by Berg (4) who purified from baker's yeast a protein he called aceto-CoA kinase that activated acetate not through an acyl-phosphate intermediate but rather through an acyl-adenylate. This study demonstrated the exchange of 32PPPi into ATP in an acetate dependent manner and that the acetyl-AMP intermediate could react with PPi for the formation of ATP or with CoA to form acetyl-CoA and AMP. The formation of acetyl-AMP from ATP and acetate occurred in the absence of CoA, demonstrating the independence of the two partial reactions. Rigorous attempts to separate the active fraction into two distinct enzymes that catalyzed the adenylation and the thioester-forming reactions were unsuccessful, and a single enzyme was proposed to be responsible for the production of acetyl-CoA, AMP, and PPi directly from acetate, ATP, and CoA, proceeding through the acetyl-AMP intermediate. A related medium chain acyl-CoA synthetase was subsequently demonstrated to use a bi-uni-uni-bi ping pong mechanism (2). Such ping-pong kinetics have since been confirmed for many acyl-CoA synthetases, including enzymes with specificity for acetate (5), malonate (6), long chain fatty acids (7), and more complex aryl acids (8, 9).
The possibility that this adenylating enzyme family contained members beyond the acyl- and aryl-CoA synthetases was raised as early as 1967 by McElroy and colleagues who noted the functional similarities of the acyl-CoA synthetases with firefly luciferase, as well as with the reaction of pantoic acid in pantetheine cofactor biosynthesis and the activation of amino acids by amino acyl-tRNA synthetases (1). As sufficient sequence information became available, several reports (10, 11) noted that the acyl-CoA synthetases and luciferase enzymes shared numerous conserved sequence motifs. The amino acyl-tRNA synthetase and pantothenate synthetase enzymes, while functionally similar, share no sequence homology. In the late 1980s, several multidomain proteins that produce bacterial peptide antibiotics or siderophores were also identified that share similar sequence motifs (12-14). This new class of modular enzymes became known as non-ribosomal peptide synthetases (NRPSs) and formed the third subfamily of this adenylating enzyme superfamily.
We and others have referred to this enzyme family as the “adenylate-forming superfamily” of enzymes or the “acyl-AMP forming family of adenylation enzymes”. Neither of these names is particularly satisfying as other acyl-adenylating enzymes exist that do not belong to this family. Using the description of divergent superfamilies described by Gerlt and Babbitt (15), luciferase, acyl-CoA synthetases, and NRPS adenylation domains comprise a Mechanistically Diverse Enzyme Superfamily. The enzymes share ~20% sequence identity, are structurally homologous, and catalyze different overall reactions, while sharing a conserved mechanistic step, the adenylation partial reaction. Other adenylate forming enzymes, such as the amino acyl-tRNA synthetases (16) or the NRPS-independent siderophore (NIS) adenylating enzymes (17), do not belong to this enzyme superfamily and are simply unrelated enzymes that catalyze a similar overall reaction.
In the interests of providing a clear way to describe the enzyme superfamily that is the focus of this review, I propose a new designation of the “ANL superfamily of adenylating enzymes”. This name is derived from the three main subfamilies, namely the Acyl-CoA synthetases, the NRPS adenylation domains, and the Luciferase enzymes.
Although distinct in the overall reactions catalyzed, the enzymes within all three subfamilies use a two-step reaction to first activate a carboxylate substrate by reacting with ATP to form the acyladenylate and inorganic PPi (Figure 1). The adenylate, a high energy acid anhydride, provides the activation energy for the second partial reaction. For the acyl-CoA synthetases and NRPS adenylation domains, a pantetheine thiol group attacks the carboxylate carbon, displacing the AMP leaving group, in the second half of the reaction. The hydrolysis of ATP in the first partial reaction therefore creates a higher energy adenylate intermediate that is utilized for the thioester-forming step. In the reaction catalyzed by luciferase, the activated luciferyl-adenylate undergoes an oxidative decarboxylation that results in the formation of an intermediate that subsequently decomposes within the enzyme active site to yield a photon of light. As described below, the unique chemical properties of these reactions have been facilitated by an interesting catalytic mechanism.
Many excellent reviews have been written describing the fascinating modular architecture of the NRPS enzymes (18-21) and these enzyme assembly lines will be described only briefly here. The multi-domain NRPS enzymes generally contain a single module for the incorporation of each amino acid into the peptide product (Figure 2). During synthesis, the amino acid and peptide intermediates are bound to the pantetheine cofactor of an integrated peptidyl carrier protein domain (22, 23). Within each module is an adenylation domain responsible for activating the amino acid substrate and transferring it to the pantetheine cofactor of the neighboring carrier protein domain. Finally, a condensation domain is required in all modules except the first to catalyze peptide bond formation and transfer the peptide from an upstream to a downstream carrier protein domain, increasing the peptide length by a single residue. This assembly line biosynthesis terminates with a thioesterase domain that cleaves, and often cyclizes, the peptide product. Reflecting the diversity of the peptide products that are produced, some NRPS proteins contain additional internal catalytic domains that are responsible for epimerization or Nmethylation of the constituent amino acids. The ACV Synthetase enzyme (24), for example, is a three-module protein involved in the synthesis of isopenicillin N, a β-lactam antibiotic. The ACV synthetase NRPS produces a tripeptide from α-aminoadipic acid, cysteine, and valine that is then cyclized by the isopenicillin N synthase (25).
The choreography that delivers the substrate bound to the NRPS carrier protein domain to the neighboring catalytic domains is undoubtedly complex and is not clearly defined. Recent studies using alanine-scanning mutagenesis have defined the binding surface used by the E. coli EntB PCP domain for interactions with its functional partners, EntF and EntD (26, 27). Additionally, NMR studies have demonstrated that both the PCP domain and the catalytic domains will undergo local conformational rearrangements during their interactions (28-30). In particular, these studies suggest that the PCP domains are dynamic and can adopt multiple conformations and that the catalytic domains are able to stabilize one of the conformations selectively for a functional interaction. Nonetheless, these conformational changes are limited in scale and it appears from a recent multi-domain NRPS structure (31) that more significant conformational rearrangements are required for the proper delivery of amino acyl and peptide substrates to the different catalytic centers.
NRPS adenylation domains can be subdivided into two classes. Most NRPS adenylation domains are integrated into the catalytic module (Figure 2) and activate and load the amino acid on the pantetheine cofactor of an adjacent carrier domain, referred to alternatively as the thiolation domain and the PCP domain (22). Other NRPS adenylation domains however are self-standing and transfer the amino acyl substrate in trans to a separate amino acyl carrier domain. In many cases these isolated adenylation domains load an acyl or aryl capping group that initiates the NRPS peptide (32).
The substrates of the acyl- and aryl-CoA synthetase subfamily of the ANL enzymes are chemically diverse, ranging in size from small acids like acetate and propionate to medium- and long-chain fatty acids and aromatic compounds. The aryl-CoA ligases have been identified in a variety of pathways for the degradation of aromatic compounds, as well as for the biosynthesis of plant metabolites (33).
A subfamily of fatty acyl-AMP ligases (FAALs) has been characterized by Gokhale and colleagues (34, 35). These enzymes share functional features with both the acyl-CoA synthetases and the NRPS adenylation domains. The acyl substrate of these enzymes are fatty acids that are used in the formation of complex lipopeptides, some of which play a role in virulence. Unlike the acyl-CoA synthetases, however, the enzymes of this family transfer the activated fatty acid from the adenylate directly to the pantetheine cofactor of an acyl carrier protein of an NRPS or polyketide synthase cluster (34). In this latter regard, these enzymes share more in common with the NRPS adenylation domains than with the fatty-acyl CoA synthetases. In particular, these enzymes are reminiscent of the self-standing aryl activating of many NRPS siderophore clusters (36) that activate salicylate or 2,3-dihydroxybenzoate and transfer the aromatic acid to an aryl carrier protein domain (32).
As more members of the adenylate-forming family of enzymes were identified, several groups proposed a number of conserved sequence motifs to be important for catalytic activity. The first identified sequence was a serine-, threonine-, and glycine-rich motif (10). This region was deemed a signature sequence for the ANL enzyme family and was designated Motif I. Two additional regions (Motif II and Motif III) were also identified based on sequence conservation (37). A more detailed comparison of the conserved regions exclusively within the NRPS adenylation domains identified ten conserved regions, named A1-A10 (38). The determination of crystal structures of members of this enzyme family allowed a preliminary understanding of the roles of these conserved motifs in the catalytic residues (Table 1). Interestingly, the conservation of several regions that were located at some distance from the active site of the first structures can now be rationalized on the basis of the domain rearrangements described below.
As of July 2009, there are 47 crystal structures deposited in the Protein Data Bank of members of the ANL adenylating enzyme family (Table 2). These structures represent 16 different proteins and have been crystallized in a variety of liganded states, providing a detailed view of the catalytic strategy used by this enzyme family.
The first crystal structure of a member of the adenylate-forming family of enzymes was of firefly luciferase from P. pyralis (39). This structure identified an overall two-domain architecture with a larger N-terminal domain composed of the first 430 residues and a smaller C-terminal domain of the final 120 residues (Figure 3A). The larger domain contained an ababa domain structure with two large eight stranded β-sheets that surround two α-helices. The N-terminal domain ends with a distorted β-sheet. Following a short disordered loop in the luciferase structure, the C-terminal domain begins with an antiparallel β-sheet that contained two strands, followed by a central 3-stranded β-sheet that was surrounded by helices. The residues connecting the two domains form the A8 motif (Table 1) and are collectively referred to as the A8 loop. Luciferase was crystallized in the absence of ligands. The conserved sequence motifs (40) were used to propose a location of the enzyme active site. In particular, the Gly- and Ser-rich motif I was located at the interface between the N- and C-terminal domains. Conti et al. (39) note that the cleft is likely “too big to accommodate the substrates” and predict closure of the interface upon substrate binding.
As noted above, early experiments predicted a large conformational change for acyl-CoA synthetases and luciferase (1) on the basis of tritium exchange and thermal inactivation in the presence and absence of ligands. A conformational change was also invoked to explain the stabilization of phenylacetyl-CoA ligase (41). The crystal structure of firefly luciferase thus provided a structural framework to envision this large conformational change. The single structure, however, provided only an initial look at the protein and many additional structural studies were necessary to understand the full conformational mechanism.
The structure of an initiating NRPS adenylation domain, the phenylalanine activating domain of gramicidin synthetase S (GrsA), was determined the following year (42). Importantly, this structure was determined in the presence of the amino acyl substrate phenylalanine and a molecule of AMP (Figure 3B) confirming the predicted location of the active site. The C-terminal domain was rotated by ~90° compared to the orientation seen in the luciferase structure. A universally conserved lysine from the A10 region formed hydrogen bonds to the ribose ring oxygen, the 5′-bridging oxygen, and a carboxylate oxygen of phenylalanine. This suggested a possible catalytic role for this lysine, which was indeed supported by prior biochemical studies of tyrocidin synthetase (40).
These initial structures provided the foundation for a number of studies that investigated the roles of catalytic or substrate specificity residues. In particular, the interaction of the C-terminal lysine from the A10 region (Table 1) was studied by mutagenesis in both luciferase (43) and PrpE, a propionyl-CoA synthetase (44). No activity could be detected for the complete reaction of the K592E mutant of PrpE for example and activity was reduced by over four orders of magnitude for the reverse of the adenylation reaction; the rate of production of propionyl-CoA from the propionyl-AMP intermediate, however, was reduced by only a factor of 2. These studies demonstrated a role of this residue specifically in the adenylation partial reaction.
Subsequent to the structural characterization of PheA, a number of additional structures of enzymes in this family were determined that demonstrated a similar tertiary structure and a similar conformational orientation between the N- and C-terminal domains. These structures included DhbE, the self standing adenylation domain from the bacillibactin NRPS cluster (45), yeast acetyl-CoA synthetase (46), two enzymes that catalyze aryl-CoA synthesis that are involved in the metabolic breakdown of 4-chlorobenzoate (47) and benzoic acid (48), luciferase from L. cruciola (49), and the enzyme DltA from B. cereus, which is involved in the activation of alanine for subsequent alanylation of teichoic acid in cell wall biosynthesis of gram positive bacteria (50, 51). These structures were all determined in the absence of ligands, or in the presence of the acyl substrate, the adenylate, or AMP. Notably, none of these structures contained a bound CoA or thiol acceptor for the second partial reaction (Table 2).
Insights into CoA binding were derived from the crystallization of a bacterial acetyl-CoA synthetase (Acs) bound to adenosine-5′-propylphosphate, a non-hydrolyzable mimic of the adenylate intermediate, and CoA (52). This structure, which appeared to show the enzyme poised to catalyze the thioesterification reaction, located the nucleotide of CoA at the surface of the protein with the pantetheine portion of CoA passing through a pantetheine tunnel that runs between the N- and C-terminal domains to enter the mostly buried adenylate binding site.
The most intriguing feature of this structure was that the C-terminal domain of Acs adopted a dramatically different conformation compared to that seen in the prior structures of PheA and DhbE (Figure 3B). The C-terminal domain of Acs packed against the N-terminal domain forming a more lobular enzyme (Figure 4). Multiple interactions were observed between the two domains, and between the C-terminal domain and the reaction intermediates. In this conformation, the loop containing the A10 lysine residue is 25Å from the active site. In contrast, the A8 β-sheet that initiates the C-terminal domain was rotated into the active site. The luciferase C-terminal domain, which was also observed in a conformation than observed in PheA, did not make any interactions with the N-terminal domain and did not seem to be a functionally relevant conformation.
The biochemical data implicating the A10 lysine from the C-terminal domain in catalyzing the adenylate-forming reaction specifically (43, 44) and the structural observation of this new conformational state of Acs bound to CoA provided preliminary support for a novel catalytic strategy (52). In this proposed catalytic strategy, the members of this adenylate-forming family would adopt the PheA-like structure to catalyze the adenylation partial reaction. Upon formation of the adenylate and the release of pyrophosphate, the C-terminal domain would rotate to the orientation observed in the bacterial Acs to form a second conformation that would be used to catalyze the thioester-forming reaction. We adopted the term Domain Alternation, which had been used to describe a large-scale domain rearrangement in methionine synthase (53), to describe this mechanism.
The crystal structures of several other members of this enzyme family have since been determined in this conformation. These structures include the long chain fatty acyl-CoA synthetase from T. thermophilus (54), DltA from B. subtilis (55), and an acyl-adenylating enzyme from the methanogenic bacteria Methanosarcina acetivorans (56). Interestingly, while the Acs structure contained CoA, these latter structures did not contain bound coenzyme A, although in the case of the fatty acyl-CoA synthetase, CoA was included in the crystallization conditions yet was not bound in the crystal structure.
In the last year, two examples of a single enzyme being crystallized in both conformations have been reported. The 4-chlorobenzoyl-CoA ligase (4CBL) from Alcaligenes sp. AL3007 was the subject of extensive structural and kinetic evaluation, as will be discussed below. As part of this study, the enzyme was trapped in both the adenylate-forming conformation bound to the adenylate intermediate as well as to a complex of AMP and the product analog 4-chlorophenacyl-CoA (57). More recently, the structure of the human medium chain Acyl-CoA synthetase has also been determined in both conformational states (58). These alternate structures demonstrate that the domain rotation is a dynamic feature of the ANL enzymes and does not simply reflect differences in tertiary organization between different superfamily members.
Not surprisingly, crystallization of these conformationally flexible enzymes has been challenging and several of the deposited structures exhibit significant disorder in the C-terminal domain. Indeed, a recent structure of an FAAL (34, 59) required the removal of the C-terminal domain altogether to achieve crystallization. As with all structural studies, the careful selection of appropriate inhibitors can support the crystallization of enzymes trapped in relevant conformations. The use of alkyl phosphate esters (52) or adenosyl sulfamate analogs (49) as mimics of the adenylate intermediate, or a substituted phenacyl-CoA thioether (57) as a mimic of the CoA thioester product has enabled the determination of some of the highest resolution structures of members of this conformationally dynamic enzyme family.
Acs (52, 60), 4CBL (57), DltA (50, 51, 55), and a human medium chain acyl-CoA synthetase (58) have now been studied structurally with numerous ligand complexes or structures of site-directed mutants. These structures give insights into the catalysis of the individual reactions and illustrate residues responsible for substrate binding. The active sites of these three proteins will be described as representative members of the family.
The carboxylate substrate binds in a pocket located within the N-terminal domain (Figures (Figures3B,3B, ,4).4). The residues that form the binding pocket are quite diverse, reflecting the differences in substrate specificity between different enzyme family members. In fact, we have recently noted (56) that the core of the N-terminal domain of five structurally characterized acyl-CoA synthetases contains only 14 conserved residues (out of 250). Interestingly, a conserved glycine is observed throughout the superfamily except in acetyl- and propionyl-CA synthetase where this glycine is replaced by a tryptophan residue that truncates the acyl binding pocket for the smaller substrates (47).
The amino acyl binding pocket of NRPS adenylation domains is better characterized than the acyl-CoA synthetase pockets. The substrate α-amino group is positioned by a conserved aspartic acid residue that directs the amino acid side chain into a pocket that is chemically complementary to the specific substrate. This feature of the adenylation domain binding pocket has allowed the clustering (61-63) and prediction (64) of specificity of NRPS adenylation domains.
The ATP binding site contains several highly conserved motifs that are present in all family members and perform similar roles in positioning the nucleotide (Figure 5A). The aspartic acid residue of the A7/motif III region, Asp385 of 4CBL, is universally conserved and interacts with one or both ribose hydroxyls. Fifteen residues downstream is the completely conserved Arg residue of the A8 motif, Arg400 in 4CBL, that also interacts with the ribose hydroxyls. An aromatic residue from the A5 motif, Tyr304 in 4CBL, stacks against the adenine base. Two recent structures, DltA and the human medium chain acyl-CoA synthetase (51, 58), illustrate the interactions that occur between protein and the triphosphate moiety. The motif I residues that surround the β- and γ-phosphates of the human medium chain acyl-CoA synthetase (Figure 5B) are well-conserved suggesting that the ATP binding position will be similar in all family members.
The structure of DltA bound to ATP was recently determined and compared to previously characterized structures in the adenylate-forming conformation (51). The authors note that even within this conformation, differences of as much as 40° exist in the orientation of the C-terminal domain. The different orientation causes the invariant A10 lysine residue to interact with different ligand atoms, including the bridging oxygen between the ribose and phosphate and a carboxylate oxygen in PheA (42) and DhbE (45), or a β-phosphate oxygen in the medium chain acyl-CoA synthetase (58) or DltA (51). The authors raise the intriguing possibility that the ANL enzymes adopt a pre-adenylation state bound to ATP and a post-adenylation state upon completion of the adenylation partial reaction. The A10 lysine residue is proposed to track the accumulation of negative charge on the initial attack complex, the transition state for adenylation formation, and finally to the pyrophosphate product prior to product release.
The CoA binding site can be divided into two regions, a nucleotide binding site that is located on the surface of the protein (Figure 5C), and the previously mentioned pantetheine tunnel that runs between the two domains. The tunnel contains a β-sheet like interaction that occurs between the conserved glycine on the A8 loop and the β-alanine group of the pantetheine.
In the three protein structures bound to CoA in a productive conformation, the CoA nucleotide moiety is located in different positions (52, 57, 58). The adenine group of the CoA in the structures of the two acyl-CoA synthetases interacts most closely with the N-terminal domain, while the nucleotide moiety of the CoA ligand in 4CBL binds primarily to the C-terminal domain. In 4CBL, it is sandwiched between two aromatic side chains that are well conserved within other 4CBL enzymes (9) but not in the larger subfamily of acyl-CoA synthetases.
Catalysis of the thioester-forming reaction likely requires the deprotonation of the thiol of CoA to increase the nucleophilicity for attack on the carboxylate in the adenylate intermediate. Surprisingly, no conserved residues were identified in the area surrounding the CoA thiol. Instead, the enzymes may use the helix dipole of the helix that starts with the A4 aromatic residue to provide a positive dipole that could reduce the pKa of the thiol (9).
The observations of the distinct structures support the domain alternation hypothesis however kinetic studies have been equally important for understanding the catalytic mechanism. In particular, these studies support the involvement of residues from the opposite faces of the C-terminal domain in the distinct partial reactions and identified the catalytic advantage that is gained by the domain alternation.
Following the original experiments on luciferase and PrpE (43, 44), studies have since been performed with all three subfamilies of the acyl-AMP forming adenylating enzymes that support the hypothesis that the two C-terminal conformations are used for the different partial reactions. Mutagenesis studies with luciferase (65), Acs (60), and the EntE self-standing adenylation domain (66), all demonstrate that residues on the A8 loop are important specifically for the thioester-forming partial reaction. As the A8 loop is rotated into the active site only in thioester-forming conformation, this supported the use of the conformation observed with Acs for the thioester-forming reaction.
To analyze more rigorously the effects of mutations on this enzyme family, the 4CBL enzyme was extensively mutated and subjected to kinetic analyses (9) that complement the structural characterization of the two conformations (57). Mutations were constructed in over 20 different residues of the 4CBL protein and were analyzed by steady-state kinetics. Rate constants were determined for the individual partial reactions for a subset of 10 of the mutant enzymes. The mutations were studied for their effects on k1 and k2, the forward rate constants for the adenylation and thioesterification steps of the reaction, respectively. Not surprisingly, mutations in the ATP binding pocket had dramatic effects on k1. Mutations in two residues that are located on the “thioester-forming face” of the C-terminal domain that stack against the adenine ring of CoA (Trp440 and Phe473) showed no effect on the adenylation partial reaction but 200-fold decreases in k2, the rate of the thioester-forming partial reaction.
The most interesting results from this study relate to His207 and Glu410. His207 is conserved as an aromatic residue as part of the A4 motif (38). The precise identity of this residue correlates with the subfamily within the ANL family with NRPS adenylation domains most often containing a phenylalanine, small chain acyl-CoA synthetases containing a tryptophan, and larger chain acyl-CoA synthetases and luciferase enzymes containing a histidine. The His207 residue of 4CBL exhibits a side chain torsional rotation in the two observed crystal structures. In the adenylate-forming conformation, the side chain is rotated towards the acyl carboxylate with a side chain χ1 torsion angle of −166°. In the thioester-forming conformation, the χ1 torsion angle is -−60°, with side chain rotated away from the carboxylate (Figure 6). Upon further consideration, this initially minor observation provided one explanation for the catalytic advantage of the domain alternation strategy.
In the thioester-forming conformation of 4CBL, His207 forms a hydrogen bond with Glu410, which is located on the A8 loop and follows a universally conserved glycine residue (Gly409 in 4CBL). The His207 mutation resulted in a ~100-fold decrease in both the k1 and k2 rate constants. The Glu410 mutation had no effect on k1 but caused an 800-fold decrease in k2 despite the fact that this residue does not contact any of the reacting ligands. These results were interpreted to demonstrate that the His207 played a role in both partial reactions, while the Glu410 residue played a role in only the thioester-forming step. The interaction between His207 and Glu410 in the second conformation (Figure 5C) was seen to be highly important for stabilizing this state of the enzyme (9).
The A8 motif contains a conserved hinge residue, most commonly an aspartic acid but also commonly present as a lysine. This residue undergoes main chain torsion angle rotations that are responsible for the majority of the difference between the two conformations. Indeed, the main chain dihedral angles of the neighboring residues are not altered between the two studies (67). The dynamics of this hinge residue has been explored structurally and kinetically by mutation of the 4CBL hinge residue, Asp402, to a proline residue (67). The ϕ/ψ angles of the proline residue allow the enzyme to adopt the adenylate-forming conformation; indeed, the crystal structure of the D402P mutant showed the overall fold of the enzyme was nearly identical to the wild-type enzyme in conformation I. The restraints imposed by the proline ring however prevent the enzyme from easily transitioning from conformation I to the thioester-forming conformation. Kinetic measurements showed that while the rate of the adenylation partial reaction was reduced ~3-fold, the rate constant for the thioester-forming reaction was reduced by four orders of magnitude.
Careful consideration of the structural and kinetic results provides insight into the role of the domain movement in the catalytic cycle. The side chain torsion of the A4 aromatic residue is conserved in the homologous crystal structures. In the adenylate-forming conformation, the A4 motif aromatic residue is located close to the carboxylate carbon where it positions this moiety to attack the α-phosphate and displace PPi. The χ1 side chain torsion angle for the aromatic residue is −176±6° in six members of the family that were crystallized in the adenylate-forming conformation bound with the adenylate intermediate or with ligands that mimic this state. Three exceptions were observed: the yeast Acs (46), human medium chain acyl-CoA synthetase (58), and the SrfA-C multi-domain NRPS (31), all of which lack both AMP and the acyl substrate. Thus it appears that when the acyl-adenylate is present, the aromatic residue from the A4 motif is directed at the carboxylate. In contrast, in the thioester-forming state, the A4 aromatic residue is rotated out of the active site and exhibits an average χ1 torsion angle of −59±14° in six different structures.
Examining the structures of the family members that contain CoA provides an explanation for this necessary rotation. The pantetheine tunnel is obstructed by the A4 aromatic residue in the side chain conformation observed in the adenylate-forming conformation (Figure 6). The rotation of the side chain allows the pantetheine group access to the carboxylate. The domain rotation brings the A8 loop into the active site where it provides an environment that promotes the rotation of the A4 aromatic residue to the −60° side chain orientation. The specific environment that the A8 residues provide depends on the identity of the A4 residue. In proteins that contain an A4 histidine, a glutamic acid side chain is most commonly present on the A8 loop to form a hydrogen bond with the histidine imidazole group. In proteins that contain a phenylalanine or tryptophan A4 residue, the A8 loop more commonly contains a tyrosine or histidine that promotes the hydrophobic environment.
Thus, one role of the domain alternation appears to be to provide a new environment for the aromatic residue of the A4 motif to remove it from the pantetheine tunnel and allow access to the adenylate intermediate. Associated with this movement in the 4CBL enzyme, there are numerous residues on the thioester-specific face of the C-terminal domain that are involved in binding the CoA nucleotide (57). Two hydrophobic residues sandwich the nucleotide adenine ring. Interestingly, the two other protein structures containing CoA, the bacterial Acs (52) and the human medium chain acyl-CoA synthetase (58), both bind to CoA in a manner that is different from 4CBL and from each other. In both of these structures the CoA nucleotide does not interact with the mobile C-terminal domain.
Considering the chemical requirements for the partial reactions performed by this enzyme family, one can understand the role that domain alternation plays in the catalytic mechanism. The adenylation reaction requires the nucleophilic attack of the negatively charged carboxylate from the acyl substrate on the negatively charged α-phosphate of ATP, displacing PPi. To accomplish this reaction, the enzyme must properly orient the acyl substrate. The enzyme thus appears to use the A4 aromatic residue to constrict the substrate binding pocket, properly positioning the carboxylate for nucleophilic attack on the α-phosphate. This required feature of the adenylation active site becomes a hindrance to the second partial reaction. In the thioester-forming reaction, the thiol of the CoA or pantetheine group must be able to approach the carboxylate carbon to displace the AMP leaving group and form the thioester linkage. Domain alternation provides the A4 residue with an appropriate new environment to induce its rotation out of the active site.
The domain rotation also creates the pantetheine tunnel, appropriate to bind the thiol substrate. In all three CoA bound structures, the backbone residues on the A8 loop, the universally conserved glycine and the residue preceding it, hydrogen bond to the amide nitrogen of the β-alanine moiety of the pantetheine. Additionally, as noted above, the C-terminal domain also can contain residues that are involved in binding the CoA nucleotide (57).
The structures of these proteins also provide insights into the timing of the domain rotation with respect to the catalytic cycle. Recent structures of members of the ANL family bound to ATP (51, 58) confirms an earlier proposal (57) that the β- and γ-phosphates would bind in a cavity filled by the A8 loop in the thioester-forming conformation. This prediction of the location of the binding position of the phosphates assumed the inline displacement of the PPi upon nucleophilic attack of the carboxylate on the α-phosphate. The steric clash between the A8 loop and the PPi binding pocket dictates that PPi must be released prior to the domain rotation from the adenylate-forming to the thioester-forming conformation. Additionally, the rotation of the A4 aromatic residue suggests that the pantetheine group cannot bind productively to the enzyme in the adenylate-forming conformation. The aromatic side chain occludes the thiol of the pantetheine from approaching the active site and suggests that the domain rotation to the thioester-forming conformation must precede binding of CoA.
Much structural and biochemical evidence exists for the domain alternation hypothesis in the acyl-CoA synthetases. Limited evidence supports a role in the A8 loop specifically in the thioester-forming reaction of the NRPS adenylation domains (66). Very recently the crystal structure of a four domain NRPS was determined (31). The 1274 residue SrfA-C protein contains a full NRPS module organized as condensation, adenylation, peptidyl carrier, and thioesterase domains. This remarkable structure illustrated that the adenylation domain was positioned in a conformation that was similar, though not identical, to the adenylate-forming conformation (Figure 7A). The C-terminal domain is rotated away from the N-terminal domain by ~40° compared to the PheA structure, resulting in a more open active site. The active site of the adenylation domain contains the amino acyl substrate leucine, but does not contain AMP.
The PCP domain of the SrfA-C protein is positioned to interact productively with the upstream condensation domain (Figure 7B). The serine residue on which the pantetheine would be placed was mutated to an alanine to produce homogeneous apo-protein. Nonetheless, the serine residue is close enough to the condensation domain active site that the structure observed is likely the conformation used in the condensation domain reaction. In contrast, the PCP domain is not positioned where it could donate the pantetheine arm to either the adenylation domain or the thioesterase domain. The authors note that a conformational rearrangement may be required to reposition the PCP to interact with the alternate catalytic sites. The C-terminal domain rotation of the adenylation domain was suggested as an attractive candidate to play a role in the progression of the nascent peptide from the active site of one catalytic domain to the next. Extensive interactions exist between the condensation domain and the N-terminal sub-domain of the adenylation domain, which may serve as a “foundation” for each module. Upon completion of the adenylation partial reaction, the rotation of the C-terminal sub-domain of the adenylation domain could be used to transport the pantetheine cofactor into the adenylation domain active site where it would become amino acylated in the thioesterification partial reaction using the conformation observed in Acs. Release of the loaded PCP domain, the thioester product of the adenylation domain complete reaction, would accompany a rotation of the adenylation C-terminal sub-domain back to the conformation observed in PheA or SrfA-C. This would enable the delivery of the aminoacylated pantetheine cofactor to the upstream condensation domain for peptide bond formation. The conformational mechanism to direct the substrate to the downstream thioesterase domain remains to be determined.
This required domain rearrangement was modeled in a recent structural report of DltA (55). This structure was determined in the thioester-forming conformation bound to AMP. Extensive manual modeling was performed to generate a potential cycle of the DltA-catalyzed reaction (55). The reaction imitated in the unliganded, apo-state in which the C-terminal domain was in the open orientation seen in the original luciferase structure (39). The C-terminal domain of DltA was then modeled into the orientation observed in PheA to mimic the adenylation partial reaction. The orientation of the triphosphates of modeled ATP is consistent with the subsequently determined structure of the human medium chain Acs bound to ATP (58). The structure of DltA in the thioester-forming conformation was also used to model the interaction of DltA with the carrier protein DltC using the pantetheine group from the structure of Acs (52). The other constraint used in modeling the interaction derived from the distance between the C-terminus of the adenylation domain and the N-terminus of the carrier protein, which we had suggested to be a maximum of 20-25Å if the peptide linker were fully extended (66). This value was determined by comparison of the sequences of the linker joining the adenylation and PCP domains in multidomain NRPSs. In the structure of the SrfA-C multi-domain NRPS, the distance between the termini is 14Å.
Crystallographic studies of multi-domain NRPS enzymes have been challenged by the difficulty in creating a conformationally uniform population of protein molecules. The SrfA-C structure does show the overall architecture of a complete termination module of an NRPS (31) however it was necessary to mutate the phosphopantetheine binding site to obtain a uniform apo population of protein. Mutation of the hinge residue of the adenylation domains is one way to reduce the conformational flexibility of these large multi-domain enzymes. The structure of the D402P mutant of 4CBL demonstrates that the proline mutation forces 4CBL to adopt the adenylate forming conformation (67). A similar mutation to the hinge of NRPS adenylation domains may be a useful tool for reducing the conformational flexibility of adenylation domains and may allow the crystallization of larger, multi-domain NRPS proteins.
The domain alternation hypothesis is presented as a strategy that enzymes of the ANL superfamily have adopted that allows them to catalyze the two-step adenylation and thioesterification reactions. As other ligases use an adenylation step to activate an acyl substrate, one might ask if other enzymes use a similar mechanism to stabilize the adenylation partial reaction and then allow the nucleophilic displacement of AMP in a second step.
A detailed description of other adenylate forming ligases, including amino acyl-tRNA synthetases (16, 68), NIS synthetases (17, 69), and ubiquitin ligases (70), is beyond the scope of this review. An analysis of the structures of multiple members of these families at different steps along the reaction coordinate identifies in some cases limited domain movements however the domain alternation strategy described for the ANL family of enzymes seems unique.
Why then have the ANL enzymes adopted the domain alternation catalytic strategy when other unrelated enzymes apparently accomplish similar chemistry without the dramatic conformational change? In these ligases, the energetically difficult step in the reaction is the initial nucleophilic attack of the carboxylate substrate on the α-phosphate of ATP. The ATP is held firmly in place through specific interactions with the triphosphate, the ribose hydroxyls, and the adenine base. The challenge to the ligase then is to direct the carboxylate substrate to the α-phosphate to promote the adenylate formation. It is reasonable to propose that the earliest members of the ANL enzyme superfamily were acyl-CoA synthetases that use chemically uninteresting fatty acids. Because the acyl substrates did not contain additional groups that the enzyme could use to position the carboxylate, the ANL enzymes evolved the A4 aromatic residue to position the carboxylate substrate and use domain alternation to allow access to this atom in the thioester-forming reaction.
The enzyme ligases mentioned above do not require a domain alternation strategy because their carboxylate substrates (amino acids in amino acyl-tRNA synthetases, di- or tricarboxylate compounds like citrate or α-ketoglutarate in the NIS synthetases, or protein molecules for ubiquitin ligases) contain numerous functional groups that can provide specific binding interactions between the adenylating enzyme and the nucleophilic substrate. The presence of these groups obviates the need to tightly surround the carboxylate carbon, leaving this atom accessible for attack in the second partial reaction. Even the glycyl-tRNAGly synthetase enzyme (71), which has a relatively simple amino acyl substrate, uses multiple interactions, including a hydrogen bond to the glycine α-proton, to position the carboxylate substrate precisely. These multiple interactions leave the glycine carboxylate carbon open for attack by the tRNA acceptor chain in the second partial reaction.
The importance of dynamics and conformational changes to proteins is now well established and allows enzymes to shield reactive intermediates and to induce the alignment of substrates and reactive catalytic groups from the protein. Several examples of large scale domain rotations have been described that are distinct from the simple closing of catalytic loops that are often on the order of 10-20°. Generally, conformational changes that involve domain rotations larger than 50° are used to transport a substrate or intermediate between active sites. What makes the domain alternation of the ANL enzyme family unique is that this strategy allows the enzyme to present two different faces of a single protein domain to a single active site that catalyzes both reactions.
The extensive structural and functional investigations of the ANL enzyme family over the last decade have not only identified this catalytic strategy but have also explained the catalytic advantage that is derived from the use of the domain alternation. This is yet another example of the ways that enzymes continue to fascinate us with the ability to use unpredictable mechanisms to catalyze challenging chemical reactions. That this understanding explains observations first made over 40 years ago is particularly satisfying and points once again to the value of the complementary approaches of detailed kinetic and structural investigation.
Finally, the study of multiple members of this enzyme family has provided tremendous insights into the catalytic cycle of the modular NRPS proteins. These assembly line proteins require the transfer of substrates between different catalytic domains and the recent structural advances demonstrate that the pantetheine cofactor is not sufficiently long to act solely as a swinging arm to transport the substrates. Instead, it appears that coordinated conformational movements are required to carry out this elaborate dynamic cycle. The domain alternation strategy of the NRPS adenylation domains is likely a necessary component of this modular protein family that ensures proper delivery of the bound intermediates to the catalytic domains.
I am very grateful to Drs. Debra Dunaway-Mariano and Bruce Branchini for helpful comments and discussion throughout the course of our study of the ANL family of adenylating enzymes. I would also like to thank the members of my laboratory, Albert Reger, Eric Drake, Manish Shah, Jesse Sundlov, and Carter Mitchell, for their work on this project and their contributions made to the understanding of these enzymes.
†Work from my laboratory was supported in part by NIH grant GM-068440.