|Home | About | Journals | Submit | Contact Us | Français|
This chapter describes structural and associated enzymological studies of polyketide synthases, including isolated single domains and multidomain fragments. The sequence–structure–function relationship of polyketide biosynthesis, compared with homologous fatty acid synthesis, is discussed in detail. Structural enzymology sheds light on sequence and structural motifs that are important for the precise timing, substrate recognition, enzyme catalysis, and protein–protein interactions leading to the extraordinary structural diversity of naturally occurring polyketides.
The past decade has witnessed significant advances in PKS structural biology for different types of PKSs that help visualize polyketide biosynthesis at all stages, including chain initiation, elongation, reduction, cyclization, and chain termination. Using x-ray crystallography or NMR, these studies help correlate PKS three-dimensional structures with substrate specificity and also help in elucidating the sequence–function–structure relationships that predict product outcome. With the recent publication of the striking porcine FAS crystal structure, we summarize advances in PKS structural biology during the past decade, and compare different types of FASs and PKSs, as well as their individual enzyme domains.
The major methods used to determine PKS protein structures are NMR and x-ray crystallography. With the current advances in NMR instrumentation, software, and pulse sequence development, the upper limit of protein molecular weight for a given target can be as high as 80 kDa (Redfield, 2004; Zeeb and Balbach, 2004). Various pulse sequences offer a further powerful probe for protein conformational change, as well as residues that are important for protein–protein interaction and protein-ligand binding (Redfield, 2004; Zeeb and Balbach, 2004). The application of these NMR methodologies to PKS enzymology is presented in section 4 on the structure of the acyl-carrier protein (ACP). The general methodology for x-ray crystallography includes protein expression/purification, crystallization, crystal harvesting, x-ray diffraction, data processing, structure solution, model refinement and model validation. While mammalian fatty acid synthase (FAS) architecture can be visualized by electron microscopy (EM) (Asturias et al., 2005), PKS flexibility has resulted in highly variable conformations as detected by electron microscopy (Grant Jensen, personal communication), and such flexibility may also contribute to the relative difficulty in obtaining PKS megasynthase crystals (as compared to crystallizing the mammalian FASs). Nevertheless, the majority of single and didomain PKS fragments can be cloned into standard expression vectors (such as pET vectors), expressed in large quantity (up to 200 mg/l culture), and purified to more than 95% purity using Ni-affinity chromatography. The pure single and didomain PKS fragments can then be crystallized using standard vapor diffusion methods (Derewenda, 2004; McPherson, 2004; Rupp and Wang, 2004), followed by crystal harvest in cryoprotectant and cooling in liquid nitrogen (Pflugrath, 2004). Robotic automation of crystallization has greatly accelerated the discovery of initial crystal leads (Bard et al., 2004). Due to the high structural homology to FAS domains, many PKS domains have been solved by molecular replacement, using programs such as CNS (Brunger et al., 1998) or the CCP4 suite (Xx, 1994), although heavy-atom methods are also applied using programs such as SOLVE (Terwilliger, 2004). Protein models are then built and refined by programs such as Coot (Emsley and Cowtan, 2004), further refined by programs such as CNS (Brunger et al., 1998) or the CCP4 suite (Xx, 1994), and verified by programs such as PROCHECK (Laskowski et al., 1993). As discussed in detail in Section 4, the conformational flexibility of many PKS domains renders the binding of enzyme inhibitors or substrates very useful to stabilize protein conformation for crystallization and improve diffraction data quality. However, the reactivity of these PKS enzyme domains sometimes also results in difficulty in the detection of substrate electron density maps. The above issues in the methodology of PKS structural enzymology, as well as the application and subsequent outcome of these methodologies, are discussed in details in Sections 3 and 4.
The fatty acid synthase (FAS) is a multidomain protein complex (Maier et al., 2006) consisting of seven conserved protein domains (MAT, KS, KR, DH, ER, ACP and TE) that catalyze more than 50 reactions en route to the final fatty acid product. FASs can be classified as type I or type II (Fig. 2.1A). The crystal structures of all type II FAS domains have been solved, including the type II KS, MAT, KR, ER, DH, ACP, and TE domains (Leesong et al., 1996; Olsen et al., 1999; Price et al., 2001; Serre et al., 1995), and White et al. have provided an excellent review of type II FAS structural biology (White et al., 2005). The crystal structure of a type I FAS TE domain, as well as the NMR structure of a type I FAS ACP domain, have also been reported (Chakravarty et al., 2004) and reviewed (Smith and Tsai, 2007). The recently published full-length mammalian type I FAS, solved to 3.2 Å, has greatly expanded our knowledge about the complicated domain–domain interactions in the megasynthase (Maier et al., 2008) (Fig. 2.1B, C), showing a homodimer that confirms the head-to-head model based on electron microscopy (Asturias et al., 2005) and biochemical results ( Joshi et al., 1997, 1998). The porcine FAS is separated into two portions: the lower condensing portion which contains the KS and MAT domains, and the upper chain-modifying portion which contains the DH, ER, and KR domains (Fig. 2.1B, C). Two additional nonenzymatic domains, termed ψMe (an inactive methyltransferase) and ψKR (a truncated KR), lie at the periphery of the upper portion (Fig. 2.1B, C).The central core of the X-shape architecture of type I FAS consists of KS, DH, and ER domains from both monomers, with an extensive dimer interface between KS, DH, and ER domains. The dimer is held together by KS–KS, DH–DH and KS–DH interactions. The head-to-head arrangement also implies that the ACP in either monomer (A or B) can interact with active sites of both monomers, consistent with previous biochemical studies ( Joshi et al., 1997; Witkowski et al., 2004). The DH–KS interaction connects the top and bottom of the X-shaped architecture, and strongly suggests that the DH fold is important in maintaining FAS architecture. The AT and KR domains extend from the bottom and top of the protein, respectively, to form two asymmetric reaction chambers, one on either side of the central KS–DH–ER core. The flexible linker regions between KS–AT, AT–DH, DH–ER and ER–KR (Fig. 2.1C) facilitate the opening and closing of each reaction chamber, thus allowing the ACP-bound fatty acyl intermediate access to the active-site of each domain. Although there is no solid proof, the type II FAS is proposed to adopt a similar X-shaped architecture (Smith and Tsai, 2007) due to the high degree of conservation between type I and II FAS domains.
Based on domain architecture, there are at least four distinct PKS types: modular type I, iterative type I, type II, and type III PKSs (Shen, 2003). Extensive progress has been made on the structural biology of DEBS (the erythromycin PKS) (Khosla et al., 2007), including crystal structures of KS3–AT3 (Tang et al., 2007), KS5–AT5 (Tang et al., 2006), KR1 (Keatinge-Clay and Stroud, 2006), DH4 (Keatinge-Clay, 2008), and TE (Tsai et al., 2001), and an NMR structure of ACP2 (Alekseyev et al., 2007) (the number following the domain name abbreviation indicates the module number), as well as KR1 of the tylosin PKS (Keatinge-Clay, 2007) and the TE domain from the pikromycin PKS (PIKS) (Akey et al., 2006; Giraldes et al., 2006; Pan et al., 2002; Tsai et al., 2002). An NMR structure of the intermodular linker region in DEBS has also been reported (Weissman, 2006). For the type II PKS, extensive progress has been made on the structural biology of Streptomyces coelicolor MAT (Keatinge-Clay et al., 2003), actinorhodin (act) KS/CLF (Keatinge-Clay et al., 2004), act KR (Hadfield et al., 2004; Korman et al., 2004, 2008), the R1128 priming ketosynthase ZhuH (Pan, et al., 2002; Witkowski et al., 2004), and the NMR structure of holo and apo act ACP (Crump et al., 1997; Evans et al., 2008), frenolicin (fren) ACP (Li et al., 2003), and oxytetracycline (otc) ACP (Findlow et al., 2003). The crystal structures of the tetracenomycin (tcm) ARO/CYC, the whiE ARO/CYC, and the ZhuI ARO/CYC have been solved (Ames et al., 2008). Three fourth-ring cyclase structures have also been reported that are not included in this chapter (Kallio et al., 2006; Sultana et al., 2004; Thompson et al., 2004). For iterative type I PKSs, the unpublished crystal structures of the PksA PT and TE domains are discussed briefly. The linker regions in type I PKSs have been reviewed elsewhere (Smith and Tsai, 2007; Weissman, 2006).
The crystal structures of DEBS AT3 and AT5, S. coelicolor MAT, E. coli MAT (FabD) and the porcine FAS AT domain have been reported (Keatinge-Clay et al., 2003; Serre et al., 1995; Tang et al., 2006, 2007). All five structures are highly similar, with an RMSD of 1.3 to 1.9 Å. The AT structure has two domains: the larger core subdomain is similar to an α/β-hydrolase fold, with a parallel β-sheet flanked on two sides by α-helices (Fig. 2.2A in grey); the smaller subdomain, an insertion typically between residues 130 to 200, has a ferredoxin fold that consists of a four-stranded antiparallel β-sheet capped by two helices (Fig. 2.2A in black). The active site lies in a cleft formed between the two subdomains. Protein–protein docking simulations were conducted using the DEBS AT3 and ACP3 homology models (Tang et al., 2007), although no apparent solution was obtained. The most important structural difference between the type I modular PKS AT domains and other MAT domains is that DEBS AT3 and AT5 have a much longer C-terminal helix (residues 857 to 867), presumably important for protein–protein interactions between AT and the KS-AT linker (Tang et al., 2006, 2007). Based on enzymological studies of the type II FAS MAT domain, a ping-pong bi bi mechanism is proposed for both type I and type II ATs of FAS and PKS that involves the active site Ser and His (Fig. 2.2B) (Ruch and Vagelos, 1973), while the 3-carboxylate of malonyl- or methylmalonyl-CoA can form charge–charge interactions with the side-chain of a highly conserved Arg (Keatinge-Clay et al., 2003; Rangan and Smith, 1997; Serre et al., 1995). In the S. coelicolor MAT structure, the backbone amides of Gln9 and Val98 were identified as the oxyanion hole (Keatinge-Clay et al., 2003). Remarkably, for both type I and type II AT domains, the acyl-enzyme complex is stable enough for detection (Dreier et al., 2001; Lau et al., 2000; Liou et al., 2003; Szafranska et al., 2002). The deacylation of acyl-enzyme occurs only in the presence of specific thiol nucleophiles such as phosphopantetheinylated ACP. This is a unique behavior among structurally related enzymes with an α/β hydrolase fold (Serre et al., 1995). In addition, based on the crystal structures of all five AT domains, the hydrophobic nature of the substrate-binding pocket should presumably discourage water binding in the cleft.
For type I modular PKSs, the loading AT domain is promiscuous and can accept acetyl-, propionyl-, isopropionyl-, isobutyryl-, crotonyl-, phenylacetyl-, hydroxybutyryl-, and isopentyl-CoA both in vivo and in vitro (Del Vecchio et al., 2003; Hong et al., 2005; Lau et al., 1999, 2000; Liou and Khosla, 2003; Liou et al., 2003). On the other hand, the AT domains in the extender modules (such as DEBS modules 1 to 6) are highly specific toward (2S)-methylmalonyl-CoA and the (2S)-methylmalonyl thioester of N-acetylcysteamine (NAC) (Lau et al., 1999; Liou and Khosla, 2003; Marsden et al., 1994). Therefore, the extending AT in DEBS serves as an important gatekeeper in macrolide biosynthesis (Khosla et al., 1999). Past biochemical and structural work identified four motifs to explain the observed AT specificity (Fig. 2.2A): (1) the “RVDVVQ” motif lies 30 residues upstream of the active-site Ser (Haydock et al., 1995; Yadav et al., 2003); (2) the GHSXG motif around the catalytic Ser (Haydock et al., 1995; Yadav et al., 2003); (3) the YASH motif 100 residues downstream of the active-site Ser (Haydock et al., 1995; Reeves et al., 2001), which based on a systematic mutational analysis of DEBS AT domains is the dominant substrate specificity motif in type I modular PKSs among motifs 1, 2, and 3 (Reeves et al., 2001); and (4) the C-terminal region shown to be important for substrate specificity from domain swapping experiments (Lau et al., 1999). A detailed review of the above four motifs can be found in reference (Smith and Tsai, 2007). The recent 3.2-Å porcine FAS structure further demonstrates that F682 (part of Motif 1) and F553 (part of Motif 3) form a hydrophobic cavity, which may allow M499 to flip in and out to accommodate both methylmalonyl and malonyl moieties. In conclusion, the specificity between malonyl- versus methylmalonyl-CoA (or methylmalonyl- vs. propionyl-CoA) is likely to be a combinatorial result of different structural elements that interact throughout the entire protein fold, rather than an influence of a limited number of residues.
The crystal structures of DEBS ketosynthase (KS) 3 (Tang et al., 2007) and KS5 (Tang et al., 2006), the actinorhodin KS/CLF (act KS/CLF) (Keatinge-Clay et al., 2004) and the R1128 priming KS (ZhuH) (Pan et al., 2002) will be compared with those of the type III PKS (Austin and Noel, 2003), type I (the porcine FAS) (Maier et al., 2008), and type II (the E.coli FabH, FabB, and FabF) KS domains (Olsen et al., 1999; Price et al., 2003; Qiu et al., 2005). All KS crystal structures from FASs and PKSs reveal a highly similar thiolase fold (Austin and Noel, 2003), consisting of two copies of α-β-α-β-α folds that form a five-layered core (2α-5β-2α-5β-2α): three layers of α-helices interspersed by two layers of β-sheet, with extensive connecting loops. A pseudo-twofold axis lies between Nα3 and Cα3 parallel to the dimer axis. The KSs can be divided into three subfamilies: (1) KAS I and II, including FabB, FabF, DEBS KS3 and KS5, the porcine FAS KS domain, and the act KS/CLF (Keatinge-Clay et al., 2004; Maier et al., 2008; Olsen et al., 1999; Price et al., 2003; Tang et al., 2006, 2007); (2) KAS III and the CHS-like type III PKS enzymes, including the priming KS ZhuH (Pan et al., 2002) from the R1128 PKS; (3) the biosynthetic and degradative thiolases. All three subfamilies conserve the core structural features, the extensive dimer interface, and the location of the active-site residues. There is also absolute conservation of the active-site Cys for covalent attachment of substrates and intermediates. However, there are major structural differences among the three subfamilies, mainly concerning the extent and structure of the loops on the opposite side of the substrate-binding pocket. These loops affect the position and identity of key catalytic residues (except the universally conserved Cys) as well as the different substrate chain-length specificities for CoA-linked or ACP-linked thioesters (Austin and Noel, 2003). The KSs are dimers with an extensive dimer interface stabilized by a pair of hydrogen-bonded, antiparallel β-strands, which creates a 14-stranded β-sheet spanning both monomers. In the DEBS KS5 structure, there is a long helix at the N-terminal that may facilitate KS dimerization and serve as a docking point for the upstream C-terminal docking domain of the previous module 4 (Tang et al., 2006). In the type I FAS and PKS, KS deletion experiments show that the dimeric nature of the KS domains is key in facilitating dimerization of the entire megasynthase (Witkowski et al., 2004).
Similarly to the well-studied FabB and FabF KSs of E. coli, the extending KSs, including DEBS KS3 and KS5 and act KS/CLF, employ a Cys-His-His triad at the active center. The active-site triad and oxyanion hole of DEBS KS3 and KS5, act KS/CLF and porcine FAS KS domains can be overlapped perfectly, suggesting a similar catalytic mechanism (Witkowski et al., 2002; Zhang et al., 2006), initiated by the docking of acyl-ACP to an electropositive patch of KS. The proposed mechanism is similar between the priming (ZhuH) and extending (DEBS KS3 and KS5 and act KS/CLF) KSs, except that the catalytic triad of a priming KS consists of His-Asn-Cys. Furthermore, acyl-CoA, rather than acyl-ACP, first binds to ZhuH. The Asn versus His difference between the priming and elongation KSs helps explain why inhibitors such as cerulenin and thiolactomycin preferentially bind the elongation KSs (White et al., 2005).
Despite a similar thiolase fold and enzyme mechanism, the specificity of type I and II FAS and PKS KSs varies significantly. The type I FAS KS domains are highly specific toward saturated acyl chains (Witkowski et al., 2002). In contrast, type I modular KS domains, such as the six KS domains in DEBS, have a wide range of substrate specificities that vary in length from diketide to decaketide, although some PKS KS domains appear to possess some specificity with regard to different β-carbon status (Khosla et al., 1999). The type II systems possess highly specific chain-length control that is pathway-specific for the priming and extending KSs. Examples include the C2–C4 priming preference of ZhuH (Pan et al., 2002; Qiu et al., 2005), or the act, tcm, and whiE KS/CLF which extend the polyketide chain to 16, 20, and 24 carbons, respectively (Tang et al., 2003). The observed substrate specificity of each KS domain can be explained by the size and shape of the KS substrate-binding channel, which can be divided into two halves, corresponding to the substrate-binding and PPT-binding regions (Fig. 2.3). The PPT-binding region stretches from the enzyme surface to the active-site Cys and this region is relatively well conserved, reflecting its universal role in binding the PPT moiety. In contrast, the acyl-binding region varies significantly. While the binding pockets of FAS KS domains are hydrophobic and promote fatty acyl binding (Maier et al., 2008; Olsen et al., 1999; Price et al., 2003; Qiu et al., 2005), the acyl-binding pockets of PKS KS domains (such as DEBS KS3 and KS5, act KS/CLF and ZhuH) are amphipathic and allow hydrogen-bonding interactions with the carbonyl groups of the growing polyketide chain (Keatinge-Clay et al., 2004; Pan et al., 2002; Tang et al., 2006, 2007). The FAS KS domains have a hydrophobic, narrow pocket of a suitable size to specifically accommodate their corresponding fatty acyl substrates (Fig. 2.3B) (Maier et al., 2008; Olsen et al., 1999; Price et al., 2003; Qiu et al., 2005), whereas the substrate pockets in DEBS KS3 and KS5 are much wider (Fig. 2.3F) (Tang et al., 2006, 2007), consistent with the substrate tolerance reported for type I modular PKS KS domains. In contrast, both priming and extending KSs of the type II PKS are more substrate-specific, reflected by the narrower acyl-binding pocket, similar to those in FAS KS domains. Significantly, mutations of four residues that define the bottom of the acyl pocket in act and tcm KS/CLF confirmed that the pocket size and shape indeed control polyketide-product chain length (Tang et al., 2003), with mutations in CLF being sufficient to alter chain length.
Three ketoreductase (KR) crystal structures have been reported: for type I modular PKSs, crystal structures of the DEBS KR1 (EryKR1) (Keatinge-Clay and Stroud, 2006) and KR1 of tylosin PKS (TylKR1) (Keatinge-Clay, 2007) have been solved (Fig. 2.4A–B) and for type II PKS, the actinorhodin KR (act KR) crystal structure has been reported (Hadfield et al., 2004; Korman et al., 2004, 2008) (Fig. 2.4D). EryKR1 and TylKR1 reduce a diketide substrate C=O to C-OH with “2R, 3R” and “2R, 3S” stereochemistry, respectively, so these two KRs choose opposite diketide epimers (at the 2 position) for the reduction reaction. In contrast, the actinorhodin KR (act KR) specifically reduces the C9 carbonyl group of a 16-carbon (octaketide) preassembled polyketide chain, which folds into the C7–C12 first-ring cyclized shunt-product mutactin when expressed without downstream enzymes. The ketoreduction catalyzed by act KR, as well as by other type II PKS KR domains, is chemically identical to the corresponding fatty acid ketoreduction, yet with very different regio-specificities (O’Hagan, 1993).
In both FASs and PKSs, the type I KR has two domains with the same protein fold: the catalytic subdomain and a truncated, noncatalytic structural subdomain. Both EryKR1 and TylKR1 are monomeric in solution and in crystal structure (Keatinge-Clay, 2007; Keatinge-Clay and Stroud, 2006). In contrast, the type II KR exists as a tetramer (Fig. 2.4) (Hadfield et al., 2004; Korman et al., 2004), and each monomer contains a single domain. Each domain (or each subdomain in type I KR) contains a short-chain dehydrogenase/reductase (SDR) fold consisting of a highly conserved Rossmann fold with two right-handed α-β-α-β-α motifs connected by α3, and the core region consists of a seven-stranded β-sheet flanked by α-helices (Fig. 2.4). The cofactor NADPH is bound at the junction of two α-β-α-β-α motifs in a highly conserved groove characteristic of the Rossmann fold (Persson et al., 2003). The polyketide substrate-binding pocket consists of a large cleft formed by helices α6-α7 and the loops between α4 and α5 (Fig. 2.4). The catalytic subdomains of EryKR1 and TylKR1, as well as act KR, have the typical SDR motifs, such as the TGxxxGxG motif (residues 2 to 19), the D63 and NNAG motifs (residues 89 to 92), the active-site tetrad Asn–Ser–Lys–Tyr, and the PG motif (residues187 to 188) (Persson et al., 2003). The biggest difference between the type I and II PKS KRs is a long insertion between helices 6 and 7 for act KR, and this may account for the different substrate specificities of type I and II PKS KRs. The monomeric type I KR orients its two subdomains (Fig. 2.4A–B) very similarly to a type II KR dimer (Fig. 2.4D), with extensive, mainly hydrophilic, interactions. The structural subdomains in EryKR1 and TylKR1 lack a cofactor-binding motif and the substrate-binding portion, thus rendering the structural subdomain inactive (Keatinge-Clay, 2007; Keatinge-Clay and Stroud, 2006). The type I KRs have additional β1-β8 and α4-αF interactions bridging the structural and catalytic subdomains and stabilizing the pseudodimeric KR protein fold.
The act KR active-site tetrad consists of N114–S144–Y157–K161 (42 to 44). In contrast, the Asn and Lys positions are switched in EryKR1 and TylKR1 (K1776–S1800–Y1813–N1817) (Keatinge-Clay, 2007; Keatinge-Clay and Stroud, 2006). The KR tetrad lies near the nicotinamide ring of NADPH, where Tyr and Lys form hydrogen bonds with the NADPH ribose and nicotinamide ring. In act KR, four crystalline water molecules form extensive hydrogen bonds with N114 and K161 (Fig. 2.5A). These waters form a proton-relay network that is very similar to the one observed in E. coli FabG-NADP+ (Price et al., 2004), leading to the hypothesis that the water-relay mechanism for FabG may also be applicable to act KR (Fig. 2.5A). In vitro assays indicate that type II PKS KRs have a different substrate specificity from that of FAS (Dutler et al., 1971; Joshi and Smith, 1993) and modular type I PKS KRs (Ostergaard et al., 2002), both of which can reduce linear and monocyclic ketones. The strong preference of act KR toward bicyclic polyketides supports the conclusion that its natural substrate is a cyclic polyketide, and detailed kinetic analysis showed that act KR proceeds through an ordered bi bi mechanism (Korman et al., 2008), in which the cofactor NADPH binds KR prior to the substrate trans-1-decalone. The above results imply that the first ring is cyclized prior to ketoreduction, thus sterically constraining the ketoreduction and leading to a highly specific C9-reduction. The inhibitor emodin-bound act KR structure also revealed that act KR can exist with at least two different conformations (Korman et al., 2008): open and closed forms that differ in the 10-residue loop region (residues 199 to 209) between helices 6 and 7 (Fig. 2.5B). EryKR1 and TylKR1 also show similar conformational changes and loop movement in this region (Keatinge-Clay, 2007; Keatinge-Clay and Stroud, 2006), which may reflect different binding motifs during ketoreduction (Keatinge-Clay, 2007; Korman et al., 2008).
The stereoselective signature motifs for the modular PKS KRs were previously proposed to be “LDD” and PxxxN (Caffrey, 2003; Reid et al., 2003), and the presence or absence of these motifs produce the 3R or 3S stereomer, respectively. The EryKR1 crystal structure shows that the 93–95 “LDD” motif lies in a loop adjacent to the active site (Keatinge-Clay and Stroud, 2006), and KR1 mutation indeed resulted in a switch of alcohol stereochemistry (Baerga-Ortiz et al., 2006; O’Hare et al., 2006). The 2-position stereochemistry is also affected by KR and its upstream KS domain. Based on extensive bioinformatic analysis guided by the crystal structures of EryKR1 and TylKR1, Keatinge-Clay categorized the type I KRs into six types to explain their observed stereochemistry at the 2- and 3-positions and developed a protocol to assign substituent stereochemistry accordingly (Keatinge-Clay, 2007). Similar studies on type II KRs (Korman et al., 2008) support the hypothesis that type II KR substrate specificity is defined by a combination of enzyme conformation, specific molecular interactions between the substrate and active-site residues, as well as substrate and protein flexibility due to the dynamic nature of the binding cleft.
DEBS DH4 catalyzes dehydration of a 2R-methyl-3R-OH pentaketide to afford a trans double bond (Keatinge-Clay, 2008). The recent report of the 1.8-Å DEBS DH4 crystal structure, combined with the porcine FAS structure, reveals that the DH domain in type I PKSs and FASs consists of two subdomains with limited sequence homology, yet each subdomain consists of the “hot-dog-in-a-bun” fold (Dillon and Bateman, 2004). The double hot-dog (DHD) fold exhibited by type I FAS and PKS DHs is similar to that of the dimeric type II bacterial DHs (such as E. coli FabA and FabZ ( Kimber et al., 2004; Leesong et al., 1996)), consisting of one hot-dog fold per monomer (Fig. 2.6A–B). The E. coli and human TE II are also reported to contain the DHD fold (Li et al., 2000). In each hot-dog subdomain, the long central helix—the hot-dog—is packed against a seven-stranded antiparallel β-sheet that forms the bun. In both the porcine FAS and DEBS DH4 structures, the two hot-dog subdomains of DH interact extensively to form a 14-strand β-sheet with additional interactions between the hot-dog helices from each subdomain. In DH4, the catalytic H44 lies in the first subdomain as part of the HXXXGXXXXP motif (Joshi and Smith, 1993), where the conserved Gly is necessary to make a turn that enables van der Waals interactions between H44 and P53. The catalytic D206 within the DXXX(Q/H) motif lies in the second subdomain and hydrogen-bonds with the side chain of Q210 (itself anchored to Y158). The organization and interaction of catalytic residues described for DH4 are conserved in the porcine FAS DH. Additionally, both the “GYXYGPXF” and “LPFXW” motifs are highly conserved and help define the substrate pocket.
The reaction catalyzed by type I and II FAS DHs is freely reversible with equilibrium favoring hydration (Heath and Rock, 1995; Witkowski et al., 2004). In type I PKS DHs, the equilibrium may also lean toward the hydrated polyketide, and a downstream KS or TE may pull the reaction forward toward the dehydrated polyketide (Tang et al., 1998; Wu et al., 2005). However, there is no obvious sequence motif associated with DHs that dehydrate substrates with or without α-substituents, indicating that the α-substituents may not be recognition factors for the PKS DHs. A catalytic mechanism was proposed for DH4 (Fig. 2.6D) in which H44 serves as the active-site base to deprotonate the α-carbon, while the β-hydroxyl group may be polarized by the helix-1 dipole, facilitating water elimination. The DH4 structure shows that the stereochemistry of the β-hydroxyl group in an incoming polyketide substrate may be the primary factor that determines if a cis or trans double bond is produced by the DH domain. For example, when an A-type KR provides DH with the 3S-OH substrate, the 3S-OH is hydrogen-bonded to D206 and the polyketide chain must rotate 120 degrees about the Cα–Cβ bond. Thus, an elimination results in the formation of a cis double bond, as in the phoslactomycin DH1 and DH2 (Palaniappan et al., 2008) or rifamycin DH10 (Tang et al., 1998). Sequence comparison indicates that if DH does catalyze epimerization, the Leu and Pro in the “LPFXW” motif may be important (Keatinge-Clay, 2008). Further work is necessary to distinguish the above hypotheses.
Based on previous studies of first-ring cyclization, there are at least three different classes of ARO/CYCs (Fig. 2.7A): (1) C9–C14 cyclization associated with monodomain ARO/CYCs, such as tcm ARO/CYC, WhiE-ORFVI (whiE ARO/CYC), and RemI (Alvarez et al., 1996; Fritzsche et al., 2008; Moore and Piel, 2000; Motamedi and Hutchinson, 1987; Zawada and Khosla, 1997); (2) C7–C12 cyclization in the absence of KR by monodomain ARO/CYCs such as ZhuI from the R1128 biosynthetic pathway (Tang et al., 2004), or didomain ARO/CYCs such as MtmQ from the mithramycin pathway (Lombó et al., 1996; Zhang et al., 2008) (3) the didomain ARO/CYCs associated with KR-containing type II PKSs that aromatize C7–C12 first-ring cyclized polyketides, such as the actinorhodin (McDaniel et al., 1995) and griseusin (Zawada and Khosla, 1997) ARO/ CYCs. The best-studied monodomain ARO/CYC is tcm ARO/CYC, which consists of the N-terminal 160 residues of the bifunctional protein, TcmN (McDaniel et al., 1995; Zawada and Khosla, 1999). Following the production of the linear decaketide intermediate, tcm ARO/CYC is proposed to fold, cyclize and aromatize the ACP-tethered polyketide via aldol condensation and dehydration reactions (Shen and Hutchinson, 1996). Based on genetic analyses, it had been argued that the first cyclization of a linear polyketide chain may occur either in the active site of KS–CLF (Keatinge-Clay et al., 2004), in solution (without enzyme catalysis) (Hertweck et al., 2004), or in the binding pocket of KR or ARO/CYC (Zawada and Khosla, 1999). Similar observations on more than ten aromatic PKSs (except in enterocin, discussed below (Hertweck et al., 2004)) have led to the general conclusion that KR promotes C7–C12 cyclization (Crump et al., 1997; McDaniel et al., 1994), whereas the monodomain ARO/CYCs in nonreducing type II systems promote C9–C14 cyclization. The crystal structure and mutagenesis of act KR and tcm ARO/CYC support the hypothesis that cyclization in the KR or ARO/CYC substrate pocket may be a likely event. However, further validation is necessary.
Three ARO/CYC structures have been solved: the tcm, whiE, and ZhuI ARO/CYCs, all three of which have a helix-grip fold (Gajhede et al., 1996; Iyer et al., 2001) consisting of a seven-stranded antiparallel β-sheet that partially surrounds (“grips”) a long C-terminal α-helix (Fig. 2.8). Two small helices between β1 and β2 form a helix-loop-helix motif that acts to seal one end of the β-sandwich. The topology of ARO/CYC is highly similar to that of members of the Bet v1-like superfamily (Iyer et al., 2001; Radauer et al., 2008), which commonly includes a large solvent-accessible pocket that binds small molecules such as phytosteroid hormones, lipids, enediyne, and cholesterol (Markovic-Housley et al., 2003; Mogensen et al., 2002; Pasternak et al., 2005; Radauer et al., 2008; Tsujishita and Hurley, 2000). However, unlike many Bet v1-like proteins with hydrophobic pockets, the ARO/CYC pocket is amphipathic with an approximately equal distribution of hydrophobic and hydrophilic residues. The ARO/CYC pocket dimensions and residue composition are appropriate for binding cyclized polyketide intermediates (Fig. 2.8E). Whereas the type II PKS ARO/CYC contains a helix-grip fold, all FAS DH (such as E. coli FabA) contain a hot-dog fold (Dillon and Bateman, 2004; Leesong et al., 1996) (Fig. 2.8A–B) that is topologically different but similar in shape. In the hot-dog fold, the β-sheets have strand-order 1–2–3–5–6–7–4, whereas the strand-order is 1–7– 6–5–4–3–2 for the helix-grip fold. Also, the central helix is tucked against the β-sheets in FabA, thus precluding formation of an interior pocket between the β-sheet and central helix αC. As multiple dehydration events are necessary in order to aromatize the rings formed during polyketide biosynthesis it has been suggested that PKS ARO/CYCs may act as DHs to catalyze aromatic ring formation (Hopwood, 1997). That both ARO/CYC and DH catalyze dehydration reactions in evolutionarily related complexes suggests that they may play similar biological roles implied by their similar topologies.
Based on site-directed mutagenesis, the tcm ARO/CYC crystal structure, and computer-simulated docking, a catalytic mechanism was proposed for tcm ARO/CYC (Ames et al., 2008) in which the polyketide carbonyl oxygens are anchored in close proximity to S67, R69, Y35, and R82, and subsequent aldocyclization is promoted by electrostatic stabilization of pocket residues (with special attention to the possible involvement of the strictly conserved pocket residue, R69). The above proposal is further supported by the crystal structures of whiE and ZhuI ARO/CYC (Fig. 2.8C, D). The whiE ARO/CYC promotes C9-C14 first-ring and C7-C16 second-ring cyclizations of a 24-carbon polyketide, and ZhuI is the only reported monodomain C7–C12 first-ring-only cyclase (Marti et al., 2000; Yu et al., 1998). The overall structure and residue composition are similar between whiE and tcm ARO/CYCs, but subtle conformational and residue changes in the whiE ARO/CYC pocket increases the pocket space (compared to tcm ARO/CYC) to accommodate the binding of a C24 polyketide. In contrast, ZhuI has a much smaller pocket that accommodates a monocyclic C7–C12 cyclized intermediate (Fig. 2.8C, D).
There are two observed folding patterns of aromatic polyketides which generally lead to unique cyclization regio-specificities (Thomas, 2001). S-type folding, promoted by type II PKSs, leads to C7–12 or C9–C14 first-ring cyclization depending on whether KR is present. In contrast, F-type folding patterns, promoted by the produce template (PT) domain of fungal nonreducing type I PKSs (Crawford et al., 2006, 2008; Udwary et al., 2002), include C2–C7, C4–C9, and C6–C11 first-ring cyclization. The 1.8-Å crystal structure and mutational analyses of the PksA PT domain (manuscript in preparation) revealed that PT also has a DHD fold (Fig. 2.6C) and nearly all secondary structure elements are aligned with both DEBS DH4 and porcine FAS DH domain with only 3 Å of RMSD. Significantly, similarly to ARO/CYC, PT has an interior pocket (Fig. 2.6C), and the reported F-fold patterns may be directly related to their corresponding PKS PT domain, in which the cyclization pattern is determined by pocket shape, while chain length is correlated with pocket size. In conclusion, PT may bind a fully extended linear polyketide that is “kinked” in the cyclization chamber to promote the F-folded cyclization pattern, while the ARO/CYC likely bends the ACP-bound polyketide into a hairpin, thus promoting the S-folded cyclization pattern.
Recently, the first solved structure of an acyl carrier protein (ACP) from a type I modular PKS was reported for DEBS ACP2 (Alekseyev et al., 2007). Similar to the type I FAS ACP structure, the 10-kD DEBS ACP2 contains a three-helical bundle, and an additional short helix in the second loop also contributes to core helical packing. The conserved Ser in the universal “DSL” motif (where PPT is covalently attached) lies at the N-terminal end of helix-2, which is regarded as a universal “recognition helix” involved in interactions with other proteins (Crump et al., 1997; Findlow et al., 2003; Li et al., 2003; Reed et al., 2003; Zhang et al., 2003). Homology models of DEBS ACP domains (ACP1–6) (Alekseyev et al., 2007) suggest that protein–protein recognition of ACP domains is highly specific for their corresponding KS domains (Chen et al., 2006). Similar results were reported for type II PKS ACP domains, such as the act apo-ACP (Crump et al., 1997), the fren holo-ACP and otc ACP (Findlow et al., 2003; Li et al., 2003). These results are consistent with the “switch blade” theory based on the yeast FAS crystal structure (Leibundgut et al., 2008), in which the growing acyl chain (the blade) switches its nestling cavity from ACP to the KS (or other PKS enzymes) binding pocket, and the timing of blade switching is closely related to the degree of exposure between the polyketide intermediate and the solvent, which depends not only on the ACP “recognition helix” property, but also on the chemical structure of a given polyketide intermediate.
Three PKS thioesterase (TE) structures have been reported: the DEBS TE (Tsai et al., 2001), and the homologous pikromycin (PIKS) TE (Akey et al., 2006; Giraldes et al., 2006; Tsai et al., 2002), and the PksA TE in a nonreducing iterative type I PKS from aflatoxin biosynthesis (manuscript in preparation). All three structures exhibit the classic features of the α/β hydrolase fold (Fig. 2.9A–B), which consists of a central seven-stranded β-sheet with the second strand (β2) antiparallel to the remaining strands (Chakravarty et al., 2004; Giraldes et al., 2006; Tsai et al., 2001, 2002). While the active-site triad (Ser-His-Asp) and nearly all important secondary structure components are highly conserved, substrate specificity and regio-specificity vary significantly among different TEs. The PKS TE domains lack the characteristic αD helix of the α/β-hydrolase family, instead having an inserted “lid” region that is consistently observed in the DEBS (Tsai et al., 2001), PIKS (Akey et al., 2006; Giraldes et al., 2006; Tsai et al., 2002), surfactin synthetase (SrfA-C) (Bruner et al., 2002), fengycin synthetase (FenB) (Samel et al., 2006), and enterobactin (EntF) (Frueh et al., 2008) TE structures. The lid region is also the most variable region among the TE domains (Fig. 2.9C). Because the substrate-binding region of the megasynthase TE is formed between the α/β-hydrolase core and the lid region inserted between β6 and β7, variability of the lid region is reflected in the highly variable substrate channel shape among different TEs. In the modular DEBS and PIKS TEs (Akey et al., 2006; Giraldes et al., 2006; Tsai et al., 2001, 2002), an unusual, 20-Å long amphipathic substrate channel passes through the entire protein, implying passage of the substrate through the protein (Fig. 2.9C) with the catalytic triad that carries out macrolactonization of a hydrophilic polyketide substrate located at the center. In contrast, the PksA TE substrate pocket adopts a closed conformation, sealed from both ends of the channel, to turn the pocket into a sealed hydrophobic “cyclization chamber” and protect the polyketide substrate from hydrolysis. The flexibility of the lid region is evident when a series of PIKS TE structures solved from crystals at different pH values were compared (Tsai et al., 2002), showing that the size of the substrate channel increases with increasing pH. The variable channel geometry is proposed to influence the regio-selectivity between the hydrolysis and macrocyclization/Claisen cyclization activities of type I PKS TEs. The crystal structures of DEBS and PIKS TE help rationalize the observed substrate specificity of modular PKS TEs (Giraldes et al., 2006; Tsai et al., 2001, 2002), and the chemical structures of the polyketide substrates are as important as the TE substrate residues in determining cyclization versus hydrolysis activity (Gokhale et al., 1999; Lu et al., 2002). Further work is necessary to fully determine residues important for PKS TE substrate specificity.
Recent revelations from crystallographic analysis of type I and type II PKSs have raised awareness of the extraordinary architecture of these megasynthases and offered a new perspective in visualizing some of the unsolved questions concerning polyketide biosynthesis. The 3.2-Å porcine FAS crystal structure has provided a framework for megasynthase architecture that may also apply to type I PKSs (Maier et al., 2008). Further, the DEBS KS-AT structures clearly show that the KS-AT and post-AT linkers are highly structured and closely interacting with both KS and AT domains, so that the linker regions contribute extensively to stabilization of the overall KS-AT structure (Tang et al., 2006, 2007). Clearly, we need to reconsider the original notion that the linkers merely serve as semi-flexible tethers that hold adjacent domains in proximity (Gokhale et al., 1999). In the future, studies of interdomain, intermodule, and interpolypeptide linker regions in type I PKSs should further determine their importance to dimer formation, polyketide chain transfer, and reaction timing. In the arena of type II PKSs, the detection of multienzyme complex formation should help shed light on how the ACPs gain access to each of the PKS enzyme domains. Because of the dynamic nature of this complex, x-ray crystallography, if successful at all, may only trap one snapshot of such a transient complex, and other techniques such as electron microscopy or NMR may be necessary to capture a series of protein motions during different stages of polyketide biosynthesis. The early successful application of electron microscopy to capture the FAS dynamic structures (Asturias et al., 2005) can serve as an excellent example for a similar study with type I and type II PKSs.
Our sincere thanks to David Hopwood, Chaitan Khosla, Joel Bruegger, Pouya Javidpour, and Ming Lee for their helpful suggestions and critical reading of the manuscript. Sheryl Tsai is supported by the Pew Foundation, the American Heart Association (0665164Y), and National Institutes of Health R01GM076330 and R21GM077264.