Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Biochemistry. Author manuscript; available in PMC 2011 January 12.
Published in final edited form as:
PMCID: PMC2806640



Prolyl 4-hydroxylases (P4H) catalyze the posttranslational hydroxylation of proline residues and play a role in collagen production, hypoxia response, and cell wall development. P4Hs belong to the Fe(II)/αKG oxygenases and require Fe(II), α-ketoglutarate (αKG), and O2 for activity. We report the 1.40 Å structure of a P4H from Bacillus anthracis, the causative agent of anthrax, whose immunodominant exosporium protein BclA contains collagen-like repeat sequences. The structure reveals the double stranded β-helix core fold characteristic of Fe(II)/αKG oxygenases. This fold positions Fe-binding and αKG-binding residues in what is expected to be catalytically-competent orientations and is consistent with proline peptide substrate binding at the active site mouth. Comparisons of the anthrax-P4H structure with Cr-P4H-1 structures reveal similarities in a peptide surface groove. However, sequence and structural comparisons suggest differences in conformation of adjacent loops may change the interaction with peptide substrates. These differences may be the basis of substantial disparity between the KM values for the Cr-P4H-1 vs. the anthrax and human P4H enzymes. Additionally, while previous structures of P4H enzymes are monomers, Bacillus anthracis P4H forms an α2 homodimer and suggests residues important for interactions between the α2 subunits of the α2β2 human collagen P4H. Thus the anthrax-P4H structure provides insight into the structure and function of the α subunit of human-P4H, which may aid in the development of selective inhibitors of the human-P4H enzyme involved in fibrotic disease.

Prolyl 4-hydroxylase (P4H) enzymes are involved in the post-translational formation of trans-4-hydroxyproline (Hyp) from peptidyl proline. In plants, P4H enzymes act on prolines present in extensins, in proline-rich proteins, and in arabinogalactan proteins to form hydroxyproline-rich glycoproteins (HRGPs) that stabilize plant cell walls (1, 2). Vertebrates have two types of P4H enzymes with different functions. One form of P4H (HIF-P4H) hydroxylates proline residues in the hypoxia-inducible transcription factor (HIF), and is responsible for changes in gene expression in response to hypoxic conditions (3). The second form of vertebrate P4H (C-P4H) is responsible for modification of the third proline of (Gly-Pro-Pro)n repeats in collagen and collagen-like proteins. In collagen the hydroxylation of proline to Hyp is essential to stabilize the collagen triple helical structure (4). This process occurs in the lumen of the endoplasmic reticulum (5) and is the rate limiting step in collagen biosynthesis. For this reason human-P4H has gained much attention as a potential therapeutic target for inhibitors of fibrotic disease.

Mammalian C-P4H enzymes exist as α2β2 tetramers. The alpha subunit consists of an N-terminal peptide substrate binding domain (6) and a C-terminal catalytic domain, which is responsible for proline hydroxylation. The beta subunit has protein disulfide isomerase activity and is responsible for retention and solubility of the alpha subunit in the endoplasmic reticulum. Without the beta subunit, the alpha subunits aggregate (5) and formation of the tetramer is required for stability and activity. Although there is a structure of the peptide binding domain of the human type I C-P4H alpha subunit (Gly138-Ser244 (5, 7)), no structures of the full length alpha subunit of human type I C-P4H have been reported. The lack of this knowledge hinders understanding of substrate positioning, the hydroxylation mechanism, and the design of inhibitors. Structure determination of the full-length human type I C-P4H alpha subunit has been frustrated by insolubility of the alpha subunit in the absence of the beta subunit and the difficulty in generating α2β2 tetramer suitable for crystallographic studies. The structure of the other vertebrate P4H, HIF-P4H-2, has been determined (PDB:2G1M), but was found to belong to a structural subgroup distinct from C-P4H enzymes (8, 9).

P4Hs belong to a class of enzymes known as the α-ketoglutarate dependent non-heme iron oxygenases (Fe(II)/αKG oxygenases) (10). In all Fe(II)/αKG oxygenases, αKG undergoes decarboxylation to form succinate and CO2 concomitant with the hydroxylation of the substrate, which is proline in the case of P4H enzymes (Scheme 1) (11, 12). The structures of the Fe(II)/αKG-dependent oxygenases all share a common motif, a double-stranded β-helix core fold also termed a jellyroll or double Greek key motif (13). These enzymes require Fe(II) for catalytic activity, which binds in a conserved facial triad binding site, His1-X-Asp/Glu-Xn-His2 (10).

Scheme 1
Proposed reaction mechanism of collagen-P4H. Hydroxylation of the substrate is coupled to the decarboxylation of α-ketoglutarate to yield succinate and CO2.

Several structures are available of a P4H catalytic domain from Chlamydomonas reinhardtii (Cr-P4H-1). This algal enzyme shares 30% sequence identity to the C-terminal catalytic domain of the type I human C-P4H alpha subunit and has provided the first insights into the structure and function of the catalytic domain of P4H enzymes. Complexes of the algal P4H with Zn(II) and inhibitor 2,4-dicarboxylate pyridine have revealed the metal- and cofactor-binding sites (9) within a jellyroll core fold, as expected for an Fe(II)/αKG-dependent oxygenase. Very recently a structure of this same algal P4H complexed with a (Ser-Pro)5 peptide substrate also elucidated the binding interactions that place the proline substrate in the proper location in the active site (14). While these structures have significantly advanced our understanding of P4H active sites, the algal P4H is strictly monomeric and thus is silent on the relationship of quaternary structure interactions that might be informative in terms of the human enzyme.

A C-P4H from Bacillus anthracis (anthrax-P4H) has been identified and characterized as an α2 homodimer (unpublished work). Like the monomeric Cr-P4H-1, anthrax-P4H also shares a 30% sequence identity to the C-terminal end of human C-P4H. Anthrax-P4H is the first C-P4H found in prokaryotes, and its specific role in B. anthracis is currently being investigated. B. anthracis is an endospore-forming, gram-positive bacterium that is the causative agent of anthrax. Under starvation conditions, the vegetative cells develop a hardy spore whose immunodominant protein is the glycoprotein BclA (Bacillus collagen-like protein of anthracis) (15). BclA possesses an internal collagen-like region of Gly-Xaa-Yaa repeats of varying length, many which contain Gly-Pro-Thr triplets. BclA has also been shown to assemble into triple helical structures similar to animal collagens (16). Together, this information hints that the physiological role of anthrax-P4H may be associated with sporulation.

In addition to similarities in their quaternary states, in vitro studies have shown that anthrax-P4H can bind the substrate (Gly-Pro-Pro)10 with a KM value similar to human type I C-P4H (unpublished data). In contrast, Cr-P4H-1 acts on this common substrate with a KM 83-fold higher than the human enzyme (17). This functional similarity and presence of higher oligomeric state make the anthrax-P4H an improved model for human C-P4H. We report here the X-ray crystal structure of the homodimeric anthrax-P4H and interpret the data to provide insights into the structure and function of human-P4H.


Cloning, Expression, and Purification of Selenomethionine Labeled Anthrax-P4H

Cloning, expression, and purification of anthrax-P4H and selenomethionine anthrax-P4H (SeMet-anthrax-P4H) was performed as described previously (18).

MALDI-TOF mass spectrometry analysis revealed the mean incorporation of 3 selenium atoms per anthrax-P4H monomer out of a potential 4 methionine residues.

Protein Crystallization

The crystallization of SeMet-anthrax-P4H was described previously (18). Briefly, the purified SeMet-anthrax-P4H was concentrated to 24 mg/mL in 50 mM Tris buffer (pH 7.4) containing 150 mM KCl, and 5 mM β-mercaptoethanol. The hanging-drop vapor diffusion method was used at 20 °C under aerobic conditions. Crystals grew from a mixture of equal volumes of protein in the solution described above and well solution [16% (w/v) PEG-8000, 40 mM potassium phosphate (monobasic), pH 4.0–4.2 and 20% (v/v) glycerol]. A typical size of 0.8 × 0.5 × 0.1 mm was reached over 1 to 2 weeks. Crystals were cryo-cooled by direct immersion into liquid nitrogen.

Data Collection

A three-wavelength anomalous dataset of the SeMet-anthrax-P4H was collected at beamline BL9-2 at the Stanford Synchrotron Radiation Laboratory (SSRL) to 1.4 Å resolution, and was solved using multi-wavelength anomalous dispersion (MAD) phasing (19). The data set was collected at remote, peak, and inflection wavelengths (0.912 Å, 0.9793 Å, 0.9795 Å). All images were processed using MOSFLM and SCALA of the CCP4 suite (20, 21). The space group was determined as P21 with cell dimensions a = 41.38 Å, b = 63.80 Å, c = 98.50 Å, β = 98.74°. Data collection statistics are summarized in Table 1.

Table 1
Data collection and refinement statistics for (SeMet) anthrax-P4H.

Structure Determination and Validation

Shelx (22) was used to calculate the heavy atom substructure. Phases were calculated using SOLVE (23), which was followed by electron-density modification using RESOLVE (24). The model was built using COOT (25) and Arp/wArp 6.1 (26, 27) and crystallographic refinement performed with the program REFMAC5 (28). Refinement statistics are summarized in Table 1. The final structure was validated using Procheck (29), WHATIF (30), and Molprobity (31). The secondary structure was assigned using DSSP (32). The Ramachandran plot showed 90.5% of the amino acids in the most favored region and 9.5% in the allowed region. All molecular figures were produced with the program PyMOL (33).


Composition of the Structure

Anthrax-P4H crystallized with two molecules in the asymmetric unit. The N-terminal portions of both molecule A (residues Met1 – Asn10) and B (residues Met1 – Lys11) had poor electron density and could not be modeled. In addition, there is one disordered loop region present in both monomers. This missing region corresponds to residues Arg66–Arg73 for molecule A, and Arg66-Asp74 for molecule B. Thus the final models consist of amino acids Lys11-Ala65 and Asp74-Lys216 for molecule A and Glu12-Ala65 and Val75-Lys216 for molecule B. As molecules A and B are very similar, differing by a core RMSD of only 0.285 Å, only molecule B is discussed in most cases.

Overall Structure

The core structure of anthrax-P4H consists of a double stranded β-helix (DSBH or “jelly roll”) motif (Figure 1), characteristic of the larger family of Fe(II)/αKG-dependent oxygenases (13). This DSBH core is composed of eight β-strands (I-VIII), which typically form the structural basis for positioning a conserved triad of iron-binding residues in the active site. The major sheet of the DSBH motif is comprised of β-strands I, VIII, III, and VI, while the minor sheet is made up of β-strands II, VII, IV, and V. In anthrax-P4H the major sheet of this conserved core is extended at the N-terminus by three additional β-strands (β1′, β1, and β2) and at the opposite end by a short two-residue β-strand (β5). Additionally the DSBH is flanked by three α-helices (α1-α3) and three 310 helices (α1′-α3′). Three of these helices (α1, α2, α2′) form one side of the DSBH major sheet, as commonly seen among the Fe(II)/αKG-dependent oxygenases (13). Additional helices are located on the solvent exposed loop between II/III, in a loop created by VI/VII, and prior to β5.

Figure 1
Structure of the anthrax-P4H monomer. The eight β-strands of the DSBH core fold are shown as gold arrows labeled with Roman numerals, with strands I, VIII, III, and VI making up the major sheet while strands V, IV, VII, and II make up the minor ...

Active site of Anthrax-P4H

In αKG/Fe(II)-oxygenases the residues responsible for binding Fe(II) are found within the minor β-sheet of the DSBH fold as the facial triad within the conserved motif H1-X-D/E-Xn-H2 (13). In anthrax-P4H these facial triad residues are His127, Asp129, and His193 (Figure 2A). The H1-X-D of this motif is located at the end of the second strand of the DSBH (II) motif, and the distal histidine (H2) is located at the start of the seventh strand of the DSBH (VII) motif. Positioning of the Asp129 side chain is constrained by hydrogen bonds with Thr145 (2.7 Å) of III and Trp209 (3.2 Å) of VIII. The imidazole ring of His193 is oriented by interactions between the δ nitrogen and two different backbone carbonyls, while the imidazole of His127 is less constrained by interactions with the rest of the protein. Comparison of the anthrax-P4H structure with the Cr-P4H-1 and the αKG/Fe(II)-dependent taurine dioxygenase (TauD) reveals that the positions of these three facial triad residues are highly conserved (Figure 2B), as would be expected to preserve its’ iron-binding function.

Figure 2
Active site of anthrax-P4H. A, Anthrax-P4H active site highlighting facial triad residues (cyan sticks), residues that coordinate the facial triad Asp129 (green sticks), bound glycerol (green sticks), an active site water (cyan sphere), and other important ...

In the current structure, all three facial triad side chains interact with electron density consistent with a well-defined glycerol molecule in the active site (Figure 2A). In addition, there is also one water molecule present in each active site that interacts with the side chains of Thr159, Lys203, and Tyr124. TauD is the closest structure with Fe and αKG in the active site and alignment with anthrax-P4H reveals that the glycerol and water molecule locations within the anthrax-P4H active site overlaps the regions where Fe and αKG would be expected to bind in the catalytically competent enzyme. The central oxygen of the glycerol molecule occupies essentially the same space as iron in the TauD structure and interacts with all three facial triad residues in nearly the same manner. Another of the glycerol oxygens interacts with both His193 and Asp129 to form a very stable arrangement.

In TauD the 2-oxo and the adjacent carboxylate oxygens both coordinate the active site iron. The latter carboxylate oxygen also interacts with the facial triad Asp and Arg270 (Figure 2B). In the anthrax-P4H active site each of these interactions with αKG are likely to be preserved, except that the residue corresponding to Arg270 is the much shorter Thr207, which does not extend far enough to interact with αKG. Instead in the anthrax structure, Trp209 occupies this location and is likely to fulfill the role of interacting with αKG (Figure 2A). In the TauD structure the carboxylate at the opposite end of αKG forms a bidentate interaction with Arg266, but also interacts with a water molecule and Thr126. In anthrax-P4H, the active site water occupies the approximate location of the carboxylate C5 and interacts with Tyr124. It also interacts with a conserved Thr (159), and Lys203, which replaces Arg266. Lys203 is a conserved basic residue corresponding to Lys493 in human type I C-P4H, which is essential in binding αKG at the 5-carboxylate moiety (34).

Putative Peptide Binding Groove

Since P4H enzymes catalyze the hydroxylation of prolines in collagen or collagen-like peptides, understanding the binding of these substrates is critical to understanding protein function. The binding mode of such peptides to P4H enzymes has been poorly understood, but recently a structure of Cr-P4H-1 was determined with bound (Ser-Pro)5 (14). This structure confirmed that a number of residues with previously suspected roles in collagen-like peptide recognition/binding (9) were indeed found in the peptide binding groove including Arg93, Ser95, Glu127, Tyr140, Arg161, and His245 (Cr-P4H-1 numbering). The corresponding residues are strictly conserved in anthrax-P4H (Arg79, Ser81, Glu111, Tyr124, Arg142; anthrax-P4H numbering), suggesting that comparison of these structures can be used to draw inferences about peptide binding in anthrax-P4H.

At the position where peptide binds within a groove of Cr-P4H-1, a distinct groove also extends across the face of anthrax-P4H. In anthrax-P4H this groove extends from Phe85 of β5 to Lys125 of II (Figure 3A). This groove is of sufficient size for a large peptide substrate such as (GlyProPro)10 to bind. Viewed in cross section, the groove in the current anthrax-P4H structure appears deeper than the corresponding groove in the algal P4H (Figure 3B vs. 3C), though this may be because the algal enzyme is missing more residues due to poor/missing density (Supplemental Figure 1). The anthrax-P4H groove is flanked by projections consisting of the 310 helix 3 (α3′) on one side and by the extended sequence between 310 helix 1 (α1′) and the unmodeled region on the opposite site. The unmodeled region may become ordered upon peptide substrate binding.

Figure 3
Peptide binding. A, Anthrax peptide binding groove in stereo with groove residues shown in yellow over the invagination into the active site with the glycerol shown in cyan sticks. B, Anthrax-P4H groove, side view from the β5 end with groove residues ...

This groove in the anthrax-P4H protein surface is centered above the active site such that an invagination of this surface connects this groove to the active site cavity containing glycerol and the facial triad (Figures 4A and 4D). The short channel between the groove and the active site proper contains a water molecule (Figure 3D). This water molecule is just distal to the facial triad His127 and hydrogen bonds to facial triad residue Asp129 and to Arg142. An overlay of the Cr-P4H-1/(Ser-Pro)5 peptide complex with the anthrax-P4H structure reveals that there is good overall complementarity between the conformation of the peptide and the anthrax-P4H surface (Figure 3D). Additionally, the substrate proline that would be hydroxylated is positioned directly over the active site cleft. Finally, a channel water overlaps with Cγ of the ring of this proline. In the algal-P4H structures an equivalent water is displaced by peptide binding. Additionally in structures of the Fe(II)/αKG enzymes clavaminate synthase, factor inhibiting hypoxia inducing factor (FIH), and TauD a similarly located water is displaced by binding of the respective substrates (13). Thus, this water marks the proposed site of the proline that is hydroxylated by anthrax-P4H.

Figure 4
Oligomeric state of anthrax-P4H. A, Overview of interactions between molecule A (green) and molecule B (blue) with the facial triad residues shown as magenta sticks. B, Detailed view of half of the symmetrical network of interactions between molecule ...

Individual amino acids composing the anthrax-P4H groove include Ser81, Phe85, Glu111, His114, Tyr124, Lys125, Tyr128, Arg142, Lys163, Trp209, and Arg211 (Figure 3E). Four of these groove residues are completely conserved in the algal, human, and anthrax P4H proteins: Glu111, Tyr124, Arg142, and Trp209. The position and orientation of the Glu111 side chain is conserved in the anthrax structure compared with the algal structure, but does not appear positioned to interact directly with the peptide. Instead it hydrogen bonds with both Nε and NH2 of the adjacent conserved Arg142 and may stabilize the Arg142 interactions with peptide. In the anthrax structure, Arg142 interacts with two waters that overlap the peptide position in the algal complex. One of these waters is the water that overlaps the proline ring that is hydroxylated. A third conserved amino acid in the groove, Trp209 (Trp243 in algal), may also play an indirect role in peptide binding by packing against Arg142, in addition to hydrogen bonding with facial triad residue Asp129. The fourth conserved amino acid in the groove is Tyr124 (Tyr140, algal numbering). In the anthrax structure, this Tyr side chain is flipped by ~65° “down” toward into the active site adjacent to the glycerol. This side chain is flipped “up” toward the peptide substrate position in the algal apo, Zn/inhibitor, and Zn/peptide substrate structures. In the latter case, the side chain hydroxyl interacts with both the amide N of the proline being hydroxylated and the carbonyl of the next residue in the peptide substrate. However, in the SeMet and Zn algal structures, Tyr140 is flipped “out” toward the solvent-exposed surface. Thus the orientation of this residue may play an important role in binding and positioning of cofactors and substrate. This residue may be repositioned in anthrax-P4H when substrate is bound. The remaining (nonconserved) groove residues are generally positioned and oriented similarly in anthrax-P4H and the algal P4H. Since anthrax-P4H and human P4H have much lower KM values for (Gly-Pro-Pro)10 substrate compared to Cr-P4H-1, the remaining groove residues were examined to determine if any are conserved in the human and anthrax enzymes, but not the algal P4H, but none of the residues fit this profile.

Oligomerization of Anthrax-P4H

Anthrax-P4H is a dimer in solution (unpublished data) and crystallizes with extensive interactions between the molecules A and B in the asymmetric unit. The two molecules pack against each other so that the beta sheet cores form a two-layered extended beta sheet unit across the faces of both molecules (Figure 4A). The short V strand of the DSBH major sheet and the subsequent coil of one molecule pack up against the longer β1 strand on the major sheet side of the DSBH motif in the opposite molecule. The interface primarily consists of residues Gln31-Glu37 from one molecule (β1 side) interacting with Asn165-Arg171 from the other side (V side) in a symmetrical arrangement that buries 789 Å2 per monomer (Figure 4B). Smaller contributions to the interface are made by interactions between residues 19–22 in one monomer and 153–158 in the other monomer.

This dimer interface is stabilized by an extensive network of interactions including van der Waals interactions and hydrogen bonding. There are hydrogen bonds between backbone residues as the two beta sheets pack next to each other as well as a number of interactions mediated by side chains. In half of the symmetric interactions between monomers there are 9 direct hydrogen bonds and 4 additional interactions mediated by two water molecules (Figure 4B), for a total of 26 such interchain interactions in the dimer. The side chains of Asn21, Glu37, Ser167, His169, and Arg171 contribute to one or more of these interchain bonds. None of these side chains are conserved in the monomeric Cr-P4H-1, but all five are conserved or have a similar amino acid at each corresponding position in human P4H (Supplemental Figure 1). The human residues corresponding to anthrax Asn21 and Arg171 retain terminal amines at these positions with Arg and Lys residues, respectively. Anthrax Glu37 is substituted with Asp in human P4H, conserving the carboxylic acid moiety involved in dimerization with Ser167. Anthrax Ser167 is identical in human P4H. Finally, anthrax His169 is also a large nitrogen-containing heterocycle in the human enzyme, Trp. Additionally, the side chain of Ser34 forms an additional water-mediated interaction with itself across the dimer interface, but this residue is not conserved in human P4H.


Overall structure

Though Fe(II)/αKG-dependent oxygenases catalyze a wide range of reactions, all of the known structures contain most components of a characteristic core fold consisting of a double stranded β-helix (9, 13, 35). The structure of the Fe(II)/αKG-dependent oxygenase prolyl 4-hydroxylase from anthrax is also consistent with this core fold, containing a β-sandwich made of major and minor sheets each consisting of 4 anti-parallel β-strands. The second β-strand in the DSBH (II) motif of anthrax-P4H is short but ordered, unlike a number of oxygenases, including the algal P4H-1/Zn structure (PDB 2JIG), where it is missing altogether (13, 36). Thus, all eight β-strands of the DSBH motif are present in the anthrax-P4H structure. Within Fe(II)/αKG-dependent oxygenases, however, there is often significant diversity outside of this core fold.

Comparison of anthrax-P4H with other αKG/Fe oxygenases revealed that the closest structural homolog of anthrax-P4H is the algal P4H from C. reinhardtii, specifically the algal Cr-P4H-1 Zn/peptide structure (PDB 3GZE). These two enzymes have a Z-score of 23.2 using the DALI server and an RMSD of 1.47 Å. By comparison, structural alignment with HIF-P4H-2 involved in hypoxic response yielded a score of 13.6 and an RMSD of 2.53 Å. Anthrax-P4H and Cr-P4H-1 share some similar structural characteristics beyond the DSBH core fold (Figure 5), including the β2, α1, α2, and α2′ units. Similarly, neither structure has secondary structure after β sheet VIII of the DSBH. In some αKG/Fe(II) oxygenase structures this C-terminus is involved in oligomerization (37, 38), but Cr-P4H-1 is a monomer and the anthrax-P4H dimer involves other interactions discussed below.

Figure 5
Overall comparison of structures of anthrax-P4H (blue ribbons), the Cr-P4H-1/Zn complex (yellow ribbons), the Cr-P4H-1/Zn/inhibitor complex (orange ribbons), and the Cr-P4H-1/Zn/(Ser-Pro)5 complex (red ribbons with peptide in sticks with cyan carbon atoms ...

There are also many differences between the algal and anthrax structures that may contribute to the differences in their function and oligomeric states. The most substantial differences occur either on the face of the protein involved in peptide substrate binding or on the face of the protein involved in dimerization. As part of the dimerization face, anthrax-P4H has an additional β-strand (β1′) at its’ N-terminus that is not present in the algal structure, and β1 appears to be considerably longer than in the algal structure, although this may be due to lack of electron density at the N-terminus of in the algal structure. The internal region that could not be modeled in anthrax-P4H corresponds to a region that also could not be modeled in the algal P4H Zn complex, but which forms two β-strands (β3-β4) in the algal Zn/inhibitor and Zn/peptide complexes. Thus, it is likely that these strands are involved in peptide binding/recognition and may become ordered in anthrax-P4H when a peptide is bound. The loop between II in the minor sheet and III in the major sheet has also been noted to play a role in substrate binding/recognition and differs the anthrax-P4H and Cr-P4H-1 structures. In the algal structure this loop undergoes a 19 Å conformational change from an “open” conformation to a “closed” one upon binding of Zn(II) and an αKG analog inhibitor. In this region, the anthrax-P4H structure is similar to the “open” algal structure, with the central cavity is exposed to solvent. This loop in anthrax-P4H is four residues shorter than in Cr-P4H-1, which may result in less flexibility and conformational change upon binding of the cofactors or substrate. Finally, in anthrax-P4H the loop between IV and V is a tight hairpin turn similar to that seen with some members of Fe(II)/αKG-dependent oxygenases such as deacetoxycephalosporin C (DAOCS) (37). The Cr-P4H-1 has an extended loop at this position, formed by 15 additional residues, some of which form a short 310 helix (α3′) not present in the anthrax-P4H structure. Conservation of the structures begins after the loop.

Active Site

One of the roles of the DSBH motif is to provide a rigid scaffold for Fe(II) binding. The canonical facial triad residues involved in Fe(II) binding are conserved among all members of the αKG/Fe(II) oxygenases, including anthrax-P4H. In addition to the H1-X-D-Xn-H2 facial triad involved in Fe(II) binding, the Lys residue (Lys203, anthrax-P4H numbering) required for stabilizing the C5 carboxylate of αKG is also spatially conserved. In the anthrax structure a water occupies the approximate position where the C5 carboxylate is thought to bind, interacting not only with Lys203 and a conserved Thr, but also with Tyr124. In the anthrax structure the Tyr124 side chain is oriented with the hydroxyl directed into the active site. This differs significantly from the orientation of the corresponding Tyr140 in the algal structures. In the Cr-P4H-1/Zn complex the peptide backbone containing Tyr140 is positioned very differently so that this side chain is pointing toward solvent. However, in the algal apo, Zn/inhibitor, and Zn/peptide complexes, the backbone is rearranged similar to that seen in the anthrax structure, although the Tyr140 side chain is rotated by ~90° with respect to anthrax-P4H to interact with the (Ser-Pro)5 peptide backbone (9, 14) instead of projecting into the active site. It is likely that in the catalytic state of anthrax-P4H this side chain is rotated out of the anthrax active site to provide the space for αKG binding and to adopt a similar role in peptide substrate binding at the enzyme/substrate surface interface.

Self-Hydroxylation at the Active Site Leads to Inactivation

In anthrax-P4H and other members of the Fe(II)/αKG oxygenases, the addition of αKG and Fe(II) in the absence of O2 causes inhibition due to self-hydroxylation. The formation of the characteristic chromophore is attributed to the self-hydroxylation of a Phe, Tyr, or Trp residue in the active site (3941). The alkylated DNA repair protein (AlkB) is self-inactivated by hydroxylation of Trp178. Comparison of the AlkB structure (PDB 2FD8) with the current anthrax-P4H structure suggests that in the anthrax enzyme Phe178 (Figure 2B) may be the target for self-hydroxylation, a hypothesis that is currently being examined experimentally.

Peptide Binding Site

Comparison of the structures of anthrax-P4H and the algal-P4H shows grooves extending from β5 to II in both structures where peptide binds to Cr-P4H-1. Structural alignment of the Cr-P4H-1/(Ser-Pro)5 complex with the anthrax-P4H structure reveals significant complementarity between the peptide and the anthrax groove surface, as well as placing the proline to be hydroxylated in the right position for catalysis.

However several regions flanking this groove are the primary large-scale structural differences between the two enzymes and may be partially responsible for the very different KM values for the substrate (Gly-Pro-Pro)10. These regions may be involved in packing around large peptide substrates when they are bound. Adjacent to the groove region, β3-β4 loop is only ordered in the algal structures with Zn and inhibitor or with peptide substrate. In the algal inhibitor complex the β3-β4 unit is extended, and forms around the peptide when it is present. This region in anthrax-P4H may become ordered when substrate is present, but a number of indicators suggest that the interactions will be substantially different than in the algal enzyme. First, there are four amino acid residue deletions in anthrax-P4H compared to the algal P4H in both β3-β4 and II-III flexible loop regions. These deletions may make the loops in anthrax-P4H less flexible and shorter. Secondly, the sequence identity between anthrax-P4H and the Cr-P4H-1 in the remaining loop regions is minimal. Third, Koski et al. (14) suggests that two often-conserved sequences are structurally important in this region, but neither is highly conserved in the anthrax enzyme. However, viral P4H (Paramecium bursaria Chlorella virus-1 P4H), HIF-P4H-2 (PDB 1H2L), AlkB (PDB 2FD8), and PAHX (Phytanoyl-coa 2-hydroxylase) (PDB 2A1X) are also missing one or both of these sequences and where structures are available these enzymes still reveal a topologically similar β3-β4 loop region. Thus, either the loop conformation and not the specific sequence is important for peptide binding, or the specific interactions this region might make with the peptide is different in anthrax-P4H.

However, because both the β3-β4 and II-III loops are in a substrate-free open conformation in the anthrax-P4H structure, there are no observed interactions between the peptide substrate and residues in these loops. A substrate-bound anthrax-P4H structure would be expected to significantly increase our understanding of this interaction.


Members of the αKG/Fe(II)-oxygenase family of enzymes exist in a wide range of oligomeric forms. Structures of proline 3-hydroxylase (PDB 1E5S) and factor inhibiting HIF (PDB 1H2L) reveal dimeric oligomerization that are stabilized through hydrophobic interactions (42, 43). Members of the collagen P4H subfamily also exist in various oligomeric forms. Plant C-P4H from A. thaliana and viral P4H forms are monomeric, while C-P4H from C. elegans are heterodimers (44). Anthrax-P4H is the first C-P4H known to form an α2 homodimer, while human C-P4H is an α2β2 heterotetramer (4547). Little is known structurally about the α2 dimer interaction of human C-P4H. The structure of a portion of human C-P4H is known (PDB 1TJC) and forms tetratricopeptide repeat domains that are found in protein-protein interactions, but in C-P4H these repeats are thought to be involved in peptide substrate binding rather than dimerization (6, 7). Site-directed mutagenesis studies have suggested that the human α2 dimer may be stabilized by two intrachain disulfide bonds (C486-C511 and C276-C293) (48, 49). These disulfide bonds are required for tetramer assembly with PDI (49), and additional interactions such as those identified in the anthrax-P4H α2 dimer may also support interactions between human C-P4H alpha subunits.

In anthrax-P4H the extensive dimer interface buries almost 790 Å2 per monomer. The arrangement that packs the double-layered beta sheets alongside each other such that each β1 sheet interacts with the DSBH V sheet of the opposite monomer. A network of hydrogen bonding interactions exists between the monomers. Sequence alignment (Supplementary Figure 1) shows that five of the six amino acid side chains that form the anthrax α2 dimer interface are identical or homologous between anthrax and the human C-P4H α subunits, but not in the monomeric Cr-P4H-1. This strongly suggests that these residues are also likely to be important in formation of the human P4H α2 interactions within the larger α2β2 tetramer.

In conclusion we have obtained the crystal structure of dimeric anthrax-P4H. Although attempts to crystallize in the presence of metal, cofactor, or substrate have been unsuccessful to date, the apoenzyme retains the canonical Fe(II) and αKG binding sites and is consistent with proline peptide substrate binding adjacent to the active site. Additionally, the structure suggests that self-hydroxylation in the absence of substrate may modify Phe178. This enzyme is functionally and structurally similar to the human collagen P4H catalytic subunit, for which structures beyond the putative peptide binding site are not known. Since anthrax-P4H and human-P4H-1 have comparable binding affinity for (Gly-Pro-Pro)10, a structure of the anthrax complex with this substrate may provide additional insight into the details of substrate recognition, binding, and specificity among the different members of the C-P4H family. Finally, sequence and structural comparisons have identified a set of five amino acids proposed to play key roles in dimerization. Design of molecules that inhibit dimerization of human P4H is a potential direction for inhibition of excessive fibrotic diseases.

Supplementary Material



Crystals were grown and initially screened using the facilities of the Protein Structure Laboratory core facility at The University of Kansas. Portions of this research were carried out at the Stanford Synchrotron Radiation Laboratory, a national user facility operated by Stanford University on behalf of the United States Department of Energy Office of Basic Energy Sciences. The SSRL Structural Molecular Biology Program is supported by the Department of Energy, Office of Biological and Environmental Research, and by the National Institutes of Health, National Center for Research Resources, Biomedical Technology Program, and the National Institute of General Medical Sciences.

The abbreviations used are

prolyl 4-hydroxylase
P4H from Chlamydomonas reinhardtii
double stranded beta helix


The atomic coordinates and structure factors (code 3ITQ) have been deposited in the Protein Data Bank, Research Collaboratory for Structural Bioinformatics, Rutgers University, New Brunswick, NJ (

This work was supported, in whole or in part, by National Institutes of Health Grants GM079446 (JL), 5P20 RR17708 (COBRE Center in Protein Structure and Function) (JL), and T2 GM 08454 (MAC).


An alignment of the anthrax, human, and algal prolyl 4-hydroxylase amino acid sequences is provided as supplemental material. This alignment is annotated with conserved amino acids, secondary structure, important functional and structural features, and missing segments from the various structures available. This material is available free of charge via the Internet at


1. Hieta R, Myllyharju J. Cloning and characterization of a low molecular weight prolyl 4-hydroxylase from arabidopsis thaliana. J Biol Chem. 2002;277:23965–23971. [PubMed]
2. Tiainen P, Myllyharju J, Koivunen P. Characterization of a second arabidopsis thaliana prolyl 4-hydroxylase with distinct substrate specificity. J Biol Chem. 2005;280:1142–1148. [PubMed]
3. Myllyharju J. Prolyl 4-hydroxylases, key enzymes in the synthesis of collagens and regulation of the response to hypoxia, and their roles as treatment targets. Ann Med Apr. 2008;23:1–16. [PubMed]
4. Myllyharju J, Kivirikko KI. Collagens, modifying enzymes and their mutations in humans, flies, and worms. Trends Genet. 2004;20:33–43. [PubMed]
5. Hieta R, Kukkola L, Permi P, Pirila P, Kivirikko KI, Kilpelainen I, Myllyharju J. The peptide-substrate-binding domain of human collagen prolyl 4-hydroxylases - backbone assignments, secondary structure, and binding of proline-rich peptides. J Biol Chem. 2003;278:34966–34974. [PubMed]
6. Myllyharju J, Kivirikko KI. Identification of a novel proline-rich peptide-binding domain in prolyl 4-hydroxylase. EMBO J. 1999;18:306–312. [PubMed]
7. Pekkala M, Hieta R, Bergmann U, Kivirikko KI, Wierenga RK, Myllyharju J. The peptide-substrate-binding domain of collagen prolyl 4-hydroxylases is a tetratricopeptide repeat domain with functional aromatic residues. J Biol Chem. 2004;279:52255–52261. [PubMed]
8. McDonough MA, Li V, Flashman E, Chowdhury R, Mohr C, Lienard BMR, Zondlo J, Oldham NJ, Clifton IJ, Lewis J, McNeill LA, Kurzeja RJM, Hewitson KS, Yang E, Jordan S, Syed RS, Schofield CJ. Cellular oxygen sensing: Crystal structure of hypoxia-inducible factor prolyl hydroxylase (phd2) Proc Natl Acad Sci USA. 2006;103:9814–9819. [PubMed]
9. Koski MK, Hieta R, Bollner C, Kivirikko K, Myllyharju J, Wierenga RK. The active site of an algal prolyl 4-hydroxylase has a large structural plasticity. J Biol Chem. 2007;282:37112–37123. [PubMed]
10. Hausinger RP. Fe(ii)/α-ketoglutarate-dependent hydroxylases and related enzymes. Crit Rev Biochem Mol Biol. 2004;39:21–68. [PubMed]
11. Rhoades RE, Udenfriend S. Decarboxylation of alpha-ketoglutarate coupled to collagen proline hydroxylase. Proc Natl Acad Sci USA. 1968;60:1473–1478. [PubMed]
12. Kivirikko KI, Prockop DJ. Hydroxylation of proline in synthetic polypeptides with purified protocollagen hydroxylase. J Biol Chem. 1967;242:4007–4012. [PubMed]
13. Clifton IJ, McDonough MA, Ehrismann D, Kershaw NJ, Granatino N, Schofield CJ. Structural studies on 2-oxoglutarate oxygenases and related double-stranded b-helix fold proteins. J Inorg Biochem. 2006;100:644–669. [PubMed]
14. Koski MK, Hieta R, Hirsila M, Ronka A, Myllyharju J, Wierenga RK. The crystal structure of an algal prolyl 4-hydroxylase complexed with a proline-rich peptide reveals a novel burided tripeptide binding motif. J Biol Chem. 2009;284:25290–25301. [PMC free article] [PubMed]
15. Sylvestre P, Couture-Tosi E, Mock M. A collagen-like surface glycoprotein is a component of the bacillus anthracis exosporium. Mol Micro. 2002;45:169–178. [PubMed]
16. Boydston JA, Chen P, Steichen CT, Turnbough CL. Orientation within the exosporium and structural stability of the collagen-like glycoprotein bcla of bacillus anthracis. J Bacteriol. 2005;187:5310–5317. [PMC free article] [PubMed]
17. Keskiaho K, Hieta R, Sormunen R, Myllyharju J. Chlamydomonas reinhardtii has multiple prolyl 4-hydroxylases, one of which is essential for proper cell wall assembly. The Plant Cell. 2007;19:256–269. [PubMed]
18. Miller MA, Scott EE, Limburg J. Expression, purification, crystallization and preliminary x-ray studies of a prolyl-4-hydroxylase protein from bacillus anthracis. Acta Cryst F. 2008;64:788–791. [PMC free article] [PubMed]
19. Hendrickson WA. Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science. 1991;254:51–58. [PubMed]
20. Leslie AG. Integration of macromolecular diffraction data. Acta Cryst D. 1999;55:1696–1702. [PubMed]
21. Collaborative Computational Project, N. Acta Cryst D. 1994;50:760–763. [PubMed]
22. Sheldrick GM. A short history of shelx. Acta Cryst A. 2008;64:112–122. [PubMed]
23. Terwilliger TC, Berendzen J. Automated mad and mir structure solution. Acta Cryst D. 1999;55:849–861. [PMC free article] [PubMed]
24. Terwilliger TC. Maximum likelihood density modification. Acta Cryst D. 2000;56:965–972. [PMC free article] [PubMed]
25. Emsley P, Cowtan K. Coot: Model-building tools for molecular graphics. Acta Cryst D. 2004;60:2126–2132. [PubMed]
26. Perrakis A, Sixma TK, Wilson KS, Lamzin VS. Warp: Improvement and extension of crystallographic phases by weighted averaging of multiple refined dummy atomic models. Acta Cryst D. 1997;53:448–455. [PubMed]
27. Perrakis A, Morris RM, Lamzin VS. Automated protein model building combined with iterative structure refinement. Nat Struct Biol. 1999;6:458–463. [PubMed]
28. Murshudov GN, Vagin AA, Dodson EJ. Refinement of macromolecular structures by the maximum-likelihood method. Acta Cryst D. 1997;53:240–255. [PubMed]
29. Laskowski RA, MMW, Moss DS, Thornton JM. Procheck: A program to check the stereochemical quality of protein structures. J Appl Cryst. 1993;26:283–291.
30. Vriend G. What if: A molecular modeling and drug design program. J Mol Graph. 1990;8:52–56. [PubMed]
31. Lovell SC, Davis IW, Arendall WB, III, de Bakker PIW, Word JM, Prisant MG, Richardson JS, Richardson DC. Structure validation by c-alpha geometry: Phi, psi, and c-beta deviation. Proteins. 2003;50:437–450. [PubMed]
32. Kabsch W, Sander C. Dictionary of protein secondary structure: Pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 1983;22:2577–2637. [PubMed]
33. Delano WL. In: The pymol molecular graphics system. Scientific D, editor. San Carlos; CA, USA: 2002.
34. Myllyharju J, Kivirikko KI. Characterization of the iron- and 2-oxoglutaratebinding sites of human prolyl 4-hydroxylase. EMBO J. 1997;16:1173–1180. [PubMed]
35. You Z, Omura S, Ikeda H, Cane DE, Jogl G. Crystal structure of the non-heme iron dioxygenase ptih in pentalenolactone biosynthesis. J Biol Chem. 2007;282:36552–36560. [PMC free article] [PubMed]
36. McDonough MA, Kavanagh KL, Butler D, Serals T, Oppermann U, Schofield CJ. J Biol Chem. 2005;280:41101–41110. [PubMed]
37. Valegard K, van Scheltinga ACT, Lloyd MD, Hara T, Ramaswamy S, Perrakis A, Thompson A, Lee HJ, Baldwin JE, Schofield CJ, Hajdu J, Andersson I. Structure of a cephalosporin synthase. Nature. 1998;394:805–809. [PubMed]
38. Zhang Z, Ren JS, CLifton IJ, Schofield CJ. Chem Biol. 2004;11:1383–1394. [PubMed]
39. Henshaw TF, Feig M, Hausinger RP. Aberrant activity of the DNA repair enzyme alkb. J Inorg Biochem. 2004;98:856–861. [PubMed]
40. Ryle MJ, Liu A, Muthukumaran RB, Ho RYN, Koehntop KD, McCracken J, JrQue L, Hausinger RP. O2- and alpha-ketoglutarate-dependent tyrosyl radical formation in taud, an alpha-keto acid-dependent non-heme iron dioxygenase. Biochemistry. 2003;42:1854–1862. [PubMed]
41. Lindstedt S, Rundgren M. Blue color, metal content, and substrate binding in 4-hydroxyphenylpyruvate dioxygenase from pseudomonas sp. Strain p. J. 874. J Biol Chem. 1982;257:11922–11931. [PubMed]
42. Clifton IJ, Hsueh LC, Baldwin JE, Harlos K, Schofield CJ. Structure of proline 3-hydroxylase. Eur J Biochem. 2001;268:6625–6636. [PubMed]
43. Lee C, Kim SJ, Jeong DG, Lee SM, Ryu SE. Structure of human fih-1 reveals a unique active site pocket and interaction sites for hif-1 and von hippel-lindau. J Biol Chem. 2003;278:7558–7563. [PubMed]
44. Myllyharju J. Prolyl 4-hydroxylases, the key enzymes of collagen biosynthesis. Matrix Biol. 2003;22:15–24. [PubMed]
45. Kivirikko KI, Myllyharju J. Prolyl 4-hydroxylases and their protein disulfide isomerase subunit. Matrix Biology. 1998;16:357–368. [PubMed]
46. Pihlajaniemi T, Helaakoski T, Tasanen K, Myllyla R, Huhtala ML, Koivu J, Kivirikko KI. Molecular cloning of the beta-subunit of human porlyl 4-hydroxylase. This subunit and protein disulfide isomerase are products of the same gene. EMBO J. 1987;6:643–649. [PubMed]
47. Vuori K, Pihlajaniemi T, Marttila M, Kivirikko KI. Characterization of the human prolyl 4-hydroxylase tetramer and its multifunctional protein disulfide-isomerase subunit synthesized in a baculovirus expression system. Proc Natl Acad Sci USA. 1992;89:7467–7470. [PubMed]
48. John DCA, Bulleid NJ. Prolyl 4-hydroxylase-defective assembly of alpha-subunit mutants indicates that assembled alpha-subunits are intramolecularly disulfide-bonded. Biochemistry. 1994;33:14018–14025. [PubMed]
49. Lamberg A, Pihlajaniemi T, Kivirikko KI. Site-directed mutagenesis of the alpha subunit of human prolyl 4-hydroxylase. J Biol Chem. 1995;270:9926–9931. [PubMed]