IPCT crystallized in space group P21
= 154.9, b
= 83.9, and c
= 127.9 Å) with 6 molecules in the asymmetric unit using citrate as precipitant at pH 8.2 (Matthews coefficient [Vm
] is 2.6 Å3
, and solvent content is ~53%) (24
). The structure was solved by SIRAS based on an Hg derivative and refined to an Rfactor
of 23.3% and Rfree
of 27.0% at a 2.4-Å resolution. The final IPCT model comprises 1,183 amino acid residues, 204 water molecules, and 4 citrate molecules. Data collection and refinement statistics are shown in . No clear electron density was visible at both termini (16 to 19 amino acid residues in the N terminus and 10 to 12 residues in the C terminus, depending on the chain, were not built in the model), and for a flexible loop composed of residues 27 to 37. Attempts to soak or cocrystallize IPCT with its substrates were performed, but no well-defined electron density was observed in the calculated maps for either inositol-1P or CTP nearby the active site: instead, a citrate molecule was observed ().
Data collection and refinement statistics
Fig. 3. (a) Cartoon representation of IPCT monomer. The color scheme is according to the topology diagram shown in panel b. The active site loop is shown as a thicker purple loop herein and after, and citrate is drawn as sticks (carbon is yellow and oxygen red). (more ...)
We then searched for another crystallization condition without citrate and optimized one containing malonate as the precipitant at pH ~7. Under this condition, IPCT crystals belonged to space group P2, with 12 molecules in the asymmetric unit corresponding to a Vm
value of 2.5 Å3
and a solvent content of ~50% (24
). Soaking and cocrystallization experiments were also unsuccessful under this condition since no clear electron density for CTP and/or inositol-1P was visible in the active site. The final model comprises 2,419 amino acid residues, 1,171 water molecules, and 8 glycerol molecules and was refined to a 1.9-Å resolution with an Rfactor
of 20.6% (Rfree
of 23.7%). The electron density maps are clearly defined, apart from the N and C termini of all subunits (similarly to the citrate-IPCT, chains start at residues 16 to 19 and end up at residues 220 to 222 out of 232) and a loop. This loop (residues 27 to 37) is disordered in most chains and contains gaps in the final model, except in chains E and F. Interestingly, these two loops are quite close to each other in the crystallographic dimer (e.g., the distance between the Cα
atoms of Gly34E and Gly33F is only 6.6 Å), similar to the other pairs: A-B, C-D, and so on. The final models of citrate- and malonate-IPCT are almost identical, yielding root mean square deviation (RMSD) values between 0.24 and 0.69 Å for Cα
superposition of the different chains (14
). Subsequent descriptions will refer to chains E and/or F of the malonate-IPCT structure, unless otherwise stated.
The Ramachandran plot, as assessed with RAMPAGE (23
), shows that all nonglycine amino acid residues lie within allowed regions, with the exception of Arg91A and Asp109E for citrate-IPCT. PROCHECK (22
) analysis showed no bad contacts and a final model with a good geometry and stereochemistry.
Overall fold of IPCT and related structures.
Each monomer of IPCT has overall dimensions of ~45 Å by 45 Å by 40 Å and is organized into two domains—a core domain and a sugar-binding domain. The core domain (residues 16 to 135 and 173 to 222), consists of a central seven-stranded mixed β-sheet (β1
), with order 3214657, where β6
is antiparallel to the rest, surrounded by six helices, a fold reminiscent of the dinucleotide-binding Rossmann fold (28
). At one of its ends, the central β-sheet is topped by a two-stranded β-sheet (β5a
) that participates in a 30-residue-long stretch connecting strands β5
. One face of the central β-sheet is packed against helices α1
, and α4
, and the other face, which binds the nucleotide, stacks against helices α3
(this helix being typical of nucleotide binding proteins). The sugar-binding domain (residues 136 to 172) comprises a short antiparallel three-stranded β-sheet, which sits against the exposed face of the central β-sheet (where the nucleotide binds), with several conserved residues involved in the sugar-phosphate moiety recognition (e.g., Leu24, Gly174, and Trp216; IPCT numbering). A topology diagram drawn with the program TOPS is presented in (35
The citrate-IPCT crystallographic structure contains 6 molecules in the asymmetric unit, whereas the malonate one has 12 molecules. Despite the different types of packing in both crystal forms, the dimeric arrangements between two molecules (e.g., chains A-B, C-D, and E-F) are quite similar ( and d). According to PISA (18
), a dissociation energy (ΔGdiss
) of 6.4 kcal · mol−1
is predicted for this dimeric assembly in solution. The interface is mainly hydrophilic, with an area around 1,110 Å2
corresponding to ca. 11% of the total solvent-accessible area of each monomer. Most residues that contribute to the interface stabilization are located in loops between β1
and at the C-terminal loop. The interface is stabilized by 14 H-bonds, 2 of which are salt bridges. Several hydrogen bonds are established between main chain atoms of both chains, namely, between Arg41-Asp220, Gly43-Asp218, and Asp144-Gly43. Additional stabilization is achieved by side chain-main chain interactions between Arg50-Phe142, Thr221-Leu39, and Asp218-Gly44. Moreover, Arg41E/F (NH2
) is H-bonded to Glu147F/E (OE2). Molecular mass determinations by gel filtration indicate that there is a mixture of monomers and dimers in solution, with prevalence of the monomeric form. At this stage, we think that the relevance of the interface interactions in the crystal structure for the physiological assembly of the enzyme cannot be fully assessed before the structure of the whole IPCT/DIPPS bifunctional protein is characterized.
Comparison of one molecule of IPCT (monomer) with other structures in the Protein Data Bank (PDB) by using the DALI server (14
), reveals that the highest matches are found with the N-terminal domain of the bifunctional N
-acetylglucosamine-1-phosphate uridylyltransferases (GlmU) from different sources (PDB codes 1G97
, and 2V0I
) and with glucose-1-phosphate thymidylyl/uridylyl-transferases (PDB codes 2PA4
, and 2UX8
), showing Z
scores between 20 and ~18 and RMSDs ranging from 2.5 to 2.8 Å for ~190 superimposed Cα
atoms. Other cytidylyltransferases (CTP:phosphocholine-, 3-deoxy-manno-octulosonate-, and α-d
-glucose-1-phosphate-), with PDB codes 1JYL
, and 1TZF
, respectively, have Z
values from 17.7 down to ~14.5. Interestingly, GlmU is also a bifunctional enzyme, containing a homologous nucleotidyltransferase domain at the N terminus but with an acetyltransferase domain at the C terminus, which adopts a left-handed parallel β-helix (LβH) structure. The biological assembly of the bifunctional GlmU is a trimer, whereas most members of other nucleotidyltransferases are dimers or tetramers. These structure-related enzymes share between 15% and 22% of sequence identity with IPCT and show conservation in the core region, consisting of a mixed β-sheet of seven strands with the same order and orientation as IPCT, a fold typical of nucleotide-diphospho-sugar transferases (SCOP superfamily 53448).
Interestingly, the 32 best hits retrieved in a BLAST search using the sequence of A. fulgidus IPCT belong to hyperthermophilic or thermophilic organisms, probably reflecting the specialized function of IPCT in DIP synthesis (see Fig. S1 in the supplemental material); the best match relates to proteins of Thermococcus spp. that display sequence identities between 49% and 56%. As expected, the protein from A. fulgidus appears in a cluster dominated by members of the Euryarchaeota; curiously, three representatives of the domain Bacteria also cluster in this group, which is clearly separated from a second major group comprising bacteria of the order Thermotogales. The closest mesophilic homologues share ca. 30% sequence identity with A. fulgidus IPCT and belong to the bacterial genera Gluconacetobacter and Granulibacter. Whether these mesophilic counterparts are able to use inositol-1P and CTP remains elusive.
The active site.
A citrate molecule is observed in the active site pocket in three out of six molecules of IPCT crystallized using citrate as precipitant (chains A, D, and E) (). The citrate is establishing hydrogen bonds with the main-chain nitrogen atoms of Gly27, Leu28, Lys37, and Arg68. Its presence in only some chains might be related to the position of the disordered loop comprising Gly27-Lys37. This loop is not visible in most chains of both citrate- and malonate-IPCT structures, and, if visible, it adopts different conformations. This loop encompasses the signature sequence Gly-X-Gly-Thr-(Arg/Ser)-X4
-Pro-Lys of nucleotidyltransferases, and in IPCT, it starts at Gly27 and has the sequence Gly-Leu-Gly-Thr-Arg-Leu-Gly-Gly-Val-Pro-Lys. In the apo-RmlA structure (PDB code 1FZW
), this loop is also disordered but becomes traceable in the thymidine-containing complexes (PDB codes 1G2V
Attempts to obtain structures of IPCT in complex with substrates CTP, inositol-1P, and CTP:inositol-1P, by either cocrystallization or soaking experiments, did not succeed in both crystal forms. However, due to the core fold conservation among sugar nucleotidyltransferase structures, and based on the DALI results, we were able to identify the catalytic region of IPCT. The most homologous structure to IPCT, besides GlmU, is glucose-1-phosphate thymidylyltransferase (RmlA) from Pseudomonas aeruginosa
(RMSD of 2.57 Å for 188 Cα
). RmlA is involved in the synthesis of dTDP-l
-rhamnose, an important component in the cell wall of many microorganisms. Superposition of various structures of RmlA apo forms and complexes (dTTP, glucose-1-phosphate, and dTDP-d
-glucose) shows no relevant structural changes upon ligand binding, as also observed with other nucleotidyltransferase structures (). We identified the relevant catalytic residues of IPCT based on various structures of RmlA. Moreover, this structural analysis was extended to include other homologous nucleotidyltransferases, such as CTP:phosphocholine cytidylyltransferase from Streptococcus pneumoniae
(PDB code 1JYL
) and α-d
-glucose-1-phosphate cytidylyltransferase from Salmonella enterica
serovar Typhi (PDB code 1TZF
), leading to similar results. For the sake of simplicity, we will refer to the comparison with RmlA throughout the article.
Fig. 4. (a) Superposition of IPCT (cartoon in blue) with several structures of RmlA: apo form (PDB code 1FZW; ribbon in dark green), and in complex with different ligands (PDB codes 1FXO, 1G0R, 1G2V, 1G1L, 1G23, and 1G3L; ribbons in different shades of green). (more ...)
The active site of IPCT is thus located in a pocket formed by the sugar- and nucleotide-binding domains of the enzyme (). The pocket entrance comprises a series of polar or charged residues: Lys37, Arg68, Glu93, Lys163, Glu194, Ser198, and Asp220. The cavity is formed by residues from strands β1, β2, β4, and β6 (namely Leu24, Ala25, Val65, Ala66, Met115-His118, Asp172, Gly174, and Phe176); helices α3 and α6 (residues Gly95, Asn96, and Val201); and residues Pro38, Ser99, and Trp216.
The binding sites of nucleotides are highly conserved, comprising a Rossmann fold motif (αβαβα). In IPCT, the CTP should be nested in a groove composed by the N termini of helices α2
and the C termini of strands β1
, and β4
. Based on the structure of RmlA in complex with dTTP (PDB code 1G2V
), we have fitted a CTP molecule into the active site of IPCT (). In RmlA, several conserved amino acid residues, namely, Gly10, Thr14, Arg15, and Lys25 (RmlA numbering), are involved in the stabilization of the nucleotide. In IPCT, the phosphate moiety may be stabilized by Thr30, Arg31, and Lys37 (corresponding to Thr14, Arg15, and Lys25 in RmlA), of which Thr30 and Arg31 are located in the disordered loop. Furthermore, Asp117 and Ala26 (Asp110 and Gly10 in RmlA), may establish several H-bonds to the ribose and pyrimidine ring, respectively. Gln82, a catalytically important residue in RmlA (replaced by His89 in IPCT), along with Gly87 (Gly95 in IPCT) and Gly10 (Ala26 in IPCT), provides specificity for thymidine, while a tight loop formed by Gln82 to Gly87 is responsible for the specificity of pyrimidine over purine bases (3
Fig. 5. Structural superposition in stereo view of IPCT fitted with CTP and RmlA with dTTP (1G2V) (a), IPCT fitted with inositol-1-P and RmlA with G-1-P (1G23) (b), and IPCT structure fitted with CDP-inositol and RmlA with dTDP-d-glucose (1G1L) (c). The IPCT (more ...)
IPCT is highly specific for CTP as no activity was detected with UTP, ATP, and GTP as nucleotide donors (31
). The structural basis for this substrate preference becomes apparent from our CTP-fitted model: the NH2
amino group of cytidine is able to form an H-bond with the main chain oxygen of Pro92, which is not possible with UTP (NH2
substituted for by a carbonyl group). By analogy with RmlA (2
), the tight loop (His89-Gly95 in IPCT) is proposed to be responsible for the specificity of cytosine over purines.
Inositol-1P was docked into IPCT by superposition with the structure of RmlA bound to glucose-1-phosphate (PDB code 1G23
), where the ligand is stabilized by several hydrogen bonds and van der Waals interactions (3
). The same type of interactions is expected in the inositol-1P fitted structure of IPCT since most of the residues involved in substrate binding are conserved among nucleotidyltransferases (). The side chains of Thr149, His118, and Asp172 (Tyr145, Asn111, and Val172 in RmlA), can establish H-bonds with the inositol ring, and Asn96 and Lys163 (Leu88 and Lys162 in RmlA), with the phosphate moiety. Additionally, several hydrophobic residues are present in the surroundings of the ligand, such as Met115, Gly174, Leu197, and Trp216 in IPCT (Leu108, Gly174, Ile199, and Trp223 in RmlA). The structure of IPCT fitted with CDP-inositol () reveals protein interactions quite similar to those found in inositol-1P- and CTP-docked models and hence will not be the subject of further discussion.
is known to be essential for the catalytic reaction of nucleotidyltransferases but its role is not fully understood. Blakenfeldt et al. (3
) proposed that this cation could be coordinating the β- and γ-phosphates of dTTP in a chelated manner, leaving the active site as a complex with pyrophosphate. However, several structures, such as glucose-1-phosphate uridylyltransferase from E. coli
) and glucose-1-phosphate cytidylyltransferase from Salmonella enterica
serovar Typhi (17
), show the magnesium bound to the final product of each enzyme, specifically to the α- and β-phosphoryl oxygens of the NDP-sugar product. A more direct role of Mg2+
could be to accelerate catalysis by positioning the phosphate oxygen of the sugar-phosphate adjacent to the α-phosphate of CTP to align the incoming second substrate for in-line nucleophilic attack, as suggested for CTP:phosphocholine cytidylyltransferase of Streptococcus pneumoniae
The first 3D structure of CTP:inositol-1-phosphate cytidylyltransferase (IPCT) revealed an overall architecture similar to the dinucleotide-binding Rossmann fold and a fairly good match with the structures of glucose-1-phosphate thymidylyl/uridylyl-transferases. It is surprising that inositol-1P, a novel substrate for nucleotidyltransferases, does not imply a distinctive structural layout for catalysis. On the other hand, it is curious that IPCT is structurally less related to known cytidyltransferases, namely, those recognizing other polyol-phosphates, such as glycerol-, ribitol-, or methylerythritol-phosphate.
A BLAST search in the genome databases with the A. fulgidus IPCT sequence clearly shows a strong correlation of the closest matches with thermophilic or hyperthermophilic host organisms. This feature underlies the commitment of CDP-inositol to the synthesis of DIP, a compatible solute thus far exclusive to (hyper)thermophiles. This apparently specialized role of CDP-inositol is an intriguing finding that challenges the usual, multitasking character of most metabolites.