|Home | About | Journals | Submit | Contact Us | Français|
Enzymes synthesized by hyperthermophiles (bacteria and archaea with optimal growth temperatures of >80°C), also called hyperthermophilic enzymes, are typically thermostable (i.e., resistant to irreversible inactivation at high temperatures) and are optimally active at high temperatures. These enzymes share the same catalytic mechanisms with their mesophilic counterparts. When cloned and expressed in mesophilic hosts, hyperthermophilic enzymes usually retain their thermal properties, indicating that these properties are genetically encoded. Sequence alignments, amino acid content comparisons, crystal structure comparisons, and mutagenesis experiments indicate that hyperthermophilic enzymes are, indeed, very similar to their mesophilic homologues. No single mechanism is responsible for the remarkable stability of hyperthermophilic enzymes. Increased thermostability must be found, instead, in a small number of highly specific alterations that often do not obey any obvious traffic rules. After briefly discussing the diversity of hyperthermophilic organisms, this review concentrates on the remarkable thermostability of their enzymes. The biochemical and molecular properties of hyperthermophilic enzymes are described. Mechanisms responsible for protein inactivation are reviewed. The molecular mechanisms involved in protein thermostabilization are discussed, including ion pairs, hydrogen bonds, hydrophobic interactions, disulfide bridges, packing, decrease of the entropy of unfolding, and intersubunit interactions. Finally, current uses and potential applications of thermophilic and hyperthermophilic enzymes as research reagents and as catalysts for industrial processes are described.
Hyperthermophiles grow optimally at temperatures between 80 and 110°C. Only represented by bacterial and archaeal species, these organisms have been isolated from all types of terrestrial and marine hot environments, including natural and man-made environments. Enzymes from these organisms (or hyperthermophilic enzymes) developed unique structure-function properties of high thermostability and optimal activity at temperatures above 70°C. Some of these enzymes are active at temperatures as high as 110°C and above (349). Thermophilic organisms grow optimally between 50 and 80°C. Their enzymes (thermophilic enzymes) show thermostability properties which fall between those of hyperthermophilic and mesophilic enzymes. These thermophilic enzymes are usually optimally active between 60 and 80°C. Active at high temperatures, thermophilic and hyperthermophilic enzymes typically do not function well below 40°C.
Current theory and circumstancial evidence suggest that hyperthermophiles were the first life-forms to have arisen on Earth (318). Hyperthermophilic enzymes can therefore serve as model systems for use by biologists, chemists, and physicists interested in understanding enzyme evolution, molecular mechanisms for protein thermostability, and the upper temperature limit for enzyme function. This knowledge can lead to the development of new and/or more efficient protein engineering strategies and a wide range of biotechnological applications.
This review will encompass the sources and uses of thermophilic and hyperthermophilic enzymes, as well as the molecular determinants for protein stability. Emphasis will be placed on hyperthermophilic enzymes, because most current research is focused on these enzymes and on hyperthermophiles. What is the upper temperature for life? Back in 1969, when T. D. Brock and colleagues discovered Thermus aquaticus—now known for its Taq polymerase in PCR techniques—T. aquaticus was considered an extreme thermophile since it grew optimally at 75°C (41). Today, of course, hyperthermophiles such as Pyrolobus fumarii, which grows at up to 113°C (28), are considered extreme.
Thermophilic and hyperthermophilic enzymes (also called thermozymes [see reference 349]) are part of another enzyme category called extremozymes, which evolved in extremophiles. Extremozymes can function at high salt levels (halozymes), under highly alkaline conditions (alkalozymes), and under other extreme conditions (pressure, acidity, etc.) (see references 4, 144, 223, and 371). Intrinsically stable and active at high temperatures, thermophilic and hyperthermophilic enzymes offer major biotechnological advantages over mesophilic enzymes. (i.e., enzymes optimally active at 25 to 50°C) or psychrophilic enzymes (i.e., enzymes optimally active at 5 to 25°C): (i) once expressed in mesophilic hosts, thermophilic and hyperthermophilic enzymes are easier to purify by heat treatment, (ii) their thermostability is associated with a higher resistance to chemical denaturants (such as a solvent or guanidinium hydrochloride), and (iii) performing enzymatic reactions at high temperatures allows higher substrate concentrations, lower viscosity, fewer risks of microbial contaminations, and often higher reaction rates.
Already the object of extensive reviews (140, 317, 319, 320), hyperthermophiles are only briefly described here. No exhaustive description of all the enzymes isolated and characterized from thermophiles and hyperthermophiles is presented, since that information is available elsewhere (2, 3, 139, 263, 349). Instead, we will focus on the latest findings that explain the molecular determinants of extreme protein thermostability and on the thermophilic and hyperthermophilic enzymes with the highest commercial relevance.
The interest shown by the scientific community in hyperthermophiles has constantly increased over the last 30 years. This growing interest is demonstrated by the increasing number of hyperthermophilic species that have been described (from 2 in 1972 [40, 372] to more than 70 at the end of 1999 [140, 320]), by the exponentially growing number of publications on the subject, and by the major central place occupied by hyperthermophiles in worldwide genome-sequencing projects (six completed genome sequences, and at least four genome-sequencing projects in progress) (see Table Table11 and http://www.tigr.org) Studies of environmental 16S rRNA sequences (18, 19) in samples originating from a single continental hot spring (Obsidian Pool at Yellowstone National Park) and environmental lipid analysis (128) suggest that known hyperthermophiles represent only a fraction of hyperthermophilic species diversity.
Now that we are able to collect samples almost routinely from deep-sea floors, access to hyperthermophilic biotopes is not the limiting factor in studying hyperthermophile diversity. Isolating and growing pure cultures of new hyperthermophiles has been—and remains—a challenge. A striking example of this difficulty is the bacterium Thermocrinis ruber (147). This pink-filament-forming bacterium was described as early as 1967 by Brock (39), but it took more than 25 years to successfully cultivate this organism (147). A major task for scientists in the near future will be to develop new isolation techniques for microorganisms with different, unforeseen metabolic requirements. Huber et al. (145) took the lead by cloning a new archaeal hyperthermophile by using optical tweezers.
Hyperthermophiles have been isolated almost exclusively from environments with temperatures in the range of 80 to 115°C. Hot natural environments include continental solfataras, deep geothermally heated oil-containing stratifications, shallow marine and deep-sea hot sediments, and hydrothermal vents located as far as 4,000 m below sea level (Table (Table1).1). Hyperthermophiles have also been isolated from hot industrial environments (e.g., the outflow of geothermal power plants and sewage sludge systems). Deep-sea hyperthermophiles thrive in environments with hydrostatic pressures ranging from 200 to 360 atm. Some of these species are barotolerant (281) or even barophilic (95, 233, 257). The most thermophilic organism known, P. fumarii, grows in the temperature range of 90 to 113°C. The upper temperature at which life is possible is still unknown, but it is probably not much above 113°C. Above 110°C, molecules such as amino acids and metabolites become highly unstable (ATP is spontaneously hydrolyzed in aqueous solution at temperatures below 140°C) and hydrophobic interactions weaken significantly (163).
Of the more than 70 species, 29 genera, and 10 orders of hyperthermophiles that have been described (320), most are archaea. Thermotogales and Aquificales are the only bacteria (Table (Table1).1). Thermotogales and Aquificales are the deepest branches in the bacterial genealogy, and for this reason they represent an obvious interest in evolutionary studies (1). One of the most striking findings extracted from the complete Thermotoga maritima genome sequence (258) is the abundance of evidence supporting lateral gene transfer between archaea and bacteria: (i) 24% of the T. maritima open reading frames (versus 16% in Aquifex aeolicus) encode proteins that are more similar to archaeal than to bacterial proteins; (ii) these archaea-like genes are not uniformly distributed among the biological categories; (iii) 81 of these genes are clustered in 15 4- to 20-kb regions, in which the gene order can be the same as in archaea; and (iv) The T. maritima genome sequence does not have a homogeneous G+C content—among the 51 regions having significantly different G+C contents, 42 contain “archaea-like” genes.
The archaeal domain is composed so far of two branches: the Crenarchaeota and the Euryarchaeota. A 16S rRNA isolated from a hyperthermophilic environment was recently sequenced that is not related to any other archaeal rRNA. This new rRNA species suggests the existence of a third branch in the archaeal domain, the Korarcheota, that branches deeper in the archaeal tree than the Crenarchaeota and the Euryarchaeota (18). Hyperthermophiles are represented in the Crenarchaeota and Euryarchaeota, and they systematically represent the deepest and shortest lineages in these two branches (see references 140 and 320 for phylogenetic trees). In addition to thermoacidophiles, Crenarchaeota include halophiles. Among the Euryarchaeota, methanogens have mesophilic relatives.
Hyperthermophile communities are complex systems of primary producers and decomposers of organic matter. All hyperthermophilic primary producers are chemolithoautotrophs (i.e., sulfur oxidizers, sulfur reducers, and methanogens) (104, 223). In relation to the high sulfur content of most hot natural biotopes, most hyperthermophiles are facultative or obligate chemolithotrophs: they either reduce S0 with H2 to produce H2S (the anaerobes) or oxidize S0 with O2 to produce sulfuric acid (the aerobes). Extremely acidophilic hyperthermophiles belong to the order Sulfolobales. They are all strict aerobes (e.g., Sulfolobus) or facultative aerobes (e.g., Acidianus), and they have been isolated almost exclusively from continental solfataras (Table (Table1).1). While most heterotrophs are obligate sulfur reducers, all members of the Thermotogales and most members of the Pyrococcales and Thermococcales can grow independently of S0, obtaining their energy from fermentations (Table (Table1).1). Because of the extremely low organic matter content of their submarine environments, hyperthermophilic heterotrophs typically obtain their energy and carbon from complex mixtures of peptides derived from the decomposition of primary producers. A few species are able to use polysaccharides (e.g., starch, pectin, glycogen, and chitin); to date, Archeoglobus profundus is the only known species that uses organic acids.
Thermostability and optimal activity at high temperatures are inherent properties of hyperthermophilic enzymes. Enzyme thermostability encompasses thermodynamic and kinetic stabilities. Thermodynamic stability is defined by the enzyme's free energy of stabilization (ΔGstab) and by its melting temperature (Tm, the temperature at which 50% of the protein is unfolded). For the enzymes that unfold irreversibly, only Tm can be determined. Kinetic stability depends on the energy barrier to unfolding (i.e., the activation energy [Ea] of unfolding). An enzyme's kinetic stability is often expressed as its half-life (t1/2) at defined temperatures. In this review, an enzyme will be called mesophilic if it originates from a mesophilic organism, thermophilic if it originates from a thermophile, and hyperthermophilic if it originates from a hyperthermophile. Further, we will say that enzyme X is more thermophilic than enzyme Y if enzyme X is optimally active at higher temperatures than enzyme Y.
Most enzymes characterized from hyperthermophiles are optimally active at temperatures close to the host organism's optimal growth temperature, usually 70 to 125°C (see references 139 and 349 for lists of purified hyperthermophilic enzymes and their properties). Extracellular and cell-bound hyperthermophilic enzymes (i.e., saccharidases and proteases) are optimally active at temperatures above—sometimes far above—the host organism's optimum growth temperature and are, as a rule, highly stable. For example, Thermococcus litoralis amylopullulanase is optimally active at 117°C, which is 29°C above the organism's optimum growth temperature of 88°C (43). While they are usually less thermophilic than extracellular enzymes purified from the same host, intracellular enzymes (such as xylose isomerases) are usually optimally active at the organism's optimal growth temperature. Only a few enzymes have been described that are optimally active at 10 to 20°C below the organism's optimum growth temperature (108, 197, 278). While most hyperthermophilic enzymes are intrinsically very stable, some intracellular enzymes get their high thermostability from intracellular factors such as salts, high protein concentrations, coenzymes, substrates, activators, or general stabilizers such as thermamine.
Arrhenius plots for hyperthermophilic and mesophilic enzymes are typically linear (20, 29, 62), suggesting that mesophilic and hyperthermophilic enzyme functional conformations remain unchanged throughout their respective temperature ranges. If enzyme structures changed in a catalytically significant manner with increasing temperature, one would expect to find (i) nonlinear Arrhenius plots for most enzymes and (ii) different types of plots for different enzyme classes. Biphasic Arrhenius plots reported for a number of hyperthermophilic enzymes (58, 98, 101, 133, 366) represent an important exception to the typical Arrhenius-like behavior. Biphasic Arrhenius plots can often be correlated with functionally significant conformational changes, detected by spectroscopic methods (101, 133, 222). Although not much information is typically available on the effect of temperature on the activity of mesophilic enzymes, a few examples exist of mesophilic enzymes showing bent Arrhenius plots (110), suggesting that such discontinuities are not a specific trait of hyperthermophilic enzymes.
With the exception of phylogenetic variations, what differentiates hyperthermophilic and mesophilic enzymes is only the temperature ranges in which they are stable and active. Otherwise, hyperthermophilic and mesophilic enzymes are highly similar: (i) the sequences of homologous hyperthermophilic and mesophilic proteins are typically 40 to 85% similar (79, 350); (ii) their three-dimensional structures are superposable (16, 63, 143, 160, 227, 284, 327); and (iii) they have the same catalytic mechanisms (22, 350, 386).
More than 100 genes from hyperthermophiles have been cloned and expressed in mesophiles. Most of this work has been done in the last 5 years. Only a small fraction of them have been isolated by direct expression and activity screening (i.e., by complementation of growth or activity assay) of a genomic library in Escherichia coli (Table (Table2).2). Most other genes from hyperthermophiles have been isolated by hybridization or have been directly cloned after PCR amplification. Since archaeal transcription systems (including promoter sequences) are more closely related to eucaryal than to bacterial systems, it is not surprising that most archaeal genes are expressed in E. coli only when they are cloned under the control of strong promoters (plac, ptac, or T7 RNA polymerase promoter). Pyrococcal intergenic regions are particularly AT- rich, and E. coli consensus promoter-like sequences can be found that explain why some P. furiosus genes are directly expressed in E. coli (85, 86, 343). Another difficulty encountered in expressing archaeal genes in E. coli can be low expression due to a significantly different codon usage in the expressed gene. This difficulty is often alleviated by the expression in E. coli of rare tRNA genes together with the target gene (344). A few genes from hyperthermophilic archaea have been successfully expressed in yeast systems (77). They are able to complement yeast mutations (90, 275, 282).
When the properties of the native and recombinant hyperthermophilic enzymes are compared, the majority of hyperthermophilic enzymes expressed in E. coli retain all of the native enzyme's biochemical properties, including proper folding (121), thermostability, and optimal activity at high temperatures (8, 14, 115, 338, 350). Thus, while a few proteins from hyperthermophiles might require extrinsic factors (e.g., salts or polyamines), or posttranslational modifications (e.g., glycosylation) to be fully thermostable, most proteins from hyperthermophiles are intrinsically thermostable, and they can fold properly even at temperatures 60°C below their physiological conditions. The fact that most hyperthermophilic enzymes are properly expressed and folded in E. coli has greatly facilitated their study, since they can be purified from E. coli rather than from an often hard-to-grow hyperthermophilic organism. Additional indirect evidence for the correct folding of recombinant hyperthermophilic proteins is the fact that crystal structures of recombinant hyperthermophilic proteins are typically similar to that of their mesophilic homologues (160, 183, 227, 284, 327, 368). The idea that recombinant and native hyperthermophilic protein structures are identical has become so widely accepted that in some studies both the native and recombinant enzymes are used indifferently in crystallization studies (5).
It is unclear whether all hyperthermophilic proteins can be expressed in a mesophilic environment, since unsuccessful experiments are typically not reported. So far, fewer than 10% of all the hyperthermophilic enzymes expressed in E. coli have been reported to have stability, catalytic, or structural properties different from those of the enzyme purified from the native organism (51, 239). The recombinant P. furiosus ornithine carbamoyltransferase was as stable as the native enzyme when it was expressed in Saccharomyces cerevisiae but was less stable when expressed in E. coli. When expressed in E. coli, the Sulfolobus solfataricus 5′-methylthioadenosine phosphorylase (a hexameric enzyme containing six intersubunit disulfide bridges) forms incorrect disulfide bridges and is less stable and less thermophilic than the native enzyme (51). The recombinant P. furiosus glutamate dehydrogenase (GDH) is a partially active hexamer that can be fully activated upon incubation at 90°C but remains less stable than the native P. furiosus GDH (202). Such hyperthermophilic enzymes might require posttranslational modifications (e.g., glycosylation) or specific chaperones to reach their fully functional and stable folded state.
A current working hypothesis is that hyperthermophilic enzymes are more rigid than their mesophilic homologues at mesophilic temperatures and that rigidity is a prerequisite for high protein thermostability. This hypothesis is supported by a growing body of experimental data that includes frequency domain fluorometry and anisotropy decay (229), hydrogen-deuterium exchange (35, 164, 370), and tryptophan phosphorescence (114) experiments. Figure Figure11 illustrates one of the hydrogen-deuterium exchange experiments. At 20°C a much smaller fraction of the amide protons in Sulfolobus acidocaldarius adenylate kinase (53%) are exchanged than in the porcine cytosolic enzyme (83%), indicating that considerable more amide protons are involved in stable hydrogen bonds in the thermophilic enzyme. Temperatures of 80 to 90°C are needed before S. acidocaldarius adenylate kinase can show an exchange level comparable to that of the catalytically active mesophilic enzyme (35). In protein structure determination, atomic temperature factors provide an adequate representation of local flexibility. In a 1987 study, Vihinen (351) calculated protein flexibility indexes for mesophilic and thermophilic proteins, starting from normalized atomic temperature factors. His results showed that flexibility decreased as thermostability increased. This study needs to be updated since Vihinen's sample was small and did not include data on hyperthermophilic proteins. A computer simulation showed that a mesophilic rubredoxin was more flexible, on the picosecond timescale, than its P. furiosus homologue at room temperature (201).
While most flexibility comparisons in mesophilic and hyperthermophilic proteins have reached the same conclusion that hyperthermophiliic proteins are more rigid enzymes, one recent study (134) does not support this conclusion. Using amide hydrogen exchange data, Hernández et al. show that (i) all the hydrogen bonding involving the amide hydrogens of P. furiosus rubredoxin are disrupted in less than 1 s at temperatures close to P. furiosus rubredoxin's temperature of maximal thermodynamic stability; (ii) conformational opening for solvent access takes place in the millisecond range for the entire protein; and (iii) at alkaline pHs, the maximum enthalpy contributed by hydrogen-bonded amides accounts for less than 5% of the total activation enthalpy normally associated with protein unfolding. These results suggest that the most stable protein characterized so far shows a degree of conformational flexibility comparable to that of mesophilic proteins.
Lazaridis et al. (201) argue that there is no single measure of flexibility (a protein can be rigid on a nanosecond scale but flexible on a millisecond scale) and that there is no fundamental reason for stability and rigidity to be correlated. Flexibility implies increased conformational entropy of the folded state, and it should therefore be favorable to thermodynamic stability. More studies on hyperthermophilic enzyme flexibility at various temperatures are needed before we can get a better understanding of the role of conformational rigidity in protein stability.
It has also been proposed that excessive rigidity explains why hyperthermophilic enzymes are often inactive at low temperatures (i.e., around 20 to 37°C). One set of evidence that tends to support this hypothesis is that denaturants (e.g., guanidinium hydrochloride and urea) (23, 195, 364), detergents (e.g., Triton X-100 and sodium dodecyl sulfate) (82, 283, 290), and solvents (78, 195) often activate hyperthermophilic enzymes at suboptimal temperatures. This activation tends to disappear as the temperature gets closer to the enzyme's temperature of maximal activity (Topt) (23). At that temperature, the enzyme is flexible enough in the absence of a denaturant to show full activity. Recent findings that show increasing levels of hydrogen tunneling with increasing temperature in a thermophilic alcohol dehydrogenase provide additional evidence for the role of thermally induced protein motions in modulating enzyme activity (190). A few hyperthermophilic enzymes have been characterized that are more active than their mesophilic counterparts, even at 37°C (156, 246, 315). Since they are thermostable, these enzymes are expected to be quite rigid at mesophilic temperatures. Their high catalytic activity at mesophilic temperatures suggests that these enzymes combine local flexibility in their active site (which is responsible for their activity at low temperatures) with high overall rigidity (which is responsible for their thermostability). The existence of such enzymes (and of highly stable, engineered mesophilic enzymes [116, 374]) also suggests that thermostability is not incompatible with high activity at moderate temperatures. Hyperthermophiles probably only need enzymes with activities at their optimal temperatures comparable to that of their mesophilic homologues. While there is probably no evolutionary pressure for an organism to have more efficient enzymes, this does not mean that more efficient thermostable enzymes cannot be engineered in the laboratory.
The free energy of stabilization (ΔGstab, where ΔGstab = ΔHstab − TΔSstab) of a protein is the difference between the free energies of the folded and the unfolded states of that protein. It directly measures the thermodynamic stability of the folded protein. ΔHstab (the stabilization enthalpy) and ΔSstab (the stabilization entropy) are large numbers that vary almost linearly with temperature in the temperature range of the activities of most enzymes. Also a function of temperature, ΔGstab is usually small (83, 162) (Table (Table3).3). The ΔGstab of globular mesophilic proteins is typically between 5 and 15 kcal/mol at 25°C (Table (Table3).3). Not many proteins have been studied to determine the free energies of stabilization. Such studies are hindered by the fact that the thermal denaturation of most proteins is irreversible: complete denaturation is often almost immediately followed by aggregation and precipitation (see below). Thus, most ΔGstab data are for small monomeric proteins (277) (Table (Table3).3). ΔGstab calculations are made even more difficult for hyperthermophilic proteins, since their denaturation transitions take place outside the temperature range of most calorimeters (141, 274). To overcome this difficulty, most thermodynamic studies of hyperthermophilic protein stability are performed in the presence of guanidinium hydrochloride (168) or at pHs outside the physiological conditions (241). These various conditions allow the temperature of the denaturation transition to become accessible to physical measurement, and in some cases they allow the enzyme to unfold reversibly. In one case, the stability parameters of a hyperthermophilic protein were determined under native conditions using hydrogen exchange to measure the reversible cycling between the native and unfolded proteins (141). Table Table33 shows that in most cases the difference in ΔGstab values of hyperthermophilic and mesophilic proteins is small, usually in the range of 5 to 20 kcal/mol. Stability studies of enzyme mutants (173, 261), showing that differences in ΔGstab as small as 3 to 6.5 kcal/mol can account for thermostability increases of up to 12°C, are in complete agreement with the stability data listed in Table Table3.3.
As a consequence of the enthalpic and/or entropic stabilizations occurring in a hyperthermophilic protein, the ΔGstab-versus-T curve of this protein will be different from that of its mesophilic counterpart. Figure Figure22 illustrates the three theoretical ways by which increased protein thermodynamic stability can be achieved (265): (a) the ΔGstab-versus-T curve of a hyperthermophilic protein can be shifted toward higher ΔGstab values, (b) it can be shifted toward higher temperatures, or (c) it can be flattened (due to a smaller difference in partial molar heat capacity between the protein's folded and unfolded states [ΔCp]). As seen in Table Table3,3, a majority of thermophilic and hyperthermophilic proteins use various combinations of these three mechanisms to reach their superior thermodynamic stabilities. For example, the ΔGstab-versus-T curve of the P. furiosus histone is shifted by approximately 12°C toward higher temperatures and by 10 kcal/mol toward higher ΔGstab values, compared to the ΔGstab-versus-T curve of the Methanobacterium formicicum histone. The most common stabilization mechanism among both thermophilic and hyperthermophilic proteins is the shift of their ΔGstab-versus-T curves toward higher ΔGstab values.
Native, active proteins are held together by a delicate balance of noncovalent forces (e.g., H bonds, ion pairs, and hydrophobic and Van der Waals interactions). When high temperatures disrupt these noncovalent interactions, proteins unfold. Protein unfolding can be observed by different techniques, including differential scanning calorimetry, fluorescence, circular dichroism spectroscopy, viscosity, and migration patterns. The Tm, as determined by calorimetry and spectroscopic techniques, is typically the same (216). Numerous studies have shown that inactivation becomes significant only a few degrees below the Tm. In most cases, the loss of secondary and tertiary structures is concomitant with enzyme inactivation at high temperature. Small monomeric proteins commonly unfold via a two-state transition (i.e., unfolding intermediates are barely detectable or not detectable). Some proteins might regain their native, active conformation upon cooling. This unfolding is called thermodynamically reversible unfolding, and the thermodynamic parameters describing the folded and unfolded states can be determined (it is most easily done using calorimetry data) (17, 277).
Most mesophilic proteins, however, unfold irreversibly. They unfold into inactive but kinetically stable structures (scrambled structures), and they often form aggregates (intermolecular mechanism). During aggregation, the hydrophobic residues that are normally buried in the native protein become exposed to the solvent and interact with hydrophobic residues from other unfolding protein molecules to minimize their exposure to the solvent (354). Such irreversible unfolding usually follows the general model proposed by Tomazic and Klibanov (334):
This model is consistent with an intramolecular rate-determining step in thermal inactivation. The natural logarithm of the residual activity is a linear function of the inactivation time:
where k is the inactivation rate and t is the inactivation time. In this model, the inactivation rate constant is independent of the initial protein concentrations.
Hyperthermophilic proteins that denature reversibly are probably as rare as reversibly denaturing mesophilic proteins. High Ea values for inactivation of hyperthermophilic enzymes (above 100 kcal/mol) suggest that the limiting step in their inactivation is still unfolding (55, 268, 352). These different observations suggest that chemical modifications (e.g., deamidation, cysteine oxidation, and peptide bond hydrolysis) take place only once the protein is unfolded. Accelerated at elevated temperatures, chemical modifications are another process that make denaturation irreversible.
While there have been numerous studies of mesophilic enzymes affected by deamidation in vivo (reference 367 and references therein), it is still unclear whether some hyperthermophilic proteins are inactivated via covalent mechanisms. Studies performed with a few enzymes (e.g., hen egg white lysozyme, RNase A, and Bacillus α-amylases) at temperatures neighboring or even above their melting temperatures clearly showed that elevated temperatures trigger chemical modifications that irreversibly inactivate reversibly denatured proteins (6, 334, 335, 369).
Two deamidation mechanisms are known for Asn and Gln residues (367), but it not often known which mechanism is responsible for an enzyme deamidation. In the general acid-base mechanism, a general acid (HA) protonates the Asn (or Gln) amido (—NH) group. A general base (A− or OH−) attacks the carbonyl carbon of the amido group or activates another nucleophile (Fig. (Fig.3).3). The transition state is supposed to be an oxyanion tetrahedral intermediate. The order of the acid and base attacks varies with pH. In the β-aspartyl shift mechanism, the Asn side chain amide group is attacked by the n + 1 peptide nitrogen (acting as a nucleophile). The succinimide intermediate then breaks down to yield an α-linked (Asp) or β-linked (isoAsp) residue, typically in the ratio 1:3 (Fig. (Fig.3).3). In this mechanism, Gly, Ser, and Ala are favored in n + 1 because their small side chains do not obstruct the cyclization into the succinimide intermediate. In both deamidation mechanisms, conformation and rigidity seem to be instrumental in limiting the extent of deamidation. Conformation probably also explains the approximately 10-fold-higher propensity of Asn to deamidate than Gln.
An RNase Asn residue located in a β-turn, with its side chain mobile in the solvent, was shown to be much more susceptible to deamidation once the enzyme was unfolded (362). In one of the only studies of hyperthermophilic protein chemical degradations, Methanothermus fervidus and Pyrococcus woesei glyceraldehyde 3-phosphate dehydrogenases (GAPDHs) were shown to inactivate significantly faster than they deamidated (132), indicating that deamidation was not a major inactivation mechanism. Once unfolded, the P. woesei GAPDH deamidated at a much higher rate than the native enzyme did. Zale and Klibanov (369) showed that deamidation rates were similar in a few selected enzymes and suggested that deamidation was not affected by local structure. Their studies, however, were always performed under conditions in which the enzyme would be mostly unfolded; thus, their results cannot be interpreted in terms of the role of local conformation in a residue's susceptibility to deamidation. Indirect evidence for the role of conformation and rigidity in controlling the rate of deamidation is found in the existence of hyperthermophilic proteins that are functional and stable up to 120°C. In these proteins, noncovalent structural interactions are strong enough to protect the Asn residues from deamidation. Deamidation can take place in native enzymes (reference 367 and references therein), however, but all examples are of mesophilic proteins. It is not clear if deamidation is a major inactivation process for hyperthermophilic proteins.
Hydrolysis of peptide bonds happens most often at the C-terminal side of Asp residues, with the Asp-Pro bond being the most labile of all (354). Two factors seem to be responsible for this lability. The proline nitrogen is more basic than that of other residues, and Asp has an increased propensity for α-β isomerization when linked on the N side of a proline. Peptide chain cleavage can also occur at Asn-Xaa linkages in a β-aspartyl shift-like mechanism (367). In this reaction the Asn amido (—NH2) group acts as the nucleophile, attacking its own main-chain carboxyl carbon (Fig. (Fig.3)3) (132). Such cleavage occurs in five positions in the M. fervidus GAPDH when conditions favor unfolding (i.e., temperatures above 85°C and low salt concentrations). Less susceptible to hydrolysis, the more thermostable P. woesei GAPDH contains substitutions in three of these cleavage positions. Cleavage at the two remaining Asn-Xaa locations is probably inhibited by the higher conformational rigidity of the P. woesei enzyme.
Destruction of disulfide bridges under alkaline conditions is known to occur via a β-elimination reaction, yielding dehydroalanine and thiocysteine. Dehydroalanine then reacts with nucleophilic groups—especially the -amino group of lysine—to form lysinoalanine. The fate of thiocysteine is not completely understood (354). The β-elimination reaction produces free thiols that can catalyze disulfide interchange and further inactivate the enzyme (369).
Cysteines are the most reactive amino acids in proteins. Their autooxidation, usually catalyzed by metal cations (especially copper), leads to the formation of intramolecular and intermolecular disulfide bridges or to the formation of sulfenic acid (354). Cysteines can also catalyze disulfide interchange, causing disulfide bond reshuffling as well as important structural variations. The recombinant S. solfataricus 5′-methylthioadenosine phosphorylase forms incorrect intersubunit disulfide bridges that make it less stable and less thermophilic than the native enzyme (51).
Aside from the above commonly observed degradative reactions, other, less frequent chemical inactivation mechanisms have been identified (354). Methionine can be oxidized to its sulfoxide counterpart, and some residues (Asp and Ser, in particular) can be racemized to their d-form. Lysine can react with reducing sugars via the Maillard reaction (279). Last, thermolysin-like neutral proteases are susceptible to autolysis. Local unfolding of one of their surface loops determines their inactivation by autolysis (93).
The hydrophobic effect is considered to be the major driving force of protein folding (83). Hydrophobicity drives the protein to a collapsed structure from which the native structure is defined by the contribution of all types of forces (e.g., H bonds, ion pairs, and Van der Waals interactions). Dill (83) reviewed the evidences supporting this theory: (i) nonpolar solvents denature proteins; (ii) hydrophobic residues are typically sequestered into a core, where they largely avoid contact with water; (iii) residues and hydrophobicity in the protein core are more strongly conserved and related to structure than any other type of residue (replacements of core hydrophobic residues are generally more disruptive than other types of substitutions); and (iv) protein unfolding involves a large increase in heat capacity. Given the central role of the hydrophobic effect in protein folding, it was easy to assume that the hydrophobic effect is also the major force responsible for protein stability. The sequencing, structure, and mutagenesis information accumulated in the last 20 years confirm that hydrophobicity is, indeed, a main force in protein stability. Two observations suggest that mesophilic and hyperthermophilic homologues have a common basic stability afforded by the conserved protein core: (i) hydrophobic interactions and core residues involved in secondary structures are better conserved than surface area features, and (ii) numerous stabilizing substitutions are found in solvent-exposed areas (as observed in mesophilic and hyperthermophilic protein structures comparisons and in protein directed-evolution experiments, see below). The high level of similarity encountered in the core of mesophilic and hyperthermophilic protein homologues suggests that even mesophilic proteins are packed almost as efficiently as possible and that there is not much room left for stabilization inside the protein core. Stabilizing interactions in hyperthermophilic proteins are often found in the less conserved areas of the protein. As illustrated below, factors such as surface ion pairs, decrease in solvent-exposed hydrophobic surface, and anchoring of “loose ends” (i.e., the N and C termini and loops) to the protein surface seem to be instrumental in hyperthermophilic protein thermostability.
Enough experimental evidence (e.g., sequence, mutagenesis, structure, and thermodynamics) has been accumulated on hyperthermophilic proteins in recent years to conclude that no single mechanism is responsible for the remarkable stability of hyperthermophilic proteins. Increased thermostability must be found, instead, in a small number of highly specific mutations that often do not obey any obvious traffic rules.
Protein amino acid composition has long been thought to be correlated to its thermostability. The first statistical analyses comparing amino acid compositions in mesophilic and thermophilic proteins indicated trends toward substitutions such as Gly→Ala and Lys→Arg. A higher alanine content in thermophilic proteins was supposed to reflect the fact that Ala was the best helix-forming residue (10). As more experimental data accumulate (in particular, complete genome sequences), it is becoming obvious that “traffic rules of thermophilic adaptation cannot be defined in terms of significant differences in the amino acid composition” (31). The comparison of residue contents in hyperthermophilic and mesophilic proteins based on the genome sequences of eight mesophilic and seven hyperthermophilic organisms shows only minor trends (Table (Table4).4). More charged residues are found in hyperthermophilic proteins (+3.24%) than in mesophilic proteins, mostly at the expense of uncharged polar residues (−4.98%; in particular Gln, −2.21%). Hyperthermophilic proteins also contain slightly more hydrophobic and aromatic residues than mesophilic proteins do. These data obtained from genome sequencing cannot be generalized, since large variations exist among hyperthermophile genomes themselves: the Aeropyrum pernix protein pool actually contains fewer charged residues (23.64%), fewer large hydrophobic residues (27.29%), and fewer aromatic residues (7.42%) than do the mesophiles listed in Table Table4.4. Instead, A. pernix proteins contain more Ala, Gly, Pro, Ser, and Thr residues. Thus, a bias in a hyperthermophilic protein amino acid composition might often be evolutionarily relevant, rather than an indication of its adaptation to high temperatures. Probably more relevant to thermostability than amino acid composition are the distribution of the residues and their interactions in the protein. The two homologous proteases Bacillus amyloliquefaciens subtilisin BPN′ and Thermoactinomyces vulgaris thermitase contain the same number of charged residues, but the thermophilic enzyme thermitase contains eight more ion pairs (331).
In relation to the idea that protein stability was determined by the stability and tight packing of its core, the propensity of the individual residues to participate in helical or strand structures was studied as a potential stability mechanism. In their comparison of mesophilic and thermophilic protein structures, Facchiano et al. (99) observed that helices of thermophilic proteins are generally more stable than those of mesophilic proteins. The only trend they detected was a decreasing content in β-branched residues (Val, Ile, and Thr) in the helices of thermophilic proteins (β-branched residues are not as well tolerated in helices as linear residues are) (99). A number of examples exist in which this trend is not followed. The P. furiosus and T. litoralis GDHs contain a larger number of isoleucines. If Leu and Ile residues are compared, these two residues have the highest (and equivalent) partial specific volumes. In proteins, the Leu side chain is most often found in one of two rotamer conformations (χ1 of 180° and 300°) but not in the one with χ1 = 60°. The Ile side chain frequently adopts four different rotamer conformations, and the three χ1 values are found. With this conformational flexibility, Ile might be better able to fill various voids that can occur during protein core packing (38). Dill (83) also noted that context effects (e.g., salt bridge formation, aromatic interactions, burial of hydrophobic surface, and cavity filling) could be as important as the intrinsic helical propensity. In many cases, secondary structures found in protein structures do not correspond to the secondary structures predicted by intrinsic propensity, suggesting that intrinsic propensity is not enough to account for the stability of α-helices in proteins (83).
Several properties of Arg residues suggest that they would be better adapted to high temperatures than Lys residues: the Arg δ-guanido moiety has a reduced chemical reactivity due to its high pKa and its resonance stabilization. The δ-guanido moiety provides more surface area for charged interactions than the Lys amino group does. Figure Figure44 illustrates the ability of Arg to participate in multiple noncovalent interactions. Because the Arg side chain contains one fewer methylene group than Lys, it has the potential to develop less unfavorable contacts with the solvent. Last, because its pKa (approximately 12) is 1 unit above that of Lys (11.1), Arg more easily maintains ion pairs and a net positive charge at elevated temperatures (pKa values drop as the temperature increases) (252, 354). The average Arg/Lys ratios in the protein pools of the mesophiles and hyperthermophiles listed in Table Table44 (0.73 ± 0.37 and 0.87 ± 0.60, respectively) are associated with large standard deviations. (Among hyperthermophiles, Arg/Lys ratios vary from 0.52 in Aquifex aeolicus proteins to 2.19 in Aeropyrum pernix proteins.) These results suggest that if an increased Arg content is indeed stabilizing, this mechanism is not universally used among hyperthermophiles.
An indirect indication that deamidation affects hyperthermophilic proteins (156) is the high activity of T. maritima l-isoaspartyl methyltransferase. This enzyme methylates l-isoAsp residues that result from Asn deamidation or from Asp isomerization. Its high activity suggests that it has been adapted for the high load of protein damage that could occur at high temperatures. Resistance to deamidation seems to result from at least three adaptation mechanisms. (i) Some hyperthermophilic enzymes contain less Asn than their mesophilic homologues do. P. woesei 3-phosphoglycerate kinase (PGK) contains less Asn than the Methanobacterium bryantii enzyme does. In both Asn-Ala and two of the three Asn-Gly sequences present in M. bryantii PGK, the Asn residue is substituted in the P. woesei enzyme (136). The only conserved Asn-Gly sequence is conserved in all PGKs. It is possible that the four nonconserved sequences would have been susceptible to deamidation at high temperatures and that they have been selected against in the hyperthermophilic PGK. A direct correlation was also shown between the Asn+Gln content in type II d-xylose isomerases and their respective temperatures of maximal activity (ranging from 55 to 95°C) (350). (ii) Other hyperthermophilic enzymes contain as many Asn residues, but these residues are in locations and in conformations in which they are not susceptible to deamidation. The resistance of P. woesei GAPDH to deamidation and peptide bond hydrolysis was shown to be related to the enzyme's higher conformational stability (132). S. solfataricus 5′-methylthioadenosine phosphorylase is optimally active at 120°C, and its Tm is 132°C. It is not inactivated after 2 h at 100°C (52). It is interesting that it contains twice as many Asn as a related enzyme from E. coli, including one Asn in the sequence Asn-Gly, a sequence normally highly susceptible to deamidation.
The Asn and Gln contents listed in Table Table44 suggest that hyperthermophilic proteins do not acquire their resistance to deamidation only through a decreased Asn content. Instead, it is curious that the seven hyperthermophiles show the same significant decrease in Gln residues in their proteins.
Cysteine's high sensitivity to oxidation at high temperature suggests that hyperthermophilic enzymes contain fewer cysteines than their mesophilic counterparts do. While Table Table44 indicates that hyperthermophilic proteins in average contain fewer cysteines than mesophilic proteins do, large variations exist among species. Archaeoglobus fulgidus and Methanococcus jannaschii proteins contain more cysteines (1.17 and 1.27%, respectively), in fact, than an average mesophile protein pool does (1.10%). From the seven hyperthermophilic organisms included in Table Table4,4, A. aeolicus and A. pernix are microaerophilic and aerophilic organisms, respectively, whereas the others are strict anaerobes. Interestingly, A. aeolicus and A. pernix proteins contain more cysteines (0.79 and 0.93%, respectively) than Pyrococcus abyssi, P. horikoshii, and T. maritima proteins do (0.55, 0.63, and 0.71%, respectively). One would expect a high selection pressure against the presence of cysteines in proteins from aerobic hyperthermophiles (and the absence of such selection pressure in anaerobic hyperthermophiles). Cysteines that are present in proteins from aerobic hyperthermophiles are often involved in specific stabilizing interactions (e.g., disulfide bridges and metal liganding) and/or are inaccessible to the solvent. Drastic denaturing conditions are required (2 h at 70°C in the presence of 6 M guanidinium HCl) for 10 mM dithiothreitol to reduce most of the six intersubunit disulfide bridges in native S. solfataricus 5′-methylthioadenosine phosphorylase (51). In contrast, the GAPDH from the anaerobe T. maritima contains three Cys residues, one of them essential in the active site and two others described by Schultes et al. as “unnecessary” (299).
Disulfide bridges are believed to stabilize proteins mostly through an entropic effect, by decreasing the entropy of the protein's unfolded state (237). The entropic effect of the disulfide bridge increases in proportion to the logarithm of the number of residues separating the two cysteines bridged.
Because of the susceptibility of cysteines and disulfide bridges to destruction at high temperatures, 100°C was believed to be the upper limit for the stability of proteins containing disulfide bridges (353). This notion was based on the fact that early studies characterizing protein inactivation mechanisms were performed with the only enzymes available at that time: mesophilic enzymes. These studies determined that all proteins studied that contained disulfide bridges had the same rate of β-elimination at 100°C. This rate was independent of the protein structure and was higher at pH 8.0 (t1/2 of 1 h) than at pH 6.0 (t1/2 of 12.4 h). The limitation of these studies was that at 100°C all the proteins studied were in the unfolded state. The recent characterization of disulfide bridge-containing proteins that are optimally active and stable at temperatures above 100°C suggests that disulfide bridges can be a stabilization strategy above 100°C and that conformational environment and solvent accessibility are determining factors in the protection of disulfide bridges against destruction. When expressed in E. coli, S. solfataricus 5′-methylthioadenosine phosphorylase forms incorrect, destabilizing disulfide bridges. This observation indirectly suggests that the disulfide bridges present in the native enzyme are stabilizing (52). An Aquifex pyrophilus serine protease was recently described that contains eight cysteines (none are present in subtilisin BPN') (64). A dithiothreitol treatment reduced its t1/2 at 85°C from 90 h to less than 2 h. This destabilization by dithiothreitol at high temperature suggests that this enzyme indeed contains disulfide bridges and that they are highly inaccessible. The enzyme's 6-h t1/2 at 105°C and pH 9.0, which is much longer than the t1/2 calculated for disulfide bridges in unfolded proteins at pH 8.0 (1 h), suggests that this enzyme's disulfide bridges are protected from destruction by their inaccessibility in the protein. Thus, not all disulfide bridges have equal susceptibility to thermal destruction.
As suggested in Table Table44 and illustrated in Table Table5,5, hydrophobic interactions are a stabilization mechanism in hyperthermophilic proteins. An average increase in stability of 1.3 (± 0.5) kcal/mol was calculated for each additional methyl group buried in protein folding (269) (based on cavity-creating mutations in which a large aliphatic residue was replaced with a smaller aliphatic residue). Mutations attempting to fill cavities are often less stabilizing when they create unfavorable Van der Waals interactions that need local rearrangements (158). While Table Table55 gives crystallographic evidence for the potential role of hydrophobic interactions in thermostability, not much direct, experimental evidence is available to confirm the stabilizing role of hydrophobic interactions in hyperthermophilic proteins. The stability properties of an enzyme chimera constructed between the Methanococcus voltae and M. jannaschii adenylate kinases indicated that a larger and more hydrophobic enzyme core (which is due to an increase in aliphatic residue content and in aliphatic side chain volume) may be responsible for M. jannaschii adenylate kinase's thermostability (124). The 3-isopropylmalate dehydrogenase from the thermophile Thermus thermophilus contains intersubunit hydrophobic interactions that do not exist in the E. coli enzyme. Thermus 3-isopropylmalate dehydrogenase Leu246Glu/Val249Met and E. coli Glu256Leu/Met259Val mutant derivatives were constructed that destabilized and stabilized the Thermus and E. coli enzymes, respectively. Polyacrylamide gel electrophoresis of the mutant and wild-type enzymes in the presence of urea showed that the hydrophobic interactions made the dimer more resistant to dissociation (180).
Aromatic-aromatic interactions (aromatic pairs) are defined by a distance of less than 7.0 Å between the phenyl ring centroids. The following characteristics of aromatic pairs were extracted from the analysis of 272 aromatic pairs in 34 high-resolution structures of mesophilic proteins: in two-thirds of the pairs, the interacting rings are not far from perpendicular; most are involved in a network; most link distinct secondary structural elements (i.e., nonlocal interactions); most are energetically favorable (80% have potential energies between 0 and −2 kcal/mol); and most take place between buried or partially buried residues (50). Among the hyperthermophilic proteins whose structures have been solved (Table (Table5),5), at least one might be stabilized by extra aromatic interactions. P. furiosus α-amylase also contains 5% more aromatic residues than the homologue from Bacillus licheniformis, but is it unknown whether these additional residues are involved in stabilizing interactions (85). A few examples also exist among thermophilic proteins. Thermitase, the serine proteinase produced by Thermoactinomyces vulgaris, contains 16 aromatic residues involved in aromatic pairs; the mesophilic homologue Bacillus amyloliquefaciens subtilisin BPN' contains only 6 aromatic pairs (331). Two clusters of aromatic interactions also exist in the Thermus RNase H that are not present in the E. coli enzyme (159). The solvent-exposed aromatic pair, Tyr13-Tyr17, in B. amyloliquefaciens RNase was replaced with Ala or Phe residues (single and double mutations). Both Tyr-Tyr and Phe-Phe pairs contributed approximately -1.3 kcal/mol toward thermodynamic stabilization (303).
Another type of interaction involving aromatic residues exists in proteins, but it has not been studied in relation to thermostability. In cation-π interactions, positive charges (most often metal cations but possibly cationic side chains of Arg and Lys) typically interact with the center of the aromatic ring (an example is shown in Fig. Fig.4).4). The stabilization energy of the cation-π interaction does not decrease as a function of 1/r3 but, rather, exhibits a 1/rn dependence with n < 2, which resembles more a Coulombic (1/r) than a hydrophobic interaction. The low dependence of the cation-π interaction on distance—and the fact that Phe, Tyr, and Trp do not have high desolvation energies and can easily be accommodated in hydrophobic environments—makes these interactions a potential stabilization mechanism (88).
H bonds are typically defined by a distance of less than 3 Å between the H donor and the H acceptor and by donor-hydrogen-acceptor angle below 90°. The effect of hydrogen bonds on RNase T1 stability has been carefully studied (307). RNase T1 contains 86 H bonds with an average length of 2.95 Å. Their contribution to RNase T1 stability (approximately 110 kcal/mol, as determined by mutagenesis and unfolding experiments) was found to be comparable to the contribution of hydrophobic interactions; individual H bonds contributed an average of 1.3 kcal/mol to the stabilization (307). Because the identification of H bonds is highly dependent on the distance cutoff and because a number of hyperthermophilic protein structures have not been refined to sufficiently high resolutions, studying the role of H bonds in thermostability by structure analysis has not provided clear-cut answers.
One study done by Tanner et al. showed a strong correlation between GAPDH thermostability and the number of charged-neutral H bonds (i.e., between a side chain atom of a charged residue and either a main chain atom of any residue or a side chain atom of a neutral residue) (330). Tanner et al. list two reasons why this type of H bond might be particularly thermodynamically stabilizing: (i) the desolvation penalty associated with burying such H bonds is less than the desolvation penalty for burying an ion pair (that involves two charged residues), and (ii) the enthalpic reward of a charged-neutral H bond is greater than that of a neutral-neutral H bond because of the charge-dipole interaction. This correlation between charged-neutral H bonds and GAPDH stability suggests that the role of charged residues in protein stabilization may not be limited to forming ion pairs. An increased number of charged-neutral H bonds was also found in the T. maritima ferredoxin (Table (Table5).5). These H bonds either stabilize the structure of turns or anchor turns to one another.
Because ion pairs are usually present in small numbers in proteins and because they are not highly conserved, they are not a driving force in protein folding (83). Earlier work by Perutz (272) had suggested, however, that electrostatic interactions represent a significant stabilizing force in folded proteins. He stated that ion pairs are stronger in proteins than in solvents because they are formed between fixed charges. (In bulk water, solvation makes the stability of opposite charges almost independent of distance.) A single ion pair was calculated to be responsible for a 3 to 5-kcal/mol stabilization of T4 lysozyme (7). The desolvation contribution [ΔΔG(desolvation)] to the free energy of folding associated with bringing oppositely charged side chains together is large and unfavorable. It has been suggested that ion pairs are destabilizing in proteins, because this ΔΔG(desolvation) is not sufficiently compensated by the electrostatic energy provided by the ion pair. This unfavorable ΔΔG(desolvation), however, decreases at high temperatures, partially because of a decrease in the water dielectric constant. This reduction is almost entirely electrostatic, primarily affecting the surface charged residues (the water molecules are less ordered and, on average, farther away from charged residues at high temperatures). Thus, charged residues tend to rearrange their conformations to improve their direct electrostatic interactions among each other, and the loss in solvation free energy is almost exactly compensated by a gain in interaction energy with other charged residues in the protein (80, 94). While ion pairing might not be the optimum stabilizing mechanism—or might even be destabilizing for mesophilic proteins—it can represent a strong stabilizing mechanism for hyperthermophilic proteins, as illustrated in Table Table5.5. P. furiosus GDH is 34% identical to the Clostridium symbiosum enzyme. The main difference between these two enzyme structures is found in their ion pair contents (Table (Table6).6). A higher percentage of charged residues participate in ion pairs, in particular Arg (90% of all the Arg residues in the P. furiosus GDH form ion pairs). The P. furiosus enzyme contains 0.11 ion pairs per residue against 0.06 in the C. symbiosum GDH (the average for mesophilic enzymes is approximately 0.04). Arg residues form ion pairs plus H bonds with the carboxylic acids. The ion pairs form large networks that crisscross the protein surface and the subunit interfaces. P. furiosus GDH's largest ion pair network (Fig. (Fig.5)5) is composed of 24 residues (belonging to four different subunits) connected by 18 ion pairs. Ion pair networks are energetically more favorable than an equivalent number of isolated ion pairs, because for each new pair the burial cost is cut in half: only one additional residue must be desolvated and immobilized (368).
The stabilizing potential of buried ion pairs has been investigated, but it remains controversial because of the large ΔΔG(desolvation) associated with burying two charged residues. In a recent study using continuum electrostatic calculations, an average ΔΔG(desolvation) of +12.9 ±5.6 kcal/mol was calculated for buried ion pairs. The large ΔΔG(desolvation) was compensated for by the large Coulombic energy created by the ion pair. (Buried ion pairs are in a low dielectric-constant environment and thus are not exposed to a large screening.) The study's conclusion was that salt bridges with favorable geometry were likely to be stabilizing anywhere in the protein (196). Four additional buried ion pairs between α-helices have been suggested as a stabilization mechanism in P. kodakaraensis O6-methylguanine-DNA methyltransferase. For these four pairs, distances are short: between 2.74 and 3.02 Å. Residues Arg50 and Glu93 form a double ion pair NH1-O1 (2.74 Å) and NH2-O2 (2.83 Å) that connects the N- and C-terminal domains (Table (Table5).5). A stabilizing function has also been proposed for buried ion pairs in Thermosphaera aggregans β-glycosidase (Table (Table55).
The P. furiosus methyl aminopeptidase was shown to contain more ion pairs and ion pair networks than the E. coli enzyme (Table (Table5).5). The P. furiosus enzyme stability decreased at low pH values (where acidic residues are protonated and disrupt favorable ionic interactions) and at high salt concentrations (salts are known to destabilize protein ion pairs). These results suggest that ion pairs are essential in maintaining this enzyme stability at high temperatures (266). In similar experiments, NaCl was shown to destabilize S. solfataricus carboxypeptidase at pH 7.5, but not at pH 9.0 (where the stabilizing ion pairs probably do not exist any more), suggesting that ion pairs are involved in the stabilization of this S. solfataricus enzyme (352). Since other S. solfataricus enzymes are destabilized by NaCl, ion pairing might represent a general stabilization strategy in this organism (as Sulfolobale strains do not thrive in the presence of high salt concentrations).
Ion pairing's involvement in hyperthermophilic protein stabilization has already been extensively studied by site-directed mutagenesis (SDM). Because all studied hyperthermophilic enzymes unfold irreversibly, the only stability data available refer to kinetic stability. A 4-residue surface ion pair network that connects the N- and C-terminal helices was shown by SDM to participate in the stabilization of T. maritima GAPDH. In this network, Arg20 is connected to three other residues by ion pairs or H bonds. The mutations Arg20Ala and Arg20Asn increased the enzyme denaturation rate at 100°C by a factor of 3.5 (270). In T. maritima indoleglycerol phosphate synthase, the stabilization provided by ion pair Arg241-Glu73 (between [α/β]8 barrel helices α8 and α1) was also tested by SDM. At 85.5°C, mutation Arg241Ala increased the enzyme denaturation rate by a factor of almost 3. The enzyme Ea of unfolding at 85°C decreased by 3.2 kJ/mol, suggesting that the Arg241-Glu73 pair participates in the kinetic stabilization of this enzyme (246).
The ion pair networks identified in the P. furiosus, P. kodakaraensis, and T. litoralis GDH structures (Table (Table5)5) were studied by SDM. The three enzymes are 83 to 87% identical, but their thermostabilities decrease in the direction P. furiosus GDH > P. kodakaraensis GDH > T. litoralis GDH. They all contain the same 18-ion-pair network at their hexamer interface (Fig. (Fig.5).5). The mutation Glu158Gln, which removed two ion pairs at the center of this network, significantly destabilized P. kodakaraensis GDH (280). One ion pair network involving six charged residues is present only in P. furiosus GDH. The same ion pair network was created in P. kodakaraensis GDH and T. litoralis GDH by SDM. Both enzymes were stabilized by the newly introduced ion pair network (280, 348). These studies confirmed the role of ion pair networks in the P. furiosus, P. kodakaraensis, and T. litoralis GDH thermostabilities. Lebbink et al. (203) introduced a 16-residue ion pair network at the subunit interface in T. maritima GDH to create an interface similar to the 18-ion-pair network in P. furiosus GDH. The combination of three destabilizing mutations yielded a triple mutant enzyme (Ser128Arg-Thr158Glu-Asn117Arg) that was slightly more stable and thermophilic than the wild-type enzyme. This result illustrates the high level of cooperation that exists among the different members of this ion pair network. This result also supports the role of the 18-residue ion pair network in P. furiosus GDH stabilization.
In an earlier study, Tomschy et al. (337) had removed two ion pairs located on the surface of two α-helices in T. maritima GAPDH. Because these mutations did not affect the enzyme stability, the authors concluded that surface ion pairs could not be considered a general strategy of thermal adaptation. Both ion pairs chosen in this study were intrahelical ion pairs. These two pairs might have been located in protein areas that were overconstrained and that were not among the protein areas most susceptible to unfolding. In contrast, the other examples described above illustrate the thermostabilization effect of non-local ion pairs and ion-pair networks, which link nonadjacent residues (and secondary structures) in the sequence.
Additional, indirect evidence for the role of ion pairing in thermostability comes from genome sequencing. The major trend observed in Table Table44 is toward an increased number of charged residues in hyperthermophilic proteins compared to mesophilic proteins, mostly at the expense of uncharged polar residues.
Matthews et al. (238) proposed that proteins of known three-dimensional structure could be stabilized by decreasing their entropy of unfolding. In the unfolded state, glycine, without a β-carbon, is the residue with the highest conformational entropy. Proline, which can adopt only a few configurations and restricts the configurations allowed for the preceding residue (313), has the lowest conformational entropy. Thus, the mutations Gly→Xaa or Xaa→Pro should decrease the entropy of a protein's unfolded state and stabilize the protein, as long as the engineered residue does not introduce unfavorable strains in the protein structure. This technique has been used to engineer enzymes that are more thermodynamically stable. For example, B. stearothermophilus neutral protease inactivates by autolysis, which targets a specific flexible surface loop (residues 63 to 69) (93). Prolines were introduced in that loop to make it less susceptible to unfolding. Only positions 65 and 69 were suitable for proline substitutions. In other positions, a proline would eliminate noncovalent interactions, create conformational strains, or have inappropriate torsion angles. Only mutations Ser65Pro and Ala69Pro proved thermostabilizing, as had been predicted by modeling (125). A number of thermophilic and hyperthermophilic proteins also use this stabilization mechanism (255) (Table (Table5).5). The Thermoanaerobium brockii secondary alcohol dehydrogenase contains eight more prolines than its Clostridium beijerinckii mesophilic homologue. Residues Pro177 and Pro316 at the N termini of two helices and Pro24 in position 2 of a β-turn were shown to be stabilizing (215). (Prolines were introduced in the corresponding locations in the C. beijerinckii enzyme.) There are at least 22 locations in which prolines occur only in thermophilic Bacillus oligo-1,6-glucosidases. The majority of these prolines are in position 2 of solvent-exposed β-turns (seven of these prolines), in coils within loops (nine of them), or at the N-cap of α-helices in the barrel structure (four of them). Prolines were introduced at the corresponding locations in the mesophilic Bacillus cereus oligo-1,6-glucosidase. Thermostability usually increased with the number of prolines introduced. The stability increase was most significant when prolines were added at position two of β-turns or at N caps of α-helices. The less stabilizing mutations probably introduced unfavorable Van der Waals interactions or removed stabilizing H bonds (361).
The Thermotoga neapolitana xylose isomerase contains two prolines in a loop that is involved in intersubunit interactions. These prolines are absent in the less stable Thermoanaerobacterium thermosulfurigenes enzyme (Fig. (Fig.6).6). The kinetic stability properties of the two T. thermosulfurigenes xylose isomerase mutants Gln58Pro and Ala62Pro illustrate how important the mutation location is for the outcome of SDM (313). Both Gln58 and Ala62 had backbone dihedral angles which allowed for prolines, neither was involved in noncovalent stabilizing interactions, and Asp57 and Lys61 had dihedral angles that allowed for residues preceding prolines. The conformation of the Gln58 side chain was very close to that of the proline pyrrolidone ring (Fig. (Fig.6),6), and so no conformational strain was introduced by Pro; the mutation Gln58Pro stabilized the protein mainly by decreasing the entropy of unfolding. In contrast, the mutation Ala62Pro created a volume interference between the proline pyrrolidone ring (Cδ atom) and the Ly61 side chain (Cβ atom) that probably led to destabilizing conformational changes. The mutation Ala62Pro reduced the enzyme's t1/2 at 85°C by a factor of 10.
For almost half of the hyperthermophilic proteins listed in Table Table5,5, intersubunit interactions are mentioned as a potential major stabilization mechanism. The only strong experimental evidence available that supports the role of intersubunit interactions in the stability of hyperthermophilic proteins was obtained with P. kodakaraensis and T. litoralis GDHs: ion pairs were created to match the structure of the more thermostable P. furiosus GDH (280, 348) (see “Ion Pairs” above). More work has been done using thermophilic enzymes as models. The mutation Gly281Arg in B. subtilis GAPDH (matching the sequence of the more stable B. stearothermophilus GAPDH) created kinetically stabilizing intersubunit ion pairs. The enzyme t1/2 increases from 19 min at 50°C to 198 min at 75°C (252). Compared to the E. coli 3-isopropylmalate dehydrogenase, the Thermus thermophilus enzyme shows a more hydrophobic subunit interface. SDM experiments showed than the hydrophobic interactions present in the Thermus enzyme made the dimer more resistant to dissociation (180, 250) (see “Hydrophobic Interactions” above). Mutagenesis of a hydrophobic core at the dimer interface in the T. thermophilus elongation factor EF-Ts showed that this hydrophobic core contributed to the enzyme dimerization and that dimer formation considerably contributed to the thermodynamic stability of T. thermophilus EF-Ts, (259). Methanopyrus kandleri formylmethanofuran:tetrahydromethanopterin (H4MPT) formyltransferase (MkFT) is monomeric and inactive at low salt concentrations. It adopts active dimeric and tetrameric forms at higher salt concentrations. The activity of MkFT in dimeric or tetrameric forms in the presence of potassium phosphate is stimulated by the addition of NaCl, suggesting that oligomerization is a prerequisite for activity and that the mechanisms of salt-induced activation and salt-induced oligomerization are different. In the enzyme crystal structure, subunit interfaces are mostly hydrophobic. Lyotropic salts increase hydrophobic interactions and probably strengthen subunit interactions. Oligomer formation requires higher concentrations of NaCl than of potassium phosphate (a stronger lyotropic salt), suggesting a dominant role for salting-out effects in MkFT thermostability (306). Both MkFT thermodynamic and kinetic stabilities increased with oligomerization (306). From the examples listed in Table Table55 and above, it appears that intersubunit interactions play indeed a major role in the stabilization of hyperthermophilic proteins. Interestingly, there is no single type of intersubunit interaction responsible for this stabilization.
An ever-increasing number of hyperthermophilic proteins are known that have a higher oligomerization state than their mesophilic homologues (Table (Table7).7). Only for T. maritima phosphoribosylanthranilate isomerase is experimental evidence available demonstrating that dimerization is a stabilization factor (332). Thoma et al. (332) engineered monomeric variants of this enzyme by SDM. These monomeric variants remained as active as the wild-type enzyme. Their X-ray structure differed from that of the wild-type enzyme only in the restructured interface, but their kinetic stability at 85°C decreased by factors of 60 to 100 (from a t1/2 of 310 min for the wild-type enzyme to 3 to 5 min for the variants). For M. kandleri methenyl H4MPT cyclohydrolase (MkCH), trimerization probably increases the enzyme stability, since it leads to an enlarged buried surface area and increased packing density. Not only are the hydrophobic interactions between subunits strengthened but also several loops and the N and C termini are fixed by contacts to the neighboring subunits (120). Triosephosphate isomerase (TIM) is only expressed as a fusion enzyme with PGK in T. maritima. The T. maritima PGK-TIM fusion and the tetrameric structure were shown to enhance the stability and activity of TIM but not the stability of PGK (23). Based on stability studies of dimeric globular proteins, Neet and Timm (256) calculated that quaternary interactions could provide 25 to 100% of the conformational stability in protein dimers. This study, although performed with mesophilic proteins, suggests that oligomerization can be a significant stabilizing mechanism for hyperthermophilic enzymes. T. maritima xylanase XynA is organized in five domains. Domains N1 and N2 were shown to be necessary for optimal kinetic thermostability (365). These two domains were also present in T. saccharolyticum XynA (207) and are also needed for that enzyme kinetic stability. Modular organization may be another factor contributing to stability.
Residues in the left-handed helical conformation ( = 40 to 60°, Ψ = 20 to 80°) have marginal conformational stability unless they are stabilized by intramolecular non-covalent interactions. (Non-Gly residues with a left-handed helical conformation are supposed to be less stable than the right-handed conformation by 0.5 to 2.0 kcal/mol.) The close contact between the β-carbon and the carbonyl oxygen within the residue in the left-handed helical conformation creates a local conformational strain on the protein structure. Two residues in the left-handed helical conformation, Glu15 in B. subtilis DNA binding protein HU and Lys95 in E. coli RNase H1, both in turn regions, are replaced by Gly residues in their thermophilic counterparts. Mutations Glu15Gly and Lys95Gly in B. subtilis DNA binding protein HU and E. coli RNase H1, respectively, eliminated the conformational strain created by the residues in the left-handed helical conformation, and significantly increased the thermodynamic stabilities of the two proteins (173, 179). In these two examples, the stability gained by the conformational strain release was enhanced by its stabilizing effect on secondary structure interactions. E. coli RNase H1 contains two additional non-Gly residues with left-handed helical conformation. Residues Trp90 and Asn100, in contrast to Lys95, point to the interior of the enzyme, and they make compensatory polar or hydrophobic interactions. Mesophilic ferredoxins contain three residues in the left-handed helical conformation in their [4Fe-4S] cluster binding region. In the T. maritima and T. litoralis homologs, the steric hindrance in the cluster binding region is released by the substitution of the residues in the left-handed helical conformation with three Gly residues. These three Gly residues are involved in H bonds with the cluster sulfur atoms (226).
Other types of conformational strain releases have been proposed as stabilizing mechanisms. In α-helices, for example, residues with a low helical propensity can be replaced by residues that have a high helical propensity. Such substitutions usually take place when a residue's side chain is not well accommodated in the α-helix. One particular substitution location in α-helices is the C terminus (or C cap). Gly is the most favorable residue at the C cap, because its lack of side chain allows it to adopt a left-handed helical conformation without strain and because the main chain carbonyl oxygen can form H bonds with solvent molecules. The P. furiosus citrate synthase contains at least seven helices that have a C-cap Gly. Their effect on stability is still unknown (254). In general, though, these types of conformational strain releases are not expected to provide significant stabilization, and they have not been characterized in detail in hyperthermophilic protein structures. They also compete with other stabilization mechanisms (e.g., propensity for hydrophobic interactions, for H bonds, or for ion pairs) (83).
Helix dipoles can be stabilized by negatively charged residues near their N-terminal end, as well as by positively charged residues near their C-terminal end. In the S. solfataricus indole-3-glycerol phosphate synthase, every helix dipole in the (α/β)8 barrel is stabilized versus six in the E. coli enzyme (130). The helices' dipoles are also stabilized in the B. stearothermophilus and T. maritima PGKs: while mesophilic enzymes have only 9 N caps and 12 C caps (pig PGK) and 10 N caps and 9 C caps (yeast PGK) stabilized by opposite charges, the numbers of stabilized N and C caps increase to 16 N caps and 13 C caps in the B. stearothermophilus PGK and to 17 N caps and 14 C caps in the T. maritima PGK (15). Nicholson et al. showed that N-cap stabilization could increase an enzyme's ΔGstab by approximately 0.8 kcal/mol (262). In general, though, N and C capping compete with other stabilization and destabilization mechanisms (e.g., propensity for H bonds or ion pairs, and conformational strain), and the stabilization provided is often marginal.
In a recent study, Karshikoff and Ladenstein (169) compared the partial specific volumes, voids, and cavity volumes in a set of 80 nonhomologous mesophilic, 20 thermophilic, and 4 hyperthermophilic proteins. They concluded that none of these factors could be considered a common thermostabilization mechanism. A few examples exist, however, of hyperthermophilic proteins that gain part of their stability from better packing (Table (Table5).5). For one of these proteins, thermodynamic stabilization by better packing was demonstrated experimentally: a solvent-accessible cavity in the Methanobacterium formicicum histone is partially filled by bulkier hydrophobic side chains in the M. fervidus protein (375). The mutations Ala31Ile and Lys35Met increased the M. formicicum protein Tm by 11 and 14°C, respectively, while the mutations Ile31Ala and Met35Lys decreased the M. fervidus histone Tm by 4 and 17°C, respectively (216). Britton and colleagues (38) suggested that the strongly increased Ile content in P. furiosus GDH was consistent with a general increase in packing when compared to the mesophilic C. symbiosum GDH. With the ability of Ile to adopt more conformations than Leu in proteins (see “Amino Acid Composition and Intrinsic Propensity” above), it is better able to fill various voids in the protein core (38). While better core packing is often linked to increased hydrophobicity, in some cases it can affect stability in different ways. Based on the ability of high short peptide densities to form α-helices in crystals, it has been proposed that an increasing peptide concentration increases the stability of helices (83). Internal packing, therefore, could be involved in protein thermostability either as a general stabilizing force or as a factor altering the stability of secondary structures.
Since surface hydrophobic residues cannot participate in stabilizing interactions with the solvent, they are detrimental to protein stability and solubility. A number of hyperthermophilic proteins show significantly reduced hydrophobic accessible surface areas (ASA): T. maritima lactate dehydrogenase, P. kodakaraensis O6-methylguanine-DNA methyltransferase, M. kandleri MkCH, and S. acidocaldarius anthranilate synthase (Table (Table5).5). In S. acidocaldarius superoxide dismutase, the hydrophobic ASA represents only 18% of the total ASA, as opposed to 26.8% in mesophilic enzymes. This decrease in hydrophobic ASA is balanced by an increase in polar ASA (184).
Loops and N and C termini are usually the regions with the highest thermal factors in a protein crystal structure. They are likely to unfold first during thermal denaturation. Going against the earlier belief that loops had no bearing on protein stability, loops in hyperthermophilic proteins show structural features that could lead to protein stabilization. Two loop-stabilizing trends have been observed (Table (Table5):5): loops are either shortened or better anchored to the rest of the protein. Loop shortening can be the consequence of the extension or the creation of a secondary structure (T. maritima PGK and lactate dehydrogenase in Table Table5).5). Loop anchoring is achieved through ion pairing, H bonding, or hydrophobic interactions. Stabilization of the N and C termini involves similar mechanisms to those in loop stabilization. In Fig. Fig.7,7, the N terminus of T. maritima ferredoxin is fixed to the protein core by H bonds. N and C termini often interact with each other for mutual stabilization (T. maritima PGK and phosphoribosyl anthranilate isomerase in Table Table5),5), N and C termini—as well as loops—can also be anchored by participating in subunit interfaces (T. maritima PGK and MkCH in Table Table5).5).
One of the best examples of multiple dockings in found in M. kandleri MkFT. An insertion region that is loosely attached to its subunit interacts extensively with the β-meander region from another subunit, docking each area to the other. Loops are linked to adjacent regions in the same or different subunit by multiple interactions (mainly H bonds); the N- and C-terminal residues are connected to each other and to the protein (they even have average B factors) (96). In A. pyrophilus superoxide dismutase, loop 2 is extended and plays a key role in forming a compact tetramer. This loop is not flexible (it has low B factors), and it makes extensive contacts with other subunits in the tetramer. The C terminus is 10 or 11 residues longer than in other superoxide dismutases. These additional residues extend the C-terminal α-helix by two more turns, and this C-terminal helix makes extended contacts with the C-terminal helix of another subunit (220). Trimerization of MkCH is probably a stabilization mechanism in itself, but it also allows several loops and the N and C termini to be fixed by contacts to the neighboring subunits (120).
Metals have long been known to stabilize and activate enzymes. Xylose isomerases bind two metal ions (chosen from Co2+, Mg2+, and Mn2+). One cation is directly involved in catalysis; the second is mainly structural (232, 363). The two metal binding sites have different specificities, and replacing one cation with another often significantly alters enzyme activity, substrate specificity, and thermostability (171, 232). A study of Bacillus licheniformis xylose isomerase stability in the presence and absence of metals showed that the evolution of kinetic stability followed that of thermodynamic stability and that both types of stabilities were functions of the nature of the metal present (Table (Table8).8). These observations suggest that major stabilizing forces are associated with the presence of metal in the holoenzyme.
Indirect evidence for the role of metals in the stability of hyperthermophilic proteins is the difficulty encountered in removing the metals from the enzymes. α-Amylases specifically bind Ca2+. The α-amylase catalytic site is located in a cleft between two domains (an [α/β]8 barrel and a large loop). Coordinated by ligands belonging to these two domains, Ca2+ is essential for the enzyme's catalytic activity and thermostability (30). The P. furiosus extracellular α-amylase was initially described as a Ca2+-independent enzyme, because room temperature EDTA treatments had no effect on its activity (85). Further characterization revealed that this enzyme contains at least two Ca2+ cations that cannot be removed by EDTA at temperatures below 70°C. A 30-min EDTA treatment at 90°C removed approximately 60 to 70% of the bound Ca2+ (A. Savchenko, C. Vieille, and J. G. Zeikus, unpublished results). Similar observations were made with the Thermococcus profundus α-amylase (approximately 80% identical to the P. furiosus extracellular α-amylase). This enzyme was activated and stabilized by Ca2+, but room temperature EDTA treatments had no effect on activity (65).
Some thermophilic and hyperthermophilic enzymes have been described that contain metal atoms that are not present in their mesophilic homologs. The ferredoxin from Sulfolobus sp. strain 7 contains an extra 40-residue N-terminal extension that is linked to the protein core by a Zn binding site. The zinc atom is liganded by three His residues from the N-terminal domain and one Asp residue from the core domain. This structure (N-terminal extension plus Zn binding site) is absent in eubacterial homologs but is conserved in all other thermoacidophiles (107). Progressive N-terminal deletions and SDM of two of the three His ligands showed that both the N-terminal extension and the zinc atom are important for thermodynamic stability. Their presence or absence has no effect, though, on ferredoxin function. The zinc atom is responsible for a 9°C increase in Tm. It is so tightly bound inside the protein that it cannot be removed without removing the two FeS clusters (191). Thermoactinomyces vulgaris subtilisin-type serine-protease thermitase contains three Ca2+-binding sites; one of them is not present in its mesophilic homologues (331). A thermophilic homologue of thermitase, the Bacillus Ak1 protease, contains one more Ca2+ than thermitase does, and it is significantly more kinetically stable than thermitase in the presence of Ca2+ (t1/2 of 15 h at 80°C versus 19 min for thermitase) (311). Since Ca2+ preferentially binds carboxylate and other oxygen ligands (which are the metal ligands most likely to be located on the protein surface), this metal is more likely than others to play a significant stabilizing role in proteins.
Perutz and Raidt (273) suggested that ion pairs linking portions of the protein that are juxtaposed in the structure but nonadjacent in the sequence can significantly contribute to protein thermostability. Recent information accumulated on hyperthermophilic proteins strongly supports this hypothesis. Figure Figure44 shows three types of nonlocal interactions involving residues from one α-helix (Tyr93) and three different loops (Arg19, Thr84, and Asp111). While most information available concerns ion pairs (Table (Table55 and references therein), the hypothesis of Perutz and Raidt might also extend to other types of noncovalent interactions. Chimeras were constructed between P. furiosus and Clostridium pasteurianum rubredoxins. Their relative stabilities (compared to the P. furiosus and C. pasteurianum rubredoxins) indicate that essential stabilizing interactions exist between the protein core and the β-sheet (H bonds or hydrophobic interactions). Individually, neither the core nor the β-sheet provides extensive stabilization (92).
The stabilization provided by a few nonlocal ion pairs was tested by SDM. In T. maritima indoleglycerol phosphate synthase, the ion pair Arg241-Glu73 links helices α8 and α1 in the (α/β)8 barrel. At 85.5°C, the mutation Arg241Ala increased the enzyme denaturation rate by a factor of almost 3. The enzyme Ea of unfolding at 85°C decreased by 3.2 kJ/mol, suggesting that the Arg241-Glu73 ion pair participates in the kinetic stabilization of this enzyme (246). In Bacillus polymyxa β-glucosidase A, the mutation Glu96Lys created a stabilizing surface salt bridge (Lys96-Asp28) between two distant parts of the protein sequence: a loop (Asp28) and the N terminus of a helix (Lys96) (293). The docking of loops and N and C termini on the protein surface also involve numerous nonlocal interactions. Figure Figure77 illustrates the docking of a protein N terminus to a surface turn.
Protein glycosylation is widespread among eucaryal enzymes, and a number of bacterial extracellular enzymes are glycosylated. Only a few examples are known of hyperthermophilic proteins that are glycosylated, and their carbohydrate moieties have not been extensively characterized (100, 138). Most glycosylated enzymes (bacterial, archaeal, and eucaryal), though, retain their catalytic and stability properties when expressed in bacteria. A few studies using naturally glycosylated eucaryal proteins showed that glycosylation could cause significant thermal stabilization without affecting the protein folding pathways or their conformations. The higher tendency of the deglycosylated enzymes to aggregate during thermal inactivation suggested that glycosylation could also prevent partially folded or unfolded proteins from aggregating (163, 359). Bovine pancreatic RNase A and RNase B differ only by a carbohydrate moiety attached on Asn34 in RNase B. This carbohydrate moiety is responsible for the higher kinetic and thermodynamic stability of RNase B. Progressive hydrolysis of the carbohydrate moiety showed that the stability difference was due to the attachment of the first carbohydrate unit to Asn34 (13).
The effect of glycosylation on thermostability was also studied with two Bacillus β-glucanases expressed in E. coli and in Saccharomyces cerevisiae. One of the two enzymes was strongly kinetically stabilized by glycosylation at 70°C, and its optimum temperature for activity was higher. The thermostabilization level depended more on the location of the carbohydrate moiety on the protein than on the extent of glycosylation (267). While glycosylation is probably not a thermostabilization method commonly found in nature, the few examples cited above suggest that it could represent an alternative method for either enzyme thermostabilization or for solubilization.
Posttranslational lysine methylation (formation of N--monomethyllysine) has been described for a number of Sulfolobus proteins (91, 231, 240). The native small DNA binding protein Sac7d from S. acidocaldarius (monomethylated on Lys 5 and Lys7) reversibly denatures at 100°C (pH 7.0) (91). The recombinant Sac7d denatures at 92.7°C. The 7°C difference in Tm between native and recombinant Sac7d has been attributed to Lys methylation, which is absent in the recombinant protein (240). It is unclear, though, if Lys methylation is a general thermostabilization mechanism in members of the Sulfolobales, since the stability of Sso7d (the Sac7d homologue in S. Sulfolobus) is methylation-independent.
While most pure hyperthermophilic enzymes are intrinsically very stable, some intracellular hyperthermophilic proteins get their high thermostability from intracellular environmental factors such as salts, high protein concentrations, coenzymes, substrates, activators, polyamines, or an extracellular environmental factor such as pressure.
Inorganic salts stabilize proteins in two ways: (i) through a specific effect, where a metal ion interacts with the protein in a conformational manner (see “Metal binding” above), and (ii) through a general salt effect, which mainly affects the water activity. Thauer and colleagues studied the effect of salts on the thermostability and the activity of five M. kandleri methanogenic enzymes (36, 37, 181, 224, 225). While the five enzymes are activated and kinetically stabilized by salts, the extent of the salt effect is enzyme dependent. K+ and NH4+ typically stabilize enzymes more efficiently than other cations do. Of all the anions, SO42− and HPO42− have the strongest activating effect (36). Enzyme salt requirements are not always satisfied by the intracellular salt concentration. The M. kandleri intracellular salt concentration (>1 M potassium plus 1 M cyclic 2,3-diphosphoglycerate [cDPG]) seems to favor MkCH activity (maximal at 1.5 M salt) over its stability (optimal below 0.1 M salt) (37). The effects of salts on CHO-tetrahydromethanopterin (H4MPT) formyltransferases from M. kandleri, M. thermoautotrophicum, Archaeoglobus fulgidus, and Methanosarcina barkeri were compared. The difference in formyltransferase activation by salts was directly correlated to the intracellular cDPG concentration in the different organisms (36). The structure of MkFT was analyzed in terms of its stabilization by salts. Two features were suggested to be related to this property: (i) MkFT presents a decrease in accessible surface hydrophobicity, as well as intersubunit interfaces that are largely hydrophobic; and (ii) the tetramer surface presents an excess of negatively charged residues (48 versus 24 basic residues). Acidic residues can form strong H bonds and multiple H bonds to water molecules, enabling these residues to compete with inorganic cations or water. Of all the residues, Glu has the highest capacity to bind water molecules. Of the 48 surface negative residues, 33 are Glu and 15 are Asp (96). High lyotropic salt concentrations are supposed to enhance the surface ionic interactions due to an increasing number of inorganic cations at the negatively charged surface and to enhance intersubunit hydrophobic interactions due to the salting-out effect. MkFT oligomer formation was shown to require higher concentrations of NaCl than of potassium phosphate (a stronger lyotropic salt), suggesting a dominant role for salting-out effects in MkFT thermostability (306). This protein might have evolved to be optimally stable in the presence of a high intracellular salt concentration. M. kandleri cells contain approximately 1 M cDPG when grown at 98°C. Potassium salts of cDPG, 2,3-DPG, and phosphate are equally effective at activating M. kandleri cyclohydrolase. However, at equal ion concentrations cDPG is more effective at stabilizing MkFT. In M. kandleri, the cDPG concentration is optimal for the activities and stabilities of MkCH and MkFT. Synthesizing cDPG requires 4 ATP molecules. The last reaction in this synthesis is the only one exergonic enough to drive the synthesis up to cDPG rather than to its precursor 2,3-DPG. Also, at pH 7.0, cDPG is a trianion whereas 2,3-DPG is a penta-anion; thus cDPG has a smaller effect on the ionic strength than would 2,3-DPG (305).
M. fervidus GAPDH is intrinsically kinetically stable only up to 75°C. A study of this enzyme's thermostabilization by salts indicated that the relative salt effects—K3PO4 > Na3PO4 > K2SO4 > Na2SO4 > KCl > NaCl—were consistent with their respective abilities to reduce the enzyme solubility in an aqueous solvent. Their action was attributed to their salting-out effects (98). M. fervidus GAPDH is probably stabilized in vivo by cDPG, which is present in this organism at approximately 0.2 to 0.3 M (305). It is interesting that other M. fervidus enzymes are only stable up to temperatures below the organism's optimal growth temperature, suggesting that stabilization by salts is a common mechanism in this organism (98).
The Thermoanaerobacterium thermosulfurigenes xylose isomerase was also stabilized by K+. This enzyme's t1/2 increased sevenfold in the presence of 100 mM KCl (244).
Substrate molecules have long been known to stabilize enzymes specifically by stabilizing their active site. This observation is also valid for hyperthermophilic enzymes (9, 98, 182, 187, 309, 328). T. maritima dihydrofolate reductase was shown to be strongly kinetically stabilized by substrates, in particular by NADPH (sixfold increase in t1/2 at 80°C) (364). NADPH is quite unstable at high temperatures. This strong stabilization of T. maritima dihydrofolate reductase by NADPH could also be associated to a strong stabilization of the cofactor.
Because many high-temperature environments are also high-pressure environments and because microorganisms cannot evade pressure and temperature, all the macromolecular cell components have to be adapted to high pressures. Thus, it is not surprising to find hyperthermophilic organisms that are also barophilic (such as Thermococcus barophilus [Table 1]) and to find enzymes that are stabilized and activated by high pressures (e.g., M. jannaschii protease and hydrogenase) (129, 247, 248). The theory behind stabilization by pressure says that pressure favors the structure with the smallest volume. Proteins stabilized mainly by hydrophobic interactions are therefore expected to be stabilized at high pressure, whereas proteins stabilized by ionic interactions should be destabilized (247). P. furiosus rubredoxin, for example, is stabilized mostly by electrostatic interactions. This enzyme is destabilized by high pressures. Since numerous chemical reactions are performed at high temperatures and pressures, enzyme stability at high pressures has great potential benefits for biocatalysis.
The Q10 rule indicates that many reaction rates double with each 10°C increase in temperature. According to this rule, one would expect hyperthermophilic enzymes to have specific activities between 50 and 100 times higher than mesophilic enzymes. What is observed instead is that hyperthermophilic and mesophilic enzymes have approximately the same activities and catalytic efficiencies in their respective physiological conditions. The fact that hyperthermophilic enzymes are not as catalytically optimized catalysts as their mesophilic homologues was attributed to the principle that there had to be a trade-off between thermostability and activity, i.e., that a protein could not be both a hyperstable and a catalytically optimized catalyst. This principle came from the observation of natural proteins. Numerous protein engineering studies performed in the last 10 years suggest instead that protein stability can be enhanced without deleterious effects on activity and that actually stability and activity can be increased simultaneously (116, 346, 374).
The facts that mesophilic enzymes are not optimized in terms of stability and that hyperthermophilic enzymes are not catalytically optimized is probably only a reflection of the absence of selection pressure for these characteristics. Organisms need to have proteins that they are able to degrade, in order to rapidly adapt to changes in environmental stimuli. Hence, mesophilic as well as hyperthermophilic enzymes are only marginally stable under their respective physiological conditions. Unless their substrate is highly unstable, there is no selection pressure in nature for hyperthermophilic enzymes to be highly active. A protein engineer should be able to increase an enzyme's thermostability without negatively affecting its catalytic properties. The limit to this thermostability increase is the upper temperature of protein stability (which is still unknown).
Two types of protein stability (thermodynamic and long term) are of interest from an applied perspective. Increasing the thermodynamic thermostability is the main issue when an enzyme is used under denaturing conditions (i.e., high temperatures or organic solvents). Industrialists need active enzymes rather than enzymes that are in a reversibly inactivated state. For other enzymes, for example diagnostics enzymes, it is often long-term stability that needs to be improved (251).
Depending on an enzyme's first inactivation step, i.e., chemical inactivation or unfolding, stabilizing the native, active conformation should involve either substituting temperature-sensitive residues with chemically more stable residues or increasing the enzyme's resistance to unfolding, respectively. A number of attempts at stabilizing (or destabilizing) proteins by SDM have failed because they did not target protein areas that were critical for the unfolding process (173, 174, 337). On the other hand, mutations targeted at areas whose unfolding is limiting in the protein denaturation process can provide extensive stabilization. Good illustrations can be found in the stability studies of Bacillus stearothermophilus thermolysin-like protease. This enzyme is irreversibly inactivated by autolysis that is made possible by partial unfolding of local surface areas. Stabilizing mutations were all located on the surface, around one flexible loop located in the β-pleated N-terminal domain (125, 230, 347). The association of eight mutations in the same area resulted in a 340-fold kinetic stabilization of B. stearothermophilus thermolysin-like protease at 100°C and did not affect the catalytic activity at 37°C (346). In this enzyme, since the target for autolysis is a loop belonging to the N-terminal domain, any attempt to stabilize the enzyme introducing mutations into the C-terminal domain would have been doomed to fail.
While improving thermodynamic thermostability can have a beneficial effect on long-term stability, other strategies can also be used to increase long-term stability. In this case, the focus for stabilization is decelerating the irreversible inactivation process that usually follows reversible unfolding. The strategies that can be used include (i) eliminating protein diffusion to block aggregation and other bimolecular processes (the most effective approach so far is immobilization), (ii) replacing temperature-sensitive residues by chemically more stable residues (reference 251 and references therein), and (iii) stabilizing the reversibly unfolded state by introducing more hydrophilic residues on the enzyme surface or by adding low-molecular-weight compounds to the solute (i.e., inorganic and organic salts, organic cosolvents, or classical denaturants). Hen egg-white lysozyme slowly deamidates once reversibly unfolded (6). In their 1995 study, Tomizawa et al. (336) replaced Gly residues by Ala in the potentially deamidable Asn-Gly sequences of lysozyme. These mutations generally increased the rate of reversible unfolding, but they decreased the rate of irreversible inactivation and, as a result, stabilized the enzyme against irreversible inactivation. This study is a good illustration of the fact that resistance against irreversible inactivation is not synonymous to thermodynamic thermostability.
The different stabilization strategies (e.g., better core packing, surface ion pairing, surface loop stabilization, and reduction of the entropy of unfolding) discussed in this review do not have comparable potential for protein stability engineering today. The high conservation of the protein core (mostly defined by α-helices and β-strands) between mesophilic and hyperthermophilic protein homologues suggests that the protein core is already quite optimized for stability, even in mesophilic enzymes. For this reason, mutations in the protein core are often destabilizing, with stabilizing effects often being masked by destabilizing conformational constraints or repulsive van der Waals interactions. The stability gain from α-helix stabilization by introducing residues with high helix propensity is also usually small (373). More promising strategies are directed at the protein surface loops and turns: surface residues are typically involved in less intramolecular interactions than internal residues, and newly introduced residues are less likely to create volume interferences. Substitutions in these areas are accommodated by rearrangements of the neighboring residues more easily than in rigid parts of the protein. Successes in substituting left-handed helical residues with Gly or Asn, in introducing prolines in surface turns or loops, in introducing nonlocal surface ion pairs, and in creating disulfide bridges docking loops to the protein surface are well documented (173, 179, 238, 293, 346). Natural examples along these lines are the docking of the N and C termini and the anchoring of “loose ends” observed in the structures of many hyperthermophilic enzymes (Table (Table55).
The most promising strategies for thermostabilization using SDM should focus on the surface areas, mostly on loops and turns, and on creating additional nonlocal ion pairs. Loops can be made more rigid by decreasing their intrinsic entropy of unfolding. Two types of mutations, Gly→Xaa and Xaa→Pro, can be introduced (238). In the first case, the newly introduced β-carbon should not interfere with neighboring atoms. In the second case, the substitution site should have specific dihedral angles (ϕ and ψ) in the regions −50 to −80 and 120 to 180 or −50 to −70 and −10 to −50 and the residue preceding the potential proline should also have a specific conformation. In addition, the proline ring should not interfere with neighboring atoms, and the substitution should not eliminate stabilizing noncovalent interactions. Another method is to anchor the loops to the protein surface, either by noncovalent interactions or with a disulfide bridge. Introducing a disulfide bridge in a semiflexible area of the protein should help compensate for any conformational strain created by the disulfide bridge (230). Ion pairs linking nonadjacent sequences in a protein have a great stabilizing potential, and since they can be designed on the protein surface, they do not tend to create as many destabilizing conformational constraints or repulsive van der Waals interactions as substitutions of buried residues. Metal-mediated protein cross-linking can also be a stabilizing strategy. Such cross-linking stabilizes the protein by reducing the entropy of the denatured state; therefore, it depends on the size of the loop formed by the cross-link. β-sheets offer both the geometry and rigidity required for metal ion chelating by dihistidine sites (253).
Although not reviewed here in detail, a number of computer algorithms based on physical and chemical principles are being developed to predict protein rigidity and stability and to design stabilizing mutations. Initial results are encouraging and suggest that computer algorithms will become a powerful tool for protein engineers in the near future. Using a computer algorithm based on the dead-end elimination theorem, Malakauskas and Mayo (228) designed a hyperthermostable variant of the streptococcal protein G β1 domain. The variant had a Tm that was increased by more than 20°C and a ΔGstab that was increased by 4.3 kcal/mol at 50°C. Structural analysis of the variant indicated that the main stabilization mechanisms were the release of a strained rotamer conformation, the increased burial of hydrophobic area, and higher helical propensity.
The FIRST software was developed to predict flexible and stable regions in proteins of known structure (161). Based on the hypothesis that thermostable enzymes are more rigid than their less stable counterparts, we are now using mesophilic, thermophilic, and hyperthermophilic adenylate kinases as model enzymes to test the ability of FIRST to predict the interactions important for thermostability. The energy levels at which breakups occur in the hydrogen bond networks are related to the relative thermostability of the enzymes. We will attempt to identify H bonds that could be responsible for the higher stabilities of the thermophilic and hyperthermophilic adenylate kinases, The potential role of these H bonds in stabilization will be tested by SDM.
The last 10 years have seen the development of molecular dynamics (MD) simulations applied to protein unfolding (53, 54, 71, 74, 155, 200, 201, 212–214, 360). Due to limiting computing power, these simulations have been confined to the study of very small proteins (e.g., barnase and rubredoxin) or of protein fragments. Multiple MD trajectories of the same protein under identical conditions confirm the newest description of protein unfolding as a funnel-like pathway (155, 200, 214): the trajectories typically differ widely from one another, but a statistically preferred unfolding pathway emerges from these comparisons, at least up to an early unfolding intermediate (200, 214). The structural properties of this intermediate are often similar to those deduced from unfolding experimental data (200, 214). They are also often similar for simulations run with different algorithms, under different environmental conditions (e.g., different temperatures, different solvent conditions) (71, 214, 360), or with homologous proteins of different thermostabilities (201). These observations suggest that MD simulations can provide clues about how enzymes start to unfold and which regions to target for stabilization (71, 201). The comparison of MD simulations reproducing the thermal movements at room temperature and of MD simulations inducing unfolding allows the distinction between movements related to catalysis and movements related to unfolding, and it allows the identification of regions that could sustain stabilization without affecting activity (71).
Continual algorithmic improvements and improvements in computer power and speed should extend the use of MD simulations to bigger proteins (73). Investigation by SDM of hypotheses drawn from MD simulations and comparisons between MD simulations and nuclear magnetic resonance (NMR) or hydrogen exchange data should soon provide us with a clearer understanding of how MD can be used to study protein thermostability and flexibility.
Directed evolution is a powerful engineering method, and it is now often used to design enzymes with increased thermostability (297). This method has also been used for a variety of other needs, such as developing enzymes active in solvents, enzymes with altered substrate specificity (12, 61), or thermostable enzymes with high activity at 20 to 37°C. Enzymes improved by directed evolution have already been commercialized (297). A major advantage of this engineering method over SDM is that no knowledge about enzyme structure is necessary. Since there is still much to be learned about thermostabilization mechanisms, SDM approaches often yield disappointing results. To date, directed evolution has proven to be a much more powerful engineering method than SDM.
Two directed-evolution method have been developed. The first one, DNA shuffling, involves random fractionation of a gene with DNase I followed by PCR-mediated reassembly of the full gene. This method introduces point mutations at a rate of approximately 0.7% (314). Each round of mutagenesis is followed by screening for the desirable trait. This mutagenesis procedure can be accelerated by shuffling a family of genes together (70). One limitation to any random-mutagenesis method is the generation of deleterious mutations. For this reason, mutagenesis rates must be kept low to avoid multiple mutations in a single gene copy. Shuffling several homologous genes provides sequence diversity as well as functional diversity, and hence it can increase the mutagenesis rate without increasing the risk of creating deleterious variants. Genes from mesophiles and hyperthermophiles can be shuffled to select for the combination of high catalytic activity at mesophilic temperatures and high stability.
The second evolution method involves error-prone PCR together with DNA shuffling. In short, sequence diversity is created by one or several cycles of error-prone PCR, with each cycle being followed by screening for the desirable trait. Variants with the best characteristics are then recombined by the DNA-shuffling technology. A couple of examples illustrate remarkably clearly the power of this technology. Bacillus subtilis substilisin E was converted into an equivalent of its thermophilic homologue thermitase through a succession of one error-prone PCR, one step of DNA shuffling (to combine the properties of the best variants), and four additional rounds of error-prone PCR. The evolved enzyme was 15 times more active than subtilisin E at 37°C, it showed a 16°C increase in Topt, and its t1/2 at 65°C was more than 200 times that of subtilisin E (Fig. (Fig.8)8) (374). Sequence information and structural analysis indicated that most of the stabilizing mutations were in loops connecting elements of secondary structure (i.e., the most variable regions). The fact that some of these substitutions could not be modeled illustrates the limitations of SDM engineering approaches. In another experiment, B. subtilis p-nitrobenzyl esterase thermostability was enhanced through five cycles of error-prone PCR followed by one step of DNA shuffling (116). The evolved esterase showed a 14°C increase in Tm and a 10°C increase in Topt, and it was more active than the wild-type enzyme at any temperature. Although activity was screened at 30°C (instead of higher temperatures) after each mutagenesis cycle, increases in Tm always resulted in increased Topt. In both sets of experiments, B. subtilis subtilisin E and esterase variants could be generated that were significantly more thermostable while still as active at low temperatures as the wild-type enzyme. These results suggest that activity and thermostability are at least partially independent properties and that they can be optimized in the same enzyme (116).
The characterization of T. aquaticus Taq DNA polymerase followed by the quick popularization of PCR-related technologies was instrumental in the ever-growing interest of the scientific and industrial communities in thermophilic and hyperthermophilic enzymes. Only a few of today's industrial and specialty enzymatic processes utilize thermophilic and hyperthermophilic enzymes. The ever-growing number of enzymes characterized from hyperthermophilic organisms and the recent advent of powerful protein engineering tools suggest that thermophilic and hyperthermophilic enzymes will see more and more use in a variety of applications. This section assesses the interest thermophilic and hyperthermophilic enzymes hold for a few major industrial and specialty enzyme applications.
The cloning and expression of T. aquaticus Taq DNA polymerase in E. coli was instrumental in the development of the PCR technology. Thermophilic DNA polymerases have since been cloned and characterized from a number of thermophiles and hyperthermophiles. The multiple applications of the PCR technology make use of two major properties of these DNA polymerases: processivity and fidelity. Devoid of 3′-5′ proofreading exonuclease activity, the Taq polymerase synthesizes DNA faster (but with a higher error rate) than do enzymes with 3′-5′ proofreading activity. Taq DNA polymerase's high processivity make it the enzyme of choice for sequencing or detection procedures. Proofreading enzymes (such as Vent and Deep Vent polymerases [Table 9] are preferred when high fidelity is required. While thermophilic DNA polymerases have partially replaced mesophilic enzymes in a few applications, most applications were developed after the advent of PCR (e.g., PCR in situ hybridization and reverse transcription-PCR). These applications have been extensively reviewed (135, 198, 308).
Thermophilic DNA ligases are commercially available (Table (Table9).9). Optimally active in the range 45 to 80°C, they represent an excellent addition to PCR technology. These enzymes are perfect for ligating adjacent oligonucleotides that are hybridized to the same target DNA. This property can be used for ligase chain reaction (a DNA amplification method), for mutational analysis (by oligonucleotide ligation assay), or for gene synthesis (from overlapping oligonucleotides).
A number of thermophilic and hyperthermophilic proteases are now used in molecular biology and biochemistry procedures. Some proteins, in particular thermophilic proteins, resist proteolytic digestion at moderate temperatures (20 to 60°C). They only start to unfold and to become sensitive to proteolytic attack above 70°C. Proteases like the Thermus Rt41A serine protease PreTaq (Table (Table9),9), which is rapidly inactivated by EGTA, can be used in DNA and RNA purification procedures. Once inactivated, PreTaq will not interfere with other enzymes during further treatment of the DNA or RNA. This enzyme can be used as an adjunct to PCR to break down cellular structures prior to PCR. The P. furiosus protease S has a broad specificity, so it is used to fragment proteins before peptide sequencing. Other hyperthermophilic proteases are used for protein N- or C-terminal sequencing (Table (Table9).9). Numerous thermophilic restriction endonucleases are now commercialized. Most of them, isolated from Bacillus and Thermus strains, are optimally active in the range of 50 to 65°C.
Most industrial starch processes involve starch hydrolysis into glucose, maltose, or oligosaccharide syrups. These syrups are then used as fermentation syrups to produce a variety of chemicals (e.g., ethanol, lysine, and citric acid). High-fructose corn syrup (HFCS) is produced by the enzymatic isomerization of high-glucose syrup. Starch bioprocessing usually involves two steps, liquefaction and saccharification, both run at high temperatures. During liquefaction, starch granules are gelatinized in a jet cooker at 105 to 110°C for 5 min in aqueous solution (pH 5.8 to 6.5) and then partially hydrolyzed at α-1,4 linkages with a thermostable α-amylase at 95°C for 2 to 3 h. Temperature and pH controls are critical at this stage. If the gelatinization temperature drops below 105°C, incomplete starch gelatinization occurs, which causes filtration problems in the downstream process. If the gelatinization temperature increases much above 105°C, the α-amylases typically used (from Bacillus licheniformis and B. stearothermophilus) are inactivated. The enzymes are also inactivated at pHs below 5.5, and higher pH values cause by-product and color formation. After liquefaction, the pH is adjusted to 4.2 to 5.0 and the temperature is lowered to 55 to 60°C for the saccharification step. During saccharification (which runs for 24 to 72 h), the liquefied starch is converted into low-molecular-weight saccharides and ultimately into glucose or maltose. Glucose syrups (up to 95 to 96% glucose) are produced using pullulanase and glucoamylase in combination, while maltose syrups (up to 80 to 85% maltose) are produced using pullulanase and β-amylase (25, 69).
The natural pH of the starch slurry is approximately 4.5. Present starch-processing methods require adjusting the pH of the starch slurry to 5.8 or above for starch liquefaction and then reducing it to 4.2 to 4.5 for the saccharification step. These two pH adjustments increase chemical costs. They also create the need for ion-exchange refining of the final product to remove the added salts. An α-amylase able to work at lower pH would reduce these costs, simplify the process, and reduce the formation of high-pH by-product (e.g., maltulose) (69). The pullulanase, isoamylase, β-amylase, and glucoamylase used in industrial starch processing originate from mesophilic organisms and are only marginally stable at 60°C. There is a need today for thermostable pullulanases, β-amylases, and glucoamylases. α-Amylases which do not require added Ca2+ and which operate above 100°C at acid pH values are also targeted for improved processing. Increasing the saccharification process temperature would result in many benefits: (i) higher substrate concentrations, (ii) decreased viscosity and lower pumping costs, (iii) limited risks of bacterial contaminations, (iv) increased reaction rates and decrease of operation time, (v) lower costs of enzyme purification, and (vi) longer catalyst half-life, due to increased enzyme thermostability.
Able to grow at temperatures in the range 80 to 110°C, hyperthermophiles are great potential sources for α-amylases functioning in the same temperature range. Representatives of the most thermophilic α-amylases known are listed in Table Table1010 (see also reference 263). Their optimal activities range from 80 to 100°C at pH 4.0 to 7.5. An optimal catalyst for starch liquefaction should be optimally active at 100°C and pH 4.0 to 5.0 and should not require added Ca2+ for stability. None of the enzymes listed in Table Table1010 present these combined characteristics. Greater characterization of some of these enzymes is needed to determine if they are stable and retain significant activity at pH 4.0. With the recent development of powerful engineering tools (see above), we can expect that an α-amylase with these features will soon be available. Some of the α-amylases listed in Table Table1010 were initially believed to be independent of calcium. EDTA has no effect on P. furiosus α-amylase activity and stability at temperatures below 90°C. A 30-min EDTA treatment at 90°C followed by extensive dialysis removes part of the enzyme-linked Ca2+ and partially inactivates the enzyme. Full activity is restored by adding CaCl2 to the enzyme and heating the enzyme solution for 30 min at 90°C (A. Savchenko, C. Vieille, and J. G. Zeikus, unpublished). At least 80% identical in sequences to the P. furiosus α-amylase, the P. kodakaraensis and T. profundus enzymes are likely to be calcium dependent. Despite its Ca2+ dependency, P. furiosus α-amylase is highly stable and active at 100°C in the absence of added Ca2+ (Table (Table10),10), suggesting that starch liquefaction could soon be performed in the absence of Ca2+.
Due to the risk of unwanted side-reactions at alkaline pHs and to the length of the saccharification processes (48 to 72 h), thermophilic β-amylases will improve starch saccharification only if they are active at acidic pHs and only if they can reduce the saccharification time by increasing the reaction rate. Two sources of thermophilic β-amylases exist (Table (Table10).10). If intermediate temperatures (70 to 80°C) are required to limit the browning side reactions, T. thermosulfurigenes β-amylase is an option, since it is stable and 70% active at pH 4.0 (154). Optimally active at 95°C and pH 4.3 to 5.5 and in the absence of Ca2+, T. maritima β-amylase could be a good enzyme for testing the impact of high temperatures on the saccharification process. This enzyme, however, has not been cloned or characterized in detail. Production of maltose syrups using these β-amylases would still require a compatible debranching enzyme.
Hyperthermophiles typically hydrolyze starch via an α-amylase and/or an amylopullulanase (Table (Table10).10). Oligosaccharides are then degraded intracellularly by an α-glucosidase. Like β-amylases, glucoamylases are rare in thermophiles and hyperthermophiles. Glucoamylases have been purified from only a few anaerobes, and a putative glucoamylase gene has been identified in the M. jannaschii genome (21). Extensive work remains to determine if the putative M. jannaschii glucoamylase gene is functional and if its product has catalytic properties close to the properties required for starch saccharification. Stable in the presence of starch, the T. thermosaccharolyticum glucoamylase (Table (Table10)10) represents an alternative catalyst for the development of a starch saccharification process at 70 to 75°C.
Type I pullulanases (hydrolyzing only α-1,6-glucosidic linkages) are used as debranching enzymes in starch saccharification. Until recently, type I pullulanases were known only in mesophilic organisms and in thermophilic aerobic bacteria. A selection of thermophilic type I pullulanases and some of their properties are listed in Table Table10.10. The Thermotoga maritima enzyme is the only one characterized in a hyperthermophile. These pullulanases are all optimally active at acidic pHs. Their potential for starch saccharification remains to be tested. These pullulanases seems to have temperature and pH requirements compatible with those of recently characterized thermophilic glucoamylase and β-amylases (Table (Table1010).
Amylopullulanases (or type II pullulanases) show dual specificity for starch α-1,4- and α-1,6-glucosidic linkages (234). For this reason, they cannot be used as debranching enzymes in maltose and glucose syrup productions. Amylopullulanases have been suggested, however, as alternative enzymes to replace α-amylases during starch liquefaction for producing fermentation syrups. Since certain amylopullulanases specifically produce maltose, maltotriose, and maltotetraose (DP2 to DP4) as the major end products of starch degradation, they have been suggested as catalysts in a one-step liquefaction-saccharification process for the production of high-DP2-to-DP4 syrups (289). Because amylopullulanases purified from the hyperthermophiles P. furiosus, ES4, Thermococcus litoralis, and T. hydrothermalis are active at high temperatures (105 to 120°C) and at low pHs and because they are exceptionally thermostable, they are strong candidates for this process (Table (Table10).10). Their low activity levels on starch, however, represent a major limitation to their use in starch liquefaction. As an example, at 98°C the P. furiosus α-amylase is approximately 44 times more active on starch than the P. furiosus amylopullulanase (Table (Table1010).
Cyclomaltodextrin glycosyltransferases (CGTases) convert oligodextrins into cyclodextrins (CDs). α-, β-, and γ-CDs are cyclic compounds composed of 6, 7, or 8 α-1,4-linked glucose molecules, respectively. The internal cavities of CDs are hydrophobic, and they can encapsulate hydrophobic molecules. This property makes CDs suitable for numerous applications in the food, cosmetic, and pharmaceutical industries, where they are used to capture undesirable tastes or odors, stabilize volatile compounds, increase a hydrophobic substance's water solubility, and protect a substance against unwanted modifications. CD production involves α-amylase-catalyzed starch liquefaction followed by CD formation using a mesophilic CGTase. A CGTase was recently characterized in a Thermococcus species (Table (Table10).10). This enzyme is highly stable at 100 to 105°C. Optimally active at 90 to 100°C under acidic conditions, it also shows high α-amylase activity. This Thermococcus CGTase could probably be used to develop a one-step CD production in which it would replace α-amylase for starch liquefaction.
Used in HFCS production, xylose isomerases (also called glucose isomerases) catalyze the equilibrium isomerization of glucose into fructose. Xylose isomerases represent the first large-scale industrial use of immobilized enzymes (72). The isomerization process is typically run in packed-bed reactors at 58 to 60°C for 1 to 4 h, and the converted syrup reaches 42% fructose. An additional strong acid cation-exchange chromatographic step further increases the fructose concentration to 55%—the concentration required by most of today's HFCS applications. The glucose-to-fructose conversion rate at equilibrium is shifted toward fructose at high temperatures: at 60 and 90°C, the fructose contents at equilibrium are 50.7% and 55.6%, respectively. Two major parameters are responsible for the moderate temperatures used in this process. (i) the xylose isomerases currently used are only moderately stable at 60°C. (Due to enzyme inactivation, the reactors need to be repacked every 2 months.) (ii) Unwanted side reactions (Maillard reactions) occur at high temperatures and alkaline pHs. For this last reason, HFCS producers are interested in a process that would take place at temperatures close to those of today's processes but at a lower pH. HFCS producers are also interested in using a more stable enzyme.
Highly thermophilic and thermostable xylose isomerases have been characterized from Thermus thermophilus, Thermus aquaticus, Thermotoga maritima, and Thermotoga neapolitana (Table (Table10).10). Despite their optimal activity at elevated temperatures (95 to 100°C) and their attractive, high catalytic efficiency at 90°C, the T. maritima and T. neapolitana xylose isomerase are active only at neutral pH and are only marginally active (10 to 15%) between 60 and 70°C (350). Based on structural information, Meng et al. (245) have mutagenized the Thermoanaerobacterium thermosulfurigenes xylose isomerase and increased its catalytic efficiency on glucose (Table (Table11).11). Introducing the same substitutions in the T. neapolitana xylose isomerase also significantly increased this enzyme's catalytic efficiency on glucose (Table (Table11).11). The T. neapolitana xylose isomerase Val185Thr mutant derivative is now being used as the template to generate variant enzymes with increased activity at acidic pHs.
Cellulose is the most abundant and renewable nonfossil carbon source on Earth (67). Considerable effort has been spent to create an economically feasible ethanol production from cellulose, but without much success. Typically embedded in a network of hemicellulose and lignin, cellulose requires an alkaline pretreatment to become accessible to enzyme action. An enzymatic saccharification step makes cellulose and its degradation products suitable for ethanologenic yeast or bacterial fermentations. One of the main limitations to this three-step process is the low activity (and high cost) of the cellulases used. Since cellulose's alkaline pretreatment is performed at high temperatures, hyperthermophilic cellulases should be the best candidate catalysts for cellulose degradation. The production of cellulases by hyperthermophiles is rare, however. Only recently have endoglucanases and cellobiohydrolases been characterized in the Thermotogales (Table (Table12).12). Among the enzymes characterized, pairings of endoglucanase and cellobiohydrolase, optimally active either at 95 or at 105°C, represent interesting enzyme combinations to be tested in cellulose processing.
Industrial ethanol production is currently based on corn starch that is first liquefied and saccharified (see above). The oligosaccharide syrup is then used as a feedstock for ethanologenic yeast fermentation. The use of cellulases to increase the yields of starch liquefaction and saccharification has been described (211). Since starch liquefaction is performed at high temperatures, using thermophilic endoglucanases during this step is an option.
In the paper production process, pulping is the step during which wood fibers are broken apart and most of the lignin is removed. Pulping often corresponds to a chemical hot-alkali treatment of the wood fibers. The remaining lignin is removed by a multistep bleaching process. Performed with chlorine and/or chlorine dioxide at high temperatures, pulp bleaching generates high volumes of polluting wastes (333). The amount of chemical used—and, therefore, the resulting pollution—can be reduced if the paper pulp is pretreated with hemicellulases. Since pulping and bleaching are both performed at high temperatures, the paper industry needs thermophilic hemicellulases, preferably those active above pH 6.5 or pH 7.0 (358). Hyperthermophilic hemicellulases have only been characterized in the Thermotogales (Table (Table12).12). These enzymes are active at pHs around 7.0. It is not known if they can withstand higher pHs. It is interesting that the Bacillus 3D endoxylanase is at least 100 times more active than the Thermotoga thermarum enzyme (Table (Table12).12). Although less thermophilic than the Thermotoga endoxylanases, the Bacillus enzyme is highly stable under alkaline conditions. Other hemicellulases (e.g., α-glucuronidase, β-mannanase, α-l-arabinofuranosidase, and galactosidases) have been shown to contribute to the enzymatic treatment of the pulp. Not many of these enzymes have been characterized from thermophiles and hyperthermophiles (Table (Table1212).
Production of the dipeptide aspartame (l-aspartyl-l-phenylalanine methyl ester) by using thermolysin (68, 172) is the only chemical synthesis process that uses a thermophilic enzyme on an industrial scale. Other thermophilic and hyperthermophilic enzymes have been suggested as potential catalysts for a variety of synthetic processes. A number of thermophilic enzymes (including hydantoinase, cytochrome P450, secondary alcohol dehydrogenase, and various glycosyl hydrolases) show regioselective and/or stereoselective reaction mechanisms that are highly desirable for synthetic chemistry (see examples in Table Table12).12). Active at elevated temperatures and highly resistant to solvent denaturation, hyperthermophilic proteases are also strong candidates for synthesis applications where the highly temperature-dependent solvent viscosity and substrate solubility affect the reaction rate.
The use of enzymes (including horseradish peroxidase, alkaline phosphatase, and glucose phosphate dehydrogenase) in immunoassays in the pharmaceutical and food industries is constantly increasing. Highly stable enzymes are desirable for these diagnostic applications only if they are active at moderate temperatures (i.e., under conditions compatible with the biological activity and stability of the other reagents involved in the assay). The thermostable alkaline phosphatase recently characterized from Thermotoga neapolitana is highly active at high temperatures (Table (Table12)12) but shows almost no activity at room temperature. This enzyme could become valuable if its activity at room temperature is engineered to levels comparable to currently available mesophilic alkaline phosphatases and if its stability can be retained.
Pectin is a branched heteropolysaccharide abundant in plant tissues. Its main chain is a partially methyl-esterified (1, 4)-α-d-polygalacturonate chain. Along the main chain are rhamnopyranose residues, which are the binding sites for side chains composed of neutral sugars. There are two types of pectinolytic enzymes: methylesterases and depolymerases (hydrolases and lyases). These enzymes are widely used in the food industry. In fruit juice extraction and wine making, pectinolytic enzymes increase juice yield, reduce viscosity, and improve color extraction from fruit skin. A few thermophilic pectinolytic enzymes isolated from thermophilic anaerobes (Table (Table12)12) show catalytic and stability properties compatible with industrial needs.
Chitin (a linear β-1,4 homopolymer of N-acetylglucosamine) is also an abundant carbohydrate in the biosphere. Chitinases could be used for the utilization of chitin as a renewable resource and for the production of oligosaccharides as biologically active substances. Some chitooligosaccharides can be used in phagocyte activation or as growth inhibitors of certain tumors. A few thermophilic chitinases have been characterized. Their potential for an economically competitive chitin degradation process remains to be tested.
Animal feedstock production processes include heat treatments that inactivate potential viral and microbial contaminants. Using thermophilic enzymes (i.e., arabinofuranosidase and phytase) in feedstock production would enhance digestibility and nutrition of the feed while allowing the combination of heat treatment and feed transformation in a single step. Table Table1212 shows other examples of processes where it may be desirable to use a thermophilic enzyme (including keratinase and others).
Hyperthermophilic enzymes have become model systems to study enzyme evolution, enzyme stability and activity mechanisms, protein structure-function relationships, and biocatalysis under extreme conditions. These applications result from the discovery that molecular biology and biochemical studies such as protein purification and characterization are facilitated by the cloning and expressing of genes from hyperthermophiles in mesophilic hosts. The great diversity of archaeal and bacterial hyperthermophiles represents a large pool of enzymes to chose from for developing new biotechnological applications.
The future of this field is fascinating and boundless. Three questions are particularly intriguing:
First, what is the upper temperature limit for enzyme activity and stability? An answer to this question may come from the discovery of new, natural hyperthermophilic enzymes that are active above 125°C. Another approach is to use hyperthermophilic enzymes whose substrates are stable at very high temperatures and to use genetic engineering tools to select for mutants with higher stability. This approach is being tested by directed evolution with the P. furiosus α-amylase gene (85) in F. Arnold's laboratory to determine the upper limits to α-amylase activity.
Second, numerous reports suggest that the stability and activity of thermophilic enzymes can be controlled by separate molecular determinants. Can hyperthermophilic enzymes be used as molecular templates to design highly stable enzymes that have high activity at low temperatures? Such an achievement could greatly enhance the range of applications for hyperthermophilic enzymes in areas including medicine, food, and research reagents. Our laboratory is currently using directed-evolution techniques to transform hyperthermophilic xylose isomerase and alkaline phosphatase into thermostable catalysts that are highly active at moderate temperatures.
Third, how do rigidity and flexibility relate to thermostability and activity, respectively? In other words, is rigidity at room temperature a requirement for enzyme thermostability, and is flexibility a requirement for activity?
As opposed to X-ray crystallography, which gives access only to a static, average enzyme structure, tools such as molecular dynamics, hydrogen exchange, and NMR allow the study of protein flexibility and thermostability, as well as allowing the identification of regions susceptible to unfolding. Coupled with SDM, these tools should help us explore many aspects of enzyme thermostability and activity. In particular, these tools will help answer the three questions mentioned above.
The ever-increasing number of fully sequenced genomes will be an invaluable help in deciphering which sequence variations among homologous proteins are related to stability and which ones are simply a result of evolution. Many algorithms used in computational methods are created using parameters calculated from known protein structures. Despite the many advances in computer algorithms, protein structure prediction remains among the most challenging tasks in computer modeling. While homologous folds (i.e., with the same ancestor) are better predicted than analogous folds (i.e., convergent evolution), totally unknown protein structures are orders of magnitude harder to predict. With protein thermostability often depending on only a small number of noncovalent interactions, even the best predictions of homologous folds might fall short of providing clues for stability. For this reason, using structural genomics to study protein thermostability will probably first answer questions such as Are the different protein folds populated to the same level among the hyperthermophilic and mesophilic enzymes? In other words, are there folds that are favored at high temperatures? In the foreseeable future, the comparison of individual protein thermostabilities will still heavily rely on crystallographic and NMR structural studies.
This research was supported by grants 94–34189–0067 and 9901423 from the U.S. Department of Agriculture and by grant NSF-BES-9529047–63143-Zeikus from the National Science Foundation.
We gratefully acknowledge Paweena Limjaroen for her assistance with the literature search and Dinlaka Sriprapundh for preparing Fig. Fig.6.6. We also thank our many students and postdoctoral fellows who have worked on and contributed to our knowledge of thermophilic enzymes during the past years, including Dinlaka Sriprapundh, Maris Laivenieks, Doug Burdette, Guoqiang Dong, Chanyong Lee, Yong-Eok Lee, Saroj Mathupala, Ramesh Mathur, Meng-Hsiao Meng, Cindy Petersen, Badal Saha, Alexei Savchenko, Gwo-Jenn Shen, and Vladimir Tchernajenko. We express our deep gratitude to Christopher B. Jambor for his repeated encouragements and his expert editing. Any remaining mistakes are ours.