|Home | About | Journals | Submit | Contact Us | Français|
Arylamine N-acetyltransferase 1 (NAT1) and 2 (NAT2) exhibit single nucleotide polymorphisms (SNPs) in human populations that modify drug and carcinogen metabolism. This paper updates the identity, location, and functional effects of these SNPs and then follows with emerging concepts for understanding why pharmacogenetic findings may not be replicated consistently. Using this paradigm as an example, laboratory-based mechanistic analyses can reveal complexities such that genetic polymorphisms become biologically and medically relevant when confounding factors are more fully understood and considered. As medical care moves to a more personalized approach, the implications of these confounding factors will be important in understanding the complexities of personalized medicine.
The N-acetylation polymorphism was first identified in patients administered isoniazid for the treatment of tuberculosis . Subsequently, it was discovered that N-acetyltransferase (E.C. 18.104.22.168) also catalyzes the N-acetylation of a diverse array of aromatic amines drugs and carcinogens . Two arylamine N-acetyltransferase isozymes, NAT1 and NAT2, have been identified in humans and comprehensive reviews have been reported recently [3–10]. Single nucleotide polymorphisms (SNPs) in the NAT1 and NAT2 genes have been identified and tested for their functional effects. This paper updates the identity, location, and functional effects of these SNPs and then follows with emerging concepts which can serve as a paradigm for understanding why pharmacogenetic findings may not be replicated consistently.
NAT1 and NAT2 are 290 amino acid products of intronless 870 base pair open reading frames located within 170 kb on chromosome 8p22 [3–10]. A pseudogene (NATP) resides between the NAT1 and NAT2 coding exons. NAT1 and NAT2 open reading frames are 87% identical and the NAT1 and NAT2 proteins differ only in 55 amino acids . In the pharmacogenetic literature, combinations of SNPs are identified as alleles or haplotypes. A consensus N-acetyltransferase gene nomenclature has been established  with NAT1*4 and NAT2*4 designated as the reference alleles for NAT1 and NAT2, respectively. SNPs are identified by designating “A” of the ATG translation initiation codon as number 1. SNPs upstream of the site are designated by negative numbers and SNPs downstream of this site are designated by positive numbers. NAT1*4 and NAT2*4 are the most common NAT1 and NAT2 alleles reported in some but not all ethnic groups and thus it is not appropriate to designate them as “wild-type” . An international nomenclature committee publishes an internet website for updates in NAT1 and NAT2 alleles at http://N-acetyltransferasenomenclature.louisville.edu .
As reviewed recently [10,13], NAT1 and NAT2 differ in transcriptional regulation. For example, NAT1*10 which possesses SNPs solely in the 3-UTR was associated with slightly elevated NAT1 activity levels in human bladder and colon tissues [14,15] and increased N-acetylation capacity in vivo . One of the SNPs present in the 3′-UTR of NAT1*10 (1088T>A) alters a consensus polyadenylation signal (identified as polyA-1) consistent with the suggestion that increased activity may relate to a change in the polyadenylation and subsequent enhanced mRNA stability. However, several studies have failed to replicate the increased catalytic activity associated with NAT1*10 [17–19] and most published NAT1 expressed sequence tags possessing poly A tails reflect a polyadenylation site (polyA-2) located downstream of 1088T>A . Further research on SNPs in the 3′UTR is needed to better understand their functional effects, which may be tissue-specific. Additional non-genetic factors such as substrate-dependent inhibition, drug interactions and cellular redox conditions also can modify N-acetyltransferase activities . These factors may modify SNP effects and the relationships between inherited genotype and phenotype, particularly for NAT1 where functional effects for SNPs in the 3′UTR are poorly understood.
Knowledge of the functional effects of coding region SNPs has been enhanced substantially by molecular modeling and crystallization of N-acetyltransferase enzymes, initially a prokaryotic N-acetyltransferase  followed more recently by human NAT1 and NAT2 proteins . Both NAT1 and NAT2 possess a functional Cys-His-Asp catalytic triad, which resembles that of cysteine proteases . In a ping-pong bi-bi reaction mechanism, the catalytic Cys68 is acetylated by acetyl-coenzyme A (AcCoA), followed by acetyl group transfer to the substrate . Docking experiments have revealed important insights into isozyme-specific substrate specificity . The NAT1 catalytic pocket is about 40% smaller than that of NAT2  and substrate selectivity is strongly influenced by the three key active site loop residues F125, Y127, and R129 [22–25]. The functional effects of SNPs in the NAT1 and NAT2 coding regions have been determined primarily by characterization of recombinant NAT1 and NAT2 allozymes. Our understanding of these experimental observations has been enriched by studies of molecular structure and modeling. They provide explanations for the relative capacity of NAT1 and NAT2 to catalyze not only N-acetylation, but also AcCoA-dependent O-acetylation of N-hydroxyarylamines and AcCoA-independent N,O-acetylation of N-hydroxy-N-acetylarylamines; reactions also catalyzed by NAT1 and NAT2 [10,26]. The effects of selected SNPs on structure and function are outlined below.
At least 15 SNPs in the NAT1 coding region have been identified in human populations (Table 1). The sites for many of these on the NAT1 protein are illustrated in Figure 1. Analysis of the structural features and functional effects of many of these NAT1 SNPs was reported recently  and is summarized below. Data in support of their effects on NAT1 protein expression Vmax and substrate Km following recombinant expression of NAT1 variants in mammalian cells are shown in Figure 2.
The hydrogen bonding interactions between positively charged R64 and negatively charged E38 are salt bridges which are stronger than normal hydrogen bonds. Replacing arginine with either glutamine or tryptophan at residue 64 reduces these interactions in turn altering conformation and dynamics of the first domain tertiary structure. Structural changes in this region could affect the positioning of the catalytic triad C68, and thereby alter its interactions with catalytic H107, and/or alter its acetylation status leading to increased proteasomal degradation . Some hydrogen bonding interactions may be preserved with W64, but tryptophan is largely hydrophobic and does not have the hydrogen bonding capacity of arginine. The native interactions formed by R64 are likely important to the integrity of the tertiary structure. Functional studies of the R64W variant in yeast demonstrated reduced catalytic activity, reduced protein levels, and reduced thermostability [31,32]. Variant NAT1 proteins are more susceptible to proteasomal degradation due to altered acetyl-CoA binding capability, and reduced thermostability . A loss of critical hydrogen bonding interactions in the R64W variant is consistent with reduced protein thermostability. A loss of structural integrity could lead to protein aggregation  or enhanced protein degradation.
V149I and S214A exist together as a result of SNPs in the NAT1*11 allele . Functional effects of these two SNPs found in the NAT1*11 allele have been inconsistent. Recombinant expression of the NAT1*11 coding region in bacteria  and yeast [31,32] caused no change in catalytic activity, protein levels, or protein stability. Following recombinant expression in COS-1 cells, the NAT1*11 haplotype possessing both V149I and S214A resulted in elevated protein levels and catalytic activity in one study  but not another . Red blood cells and leukocytes from individuals possessing the NAT1*11 allele resulted in equivalent and lower catalytic activities, respectively [35,36].
The side-chain of V149 is located on the surface of the second domain beta-barrel. Hydrophobic interactions are possible with P144 and I164, but because these residues are on adjacent beta strands, they should not contribute to secondary structure stability. Substitution of isoleucine for valine at 149 is conservative and thus inconsistent with structural changes. When recombinantly expressed in bacterial cells, the V149I variant resulted in acetylation rates of up to 2-fold higher without changes in protein level or thermostability .
The side-chain of residue S214 is located in the inter-domain region near the active site pocket and has no apparent molecular interactions with surrounding residues. Replacing serine with a smaller alanine residue should not modify active site access nor affect structure since the side-chain is near the surface and does not interact with other residues. Analysis of the NAT2 structure complexed with CoA demonstrates that Threonine 214 is involved in hydrogen bonding to CoA. Since the NAT1 S214 also may interact with CoA, substituting alanine for serine could affect this interaction. Since S214 is located adjacent to the active site and may be involved in AcCoA binding, S214A could play a role in the increased activity of the NAT1 protein via a mechanism that increases the steady state level of acetylated NAT1 enzyme.
The side-chain of R187 is located in the second domain beta barrel and is partially exposed both to the protein surface and the active site pocket. The arginine side-chain forms hydrogen bonds with the side chain of E182 and backbone of K188 in the second domain, and with the side-chain of C-terminal residue T289 in the third domain. These interactions help shape the active site pocket and stabilize the conformations of both the C-terminal tail and the second domain loop (amino acids 165–185). Changing this residue to glutamine may result in partial loss of these interactions, although the smaller glutamine residue may be capable of maintaining some of these hydrogen bonds. Loss of the R187 interactions could lead to a change in the dynamics or conformations of the C-terminal tail and/or the second domain loop resulting in destabilization of NAT1 structure. Changes in C-terminal tail conformation also may influence AcCoA binding and C68 acetylation in the active site. Since the active site is largely shaped by its interactions with both the C-terminal tail and the second domain loop, changes in the conformation and/or dynamics of either of these structures could influence substrate selectivity and catalytic activity.
Functional studies of the R187Q variant expressed in yeast demonstrated reduced catalytic activity, protein levels, and thermostability [31,32]. Reduced Vmax and increased substrate Km were observed following recombinant expression in bacteria  and mammalian cells . These studies suggest that R187Q destabilizes the NAT1 structure and influences substrate binding by altering the size and shape of the active site. R187Q did not change AcCoA Km, indicating that altered interactions with the C-terminal tail do not significantly influence AcCoA binding .
The side-chain of M205 is located in the interdomain region adjacent to the active site entrance and the second domain beta barrel. The side chain has no apparent interactions with surrounding residues, but is in close proximity to the backbone and side-chain of I106. Replacing the methionine with a valine maintains the hydrophobicity of residue 205, and should not introduce steric clashes with surrounding residues. Therefore, it is not expected to influence the position of catalytic triad residue H107. Since no interactions are lost and no clashes induced by replacing hydrophobic methionine with a smaller hydrophobic valine, no changes in protein stability or function are expected. Functional studies of the M205V variant in yeast [31,32] and mammalian cells  demonstrated no change in enzyme expression, catalytic activity, or thermostability. These findings are consistent with no alterations of important molecular interactions in the M205V variant.
The side-chain of D251 is located on the third domain beta sheet and its side-chain is oriented into the protein core. The D251 side-chain forms hydrogen bonds with R166 on the second domain loop, with R242 of the third domain beta sheet, and with the backbone of N245 in the third domain. Although the interactions with R242 and N245 are not necessary for maintaining the stability of the beta sheet, hydrogen bonding with R242 may provide additional stability to loop-stabilizing hydrogen bonding interactions. The hydrogen bond between D251 and R166 influences the conformation of the second domain loop, providing support for its interaction with the backbone of V146. These interactions between the second domain loop and the third domain beta sheet contribute to protein stability. Replacing aspartate with a hydrophobic valine residue should reduce all of the D251 interactions, which could affect the dynamics and conformation of the second domain loop, thereby altering protein stability. Functional characterization of the D251V variant in yeast [31,32] and mammalian cells [29,30] demonstrated reduced protein stability, levels and catalytic activity.
The side-chain of E261 is located on the protein surface in the third domain helix. The E261 side-chain forms hydrogen bonds with the side chain of S259 in the coil between the third domain beta sheet and helix. It is unlikely that this interaction is required for stability of the helix or the third domain tertiary structure. Although hydrogen bonding with S259 may be lost, replacing glutamate with lysine at residue 261 should not cause major changes in dynamics or conformation, because the helix secondary structure is not dependent on this interaction and the third domain helix is stabilized by hydrophobic forces. Functional studies of the E261K variant in yeast [31,32] and mammalian [29,30] cells caused no reduction in protein levels, catalytic activity and thermostability. These data are consistent with observations that residue 261 is located on the protein surface, with a single interaction that makes no apparent contribution to the dynamics or structural conformations in that region.
I263 is located on the third domain helix. Its side-chain is part of a hydrophobic core located at the interface between the third domain helix and beta sheet, and the C-terminal tail coil. Substitution of isoleucine for the smaller valine at position 263 preserves the hydrophobic interactions without introducing steric clashes. Because the hydrophobic forces are not disturbed, this substitution should not affect NAT1 structural dynamics or conformation. Functional studies of the I263V variant in yeast [31,32] and mammalian cells  showed no change in protein level or catalytic activity. The yeast cell studies also demonstrated no reduction in protein thermostability [31,32].
Deduction of NAT1 phenotypes has remained problematic because the relationship between NAT1 genotype and phenotype is poorly understood and in contrast with NAT2 (discussed later) is more strongly influenced by factors other than coding region SNPs. This also follows from the fact that SNPs in the NAT1 coding region, in contrast to those in the NAT2 coding region, are relatively uncommon. We presently recommend that individuals homozygous or heterozygous for “slow” acetylator alleles be identified as slow acetylator phenotypes. This includes individuals who possess 97C>T, 190C>T, 559C>T, 560G>A, and 752A>T and therefore allele/haplotypes NAT1*14A,B; NAT1*17, NAT1*19, and NAT1*22. NAT1*10 is a fairly common allele with SNPs in the 3-UTR but none in the coding region. The functional effects of these SNPs as described earlier are not well understood. NAT1*10 may be designated by some investigators as a “rapid” acetylator phenotype because of the data that supports this designation [14–16]. NAT1*11 is also sometime designated as a “rapid” [29,37]or “slow”  allele. Because of the ambiguity involving the NAT1*10 and NAT1*11 phenotypes, we recommend designating them as “at risk” alleles or haplotypes until their phenotypes are clarified.
Over 25 SNPs in the NAT2 coding region have been identified in human populations (Table 2). The locations for many of these on the NAT2 protein are illustrated in Figure 1. A brief description and possible explanation for the functional effects follows below, supported with data illustrating their effects on NAT2 protein expression, Vmax and substrate Km (Figure 3). Analysis the of structural features and functional effects of many of these NAT2 SNPs was reported recently  and is summarized below.
The side-chain of R64 bonds twice to E38 and twice to N41 in a stretch of the first domain coil (V35-G51) that is mostly absent of secondary structure. As for NAT1, the hydrogen bonding interactions between positively charged R64 and negatively charged E38 are salt bridges, and are stronger than normal hydrogen bonds. Substitution of arginine with either glutamine or tryptophan at residue 64 causes loss of these interactions which contribute to the conformation and dynamics of the first domain tertiary structure. Functional studies of the R64Q variant in bacterial [42,43], yeast , and mammalian cells  demonstrated reduced activity and protein levels due to reduced protein stability. These data suggest that the R64 interactions are necessary for structural stability. Since glutamine is less bulky than arginine, the loss of enzyme stability cannot be attributed to steric clashes. Functional studies of the R64W variant in yeast demonstrated reduced protein activity and protein levels due to reduced protein stability . A loss of critical hydrogen bond and salt bridge interactions in the R64W variant is consistent with reduced protein thermostability.
The I114 side chain shares hydrophobic interactions with residues L21 and L24 in the first domain, and F84 and V112 in the second domain. Changing this hydrophobic residue to a polar hydrophilic threonine residue may alter hydrophobic interactions at the interface between the second domain beta barrel and the first domain helix. However, because of the peripheral location of I114, and the surrounding protein structure that is highly organized into secondary and tertiary structures, it is unlikely that a reduction of hydrophobic forces in this region will result in major structural changes. I114T causes large reductions in catalytic activity when recombinantly expressed in bacteria [42,43] and yeast . Recombinant expression of the I114T variant in mammalian cells did not result in changes in protein stability or apparent kinetic parameters, but led to a reduction of active enzyme possibly due to enhanced protein degradation . These functional data are consistent with a structural change that increases protein aggregation and/or targeting for degradation without altering the protein’s stability.
D122 is part of the catalytic triad and completely buried in the protein. It forms hydrogen bonds to the side chains of N72, H107, S125, Y190, and the backbone of G124. These multiple hydrogen bonds likely contribute to the stability of the active site loop conformation and the interaction between the first and second domains. Because D122 is a catalytic triad residue, any change will adversely affect the function of the catalytic triad . Recombinant expression of the D122N variant in mammalian cells resulted in undetectable levels of catalytic activity with reduced protein probably due to protein degradation pathways . Disruption of the catalytic triad likely also affects enzyme acetylation, thereby increasing proteasomal degradation .
The side-chain of L137 is oriented toward the interior of the second domain beta-barrel. L137 shares hydrophobic interactions with I120, L152, W159, F192, and L194. The L137F substitution is likely to affect the beta barrel structure due to steric clashes that result from replacing leucine with a larger phenylalanine residue. It is also possible that aromatic interactions between the F137 and W159 side chains alter the folding of the second domain. Functional studies of the L137F variant in mammalian cells demonstrated reduction in protein levels with no change in protein stability, possibly the result of proteasomal degradation . These data are consistent with a change in secondary structure enhancing protein degradation.
The side-chain of Q145 forms hydrogen bonds with the backbone of W132, Q133 and Q145 on an adjacent strand in the second domain. Q145P is likely to result in disruption of secondary structure due to the loss of stabilizing hydrogen bonding interactions and the introduction of a rigid proline residue . Since W132 and Q133 are part of the coil that becomes the active site loop, altering their backbone interactions with residue 145 may affect the conformation of the active site loop and thereby alter enzymatic activity and/or substrate selectivity. Functional studies of the Q145P variant in yeast demonstrated reduced or undetectable catalytic activity due to reduced protein levels, although the protein stability was not affected . This reduction in protein is probably due to enhanced protein degradation that is triggered by these structural changes.
E167 is part of the unstructured “loop” that plays a role in stabilizing mammalian N-acetyltransferases . The side-chain of E167 forms weak hydrogen bonds with K185 which is in an adjacent strand of the second domain loop. Removing this interaction may affect the dynamics or conformation of the loop, which largely lacks defined secondary structure and is therefore more susceptible to dynamic and conformational changes. Functional studies of the E167K variant in mammalian cells demonstrated a reduction in catalytic activity due to reduced protein levels, with no reductions in protein thermostability . It is possible that the E167K variant has small structural changes that cause protein aggregation and/or increased degradation.
The R197 side-chain is located near the protein surface of the second domain near the inter-domain helix. Electrostatic interactions are likely between R197 side-chain and E195, and with the lone pairs of electrons on the M105 side-chain sulfur. Replacement of the positively charged arginine with a neutral glutamine residue results in loss of these electrostatic interactions. Steric forces or van der Waals forces are also involved in the close interactions of E195 with R197. Functional studies of the R197Q variant in bacteria, yeast, and mammalian cells demonstrated reduced NAT2 activity and protein levels due to reduced protein thermostability [41–44]. These results are consistent with loss of the relatively weak electrostatic interactions of R197 with E195 and M105.
The side-chain of residue 268 is located on the protein surface of the third domain alpha helix and has no interactions with surrounding residues or symmetry related crystal neighbors. Replacing K268 with arginine is not expected to affect the alpha helical structure. Functional studies of the K268R variant recombinantly expressed in bacterial [42,43], yeast [44,50] and mammalian cells  caused no changes in protein expression, protein stability, or catalytic activity.
G286 is located on the C-terminal tail in the third domain directly adjacent to the active site and does not directly interact with other residues. Replacement of glycine with a much larger glutamate at residue 286 could significantly alter the conformation of the C-terminal tail adjacent to the active site opening due to steric clashes with nearby residues and loss of the highly flexible glycine residue. Since the C-terminal tail has an important role in defining the size and shape of the active site cavity, the G286E variant protein is likely to have altered active site access and altered substrate selectivity. Such a significant change to a C-terminal residue adjacent to the active site is also likely to affect AcCoA binding and C68 acetylation. The NAT2 crystal structure (PDB ID# 2PFR) has CoA bound in its active site with hydrogen bonding between CoA and S287. The C-terminal tail conformational change that may accompany the G286E variant would likely influence this interaction between S287 and CoA.
Differences between NAT1 and NAT2 in the size of the active site , and in substrate selectivities and/or catalytic activities  are likely influenced by the difference in bulkiness of the smaller NAT2 G286 residue compared to the larger NAT1 R286 residue side chain, which is oriented toward the active site opening. The addition of a bulky glutamate side-chain at residue 286 in the G286E variant should alter substrate selectivity and/or catalytic activity.
Functional studies of the G286E variant in mammalian cells demonstrated reduced activity for some substrates but not for others, and reduced protein and thermostability . The observation that the G286E residue change could alter the size and shape of the active site pocket is consistent with the substrate-dependent activity changes observed experimentally . In addition, the finding that the G286E alters the Km for AcCoA  is consistent with a conformational change in the C-terminal tail which may alter interactions with AcCoA. It is possible that reduced protein acetylation contributes to the overall reduction of protein levels through proteasomal degradation .
Following recombinant expression in yeast, 190C>T (R64W), 191G>A (R64Q), 341T>C (I114T), 434A>C (Q145P) and 590G>A (R197Q) reduce NAT2 catalytic activities, whereas 111T>C (F37F), 282C>T (Y94Y), 481C>T (L161L), 759C>T (V253V), and 803A>G (K268R) do not [43,44]. The effects of 845A>C (K282T) and 857G>A (G286E) differ with substrate, as they reduce catalytic N- and O-acetyltransferase activities towards some substrates and not others. The reduction in NAT2 protein appeared to be related to instability of the protein for 190C>T (R64W), 191G>A (R64W), 590G>A (R197Q) and 857G>A (G286E), whereas 341T>C (I114T), 411A>T (L137F), and 499G>A (E167K) did not appear to increase NAT2 instability [44,45]. Following recombinant expression in mammalian cells, the SNPs which reduced NAT2 activities each did so by reductions in expression of recombinant NAT2 protein but not recombinant NAT2 mRNA . 857G>A (G286E) changed apparent Km, reducing it towards SMZ and increasing it towards AcCoA.
As previously reviewed  deduction of NAT2 phenotypes is assigned based on co-dominant expression of rapid and slow acetylator NAT2 alleles or haplotypes. Individuals homozygous for rapid NAT2 acetylator alleles are deduced as rapid acetylators, individuals homozygous for slow acetylator NAT2 alleles are deduced as slow acetylators, and individuals possessing one rapid and one slow NAT2 allele are deduced as intermediate acetylators.
Associations between acetylator polymorphism and drug toxicity or disease frequency are often inconsistent and not replicated. A confounding issue is the poor understanding of the many factors, in addition to SNPs in the coding region, which modify N-acetyltransferase capacity. Some examples are discussed below:
Although much has been learned regarding the identity, location, and functional effects of SNPs in NAT1 and NAT2, additional research is needed. Emerging concepts regarding the functional effects of SNPs in NAT1 and NAT2 suggest pharmacogenetic effects that are not replicated may ultimately be comfirmed when all factors affecting the phenotype are better understood.
An increased understanding of pharmacogenetic principles is leading to advances in personalized drug treatment (higher efficacy and lower toxicity) together with more individualized risk assessments to disease and/or toxicities associated with carcinogen exposures. The latter falls within the discipline of “molecular epidemiology” and emerging concepts suggest that confounding variables can result in lack of replication across studies leading to conclusions that genetic polymorphisms are not sufficiently biologically or medically relevant to merit attention. Using arylamine N-acetyltransferase as a paradigm, laboratory-based mechanistic analyses are revealing additional complexities leading to the alternative conclusion that genetic polymorphisms are biologically and medically relevant when the confounding factors are more fully understood and considered. As medical care moves to a more personalized approach, the implications of these confounding factors will be very important in understanding the complexities of personalized medicine.
The author acknowledges the essential contributions of all investigators in the arylamine N-acetyltransferase field, particularly those who I have worked with and generated data presented in this paper. Also acknowledged are the relevant research grants from the National Cancer Institute (R01-CA034627), the National Institute of Environmental Health Sciences (P30-ES014443) and the National Institute of Child Health and Development (U10-HD045934).
Job Titles: Peter K. Knoefel Endowed Chair of Pharmacology, Distinguished University Scholar, Professor and Chairman of the Department of Pharmacology and Toxicology, Leader, Cancer Prevention and Control Program, James Graham Brown Cancer Center, and Special Assistant to the Provost for Strategic Planning, University of Louisville.