|Home | About | Journals | Submit | Contact Us | Français|
The post-translational modification of proteins is a well-known endogenous mechanism for regulating protein function and activity. Cellular proteins are also susceptible to post-translational modification by xenobiotic agents that possess, or whose metabolites possess, significant electrophilic character. Such non-physiological modifications to endogenous proteins are sometimes benign, but in other cases they are strongly associated with, and are presumed to cause, lethal cytotoxic consequences via necrosis and/or apoptosis. The Reactive Metabolite Target Protein Database (TPDB) is a searchable, freely web-accessible (http://tpdb.medchem.ku.edu:8080/protein_database/) resource that attempts to provide a comprehensive, up-to-date listing of known reactive metabolite target proteins. In this report we characterize the TPDB by reviewing briefly how the information it contains came to be known. We also compare its information to that provided by other types of “-omics” studies relevant to toxicology, and we illustrate how bioinformatic analysis of target proteins may help to elucidate mechanisms of cytotoxic responses to reactive metabolites.
The post-translational modification of proteins is a well-known endogenous mechanism for regulating protein function and activity (1, 2). Modifications such as disulfide bond formation, acylation, phosphorylation, glycosylation, metal ion binding and others can affect enzymatic activity, intracellular signaling via protein-protein interactions (PPIs), subcellular trafficking and localization, or even protein degradation. Cellular proteins are also susceptible to post-translational modification by xenobiotic agents that possess, or whose metabolites possess, significant electrophilic character (e.g., epoxides, Michael acceptors, aldehydes, acylating agents, benzylic sulfate esters, etc.). Such non-physiological modifications to endogenous proteins are sometimes benign, but in other cases they are strongly associated with (and presumed to cause) lethal cytotoxic consequences via necrosis and/or apoptosis. Understanding the mechanisms by which protein covalent binding leads to cytotoxicity has been a major concern in Toxicology for nearly 40 years.
During this time, as technology evolved, peak interest progressed from identifying the structures of reactive metabolites and the chemistry and enzymology of their formation, to elucidating the nature of the adducts formed with proteins and biological nucleophiles, to the identification of the target proteins modified by particular reactive metabolites. With information on the latter becoming more readily available, interest is now turning to the elucidation of mechanisms connecting protein covalent binding events to downstream cytotoxic events and cell death. Although our knowledge about reactive metabolite target proteins is not yet extensive, it has grown rapidly in the past several years. This rapid growth encouraged us to build the Reactive Metabolite Target Protein Database (TPDB), a searchable, freely web-accessible resource (http://tpdb.medchem.ku.edu:8080/protein_database/) that attempts to provide a comprehensive, up-to-date listing of reported target proteins. In this report we characterize the TPDB by reviewing briefly how the information it contains came to be known. We also compare its information to that provided by other types of “-omics” studies relevant to toxicology. Finally, we illustrate how bioinformatic analysis of target proteins may help to elucidate mechanisms of cytotoxic responses to reactive metabolites.
The origins of reactive metabolite toxicity go back to studies done by James and Elizabeth Miller at the University of Wisconsin in the 1940s. While investigating the metabolism of the carcinogenic azo dye 4-dimethylaminoazobenzene (DAB, also called “butter yellow” and formerly used as a food coloring!), the Millers observed that liver proteins of animals treated with DAB became permanently yellow colored (3, 4). Upon treatment with alkali these proteins released 3-methylthio-N-methyl-4-aminoazobenzene, suggesting that covalent binding involved modification of protein methionine residues by an “active form” of a mono-demethylated metabolite of DAB (5).
Some years later B. B. Brodie and his colleagues at NIH were interested in the organ-selective toxicity of certain drugs and organic chemicals. They became interested in bromobenzene, which was first recognized to be hepatotoxic in 1935 (6), even though its in vivo conversion to a mercapturic acid had been described as early as 1879 (7, 8). Combining excellent chemical insights with broad knowledge of biology, Brodie et al. took note of the fact that formation of the aryl-sulfur bond in p-bromophenyl mercapturic acid would require metabolic activation of the aromatic ring. He was also aware that small allergenic molecules such as penicillin or fluorodinitrobenzene first had to react chemically with proteins in order to provoke an immune response. Pulling these observations together, he then asked the key question: Could the protein binding of reactive intermediates formed during bromobenzene metabolism be connected to its cytotoxicity?
Brodie and co-workers supported this hypothesis by showing that one could manipulate the covalent binding and toxicity of bromobenzene in parallel by manipulating its metabolism and/or the availability of glutathione (the precursor to mercapturic acids). Since the landmark 1971 paper describing this work (9), and a series of follow-up papers relating acetaminophen hepatotoxicity to the covalent binding of its reactive metabolites (10, 11), a great many other works have supported this basic connection between chemical reactivity, covalent binding and cytotoxicity (12-14). Nowadays, it is fair to say that most researchers regard the existence of such correlations as prima facie evidence that protein covalent binding can cause cytotoxicity. On the other hand, it is equally clear that not all covalent binding leads automatically to cytotoxicity.
Once the idea that reactive metabolites could be cytotoxic took hold, much effort went into identifying their structures and describing their reactivity. Early ideas about reactive metabolite structures came from analyzing the structures of stable, isolable end-product metabolites. For example, Eric Boyland correctly (and against much skepticism) hypothesized arene oxide intermediates as precursors to mercapturic acids, phenols and dihydrodiol metabolites of aromatic compounds 20 years ahead of their first synthesis (15, 16). Such hypotheses were often reinforced by substituent and/or kinetic isotope effect studies showing that chemical manipulations altered the biology in a parallel fashion. As early examples, the observations that deuterochloroform (CDCl3) is significantly less toxic than chloroform (CHCl3) (17), and that “methylchloroform” (1,1,1-trichloroethane) is essentially non-toxic (18), both point to a role for C-H bond breaking in the bioactivation of chloroform. Today we understand that this occurs by enzymatic hydroxylation to the highly unstable intermediate HOCCl3, followed by non-enzymatic solvolysis to form HCl and phosgene (Cl2C=O), a highly reactive cross-linking reagent(19).
Eventually, chemical trapping of reactive metabolites, usually in vitro with nucleophiles such as glutathione, N-acetylcysteine, N-acetyllysine, semicarbazide or even cyanide ion, allowed the isolation of stable derivatives amenable to structure elucidation by NMR and mass spectrometry. Starting in the 1990s, success in this area was greatly aided by continuing improvements in NMR field strength (resolution) and sensitivity, and by the development of quadrupole mass filters and time-of-flight mass analyzers, techniques for soft ionization of delicate molecules including electrospray and matrix-assisted laser desorption ionization (MALDI), and the development of hybrid tandem mass spectrometers for MS/MS analysis of complex polar molecules including intact proteins. As noted above, chemical degradation of adducted proteins using acid, base, Raney nickel or protease enzymes often released small, chemically stable, adduct-derived molecules amenable to isolation and structure elucidation using ordinary organic chemical techniques (20, 21). Eventually, many reactive metabolites even yielded to chemical synthesis. Noteworthy examples include naphthalene-1,2-oxide (22, 23) and the quinonimine metabolite of acetaminophen NAPQI (24).
The chemical synthesis of reactive metabolites not only confirmed their existence and allowed the characterization of their behavior toward proteins and other nucleophiles, it also enabled the synthesis of immunogens and the preparation of anti-hapten antibodies. The latter proved useful for detecting adducted proteins in serum samples, tissue sections, ELISA plates and electrophoresis gels, and for immunoprecipitation of adducted target proteins and their interacting partners. This in turn enabled many qualitative and quantitative studies of dose-response, time-course and the anatomical distribution of covalent binding and cytotoxicity at the tissue and cellular level. Finally, chemical synthesis also allowed direct testing for toxicity when the metabolite was not so reactive that it could not be administered to a biological system without being destroyed by isomerization or hydrolysis en route to its targets.
Chemically speaking, reactive metabolites fall into two basic categories: free radicals (usually formed by reductive processes) and electrophiles (usually formed by oxidative processes). The electrophiles can be subdivided further into alkylating agents and acylating agents. The former group includes epoxides, benzylic sulfate esters, sulfur mustards derived from 1,2-dihaloalkanes and glutathione, and Michael acceptors including quinones, quinonimines and quinone methides. The latter group includes acyl halides derived by hydroxylation of polyhaloalkanes such as chloroform (vide supra) or halothane, thioacyl halides derived by action of cysteine conjugate beta-lyase on haloalkylcysteine conjugates (25, 26), acyl glucuronides (27), and iminosulfinic acids derived by S-oxidation of thioamides and thioureas(28, 29). Electrophiles vary in their selectivity for biological nucleophiles based in part on principles of soft-hard acid-base chemistry (30), but in general, the major protein side chains they target are cysteine, methionine, lysine, histidine and to a lesser extent glutamate and aspartate.
In contrast to electrophilic metabolites that form covalent protein adducts directly, free radical metabolites were suggested to damage proteins by initiating “oxidative stress” resulting, inter alia, in the formation of carbonyl groups detectable by dinitrophenylhydrazine or other carbonyl reagents. While proteins and nucleic acids can be attacked and degraded by free radicals, poly-unsaturated fatty acid side chains present in bilayer lipid membranes are particularly susceptible to free radical induced autoxidation in aerobic cells and tissues. Much of the protein carbonylation originally attributed to free radical oxidation is now understood to be due to the covalent modification of protein nucleophiles by electrophilic products of lipid peroxidation (e.g. Michael acceptors) (31, 32). The range of possibilities here is very large, and the details are only now starting to be elucidated.
Finally, it should be noted that although the classification of electrophilic metabolites as either alkylating or acylating agents may facilitate the understanding of their chemistry, such distinctions are probably meaningless with respect to understanding their biology, which almost certainly derives from the changes they inflict in the structures and/or properties of their protein targets, and the implications of those changes for protein–protein interactions.
As the small-molecule chemistry of reactive electrophilic metabolites was being elucidated, methods for protein separation and analysis were also improving greatly. The first protein adduct of a model reactive metabolite to be structurally identified was the hemoglobin adduct of ethylene oxide (used industrially to sterilize bulk solid materials) (33). This was a notable achievement in terms of analytical chemistry, and it became an important paradigm in industrial hygiene and molecular dosimetry for worker protection (34). Shortly thereafter Wendel and Cikryt (35) reported that incubating [14C]-acetaminophen with mouse liver homogenate resulted in the formation of protein-associated radioactivity that co-eluted with glutathione transferase activity. Although this target protein identification is relatively “soft” by today’s standards, it has been amply confirmed by later investigations (36). The rigorous identification of reactive metabolite target proteins began around 1985 with studies in the laboratory of Lance Pohl at the NIH. Typically, after treating an animal with a protoxin, adducted proteins in target tissues were isolated and purified chromatographically and identified by Edman sequencing. In this way the Pohl group first identified a microsomal esterase (37) and grp94 (38) as protein targets of halothane metabolites. Shortly thereafter the research groups of Steve Cohen (39) and Jack Hinson (40) independently reported a mysterious “selenium binding protein” as an acetaminophen target protein.
During the next five years the Pohl group and others identified approximately 20 more reactive metabolite target proteins by following this one-at-a-time approach, and by 1997, studies with six small-molecule protoxins (acetaminophen, bromobenzene, diclofenac, halothane, 1,1,2,2-tetrafluoroethylcysteine, and zomepirac) had identified a total of 28 reactive metabolite target proteins in the liver or kidneys of either rat, mouse or human (Table 1). These proteins fell broadly into three main categories: drug metabolizing enzymes, enzymes of intermediary metabolism, and chaperones or structural proteins. No “Achilles heel” target was apparent, nor were targets common to multiple chemicals observed; the broad range of proteins adducted did not immediately suggest a single mechanism of toxicity.
The year 1998 was a watershed year in reactive metabolite proteomics. Qiu, Benet and Burlingame (41) introduced the use of 2D gel electrophoresis combined with peptide mass mapping and MS/MS sequencing for target protein identification, and they illustrated the power of this method by identifying 29 acetaminophen target proteins in mouse liver from a single 2D gel. In the subsequent decade this technique enabled several research groups, including our own (29, 42-47), to bring the total number of known target proteins to at least 268 as of May, 2008.
An unexpected finding in these covalent binding studies was that some proteins would appear in multiple well-separated spots on the 2D gel, usually with the same apparent mass but differing pI values (41, 44, 45). This may be due to differences in post-translational modification such as phosphorylation, but the exact reason is unknown. Another was the finding that adduct densities vary widely among different proteins; some low-abundance proteins become highly labeled while some abundant proteins acquire relatively little label. Thus, adduction is not a purely random proccess (29, 45, 46, 48). Average adduction levels are typically low, around 1 adduct per 10 molecules of protein. Distribution of such low levels of adduction across multiple nucleophilic sites in a protein molecule could be one reason that it has been very difficult to observe adducted peptides from proteins labeled in vivo (49). An exception to this occurs with thiobenzamide, the binding of which, at toxic doses, approaches 1 adduct per molecule of protein (29). This higher level of binding, coupled with use of deuterium labeling, made it possible to observe multiple labeled peptides from liver proteins of thiobenzamide treated rats, which in turn confirmed that some proteins have multiple nucleophilic sites susceptible to adduction (29). Nevertheless, it remains an important if difficult goal to observe (and, if possible, sequence) adducted peptides in digests of spots cut from gels. Only then can one avoid the ambiguity that the spot could be composed of a minor protein component that bears the radioactive or immuno-reactive adduct moiety plus a major protein component that (incorrectly) gets “identified” as the target.
As the outpouring of newly identified target proteins accelerated we found it convenient, and later necessary, to use a spreadsheet to keep track of them. In 2006 the spreadsheet evolved into a searchable, Oracle-based database that is now freely available online (36). In May 2008 the TPDB contained information on 268 distinct proteins (comprising 997 synonyms in the literature!) targeted by one or more of 23 small molecules or their metabolites. Our criteria for including a protein in the TPDB are that it be adducted in-life, by one or more metabolites of a single well-defined chemical substance, and that the protein be identified rigorously. Users can access the TPDB and conduct custom searches by chemical, target tissue, species, protein name, or combinations of these criteria. Since these search functions have been described previously (50) and are illustrated in the help function on the website, only one will be mentioned here, and that is the Commonality Matrix function (Table 2). On its diagonal this matrix shows the number of target proteins known for any given chemical (listed alphabetically on the row and column headings). The off-diagonal elements give the number of protein targets in common between any two chemicals; clicking on any number provides a table listing all the relevant proteins along with references to the original literature and links to other databases such as SwissProt or the PDB. The number of zeros (blanks) among the off-diagonal elements, and the small size of many of the numbers on the diagonal, indicates that we are a long way from having an in-depth view of reactive metabolite target proteins, much less their mechanistic connection to reactive metabolite induced cytotoxicity. Nevertheless, signs of commonality are increasing as the TPDB grows.
We and others (29, 42-44, 47) have attempted to make sense of lists of target proteins by arbitrarily grouping them into categories according to protein function, but as alluded to above, this does little to reveal a unifying mechanism of toxicity caused by reactive metabolites. Another way to analyze target proteins is to sort them into Gene Ontology categories (GO; www.geneontology.org) and determine whether any categories show an over-abundance of target proteins; this could perhaps indicate a functional connection between protein adduction and biological outcome. For example in February 2008, when the number of proteins in the TPDB had reached 171, we analyzed the entire list this way. We found that 163 of the 171 proteins were represented (by their genes) in the Molecular Function category of the GO classification system. This category contains 8252 sub-categories of functions containing variable numbers of genes, some of which appear in several sub-categories proceeding down the heirarchy. Thus, 115 of the 163 target proteins sorted into the Catalytic Activity category, which contains 5085 genes overall. Although this result is highly significant statistically (p = 1.45E-19), the 115 targets represented only 2% of all proteins in the category, and the scope of this category is so broad and non-specific that this finding can at best hint at a mechanistic connection to toxicity. In contrast, 5 of the 163 targets proteins sorted into the category Peroxiredoxin Activity which contains only 6 members. This result too is highly significant (p = 9.43E-08), but the finding that 83% of the proteins in this category are targeted by reactive metabolites suggests that perturbation of this Molecular Function by protein adduction could perhaps contribute to the cytotoxicity associated with reactive metabolites. Such findings can help form hypotheses about reactive metabolite cytotoxicity, but each will need extensive follow-up testing before it can be taken seriously as a significant contributor.
Gene chip microarray and related technology that evolved in the 1990s has enabled the measurement of changes in the expression of thousands of genes in response to treatment of animals with drugs, chemicals and protoxicants. The impetus for these studies was largely two-fold: 1) to use the data to categorize new chemical entities vis-a-vis known toxins of various kinds in order to accelerate discovery toxicology and eliminate potential bad actors from drug discovery pipelines as early as possible, and 2) to use the data to gain insights into cellular mechanisms of chemical toxicity. The results of a number of large scale experiments of this kind have been published (51-56), and the emerging consensus is that expression changes in a moderate number of well-selected genes (hundreds rather than thousands) can indeed be used to categorize chemicals as similar or dissimilar to various well-known paradigm hepatotoxicants. An excellent example of this can be seen in the work of Steiner et al. (54). On the other hand, toxicogenomic data has not yet yielded much insight into mechanisms of toxicity (51, 52, 55-57).
Compared to the effort expended in toxicogenomics, studies of changes in the relative abundances of cellular proteins by quantitative proteomic methods such as differential gel electrophoresis (by scanning or with differential dye labeling) (55, 56, 58) or isotope coded affinity tagging (ICAT) (59, 60) have been fewer in number, often less extensive in scope, and correspondingly less informative concerning mechanisms of toxicity. Even in the few cases in which both transcriptomic and proteomic changes were measured in the same study, the changes were modest and there was surprisingly little overlap between the responding genes, the responding proteins, and the target proteins in the TPDB. This situation could change as more data is gathered, but in the meantime, it may be more useful to look at the existing data in new ways, as disccussed in the following section.
In cells, proteins interact extensively with other proteins. For example, the Human Protein Reference Database (www.hprd.org) lists over 9,000 proteins that participate in over 36,500 PPIs, some of which serve important signaling functions. Since examination of target proteins does not immediately suggest plausible direct mechanisms for reactive metabolite cytotoxicity, and since relatively few of the 268 proteins in the TPDB are common targets for multiple cytotoxic metabolites, we decided to examine the interacting partners of target proteins for clues to downstream events important to cytotoxicity. This work is just beginning, but it is also beginning to yield encouraging or at least interesting results (manuscript in preparation). From the TPDB we selected 28 proteins that were common targets of multiple reactive metabolites. A blast search indicated that their human orthologs were very highly similar, and 21 of the 28 were represented in the HPRD. Cytoscape software (61) indicated that these 21 target proteins had 165 interacting partners, and among the total set of 186 proteins there were 538 individual interactions. We then sorted the 186 proteins into GO categories and KEGG pathways to link them to biological functions. Quite remarkably, we found highly significant over-representation of our proteins in the GO categories related to apoptosis, protein folding and response to unfolded proteins, and in KEGG pathways related to MAP kinase signaling, and antigen processing and presentation.
Independent experimental evidence also strongly supports a role for MAP kinase signaling in apoptosis and other pathological cellular responses to toxic chemicals (62-66). Inhibitors of these pathways are even being sought as therapies for diseases such as inflammation, arthritis, neurodegenerative and vascular disease, (67, 68). The unfolded protein response (UPR) is an evolutionarily conserved mechanism that is activated by misfolded or abnormal proteins in the ER (69-71). Such proteins are usually created by exposure of cells to physical stresses such as heat- or osmotic shock, UV radiation, or expression of folding-deficient mutant proteins. Cells respond with increased synthesis of stress-response proteins (chaperones, Hsps, protein disulfide isomerases), decreased synthesis of most other proteins, and increased ER-associated protein degradation. When the cell loses its ability to cope, cytotoxicity sets in. We believe that exposure of cells to metabolically-generated lipophilic electrophiles constitutes another form of protein-damaging stress, mediated and propagated through alterations in protein properties and interactions. Additional discussion of cell signaling and the unfolded protein response in relation to cytotoxicity can be found in reference (29).
Our knowledge about reactive metabolite target proteins has increased enormously in just the past three years, yet we are still a long way from having what could even be called a “pretty good” picture of the scope of the situation, much less the details. Many fundamental questions remain to be answered about the proteins themselves, as well as the connection between their modification and the triggering of cytotoxic responses. For example, why do some proteins appear in multiple spots on 2D gels; there must be more to this than just differences in phosphorylation or deamidation. Does covalent adduction necessarily imply modification of function? Here we must consider that “function” extends beyond just enzymatic activity into the realm of PPIs and intracellular signaling. How much modification of a given protein is required before the cellular situation changes from normal to abnormal? In most cases we have no quantitative information about target protein adduction, such as number of adducts per molecule of protein, or fraction of copy number modified. Finally, thus far there is little apparent commonality among the target proteins hit by a range of different chemical toxins. As we uncover more targets of more reactive metabolites, will such commonality eventually become apparent and perhaps point to a common pathway for cytotoxic responses? If not can such commonality be found at the level of interacting partner proteins? Looking back, the encouraging thing is that these questions are all imminently answerable, and some of the answers will move the field forward.
Support for this research was provided by NIH grants GM-21784 (to RPH) and RR016475 (subaward to JF; J. Hunt, PI).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.