The results of our study illustrate the great power and utility of the public genome databases and database search programs. Moreover, they provide important novel insights into the molecular structure and evolution of the pART gene family.
Our results differ in some details from those of a recent report by Ame and coworkers [11
]. These discrepancies can be explained by errors in the draft sequence of the human genome available at the time of the previous report. For example, the database entry AK023746 given by Ame et al
. for PARP-5c evidently represents a truncated cDNA for pART6 (alias tankyrase 2 or PARP-5b). This entry contains two point mutations and a 65 bp deletion in the 3' utr vs. the cDNA and genomic sequences of pART6. Blast analyses of the high quality sequence of the human genome and of the EST database with the AK023746 sequence provide no evidence for a distinct copy of this gene in the human genome. We conclude that the PARP-5c gene identified by Ame et al
. represents an allelic variant or cloning/sequencing error rather than a genuine pART gene family member; i.e. that the total number of human pART genes is 17 rather than 18 suggested in the previous report. Large discrepancies exist also in the number of amino acids assigned in the two reports for pART7/PARP-15 (444 vs
. 989) and for pART16/PARP-8 (854 vs
. 501). The earlier database entries for PARP-8 (XM_018395) and PARP-15 (XM_093336) have hence been removed as a result of standard genome annotation processing because these entries evidently contained frameshift mutations and/or fused cDNA sequences that led to erroneous amino acid assignments. Similarly, the small differences in assignments for five other PARPs/pARTs can be accounted for by differences in the draft vs. high quality sequence of the human genome (Ame et al
./our study): pART2/PARP2 (583/570), pART3/PARP3 (540/533), pART10/PARP10 (1020/1025), and pART14/PARP7 (657/680).
We assigned the 17 human pARTs into five distinct subgroups (Fig. ). This assignment is supported by several independent lines of evidence: Firstly, members of a particular subgroup show higher amino acid sequence identities to one another than to members of other subgroups (Fig. ). This is reflected in the tiling paths of PSI-Blast searches, where members of the same subgroup were detected in the first iteration, whereas members of other subgroups generally were detected in later iterations (Fig. ). Secondly, members of a particular subgroup typically share one or more associated domains not found in members of other subgroups (Fig. ); pARTs 8, 10 and 15 pose exceptions to this rule. Thirdly, members of a particular subgroup typically share one or more intron positions not found in members of other subgroups (Fig. ); pARTs 1–4 pose notable exceptions to this rule. Fourthly, when genes of two or more pARTs are physically linked in a cluster on the same chromosome, they belong to the same subgroup – possibly reflecting regional duplications (Fig. ). Finally, results of all phylogenetic analysis converged in topologies with clearly distinct clades for each of the subgroups (Fig. ). Members of subgroups 1 and 2 evidently are more closely related to one another than to other subgroups (Figs. and ). Similarly, members of subgroups 3 and 4 are sister-groups to one another, indicating a close relationship.
Members of the pART family are found fused to a striking variety of associated domains (Fig. ). It is not farfetched to hypothesize that the associated domains direct the respective pARTs to subcellular structures and/or target proteins. Genetic fusion of group 1 and group 2 pARTs with DNA-binding domains is in line with their established roles in DNA-repair, chromosome remodeling, and mitotic spindle formation [9
]. Moreover, the SAM and ankyrin domains of pARTs 5 and 6 have been shown to mediate interactions with target proteins in telomere-associated protein complexes [45
]. Similarly, the C-terminal domain of pART4 evidently plays a role in targeting pART4 to the major vault particles [46
]. A flurry of domains implicated in the ubiquitination pathway point to a possible connection between ubiqutitination and ADP-ribosylation. Indeed, it has recently been reported that ADP-ribosylation of TRF1 by tankyrase (pART5) results in the release of the protein from telomers and its subsequent ubiquitination [47
]. Strikingly, pARTs from the microfungi G. zea
and A. nidulans
provide examples for the genetic fusion of two enzyme domains catalyzing these post-translational protein modifications into a single polypeptide.
So far, only a single example of a 'naked' pART catalytic domain akin to the isolated catalytic domain of the vertebrate ecto-ARTs 1–5 [27
] was recovered from the public database. This putative pART from Chilo iridescent virus clusters with the mammalian pARTs of subgroup 1 (Fig. ), suggesting that this large double stranded DNA virus [48
] may have acquired its pART by horizontal gene transfer.
The definition of the pART catalytic domain proposed in this paper is somewhat smaller than that commonly used in the field [11
]. We used the position of the common phase 0 intron upstream of the first conserved β sheet to set the N-terminal end of the catalytic domain (e.g. see Figs. and ). The pARTs of subgroup 1 are extended N-terminally of this position by an alpha helical domain (Fig. ) which is often included as part of the PARP-1 catalytic domain. However, since other pART family members lack this region, we propose to omit it from the proper pART catalytic domain. Moreover, this N-terminal delineation of the catalytic domain corresponds well to the N-terminus of the 'naked' pART of Chilo iridescent virus as well as to those of Diphtheria toxin and Pseudomonas exotoxin A after proteolytic processing of the signal sequence or translocation domain (Fig. ).
With the exception of pART4, the group 1 pARTs are extended upstream of this helical region by another domain named after its conserved motif of tryptophane (W) – glycine (G) – arginine (R) residues. This WGR domain is found also in poly-A-polymerases, its function is unknown. Many group 1 pARTs from distantly related organisms, e.g. plants, insects, nematodes, and microfungi, also contain these two domains. Interestingly, in Drosophila melanogaster pARTa these three domains (WGR, helical, catalytic) are encoded by a single, large exon (Fig. ). Human pARTs 5–17 lack the WGR and helical domains. However, pART5/6 (tankyrase)-like pARTs from C. elegans (Ce.pARTc) and D. discoideum (Dd.pARTb) contain the WGR and helical domains whereas a SAM domain is found at this position in human pARTs 5 and 6 (Fig. ).
A puzzling finding is the lack of conservation of the classic H-Y-E motif found in the catalytic cores of PARP-1, PARP-2, Diphtheria toxin and Pseudomonas Exotoxin A (Fig. ). This motif is conserved only in members of subgroups 1 and 2. All other human pARTs carry notable variations from this motif. In particular, all other pARTs carry a replacement of the glutamic acid residue in β 5, i.e. the residue that was shown to be critical for the catalytic activities of DT, PARP-1 and many other pARTs and mARTs [6
]. In six cases, this glutamic acid is replaced by an isoleucine residue, in two cases by leucine, and in one case each by threonine, valine, or tyrosine. Enzyme activity has been reported recently for two of the six pARTs that carry an H-Y-I motif instead of the H-Y-E motif (pARTs 10 and 14) [32
]. Thus, it is not unlikely that the four other pARTs carrying the H-Y-I motif turn out to be active enzymes (pARTs 11, 12, 16, and 17). Mouse pART8 also carries an H-Y-I motif, whereas its human orthologue, like pART7, carries an H-Y-L variant motif. H-Y-I and H-Y-L variant motifs are also found in pARTs from the slime mold (Dd.pARTg) and amoeba (Eh.pARTf) (Fig. ). Human pART15 carries an H-Y-Y variant motif, which is conserved in its orthologues from mouse and the malaria mosquito (Fig. ). It will be interesting to determine whether and how site directed mutagenesis of the H-Y-E motif in pARTs 1–6 to the variant motifs of pARTs 7–17 – and vice versa – affects their enzyme activities. Moreover, it remains to be determined whether the most striking variation of the H-Y-E motif – to Q-Y-T in human and mouse pART9 is compatible with enzyme activity.
The results of our PSI-BLAST and PSIPRED analyses (Figs. , , and additional files 3
) support the conclusions that the pART gene family described here and the mART gene family described in our previous study [27
] constitute two distinct ART subfamilies, and further, that the family of tRNA:NAD 2'-phosphotransferases [24
] constitutes a branch that is more closely related to the pART subfamily than to the mART subfamily. Our results illuminate the power and limits of PSI-BLAST searches: PSI-BLAST readily connected members of the pART subfamily in many different species, while DT, ETA and TpTs were found at or below the threshold. In contrast PSI-BLAST searches never connected pART family members with members of the mART subfamily or vice versa. The results of PSI-BLAST searches, thus, are in accord with insights gained from the known 3D structures of representative ADP-ribosyltransferases (Fig. ), i.e. that certain conserved structural features clearly distinguish these two subfamilies. Is it possible that some of the pART gene family members described here actually possess mono-ADP-ribosyltransferase rather than poly-ADP-ribosyltransferase activity? Given the structural similarity to DT/ETA this is a possibility. Moreover, it cannot be excluded that some family members may have lost enzyme activity and have acquired a new function. In any case, the respective proteins clearly are more closely related to the pART than to the mART gene family, in line with the nomenclature proposed here. Have all ARTs encoded in the human genome been identified? A number of ADP-ribosylation reactions have been described in mammalian cells that cannot yet be accounted for by the ARTs identified in this study or our previous study, e.g. mono-ADP-ribosylation of actin, rho, glutamate dehydrogenase, and of the alpha and beta subunits of heterotrimeric G proteins [3
]. Given the fact that the pART subfamily described here and the mART subfamily described in our previous study [27
] could not be interconnected by PSI-BLAST, it reamins an intriguing possibility that other ART subfamilies in the human genome still await to be identified.