We began this study by performing a bioinformatics survey of candidate genes for transcriptional regulators of NAD metabolism in those bacterial species that do not contain orthologs of known regulators, NadR or NiaR. We considered a conserved clustering on the chromosome with known genes of NAD biogenesis as primary evidence for implication of such candidates (32
). Further prioritization of candidate genes was performed based on their domain structure analysis for a presence of putative DNA-binding motifs as well as on additional evidence of functional coupling, such as occurrence profiles and presence of shared regulatory sites (8
), as metabolic transcriptional regulators are often auto regulated.
Reconstruction of NAD metabolic and regulatory networks in ~400 bacterial species with completely sequenced genomes was performed using a subsystems-based approach (10
) implemented in the SEED genomic platform (see ‘NAD regulation’ subsystem at http://theseed.uchicago.edu/FIG/subsys.cgi
). A genome context analysis evidenced a strong tendency of genes involved in NAD biogenesis (including regulatory gene niaR
) to form conserved operon-like clusters (Supplementary Data, Table S1). Among other genes with similar chromosomal clustering patterns, of particular interest was a family of Nudix hydrolase homologs. Members of this family were considered primary candidates for the NAD regulatory role and annotated as possible NrtR based on the following key observations:
- presence of a C-terminal domain with winged HTH-fold characteristic of many prokaryotic transcription factors (34);
- homology of the NrtR N-terminal domain to members of the Nudix hydrolase family related to NAD metabolism, such as ADPR pyrophosphatase involved in NAD recycling;
- lack of conservation within the Nudix hydrolase signature sequence suggesting that many NrtR members could have lost a catalytic activity, while potentially retaining an ability to specifically bind relevant metabolites (e.g. ADPR).
Chromosomal co-localization of nrtR genes
In most cases (42 out of 62 analyzed) nrtR genes are positionally linked to various NAD metabolism genes (Supplementary Table S1). The tendency of nrtR genes to cluster on the chromosome with NAD biosynthesis genes is illustrated in . For instance, nrtR genes occur in clusters with nadABC in Actinobacteria; nadD-nadE and pncB in Cyanobacteria; prs-nadV in γ-proteobacteria; nadR-pnuC in Streptococcus and niaP in Bifidobacterium.
Figure 2. Genomic organization of nrtR-containing loci involved in NAD metabolism (A), pentose utilization (B) and other pathways (C). Genes encoding the predicted NrtR regulator are shown by red arrows; the color code and the abbreviations for other genes correspond (more ...)
species, the nrtR
gene occurs next to a gene coding for a putative ADP-ribosyl-glycohydrolase (draG
), which catalyzes the removal of the ADPR group covalently linked to target proteins (). Protein ADP-ribosylation was proven to be an indispensable regulatory mechanism in Streptomyces
) even though its role remains to be elucidated.
Additional types of conserved chromosomal clusters were observed pointing to possible functional coupling of NrtR with other metabolic pathways. For example nine cases of nrtR chromosomal clustering with genes involved in utilization of pentoses (l-arabinose and d-xylose) were detected in some γ-proteobacteria and in Bacteroidetes. Notably, the pentose utilization pathways have connections with NAD metabolism via the shared intermediate ribose 5-phosphate (Rib-P). In fact, glycohydrolitic degradation of NAD generates ADPR, which is further converted to Rib-P by Nudix hydrolases. Rib-P can be recycled to generate NAD, via PRPP formation (). Rib-P is also produced by the pentoses utilization routes, after they merged to the pentose phosphate pathway. The result is a combination of pentoses utilization, NAD degradation and recycling in a compact subnetwork ().
Phylogenetic distribution and domain composition of NrtR proteins
We constructed the maximum likelihood phylogenetic tree for 62 NrtR protein family representatives selected from diverse bacterial species (). The distribution of NrtR orthologs largely coincides with known taxonomic groups with several exceptions, e.g. in the case of two proteins from γ-proteobacteria, Pseudoalteromonas atlantica and Saccharophagus degradans, whose clustering with the Bacteroidetes group is a likely result of lateral gene transfer. NrtR proteins from Actinobacteria and Firmicutes are split on the tree into two separate groups, which may reflect the actual functional divergence, e.g. by the set of co-regulated genes or by the consensus DNA-binding motif (see next section). Moreover, some of these species contain two NrtR paralogs. For instance, two NrtR paralogs in Streptomyces species (groups 2a and 2b) likely resulted from a recent duplication event. In another species of Actinobacteria, Kineococcus radiodurans, the situation is different: two NrtR paralogs are located on the most divergent branches of the tree (groups 1a and 2b).
Figure 3. Maximum likelihood phylogenetic tree and DNA recognition motifs for the NrtR family of transcriptional regulators. NrtR proteins recognizing the same DNA motif are grouped (the group names are given), and the corresponding motif sequence logos are shown (more ...)
A common feature of the NrtR family is the invariant presence of the N-terminal Nudix domain (PF00293 or COG1051) fused with a characteristic C-terminal domain (PB002540), which is similar to C-terminal part of proteins from uncharacterized COG4111 family. However, COG4111 and NrtR proteins have extremely divergent N-terminal domains (see Discussion section for further details on COG4111). The schematic representation of the NrtR domain arrangement compared with that of some of the known members of the Nudix hydrolase family (where the Nudix domain is combined with other domains) and the COG4111 protein family is shown in Supplementary Fig. S1. A multiple alignment of selected NrtR proteins, including two proteins with known 3D structure, is provided in Supplementary Fig. S2. The Nudix domain is typical of a family of hydrolases found in nearly all known species in all three domains of life. Typical substrates of Nudix hydrolases are nu
phosphates with large variation of residues (x
) attached to the phosphate moiety (hence the name, nudix
). These enzymes hydrolyze a pyrophosphate bond in a wide range of organic pyrophosphates, including nucleoside di- and triphosphates, dinucleoside polyphosphates and nucleotide sugars (such as ADPR), with varying degrees of substrate specificity. The number of Nudix-like genes in prokaryotic genomes is a subject of significant variation reaching up to 30 copies, depending on metabolic complexity and adaptability of species [reviewed in (36
)]. Among various Nudix hydrolases, those with established substrate preference for ADPR (i.e. ADPR pyrophosphatases, catalyzing ADPR hydrolysis to AMP and Rib-P) show the most significant similarity with the Nudix domain of NrtR proteins.
A detailed analysis of conserved sequence motifs within the NrtR family indicated that, in contrast to functional Nudix hydrolases, many members of this family could have lost a catalytic activity. This conjecture is based on the apparent lack of conservation within a signature sequence, GX5
REUXEEXGU (where U is a hydrophobic residue and X is any residue), which is strictly conserved in all active Nudix hydrolases identified so far (37
). In most members of the NrtR family, at least one or more of the conserved signature residues are replaced in a more or less random fashion (Supplementary Fig. S2). Previously published results of biochemical characterization of two divergent members of the NrtR family from cyanobacteria, NuhA from Synechococcus
sp. PCC 7002 and Slr1690 from Synechocystis
sp., provided additional insights for interpretation of functional motifs in this family. Whereas the NuhA protein with an intact Nudix signature displayed a high ADPR pyrophosphatase activity (38
), the hydrolytic activity of its homolog Slr1690 with a severely perturbed signature was barely detectable (kcat
~ 1.4 × 10−4
). Restoring the canonical Nudix signature by a directed mutagenesis of the slr1690
gene led to a 600-fold increase of the catalytic rate (39
). These facts suggest that the presence of an intact Nudix signature in NrtR proteins correlate with their catalytic activity.
Structural analysis of NrtR proteins
A comparative structural analysis of C-terminal domains in NrtR proteins provided the key evidence for their role in the regulation of transcription. This analysis was enabled by the availability of 3D structures determined at the Midwest Center for Structural Genomics (http://www.mcsg.anl.gov/
) for the two NrtR family members from Bacteroides thetaiotaomicron
(BT0354, PDB code 2FB1) and Enterococcus faecalis
(EF2700, PDB code 2FML) (). It is important to note that while the results of similarity searches for NrtR C-terminal domain based solely on sequence comparison were rather inconclusive, a structure-based search by the SSM server (24
) revealed a substantial similarity with winged helix-turn-helix (wHTH) domains. A three-stranded wHTH fold (α1-β1-α2-α3-β2-β3) of the C-terminal domain (Supplementary Fig. S2) is typical for DNA-binding domains present in many families of prokaryotic transcription factors (34
Figure 4. Crystal structures and ligand-binding sites of E. faecalis EF2700 (A) and B. thetaiotaomicron BT0354 (B). Structures of EF2700 (PDB accession number 2FML) and BT0354 (2FB1) were solved at the Midwest Center for Structural Genomics. The C-terminal wHTH (more ...)
The 3D structures of BT0354 and EF2700 show that both NrtR proteins form dimers with clear domain swapping. The Nudix domain of both proteins is very similar to the ADPR pyrophosphatase domain of the bifunctional enzyme NadM from Synechocystis sp., whose structure has been recently solved in complex with ADPR (Huang, N. and Zhang, H., unpublished data). In particular, root mean square deviations of 1.4 Å and 1.5 Å have been calculated for 115 and 108 superimposed Cα atoms between NadM and EF2700 and NadM and BT0354, respectively. The residues demonstrated to be directly involved in ADPR binding in the NadM structure are well conserved in EF2700 (A). Most of the residues interacting with ADPR are also conserved in the BT0354 active site; however, only the ribose-phosphate moiety of ADPR fits well into the binding pocket (B). These structural considerations allowed us to suggest ADPR and Rib-P primary candidates as possible NrtR effector molecules.
Prediction of NrtR-binding sites and reconstruction of NrtR regulons in bacterial genomes
To identify possible DNA motifs specifically recognized by NrtR in various taxonomic groups, we created training sets by combining the upstream regions of NAD metabolic genes from complete bacterial genomes containing nrtR genes. Based on the phylogenetic analysis, the NrtR family was divided into several taxonomic groups, and some of them were further split to distinct branches (). They were used to compile group-specific training sets. For nrtR genes that occur in conserved chromosomal clusters with genes other than those involved in NAD metabolism, the respective training sets included upstream regions of these gene clusters (e.g. nrtR-draG locus in Streptomyces and ara or xyl loci in Bacteroidetes).
By applying the motif-detection program SignalX with the inverted repeat option (40
), we have identified candidate NrtR recognition sites conserved in each of the compiled training sets and constructed the corresponding NrtR profiles and sequence logos (). Although the derived motifs are substantially different in consensus sequence and information content (depending on the number and diversity of species within groups and the number of candidate target genes), most of them share a 21-bp palindrome symmetry and a conserved core with consensus GT-N7
-AC. The most divergent NrtR consensus sequences were detected for two groups of the γ-proteobacteria (Vibrio
) and one group of the Firmicutes (Streptococcus
). For the Deinococcus
species, we were unable to identify NrtR recognition profiles and regulons.
The constructed NrtR-binding site recognition profiles were used to detect new candidate members of the NrtR regulons in the genomes containing nrtR genes. gives a list of genes and operons predicted to be under control of NrtR. The detailed information about the sequence, position, and score of each predicted NrtR site, as well as the genomic identification numbers of downstream genes, are provided in Supplementary Table S2. The key features of the reconstructed NrtR regulons are outlined in details by taxonomic groups in Supplementary Text S1.
Operon structure for nrtR genes and predicted NrtR sites in bacteria
Functional gene content of the reconstructed NrtR regulons varies significantly between different taxonomic groups of bacteria ( and ). The NrtR-regulated pathways include the universal NAD biosynthesis and salvage I pathways in Cyanobacteria and γ-proteobacteria, Pirellula and Chloroflexi; the de novo
NAD biosynthesis pathway in Actinobacteria; niacin uptake and salvage I and III pathways in Lactobacillales (Firmicutes); and the pentose utilization pathways in Bacteroidetes (see color square code in ). Taxonomic distribution of NrtR regulators is complementing to the distribution of two other transcriptional regulators of NAD metabolism, NadR (18
) and NiaR [see the accompanying paper (9
)] with the exception of some species of Firmicutes, where both NrtR and NiaR appear to regulate non-overlapping aspects of NAD metabolism (Supplementary Table S1). For instance, in Clostridium actobutylicum
, NrtR regulates the nicotinamide salvage pathway (pncAB
) and the universal NAD synthesis (nadDE
), whereas NiaR controls the de novo
NAD synthesis (nadBCA
). The Lactobacillales species represent another example of the large variability in the content of NAD regulons. In Lactobacillus casei
, NiaR and NrtR regulate the niaP
genes, respectively, whereas in L. plantarum
they control the pncB
genes, respectively (9
The position of the candidate NrtR-binding sites in the regulatory gene regions, either overlapping the predicted promoter elements or lying between the promoter and the translation start site of the downstream gene, strongly suggests that these regulators might act as repressors of transcription (Supplementary Figure S3). They are expected to bind to target genes via the wHTH C-terminal domain and a postulated interaction of the Nudix N-terminal domain with an effector molecule is anticipated to weaken the NrtR–DNA complex, leading to derepression of target genes.
While many aspects of the proposed mechanism are yet to be investigated, in this study we chose to perform an experimental validation of selected NrtR family members in order to provide minimal required support for the suggested novel functional annotations. For these validation experiments we have chosen two representatives of the NrtR family, Slr1690 from Synechocystis
sp. and SO1979 from S. oneidensi
s. The candidate NrtR-binding sites in Synechocystis
sp. precede the nadA
genes, whereas the predicted NrtR regulon in S. oneidensis
contains a single target operon, prs-nadV
(). The choice of these two species was dictated by the following considerations: (i) unambiguous association of the chosen NrtR groups with NAD metabolism; (ii) a divergent nature of these groups at the level of taxonomic placement of respective organisms as well as at the level of protein sequences (position on the NrtR tree and consensus DNA motifs); (iii) availability of biochemical data for the Slr1690 protein (39
) from the best studied model cyanobacterium Synechocystis
sp., including the detailed analysis of the NAD biosynthetic machinery (41
) and (iv) S. oneidensis
being an important model organism with a rapidly growing body of physiological and genomic data (42
Experimental characterization of two NrtR family representatives
To experimentally test the ability of NrtR to specifically bind to the predicted DNA sites and to assess possible effectors, slr1690 from the Synechocystis sp. (further referred to as syNrtR) and SO1979 from S. oneidensis (soNrtR), were cloned and overexpressed in E. coli. Both His6-tagged recombinant proteins were purified to homogeneity by Ni+2-chelating chromatography, followed by gel filtration as described in the Materials and Methods section. SDS-PAGE of pure syNrtR and soNrtR proteins revealed molecular masses of about 32 and 29 kDa, respectively, in agreement with the expected size (Supplementary Figure S4).
As expected, no appreciable ADPR pyrophosphatase activity could be detected for syNrtR and soNrtR proteins in the presence of Mg2+
. This is consistent with observed alterations in their Nudix signatures (Supplementary Figure S2) as well as with the previous report of extremely low enzymatic activity of Slr1690 protein, several orders of magnitude lower than that measured for other ADPR pyrophosphatases in Synechocystis
). Likewise, both NrtR proteins were unable to hydrolyze other nucleoside diphosphate compounds, including (2′)phospho-ADP-ribose (pADPR), NAD, flavine adenine dinucleotide, diadenosine polyphosphates (Apn
= 2, 4) and GDP-mannose.
We used electophoretic mobility shift assay (EMSA) to test the specific DNA-binding of the purified syNrtR and soNrtR proteins to their predicted operator sites derived from the upstream regions of nadA and nadE genes, the nadM-nadV operon from Synechocystis sp. and the prs-nadV operon from S. oneidensis. A substantial shift of the DNA band was observed in all cases upon incubation of the target DNA fragments with respective proteins (). A typical protein concentration dependence of DNA-binding is illustrated in A showing the increasing intensity of the shifted DNA band (corresponding to a predicted nadM-nadV target site) in the presence of increasing amounts of the syNrtR protein. The band shift was suppressed in the presence of 200-fold excess unlabeled DNA fragments but not in the presence of negative control DNA, poly(dC/dI), confirming a specific nature of the NrtR–DNA interactions (A). A similar specific binding in the presence of competing DNA was observed between syNrtR and its cognate DNA-sites from the upstream regions of nadE and nadA genes (B). Two distinctly shifted protein–DNA complexes were observed in the case of soNrtR interaction with the prs-nadV DNA target, confirming a functional competence of both predicted tandem NrtR-binding sites (C).
Figure 5. EMSA demonstrating specific NrtR binding to DNA. DNA fragments used in the assays are defined by their genomic positions and are shown as dark circles in the top of each panel. (A) EMSA with nadM-nadV DNA fragment (0.7 ng) in the absence (lane 1) and (more ...)
Of the tested intermediary metabolites associated with NAD biosynthetic, salvage, and recycling pathways (), Nam, NA, quinolinic acid, NMN, NaMN, NAD, PRPP, Rib-P, ADP and AMP at 100 μM concentration did not interfere with complex formation between soNrtR and the prs-nadV tandem DNA-site. This is illustrated for some of these compounds by the results of EMSA analysis (A). In the same experiment, 100 μM of ADPR and pADPR significantly suppressed soNrtR-DNA binding (A), and a similar effect was observed with 100 μM of nicotinate adenine dinucleotide (NaAD), nicotinate adenine dinucleotide phosphate (NaADP), NADH, NADPH and NADP (data not shown). As shown in the lower panel of A, these metabolites exerted a similar effect on soNrtR-DNA binding complex formation even at 10 μM, with ADPR being the most effective of all. ADPR was also shown to suppress specific DNA-binding of syNrtR-DNA, as illustrated in B for the nadM-nadV target, whereas NAD and Rib-P had no effect under the same conditions.
Figure 6. Effect of NAD metabolites on NrtR–DNA binding. (A) Electrophoretic mobility of prs-nadV DNA fragment incubated with purified soNrtR (2 nM) in the absence (lane 1) and in the presence of 100 and 10 μM of the indicated compounds. (B) Electrophoretic (more ...)