|Home | About | Journals | Submit | Contact Us | Français|
We have identified gene fusions of polyamine biosynthetic enzymes S-adenosylmethionine decarboxylase (AdoMetDC, speD) and aminopropyltransferase (speE) orthologues in diverse bacterial phyla. Both domains are functionally active and we demonstrate the novel de novo synthesis of the triamine spermidine from the diamine putrescine by fusion enzymes from β-proteobacterium Delftia acidovorans and δ-proteobacterium Syntrophus aciditrophicus, in a ΔspeDE gene deletion strain of Salmonella enterica sv. Typhimurium. Fusion proteins from marine α-proteobacterium Candidatus Pelagibacter ubique, actinobacterium Nocardia farcinica, chlorobi species Chloroherpeton thalassium, and β-proteobacterium Delftia acidovorans each produce a different profile of non-native polyamines including sym-norspermidine when expressed in Escherichia coli. The different aminopropyltransferase activities together with phylogenetic analysis confirm independent evolutionary origins for some fusions. Comparative genomic analysis strongly indicates that gene fusions arose by merger of adjacent open reading frames. Independent fusion events, and horizontal and vertical gene transfer contributed to the scattered phyletic distribution of the gene fusions. Surprisingly, expression of fusion genes in E. coli and S. Typhimurium revealed novel latent spermidine catabolic activity producing non-native 1,3-diaminopropane in these species. We have also identified fusions of polyamine biosynthetic enzymes agmatine deiminase and N-carbamoylputrescine amidohydrolase in archaea, and of S-adenosylmethionine decarboxylase and ornithine decarboxylase in the single-celled green alga Micromonas.
Metabolic pathways evolve through diverse genetic and molecular mechanisms: gene fusion, duplication, extension, loss, and recruitment of genes from elsewhere in the same genome or from other genomes via horizontal and endosymbiotic gene transfer (Caetano-Anolles et al., 2009, Fondi et al., 2009). Although gene fusions are not infrequent, there are relatively few examples where fusion proteins have been biochemically characterised. However, gene fusion to produce multidomain proteins is a key mechanism in protein evolution (Yanai et al., 2002), and enzyme fusions are found in pathways such as histidine biosynthesis (Fani et al., 2007). Koonin and colleagues concluded that gene fusions arise through an intermediate stage of physical clustering of the component genes (Yanai et al., 2002) within operons. Fusion of two enzymes in a biosynthetic pathway may not necessarily produce a functional fusion protein due to structural constraints, and non-functional or dysfunctional fusion genes would be unlikely to persist in genomes. Enzymes in metabolic pathways can physically associate without fusion events, to enhance biosynthetic efficiency by forming complexes known as metabolons, which may involve metabolic channeling (Srere, 2000, Srere, 1987, Welch, 1977). A metabolon is a transient multienzyme complex involving metabolically sequential enzymes, and metabolons have been observed for tricarboxylic acid cylce enzymes in Pseudomonas aeruginosa (Mitchell, 1996), for branched-chain amino acid catabolism from mammalian mitochondia (Islam et al., 2007, Islam et al., 2010), bicarbonate transport (Alvarez et al., 2005) and purine biosynthesis (An et al., 2008, An et al., 2010). Evidence of a polyamine biosynthetic metabolon (spermidine and spermine synthase) was found in the model plant Arabidopsis thaliana (Panicot et al., 2002).
Polyamine biosynthesis is particularly interesting for studying the evolutionary mechanisms that form new pathways and generate biosynthetic diversity. Polyamines are small organic polycations that are found in all cells in all three domains of life and are essential for cell growth and proliferation in eukaryotes (Fig. 1). Biosynthesis of polyamines is modular, consisting of diverse modules for diamine, triamine and higher order polyamine formation. The diamine module converts an amino acid (glutamate, arginine, ornithine or lysine) directly or indirectly to a diamine. Triamines are formed from diamines by addition of an aminopropyl or aminobutyl group to form norspermidine (Lee et al., 2009, Cacciapuoti et al., 1986), spermidine (Tabor & Tabor, 1984), homospermidine (Shaw et al.), aminopropylcadaverine (Igarashi et al., 1986) or aminobutylcadaverine (Fujihara et al., 1995, Shaw et al., 2010). Longer linear or branched polyamine biosynthetic modules also exist (Knott et al., 2007, Knott, 2009). Different modules can make the same polyamine and different polyamines can be made using the same module (Lee et al., 2009, Shaw et al., 2010). In eukaryotes, archaea and many species of bacteria, the triamine spermidine is synthesized by transfer of an aminopropyl group to putrescine (Fig. 1). The aminopropyl group is transferred from decarboxylated S-adenosylmethionine (dcAdoMet) to putrescine by the aminopropyltransferase spermidine synthase (SpdSyn) (Wu et al., 2007), and the dcAdoMet is formed by S-adenosylmethionine decarboxylase (AdoMetDC) (Pegg, 2009). In some bacteria and archaea, these same enzymes can synthesize norspermidine from diaminopropane (Cacciapuoti et al., 1986) (Fig. 1). An alternative pathway for norspermidine biosynthesis based on asparate β-semialdehyde as source of the aminopropyl group is also found in some bacteria (Lee et al., 2009, Deng et al., 2010).
AdoMetDC uses a pyruvoyl cofactor generated from an internal serine residue by autocatalytic self-cleavage of the proenzyme AdoMetDC protein to form an N-terminal β-subunit and a C-terminal α-subunit (Pegg, 2009). The pyruvoyl cofactor is formed at the N-terminus of the α-subunit. There are two classes of bacterial AdoMetDC: the Class 1a AdoMetDC is represented by the Escherichia coli enzyme and is Mg2+-activated (Toms et al., 2004, Bale & Ealick, Lu & Markham, 2007); the Class1b AdoMetDC is represented by the Thermotoga maritima enzyme and is not activated by Mg2+ and is a dimer (Toms et al., 2004). Eukaryotic AdoMetDC has evolved from the fusion of two bacterial Class 1 b-like AdoMetDC genes, fused in the same orientation and the autocatalytic cleavage site of the C-terminal AdoMetDC fusion component has been lost (Toms et al., 2004). The two bacterial AdoMetDC components of the eukaryotic AdoMetDC can be resolved only at the structural and not at the sequence level. Once AdoMet is decarboxylated by AdoMetDC, it is committed to polyamine biosynthesis (Pegg, 2009).
Aminopropyltransferases such as spermidine synthase are also found in eukaryotes, archaea and many bacteria (Pegg & Michael, 2010). The molecular mechanisms most prominent in the evolution of aminoproyltransferases are gene duplication followed by change of substrate specificity. All aminopropyltransferases bind dcAdoMet and transfer an aminopropyl group to their diamine or polyamine co-substrate. The product of AdoMetDC, dcAdoMet, is thus a co-substrate of aminopropyltransferases, which may use diaminopropane, putrescine, cadaverine, agmatine, norspermidine, spermidine, and homospermidine as co-substrates as well as longer chain and branched polyamines (Fuell et al., 2010). Additionally, with spermine synthase and thermospermine synthase, the aminopropyl group can be transferred to the N1- or N8-amino group of spermidine, respectively (Knott et al., 2007).
We noticed that gene fusions of AdoMetDC and an aminopropyltransferase are present in diverse bacterial genomes. Considering that the unfused SpdSyn of T. maritima is a tetramer (Wu et al., 2007), and its AdoMetDC is a dimer, we sought to determine if fusion proteins of these two enzymes types are functional, how they evolved and how extensive gene fusion is in polyamine metabolism. Here we show that bacterial AdoMetDC-aminopropyltransferase fusion proteins are enzymatically active, possessing AdoMetDC activity and functionally diverse aminopropyltransferase domains. Some of the fusion genes have evolved independently in different bacterial phyla although some have been acquired by horizontal and vertical gene transfer. We show that the fusion proteins from the β-proteobacterium Delftia acidovorans and δ-proteobacterium Syntrophus aciditrophicus are able to synthesize spermidine de novo from putrescine in cells lacking AdoMetDC and SpdSyn genes, demonstrating that the fusion proteins are functional covalent modules for polyamine biosynthesis.
We identified gene fusions encoding ORFs with N-terminal AdoMetDC and C-terminal aminopropyltransferase domains in genomes from diverse bacterial phyla (Fig. 2). All AdoMetDC domains of the fusion proteins are of the Class 1b type (Bale & Ealick, 2010), similar to the AdoMetDC of T. maritima, and there are no fusions containing the Class 1a, E. coli-like AdoMetDC genes. AdoMetDC-aminopropyltransferase gene fusions are found in the α-, β- and δ-Proteobacteria, Cyanobacteria, Chlorobi, Bacteroidetes, Firmicutes and Actinobacteria and the encoded proteins range in size from 346 aa for the α-proteobacterium Ca. P. ubique protein to 441 aa for the β-proteobacterium Aromatoleum aromaticum EbN1 protein. The genome sequence of the cyanobacterium Acaryochloris marina MBIC11017 is annotated as having a discrete aminopropyltransferase ORF with no AdoMetDC ORF nearby. However, close inspection of the relevant genomic vicinity reveals that the aminopropyltransferase is longer at the N-terminus than annotated and the N-terminal extension is an AdoMetDC orthologous sequence, which, for whatever reason, was not annotated in the genome sequence. This AdoMetDC domain is the only AdoMetDC orthologous sequence in the A. marina genome. The complete amino acid sequence of the A. marina MBIC11017 AdoMetDC-aminopropyltransferase fusion ORF is shown aligned with the other fusion proteins in Fig. S1.
The α-proteobacterium Ca. P. ubique is extremely abundant in the oceans (Rusch et al., 2007), and AdoMetDC-aminopropyltransferase gene fusion sequences are abundant in the marine metagenome. We conservatively found in the marine metagenome 124 fusion proteins clustering with the three Ca. P. ubique fusion protein sequences, one clustering with the fusion protein fromthe chlorobi species Chloroherpeton thalassium ATCC 35110 and two with the fusion protein from α-proteobacterium HIMB114 (Fig. S2). Sequence identity between the two most diverged marine metagenome fusion proteins that cluster with Ca. P. ubique sequences is only 61%, each protein sequence being exactly the same size with no gaps in the alignment (Fig. S3).
A gene fusion in the marine α-proteobacterium Can. P. ubique HTCC1002 encoding a 346 aa ORF (ZP_012649992) with N-terminal AdoMetDC and C-terminal aminopropyltransferase domains was synthesized with E. coli-optimised codons. An N-terminally T7-tagged Pelagibacter fusion ORF in pET21 was expressed in BL21 E. coli cells and induced at 30 °C for 24 h. Fig. 3A shows the polyamine profile detected by HPLC, of E. coli cells expressing the Pelagibacter fusion protein compared to cells expressing the empty pET21 vector. The host E. coli strain accumulates putrescine, cadaverine and spermidine. Several novel peaks appeared due to the expression of the Pelagibacter AdoMetDC-aminopropyltransferase fusion gene, the most prominent of which was norspermidine (peak g, Fig. 3A), a non-native polyamine in E. coli. Surprisingly, accumulation of non-native diaminopropane was observed, although this diamine could not have been a direct product of the fusion protein. Novel peaks of unknown identity appeared, including a prominent peak eluting between putrescine and cadaverine (peak e, Fig. 3A).
For the AdoMetDC domain of the Pelagibacter fusion protein to be enzymatically active, autocatalytic cleavage of the AdoMetDC proenzyme sequence must occur. To analyse the proteolytic processing of the Pelagibacter fusion protein, the fusion ORF was recloned with an 11 aa N-terminal FLAG-tag and an 8 aa C-terminal his-tag in pQE80 to allow greater protein recovery. The 41.3 kDa proenzyme-containing Pelagibacter fusion protein, if processed, should produce an approximately 8 kDa β-subunit consisting of the N-terminal part of the processed AdoMetDC domain and a 33.3 kDa α-domain consisting of the rest of the fusion protein. In Fig. 3B, prominent bands corresponding to the unprocessed fusion protein and the processed α- and β-subunits can be seen and approximately half of the fusion protein is unprocessed after expression in E. coli. The AdoMetDC specific activity of the purified Pelagibacter fusion protein, which was a mix of unprocessed and processed forms, was determined in the absence and presence of different polyamines (Table 1). AdoMetDC activity was stimulated to some extent by all polyamines tested but not by magnesium. Unusually, the aminopropyltransferase activity of the Pelagibacter fusion protein was promiscuous and clearly detected with a number of amine acceptors including putrescine, cadaverine, agmatine and spermidine (Table 2).
Genes encoding AdoMetDC-aminopropyltransferase fusion proteins from the chlorobi species C. thalassium ATCC 35110 (GenBank acc. no. YP_001996377), the actinobacterium Nocardia farcinica IFM 10152 (acc. no. YP_119998), the β-proteobacterium Delftia acidovorans SPH-1 (acc. no. YP_001561775) and the δ-proteobacterium Syntrophus aciditrophicus SB (acc. no. YP_460751) were synthesized with E. coli-optimised codons. Each fusion gene was cloned in pET21b and expressed in E. coli with an N-terminal T7 affinity tag. Extracts from E. coli cells expressing the induced fusion proteins were analysed by HPLC. Novel peaks appeared after expression of the fusion proteins (Fig. 4A): norspermidine accumulated with expression of the Chloroherpeton fusion protein, diaminopropane and norspermidine with the Delftia protein, diaminopropane and an unknown peak with the Nocardia protein but no novel peaks with the Syntrophus protein, although this probably reflects the insolubility of the Syntrophus protein, which was found almost entirely within the insoluble protein fraction. The abundant novel peak that accumulated on expression of the Nocardia fusion protein, eluting between the putrescine and cadaverine peaks, coincides with one of the Pelagibacter fusion protein peaks (peak e, Fig. 3A).
The relative amount of unprocessed and processed fusion protein in E. coli extracts (Fig. 4B) was assessed for the Chloroherpeton, Delftia, Nocardia and Pelagibacter fusion proteins (the Pelagibacter protein was expressed from pET21b with an N-terminal T7-tag for comparison with the other fusion proteins). The Delftia and Nocardia fusion proteins were almost fully processed, the Chloroherpeton protein was more than 50% processed, and the Pelagibacter protein less than 50% processed. However, the Synthrophus enzyme was too insoluble for the analysis. Affinity tag-purified proteins for the Chloroherpeton, Delftia and Nocardia fusions were assayed for AdoMetDC activity (Fig. 4C). The Delftia fusion protein exhibited barely detectable activity; the Chloroherpeton fusion protein exhibited AdoMetDC specific activity similar to the unstimulated Pelagibacter activity described above, whereas the Nocardia fusion protein exhibited AdoMetDC activity similar to the Pelagibacter protein stimulated by the presence of 5 mM spermidine (Table 1).
To assess whether AdoMetDC-aminopropyltransferase fusion proteins could synthesize spermidine, we deleted the AdoMetDC (speD) and SpdSyn (speE) genes of S. Typhimurium 4/74 using a homologous recombination approach and expressed them in this deletion strain. The speD and speE genes are immediately adjacent in the genome of S. Typhimurium 4/74 (Fig. 5A) and are probably co-transcribed. Deletion of the speD/speE genes abolished spermidine accumulation and caused a massive over-accumulation of putrescine in the ΔspeDE gene deletion strain (Fig. 5B). The ΔspeDE gene deletion strain and the individual AspeD and AspeE gene deletion strains grew approximately 40% less efficiently than the parental strain in defined, minimal liquid medium (Fig. 5C). Each AdoMetDC-aminopropyltransferase fusion gene was expressed in S. Typhimurium 4/74 and the derived ΔspeDE gene deletion strain, from a low copy number plasmid pWSK29 using a constitutive T7 promoter. The Pelagibacter fusion gene appeared to be completely inactive or non-expressed in either the parental or ΔspeDE gene deletion strain (results not shown). Expression of the Delftia fusion protein in the parental S. Typhimurium 4/74 strain resulted in a small accumulation of diaminopropane and norspermidine (Fig. 6A). In contrast, expression of the Nocardia fusion protein in the parental S. Typhimurium 4/74 strain resulted in an accumulation of more diaminopropane than putrescine, however, this did not result in norspermidine accumulation. Due to technical difficulties we were unable to express the Syntrophus fusion gene in the S. Typhimurium 4/74 parental strain. When expressed in the ΔspeDE gene deletion strain, neither the Delftia nor Nocardia fusion genes caused an accumulation of diaminopropane or norspermidine but there was clearly accumulation of spermidine with expression of the Delftia fusion gene (Figure 6B). Close inspection of the HPLC polyamine profile of the ΔspeDE gene deletion strain expressing the Syntrophus fusion gene revealed a small accumulation of spermidine only (Fig. 6C).
Of the bacterial genomes containing the AdoMetDC-aminopropyltransferase gene fusions (Fig. 2), none contain another unfused AdoMetDC paralogous gene except the two Bacillus megaterium strains (QM B1551 and DSM319). The AdoMetDC domain of the fusion protein from both strains of B. megaterium possesses an aberrant sequence at the putative proenzyme processing site (Fig. S1), with a threonine rather than a glutamate residue immediately preceeding the serine that generates the pyruvoyl cofactor. It is possible than the AdoMetDC domain of the B. megaterium fusion protein is therefore inactive, and the AdoMetDC function may be replaced by the paralogous unfused AdoMetDC gene, which possesses a normal proenzyme processing site.
Bacterial AdoMetDC-aminopropyltransferase gene fusions described above consist of an N-terminal class 1b AdoMetDC domain, similar to AdoMetDC of T. maritima, and a C-terminal aminopropyltransferase domain. The same arrangement, but of unfused Class 1b AdoMetDC and aminopropyltransferase unfused ORFs in the same order and orientation, is found in diverse bacterial and archaeal genomes (Fig. 7). An exception is Bacillus anthracis str. Ames, where the relative positions of the ORFs are reversed. Pairs of adjacent AdoMetDC and aminopropyltransferase ORFs are found for the Class 1a AdoMetDCs typified by the E. coli AdoMetDC. However, in the case of the class 1a AdoMetDC ORFs, the AdoMetDC ORFs are immediately downstream of the aminopropyltransferase ORFs (Fig. S4A). One possible mechanism for formation of the AdoMetDC-aminopropyltransferase fusions is point mutation of the AdoMetDC termination codon to allow translational readthrough into the downstream aminopropyltransferase ORF. This should produce a fusion protein if the two ORFs are in phase and there are no termination codons in the intervening linker region. Examples of where a simple nucleotide mutation of an AdoMetDC termination codon would result in an AdoMetDC-aminopropyltransferase fusion protein are shown in Fig. S5. For other pairs of unfused AdoMetDC and aminopropyltransferase ORFs, the downstream ORF is out of phase or there are several intervening termination codons. In such cases, short insertions or deletions would be required to form an AdoMetDC-aminopropyltransferase fusion protein.
That fusion of physically adjacent ORFs is likely to be the mechanism by which fusion proteins arise is clearly illustrated by the actinobacterial Nocardia AdoMetDC-aminopropyltransferase fusion. A cluster of discrete, unfused ornithine decarboxylase, AdoMetDC and aminopropyltransferase ORFs in the same orientation is found in a number of actinobacterial genomes, eg. Saccharomonospora viridis DSM 43017 (Fig. 8A). In the genome of the actinobacterium Streptomyces sp. AA4, the ornithine decarboxylase ORF has fused with the immediately downstream AdoMetDC ORF to form a 498 aa ODC-AdoMetDC fusion protein while maintaining a separate downstream aminopropyltransferase ORF (Fig. 8A). In Nocardia farcinica IFM 10152, the upstream ODC ORF is maintained but the downstream AdoMetDC and aminopropyltransferase ORFs have fused to form the Nocardia AdoMetDC-aminopropyltransferase fusion protein described above. In the actinobacterium Mycobacterium kansasii ATCC 12478, the upstream ODC ORF has fused with the downstream aminopropyltransferase ORF to create a 647 aa ODC-aminopropyltransferase fusion protein but the intervening AdoMetDC ORF has been eliminated (Fig. 8A).
The mechanisms for formation of gene fusions proposed above suggests that fusions might arise independently in different bacterial lineages. Different polyamine products from the various AdoMetDC-aminopropyltransferase fusion proteins expressed in E. coli support distinct evolutionary origins. Phylogenetic comparison of the AdoMetDC domains of the fusion proteins indicates that the fusion proteins have arisen independently in the α- and β-Proteobacteria and in Actinobacteria and Firmicutes (Fig. S6). Horizontal gene transfer is probably responsible for the cluster of fusion proteins represented by the cyanobacteria UCYN-A and Acaryochloris marina MBIC11017, the Chlorobi species Chloroherpeton thalassium ATCC 35110, the Bacteroidetes species Fluviicola taffensis DSM16823 and the δ-proteobacterium Bdellovibrio bacteriovorus HD100 (Fig. S6). Adjacent pairs of unfused AdoMetDC and an aminopropyltransferase exhibiting a high degree of identity to the horizontally acquired fusion proteins described above are present in the fusobacterium Ilyobacter polytropus DSM2926 (YP_003967334 and YP_003967333) and in the Deferribacteres species Deferribacter desulfuricans SSM1 (YP_003496197 and YP_003496196). No other species within the Fusobacteria or Deferribacteres phyla contain adjacent AdoMetDC and aminopropyltransferase ORFs, suggesting that the I. polytropus DSM2926 and D. desulfuricans SSM1 AdoMetDC and aminopropyltransferase pairs were originally acquired as a fusion protein which then underwent scission to recreate discrete AdoMetDC and aminopropyltransferase ORFs. Thus there is strong evidence for both independent formation of the fusion proteins in different phyla, for horizontal (and vertical in the β-Proteobacteria) gene transfer of fusion protein genes, and for scisson of horizontally acquired fusion proteins to recreate individual AdoMetDC and aminopropyltransferase ORFs.
Some AdoMetDC-aminopropyltransferase fusion genes are found in polyamine-related gene clusters/operons. The cyanobacterium UCYN-A AdoMetDC-aminopropyltransferase fusion ORF is flanked upstream by an alanine racemase-fold arginine decarboxylase ORF and downstream by an agmatine ureohydrolase ORF (Fig. 8B). In the cyanobacterium A. marina MBIC11017, and in the Bacteroidetes species F. taffensis DSM16823, the AdoMetDC-aminopropyltransferase fusion ORF is immediately upstream of an agmatine ureaohydrolase ORF (Figure 8B). An ORF for methylthioadenosine nucleosidase, which metabolises methylthioadenosine, a co-product of aminopropyltransferase activity, is found immediately upstream of the AdoMetDC-aminopropyltransferase fusion ORF in the Ca. P. ubique genomes (Fig. 8B).
Other polyamine-related gene fusions are present in different species. Gene fusions of agmatine deiminase and N-carbamoylputrescine amidohydrolase are found in four methanomicrobial species of the euryarchaeota (Fig. S4B). These two enzymes are responsible for the conversion of agmatine to putrescine via the intermediate metabolite N-carbamoylputrescine. The fusion protein has an N-terminal N-carbamoylputrescine amidohydrolase domain and a C-terminal agmatine deiminase domain. A similar arrangement of adjacent unfused N-carbamoylputrescine amidohydrolase and agmatine deiminase ORFs in found in diverse bacterial phyla although no bacterial genome appears to contain a fusion of the two ORFs (Fig. S4B). In eukaryotes, we found a gene fusion of a eukaryotic AdoMetDC and an ornithine decarboxylase in two species of single-celled prasinophyte green algae, Micromonas pusilla CCMP1545 (EEH58717, 1019 a.a.) and Micromonas sp. RCC299 (XP_002502359, 1039 a.a.). The AdoMetDC domain is N-terminal and the ODC domain C-terminal, and corresponding mRNA 5′ regions have a small upstream ORF similar to the upstream ORF in the mRNA 5′ ends of the related Ostreococcus AdoMetDC-encoding genes (Ivanov et al.). Although the Micromonas AdoMetDC-ODC gene fusions are arranged similarly to the Plasmodium AdoMetDC-ODC fusion genes (Muller et al., 2000), these fusion genes represent entirely independent eukaryotic gene fusion events because both domains of the Micromonas fusion protein exhibit greatest homology with the individual component genes from green algae and plants and present only distant similarity to the Plasmodium sequences. All gene fusions involved in polyamine biosythesis discovered to date are displayed in Fig. S7.
Gene fusion is an important mechanism in the evolution of proteins and metabolic pathways. Polyamine biosynthesis is ideal for investigating the occurrence, evolution and function of gene fusions in metabolic pathways because polyamines are ubiquitous and the biosynthetic pathways are short. We wondered how pervasive are gene fusions in polyamine metabolism; how did the fusions evolve; are the fusion proteins functional; is the presence of fusions due to vertical descent, horizontal gene transfer or independent gene fusion events? An AdoMetDC-aminopropyltransferase fusion protein encoded by the genome of Ca. P. ubique HTCC102 was processed as expected within the AdoMetDC domain when expressed in E. coli. Purified recombinant Pelagibacter fusion protein exhibited AdoMetDC activity that was not stimulated by magnesium ions but was stimulated albeit weakly by diamines and spermidine. The aminopropyltransferase activity was highly promiscuous, recognising putrescine, cadaverine, spermidine and agmatine with similar efficiency as substrates. This substrate promiscuity explains the variety of products detected when the Pelagibacter gene fusion was expressed in E. coli.
Surprisingly, the E. coli cells expressing the Pelagibacter fusion gene accumulated diaminopropane and norspermidine, which are non-native polyamines in E. coli. Formation of norspermidine is explicable if diaminopropane is used as a substrate by the fusion protein. The accumulation of diaminopropane can only be explained by catabolism of spermidine. Spermidine dehydrogenase activities have been reported in the γ-proteobacteria Citrobacter freudii, Serratia marescens and Pseudomonas aeruginosa (Hisano et al., 1992, Tabor & Kellogg, 1970, Dasu et al., 2006). Each of these spermidine dehydrogenases cleaves spermidine to release diaminopropane and 4-aminobutyraldehyde. It is conceivable that the expression of the Pelagibacter AdoMetDC-aminopropyltransferase in E. coli produces an unprecedented surfeit of spermidine that results in catabolism of the excess spermidine by a latent spermidine dehydrogenase activity belonging to a promiscuous dehydrogenase. There is no orthologue of the cloned P. aeruginosa spermidine dehydrogenase spdH gene in the E. coli genome and no spermidine dehydrogenase activity has been documented for E. coli, so another dehydrogenase is likely to be responsible. Indeed, the P. aeruginosa spermidine dehydrogenase, which has a Km for spermidine of 36 μM, is not involved in spermidine catabolism in P. aeruginosa under normal physiological conditions due to low expression levels (Dasu et al., 2006). It is likely that the AdoMetDC-aminopropyltransferase fusion protein circumvents normal polyamine biosynthetic homeostatic mechanisms in E. coli and S. Typhimurium, causing excess spermidine accumulation and the novel formation of diaminopropane through spermidine catabolism. Norspermidine could then be synthesized by aminopropylation of the diaminopropane by the Pelagibacter fusion protein. It is also theoretically possible that accumulation of aminopropylcadaverine or aminopropylagmatine generated through the activity of the Pelagibacter fusion protein could result in catabolism of these compounds, in addition to spermidine, to release diaminopropane. When the Delftia fusion protein was expressed in the parental S. Typhimurium 4/74 strain, which normally accumulates spermidine, both diaminopropane and norspermidine were produced. In contrast, when the Delftia fusion protein was expressed in the ΔspeDE gene deletion mutant in which endogenously-produced spermidine is entirely absent, diaminopropane and norspermidine were not produced, however, spermidine was synthesized by the fusion protein through aminopropylation of putrescine. This result further supports the hypothesis that catabolism of excess spermidine, produced by the fusion protein in addition to the endogenous spermidine pool, through a latent promiscuous spermidine dehydrogenase activity, is responsible for the formation of diaminopropane.
Accumulation of the non-native norspermidine in E. coli was also observed after expression of the Chloroherpeton and Delftia fusion genes. Diaminopropane in E. coli was detected after expression of the Nocardia fusion gene but there was no accumulation of norspermidine; the same result was obtained in the parental S. Typhimurium strain with the Nocardia fusion protein. This suggests that norspermidine is synthesized in E. coli by the Pelagibacter, Chloroherpeton and Delftia fusion proteins rather than by the endogenous E. coli SpdSyn and that the Nocardia fusion protein does not efficiently recognise diaminopropane as a substrate. The various fusion proteins synthesized different polyamine profiles in E. coli, indicating different substrate specificites of their respective aminopropyltransferase domains and suggesting that the fusion proteins evolved by independent gene fusion events involving distinct aminopropyltransferase genes. Further supporting independent gene fusion events, the AdoMetDC domain of the fusion proteins exhibits a highly scattered phyletic distribution (Fig. S6). There is a strong possibility of horizontal gene transfer being involved in the distribution of the Bdellovibrio (δ-Proteobacteria), Chloroherpeton (Chlorobi), cyanobacterium UCYN-A, Acaryochloris (Cyanobacteria) and Fluviicola (Bacteroidetes) fusion proteins, which are closely related at the amino acid sequence level. The sequence similarity for these fusions extends across both the AdoMetDC and aminpropyltransferase domains, and the horizontal gene transfer includes an adjacent putrescine-synthesising agmatine ureohydrolase gene in three of the species (Fig. 8) further supporting one gene fusion event and subsequent horizontal gene transfer. There also appears to be vertical inheritence for the five β-proteobacterial fusion genes.
The presence of a fused AdoMetDC-aminopropyltransferase gene in the genome of Ca. P. ubique is interesting in the context of the highly streamlined genome of this SAR11 clade marine species, which is amongst the smallest free-living bacterial genomes known (Giovannoni et al., 2005). The fusion protein is also the smallest encoded by the various AdoMetDC-aminopropyltransferase fusion genes. Ca. P. ubique is also extremely abundant at the ocean surface, the SAR11 clade comprising 12% of total marine prokaryotic biomass (Morris et al., 2002), and it exhibits a high level of intra-population sequence diversity, which can also be seen from the diversity of the AdoMetDC-aminopropyltransferase fusion protein sequences from the Global Ocean Sampling ocean metagenome indicated in supplementary Fig. S2. The nitrogen-fixing marine cyanobacterium UCYN-A also possess a highly streamlined genome (1.44 Mbp) with reduced metabolic capabilities (Tripp et al., 2010) and it too contains a fused AdoMetDC-aminopropyltransferase. However, the AdoMetDC-aminopropyltransferase genes fusions from these two pelagic species probably have independent origins. The cyanobacteria UCYN-A fusion protein is much more similar to that of the large-genomed (8.3 Mbp) marine cyanobacterium A. marina (Swingley et al., 2008) and it is possible that both the fusion protein and an adjacent agmatine ureohydrolase (Fig. 8b) have been horizontally transferred between the species as a gene cluster/operon. As discussed above, cyanobacterium UCYN-A and A. caryochloris fusion proteins are very similar to the fusion proteins from the marine chlorobi species C. thalassium and the river dwelling species B. bacteriovorus (δ-Proteobacteria) and F. fluviicola (Bacteroidetes). The acquisition by horizontal transfer of these related fusion genes may be a result of marine/aquatic interaction.
The mechanism of the gene fusion events between AdoMetDC and aminopropyltransferase genes is suggested by the prevalence of juxtaposed AdoMetDC and aminopropyltransferase gene pairs in diverse bacterial and archaeal phyla. In almost all cases, the class 1b AdoMetDC gene is found immediately upstream of the aminopropyltransferase gene and both genes are oriented in the same direction. Mutations allowing readthrough from the AdoMetDC ORF to the aminopropyltransferase ORF are the most probable cause of the fused protein. This putative mechanism is strongly supported by the different gene fusions found in some actinobacterial genomes where a cluster of unfused ODC/AdoMetDC/aminopropyltransferase ORFs has given rise to ODC-AdoMetDC, AdoMetDC-aminopropyltransferase, and ODC-aminopropyltransferase fusions. Intriguingly, although adjacent pairs of Class 1a AdoMetDC and aminopropyltransferase ORFs are widespread in γ-Proteobacteria, no fusions between the two ORFs are present. This may be due to the fact that the Class 1a AdoMetDC ORF is always downstream of the aminopropyltransferase ORF, and fusion might interfere with correct processing of the AdoMetDC proenzyme domain. The structural interference between protein domains may therefore be an important factor limiting gene fusion in general. There may be counter-selection of specific configurations of open reading frames in operons that, if fused, would produce a non-functional or dysfunction fusion protein due to steric or mechanistic interference.
We also observed the presence of fusions of N-carbamoylputrescine amidohydrolase and agmatine deiminase in closely related methanomicrobial species of the Archaea. The NCPAH-AIH fusion protein is again probably a consequence of mutation and readthrough from juxtaposed NCPAH and AIH ORFs which are widepread in bacterial phyla as adjacent pairs. Gene fusions between polyamine biosynthetic enzymes can be found for other steps in polyamine biosynthesis. Previously we showed a functional gene fusion consisting of diaminobutyrate aminotransferase and diaminobutyrate decarboxylase domains is present in the marine γ-proteobacterium Vibrio vulnificus and the fusion gene produced diaminopropane when expressed in E. coli (Lee et al., 2009). This gene fusion is present throughout the Vibrionales but not elsewhere and so the diaminobutyrate aminotransferase-diaminobutyrate decarboxylase ORF fusion event may have occurred only once followed by vertical descent or horizontal gene transfer to closely related species. The E. coli acid-inducible ADC represents gene fusion between a biosynthetic ADC (typified by the Bacillus subtilis biosynthetic ADC) and a receiver domain response regulator, fused to the N-terminus of the ADC ORF. The fusion of the reponse regulator to a biosynthetic ADC has modified the behaviour of the enzyme so that it forms a decamer rather than a dimer, and it is now active only in acid conditions (Burrell et al., 2010). Human spermine synthase has evolved through the fusion of a bacterial Class 1 b-like AdoMetDC to the N-terminus of an aminopropyltransferase open reading frame (Wu et al., 2008). The AdoMetDC domain of spermine synthase is recognisable only through structural analysis and not through primary amino acid sequence; the AdoMetDC domain is not processed and does not possess AdoMetDC catalytic activity (Wu et al., 2008). Although not catalytically active, the AdoMetDC domain is essential for spermine synthase activity due to its role in dimer formation. Orthologues of the human spermine synthase are found throughout the metazoa and in the single-celled choanoflagellate Monosiga brevicollis, a sister clade to the metazoa. It is likely that the human spermine synthase was originally acquired by horizontal transfer of an AdoMetDC-aminopropyltransferase fusion gene from bacteria to a common ancestor of metazoa and M. brevicollis.
Gene fusions involving polyamine biosynthetic genes can be found in bacteria, archaea and eukaryotes. Three of the covalent biosynthetic modules described above, AdoMetDC/ aminopropyltransferase, agmatine deiminase/N-carbamoylputrescine amidohydrolase and diaminobutyrate aminotransferase/ diaminobutyrate decarboxylase are found only in prokaryotes. Limited phyletic distribution of the latter two fusions and their conserved sequences suggests a single origin for each of the two fusion genes. In contrast, the highly scattered phyletic distribution of the AdoMetDC-aminopropyltransferase fusion genes and their functional diversity indicates independent formation in diverse bacterial phyla. Most gene fusions in polyamine metabolism are between enzymes that are involved in the same biosynthetic module, i.e., the conversion of one intermediate to the next in sequence in the metabolic pathway. This suggests that an additional constraint on the formation of gene fusions in metabolic pathways is the requirement of component domains to participate in a shared biosynthetic step.
The phylogenetically broad but locally limited distribution of AdoMetDC-aminopropyltransferase gene fusions in bacteria suggests that fusion of these two enzymes is probably accompanied by eventual scission events. However, the abundance of the AdoMetDC-aminopropyltransferase fusions in the ocean metagenome (primarily Ca. P. ubique-related) indicates that there is positive selection to maintain the fusion in this environment. What are the possible consequences of the fusion of AdoMetDC and the aminopropyltransferase? The decarboxylation of AdoMet by AdoMetDC commits AdoMet to polyamine biosynthesis, thus the activity of AdoMetDC closely correlates with polyamine levels and is rate-limiting for polyamine biosynthesis (Pegg, 2009). Fusion of the aminopropyltransferase to AdoMetDC creates a problem because a side activity of AdoMetDC results eventually in irreversible transamidation of the AdoMetDC pyruvoyl cofactor, effectively killing the enzyme. The human AdoMetDC pyruvoyl group has been calculated to be transamidated after 15,000 turnover events (Bale & Ealick, 2010). Because of the transamidation, AdoMetDC has a relatively short activity half-life and so has to be synthesized at a relatively high rate. Fusion of the two enzymes will result in a rate of aminopropyltransferase synthesis that is likely to be more than required for the unfused enzyme. Nevertheless, fusion of these two enzymes could be beneficial independently of flux and biosynthetic cost considerations. The product of AdoMetDC, dcAdoMet is an inhibitor of methyltransferase reactions and is known to inhibit DNA methylation (Frostesjo et al., 1997). A recent study has also implicated dcAdoMet in the inhibition of histone H3 lysine 9 dimethylation, as well as general DNA hypomethylation (Yamamoto et al., 2010). Therefore, if the dcAdoMet product of AdoMetDC was channeled within the fusion protein to the aminopropyltransferase, the toxic impact of dcAdoMet on cellular transmethylation reactions would be greatly attenuated. Benefits of the AdoMetDC-aminopropyltransferase fusion might therefore be determined by the sensitivity of particular species to intereference of transmethylation reactions by dcAdoMet. The importance of detoxification to Ca. P. ubique is indicated by the fact that methylthioadenosine nucleosidase, which removes and salvages the co-product (methylthioadenosine, another inhibitor of methyltransferases) of the AdoMetDC-aminopropyltransferase fusion protein, is clustered immediately adjacent to the fusion protein in this species.
In conclusion, we have found that fusions of AdoMetDC and aminopropyltransferases in bacteria arise stochastically as a consequence of clustering of these ORFs in structurally favourable configurations, and mutation to allow translational readthrough from one ORF to the next. The phylogenetic distribution of the fusion proteins is determined by independent fusion events in diverse phyla, horizontal transfer between species in disparate phyla, limited vertical inheritance, and stochastic scission events, suggesting that the persistance of the fusion proteins in some species is dependent on physiological parameters specific to those species and their environments.
Coding sequences for the fusion proteins were ligated into pET21b (Novagen) for expression in E. coli. The expression host strain E. coli BL21 (DE3) was transformed with expression plasmids, and 1 ml of a 10 ml LB broth overnight culture was used to inoculate 100 ml of LB broth containing ampicillin. Recombinant protein expression was induced by incubating the cells with components of the Overnight Express™ Autoinduction System 1 (Novagen) at 25°C for 16 hours. T7-tagged proteins were purified using the T7-tag affinity purification kit (Novagen), following the manufacturer’s instructions for the column procedure. Fractions containing the target protein, as determined by SDS-PAGE, were pooled, buffer-exchanged, and concentrated. Fusion genes were synthesised de novo with codon usage optimised for expression in E. coli and Saccharomyces cerevisiae (by Genscript Corporation, Piscatway, NJ, USA).
Cells from 1 ml of induced E. coli cultures were harvested by centrifugation (10,000 g, 10 min), and washed twice in 1 ml phosphate-buffered saline. MOPS lysis buffer (100 mM MOPS, 50 mM NaCl, 20 mM MgCl2) was added at 5 μl per mg cell fresh weight, and cells were subjected to three cycles of freeze/thawing. TCA was added to a final concentration of 10%, and cells were incubated on ice for 5 min. After centrifugation (18,000 g, 5 min, 4 °C), 5 μl of supernatant were derivatized using the AccQ-Fluor reagent kit for labelling amino acids (Waters). For normalisation, 1,7-diaminoheptane was included as an internal standard. Labelled polyamines were separated by HPLC using a Luna 5μm C18 (2) 100A column (250 × 4.6 mm; Phenomenex) with fluorescence detection (excitation 248 nm, emission 398 nm). Solvent A was 70 mM acetic acid, 25 mM triethylamine, pH 4.82; solvent B was 80% acetonitrile, 20% H2O (v/v); solvent C was methanol, and the gradient was run for 65 min at a flow rate of 1.2 ml/min with the following concentrations: t = 0 min, 100% A; t = 1 min, 78% A, 22% B; t = 27 min, 55% A, 39% B, 6% C; t = 27.5 min, 53% A, 33% B, 14% C; t = 34 min, 20% A, 10% B, 70% C; t = 37 min, 100% B; t = 58 min, 100% A.
Each purified T7-tagged recombinant fusion protein was present at 2 μg per assay. AdoMetDC activity was assayed by quantification of the released 14CO2 from S-adenosyl-L-[14C]methionine (GE Healthcare), as descibed previously (Michael et al., 1996). Background activity was determined from reactions with enzyme dilution buffer instead of protein.
Activity was measured by following the production of [35S]MTA from [35S]dcAdoMet in 100 mM sodium phosphate buffer (pH 7.5) in the presence of the amine acceptor indicated as described (Michael et al., 1996).
To deplete cells of polyamines acquired from growth in rich medium (LB), cells picked from a single colony grown on solid LB agar plates were subcultured four times (1:100 dilution) in M9 minimal medium with selective antibiotic with the addition of 0.4% glucose. Polyamine-depleted gene deletion strains and the parental strain were then grown in 50 ml of M9 defined minimal growth medium in 250 ml erlenmayer flasks at 37°C and triplicate 1.0 ml samples were taken at hourly intervals and the OD600 nm was determined. For growth curves performed using colony-forming units (cfu) per ml measurements, samples were taken every hour and, in triplicate, diluted according to calculated approximations of cfu/ml measurements. These dilutions were used to make 10 μl spots on LB agar plates, which were incubated overnight at 30 °C to prevent overgrowth, facilitating counting of resultant colonies. All data produced were converted into cfu/ml values using reversal of dilution factors. These data were plotted on a log 10 scale over time to produce a growth curve.
The template plasmids pKD3 and pKD4 contain FRT (FLP-recombinase recognition target site)-flanked chloramphenicol or kanamycin resistance gene cassettes respectively (Datsenko & Wanner, 2000). These were used to generate a PCR product which contained the antibiotic resistance gene flanked by homologous regions on the targeted gene and the two FRT sites. The S. Typhimurium strain JH3006, which carries the Red helper plasmid pKD46, was grown at 30 °C to an OD600 nm of 0.6 and then made electrocompetent by washing three times in 10% glycerol. Electroporation was applied to 100 μl of cells mixed with 10 μl of PCR product, 1 ml of LB liquid growth medium was added and cells were incubated at 37 °C for 1 hour before being spread onto LB plates to select for kanamycin or chloroamphenical resistant colonies following incubation at 37 °C overnight. Verification of transformants was performed by PCR on single colonies. A verified colony for each mutant was then used to produce a P22int-4 HT λ-bacteriophage lysate; 50 μl of overnight culture for each mutant (in JH3006) was used to inoculate 5 ml of fresh LB and incubated at 37 °C until the OD600 nm reached ~0.15-0.2. To this, 5 μl of stock P22 phage was added and incubated overnight. Cells were lysed using chloroform and then incubated at 4 °C for two hours. This lysate was purified using a 0.22 μm syringe filter following centrifugation at 13,000 rpm for 15 minutes and stored at 4 °C under a layer of chloroform. Two hundred μl of a 5 ml overnight culture of recipient S. Typhimurium 4/74 (in LB at 37 °C) had 10 μl of phage lysate added and was incubated at 37 °C for 1 hour to allow expression of antibiotic resistance genes. Selective antibiotic LB plates were used to grow transductants overnight at 37 °C, negative controls were used to ensure there was no contamination of lysate and that there were no resistant recipient cells.
The DNA encoding the different AdoMetDC-aminopropyltransferase fusion ORFs were ligated into the plasmid pWSK29 (Wang & Kushner, 1991). This low copy number plasmid has an origin of replication derived from pSC101 (Stoker et al., 1982) which allows 5-8 copies per cell in E. coli and Salmonella. Vector and insert were digested with the restriction enzymes BamHI and XhoI and the insert DNA was ligated into the vector. Plasmid pWSK29 was introduced into S. Typhimurium 4/74 cells by electroporation.
Transformation of S. Typhimurium with pWSK29-Electrocompetent Salmonella cells for both wild type S. Typhimurium 4/74, and the spermidine biosynthetic mutant AspeDE, were generated as follows: cells were grown at 37 °C to an OD600 nm of 0.6, washed three times in 10% glycerol, then snap frozen in liquid nitrogen and stored at −80 °C. Electroporation (3kW for 0.5 seconds) was applied to 100 μl of cells mixed with 10 μl of plasmid preparation, 1 ml of LB was added and cells were incubated at 37 °C for 1 hour before being spread onto LB plates to select for chloramphenical and ampicillin resistant colonies for AspeDE, and simply ampicillin resistantance for wild type Salmonella transformed with pWSK29.
AJM was supported by UT Southwestern Medical Center, a Biotechnological and Biological Sciences Research Council UK CSG grant and Institute Development Fellowship (BB/E024467/1). AEP was supported by grants CA-0181138 and GM-26290 from the National Institutes of Health. We thank Margaret A. Phillips for constructive criticism of the manuscript.