|Home | About | Journals | Submit | Contact Us | Français|
Beneficial microbial associations with insects are common and are classified as either one or a few intracellular species that are vertically transmitted and reside intracellularly within specialized organs or as microbial assemblages in the gut. Cockroaches and termites maintain at least one if not both beneficial associations. Blattabacterium is a flavobacterial endosymbiont of nearly all cockroaches and the termite Mastotermes darwiniensis and can use nitrogenous wastes in essential amino acid and vitamin biosynthesis. Key changes during the evolutionary divergence of termites from cockroaches are loss of Blattabacterium, diet shift to wood, acquisition of a specialized hindgut microbiota, and establishment of advanced social behavior. Termite gut microbes collaborate to fix nitrogen, degrade lignocellulose, and produce nutrients, and the absence of Blattabacterium in nearly all termites suggests that its nutrient-provisioning role has been replaced by gut microbes. M. darwiniensis is a basal, extant termite that solely retains Blattabacterium, which would show evidence of relaxed selection if it is being supplanted by the gut microbiome. This termite-associated Blattabacterium genome is ~8% smaller than cockroach-associated Blattabacterium genomes and lacks genes underlying vitamin and essential amino acid biosynthesis. Furthermore, the M. darwiniensis gut microbiome membership is more consistent between individuals and includes specialized termite gut-associated bacteria, unlike the more variable membership of cockroach gut microbiomes. The M. darwiniensis Blattabacterium genome may reflect relaxed selection for some of its encoded functions, and the loss of this endosymbiont in all remaining termite genera may result from its replacement by a functionally complementary gut microbiota.
Nutritional symbioses of insects with microorganisms are widespread and fall into two main categories. Intimate intracellular associations with bacteria that provision amino acids or vitamins are found in aphids, sharpshooters, psyllids, ants, tsetse flies, weevils, cockroaches, and others (2, 32, 39, 44, 51, 65, 71). In these systems, the symbionts are transmitted through eggs and exhibit reduced genomes in which functional capabilities dwindle over evolutionary time (50). In contrast, many animals, including some insects, harbor a complex mutualistic community of symbiotic bacteria and sometimes protozoans or fungi within the gut lumen; the most elaborate cases are found in termites, which can contain over 200 bacterial phylotypes, able to make all needed nutrients and to fix appreciable quantities of nitrogen, enabling termites to live on nutrient-deficient food, such as wood (25, 78). The insect group containing cockroaches and termites (Dictyoptera) is of particular interest in the evolution of insect symbiosis because all species in this group maintain at least one form of symbiosis. Considerable evidence supports the theory that termites are derived from cockroaches (29, 37), yet they differ in having advanced sociality and a specialized gut microbiota that facilitates a diet consisting primarily of wood and other plant material with a low nitrogen content (7, 9). This microbiota is comprised of characteristic groups of protozoans and bacteria that together are capable of cellulose degradation, nitrogen acquisition and recycling, and nutrient provisioning (61, 62). Although cockroaches have gut protozoans and bacteria, they also harbor Blattabacterium species, an obligate intracellular symbiont that resides solely within their fat bodies (4, 13, 38, 42). Blattabacterium is absent in nearly all termite species (5, 38), in which its nutrient-provisioning functions could be superseded by the activities of the hindgut microbiota.
Complete genome sequencing of Blattabacterium species from two distantly related cockroaches, Periplaneta americana and Blattella germanica, revealed that both genomes are highly reduced in size and exhibit near-perfect genomic synteny, features characteristic of insect-associated, obligate bacterial endosymbionts (16, 48, 50, 66, 73, 76). Despite their diminished size, both genomes encode genes for enzymes required to produce many vitamins and nearly all amino acids by using waste nitrogen products such as urea and ammonia (39, 65). In light of the genome reduction experienced by endosymbionts, retention of nutrient biosynthetic pathways reflects their importance in this host-bacterial mutualism.
The wood-feeding, “lower” termite Mastotermes darwiniensis is an exception among termites in that it, like cockroaches, maintains Blattabacterium within its fat bodies and transmits the endosymbiont transovarially to its offspring (38, 68). Mastotermes is a deeply branching termite lineage (6, 30), and it shares other morphological and reproductive similarities with cockroaches (14, 19, 55). One hypothesis that could explain the absence of Blattabacterium in all extant termites except Mastotermes is that the nutrient-supplying functions of Blattabacterium were replaced by the acquisition of bacteria and gut protozoans prior to termite diversification. This replacement would depend on the development of social behaviors that ensure reliable transmission of gut microbes among conspecifics. Several examples of replacement of ancestral symbionts are known, including replacement of the aphid symbiont Buchnera by a fungal species (21), replacement of the weevil symbiont Nardonella by a novel symbiont in grain weevils (35), and replacement of the ancestral symbiont Sulcia muelleri by fungal or bacterial symbionts in several lineages of sap-feeding hemipterans (49). Other lineages retain Sulcia but have added an additional symbiont that acts as a metabolic partner in sharing the functions of provisioning nutrients to hosts, as revealed by genome sequencing of cosymbionts in the same hosts (44–47, 81). This raises the possibility that before complete loss of an ancient symbiont, some ancestral functions will be lost individually as they are replaced by contributions from new associates.
As the basal branch of termites and the only termite known to retain Blattabacterium, Mastotermes may represent a stage in the transition from dependence on an intracellular, heritable symbiont to increased reliance on the gut microbiota. In cockroaches, the primary functions of Blattabacterium are to recycle waste nitrogen and produce amino acids and vitamins, and these functions may have been lost in part before the elimination of Blattabacterium. To explore these possibilities, we have used complete genome sequencing to compare the functional capabilities of the Blattabacterium derived from Mastotermes darwiniensis (here referred to as MADAR) to those of Blattabacterium derived from cockroaches.
Five micrograms of DNA was prepared from fat bodies dissected from four ethanol-preserved M. darwiniensis (Marlow Lagoon, Northern Territory, Australia) specimens, as previously described in reference 65, and was submitted to the Yale University Keck DNA Sequencing Lab for DNA sequencing with a Genome Analyzer IIx (Illumina). We confirmed that Blattabacterium was the sole bacterial occupant of the fat bodies by sequencing PCR products obtained using universal 16S rRNA gene bacterial primers. A single 100-bp paired-end library was constructed from a multiplexed DNA sample (Illumina) that included M. darwiniensis fat body DNA and loaded onto a single flow cell lane for sequencing-by-synthesis. A total of 19,264,920 paired and 59,294 single reads were quality filtered (parameters were sequencing primer and adapter trimming, >Q20-Phred score trimming, removal of reads <25 bp in length) and preassembled de novo using the kmer-based assembler within the CLC Workbench (v4.6.1; CLCBio). The purpose of preassembly was 2-fold: to obtain the average insert length for paired reads (~265 bp) and contigs and to assist with subsequent assemblies using velvet (v1.0.19) (82). CLC Workbench-assembled contigs (38,174 contigs, ~295.7 bp) were included in an assembly of all the paired and single reads in velvet (parameters used were velveth kmer length, 57; velvetg read-trkg, yes; min_contig_lgth, 250; ins_length, 265). A total of 2,912 contigs were returned from this initial velvet assembly (1st round); those with an average read per base pair coverage of >9× and a G+C% content of <31% were selected (160 contigs, 5%), and the individual reads that mapped back to them were obtained and reassembled in velvet. Contigs from this second assembly (2nd round) were also aligned to a Blattabacterium reference genome sequence database using blastn (12) to identify those with >70% identity, and reads that mapped to these contigs were obtained. Contigs from the 1st-round velvet assembly that did not meet the average read per base pair coverage and G+C% criteria were also aligned to the same databases using blastn to identify additional MADAR reads, which were added to the reads obtained from the 2nd-round assembly. A third assembly of all MADAR reads in velvet (3rd round) yielded 93 contigs that, following open reading frame prediction using Prodigal (v.2.50) (28), contained protein-coding genes highly similar (>80%) to those in the Blattabacterium genome of either Periplaneta americana (BPLAN) or Blattella germanica (Bge) or both, as revealed by blastx searches of custom protein databases. Contigs were mapped onto the Blattabacterium sp. BPLAN genome using a web implementation of Projector 2 (77) to automate design of gap-spanning primers for Sanger sequencing. Finally, sequenced amplicons spanning gaps between the 93 contigs were assembled with these contigs, resulting in a single, circular genome.
Functional prediction of protein-coding regions was performed using a combination of blastp searches against the GenBank nr (downloaded February 2011) (8) and COG myva (75) databases, and hmmpfam (HMMER v.3.0) searches against TIGRFAM v. 10.0 (70) and Pfam v.25.0 (20) databases were used for functional prediction of protein-coding regions. blastn searches of the MADAR genome with 5S, 16S, and 23S rRNA gene sequences from BPLAN and Bge were used to identify the 16S rRNA operon. tRNAs were identified by tRNAscan-SE v.1.3 (40), using the bacterial model. A single transfer-messenger RNA (tmRNA) was identified using a web-based implementation of BRUCE (34) and confirmed by a blastn search against GenBank. Annotated genes with predicted functions were mapped onto predicted biosynthetic and catabolic pathways in BPLAN and Bge to infer the metabolic potential of MADAR. nucmer (33) was used for whole-genome alignments.
Protein sequences from 62 complete and draft Bacteroides genomes (see the supplemental material for genomes and accession numbers) were obtained from the JGI Integrated Microbial Genomes and NCBI Entrez Genome Project websites. Protein sequences from Sulcia muelleri GWSS were searched against the protein databases for each phylotype member using blastp. Using the blast results, 13 proteins (EngA, RpsE, GidA, ValS, FusA, Tuf, GyrA, MutS, InfB, RpoC, RpoB, RplB, and RplK) were selected given that each had a bit-score ratio ≥0.3, according to Lerat et al. (36), and are present in all 62 genomes. Each set of proteins was aligned with MAFFT (v.6.624b) (31), and custom Perl scripts were used to remove all gap-containing columns. ProtTest (1) was used to identify the best model of evolution for these data. The 13 proteins from each genome were concatenated (7,003 characters/genome) and a maximum-likelihood phylogenetic reconstruction was performed with PhyML using the JTT+I+Γ model of amino acid substitution with 100 bootstrapping replicates (v.3.0) (23).
All Mastotermes darwiniensis, Heterotermes aureus (Tucson, AZ), and Cryptocercus punctulatus (Mountain Lake, VA) specimens were collected in the wild from individual colonies at single time points and preserved in ethanol. Wild Periplaneta americana specimens were caught (“wild-caught”) on the University of Arizona campus (Tucson, AZ) at a single time point and preserved in ethanol. P. americana specimens from a colony maintained in our laboratory (“lab-reared”) were also collected at a single time point and preserved in ethanol. Total DNA was prepared from hindguts dissected from three individuals each of the ethanol-preserved species and from the freshly dissected hindguts of three lab-reared P. americana adults using a PowerSoil DNA isolation kit (MoBio). DNA was used as template for amplification of the V6-V9 region of the bacterial 16S rRNA genes using barcoded primers for sample multiplexing prior to pyrosequencing using a Roche FLX-Ti system at the University of Arizona Genetics Core (see Table S1 in the supplemental material for bar code and primer sequences; see the supplemental material for the PCR amplification method). Sequences were quality filtered (reads with a Phred score of <Q20, <100% identity to the barcode sequence, and <350 bp were removed) and binned by barcode within the CLC Workbench. Pyrotag preprocessing and sequence analyses were performed using the mothur (69) software suite. Operational taxonomic units (OTUs) were generated by clustering sequences at 95% identity. Community coverage was calculated using Good's coverage estimator at 95, 97, and 100% OTU cutoffs (see Fig. S3 in the supplemental material). Single sequences representative of the top 100 OTUs ranked by abundance were used to search a SILVA-based bacterial 16S rRNA gene sequence database (63) using blastn.
The sequences generated and reported in this paper have been deposited in the GenBank database (MADAR genome, CP003000; MADAR plasmid, CP003095; pyrotag sequences, JN585351 to JN585643).
The complete Blattabacterium sp. MADAR genome is comprised of a chromosome of 587,248 nucleotide base pairs and a 3,088-bp plasmid with 27.5% and 31.9% G+C, respectively. Ninety-five percent of the MADAR genome is comprised of open reading frames (ORFs), with 544 protein-coding and 40 RNA-coding genes (Table 1). Specifically, 34 tRNAs capable of transferring all amino acids, a single, complete rRNA operon, a transfer mRNA, the noncoding RNA-modifying ribozyme RNase P, and a signal recognition particle RNA are all present within the genome. Nearly all of the genes essential for DNA replication and mRNA transcription and translation machinery are intact. A phylogeny based on the analysis of sequences of 13 highly conserved proteins from 62 selected Bacteroidetes phylotypes, including MADAR, further supports the previous inference that Blattabacterium represents a distinct clade within the Flavobacteriales (see Fig. S1 in the supplemental material), and that MADAR and Periplaneta americana Blattabacterium (BPLAN) cluster together, with the Blattella germanica Blattabacterium (Bge) as an outgroup, reflecting previously published tree topologies based on both host genes and Blattabacterium 16S rRNA genes (4, 38).
Bacterial endosymbionts, MADAR included, require and encode the major sigma factor RpoD for transcription, but they typically lack alternative sigma factors found in their close, free-living relatives. Uncharacteristic of other endosymbionts, all known Blattabacterium species, including MADAR, encode the alternate sigma factor RpoN, an enhancer-activated transcriptional regulator of genes involved in nitrogen assimilation (11). Additional metabolic functionalities MADAR shares with other Blattabacterium spceies are that it generates outer-membrane and cell wall components and produces carbon precursors and ATP by gluconeogenesis and aerobic biosynthesis, respectively.
Metabolic pathway reconstruction reveals several striking differences in the biosynthetic capabilities of MADAR compared to those of BPLAN or Bge. Specifically, MADAR has lost genes required for production of six essential amino acids that are required for protein production but cannot be produced by insects. In particular, all genes for enzymes involved in biosynthesis of tryptophan (trpEGDFCAB), threonine (thrBC), methionine (metAB), and the three branched-chain amino acids (ilvABHCDE-leuADCB) are absent (Fig. 1). All sequenced Blattabacterium genomes, including that of MADAR, lack genes for the nonessential amino acids asparagine and glutamine (which insect hosts are expected to be able to produce); loss of these genes appears to have occurred in a common ancestor. Additional genes present in Bge and BPLAN but absent in MADAR include genes involved in DNA replication and repair (ung), transcriptional regulation (asnC, dksA), translation and protein biosynthesis (truB, efp), ABC-type transporter permeases (lolE), and eight conserved hypothetical proteins. Together, these losses result in a smaller overall genome size of MADAR (587 kb) than that of either Bge or BPLAN (636 kb).
MADAR shares genes exclusively with Bge, including three tricarboxylic acid cycle (gltA, icd, acnA) genes, the cysIJNDH operon that encodes nearly all of the enzymes required for elemental sulfur assimilation, a bifunctional siroheme synthase (cysG) gene, a lipoprotein transporter (lolD) gene, and a sulfite exporter (tauE) gene. MADAR also harbors a plasmid identical in gene content to that observed for plasmids of both BPLAN and Bge. While the gene complement of MADAR is more similar to that of Bge, average nucleotide sequence identities for orthologous gene pairs is greater for BPLAN (85%) than for Bge (83%), as expected based on the more recent evolutionary divergence of BPLAN and MADAR. The low percent identity supports an ancient divergence of the species (see Fig. S1 in the supplemental material), consistent with their codiversification with hosts (38).
MADAR has few genes that lack closest homologs in BPLAN or Bge. Of these few, one encodes a free-radical-scavenging flavin adenine dinucleotide (FAD)-utilizing NAD(P)H dehydrogenase that is most similar (42% identical) to an ortholog found in Caulobacter crescentus NA1000 (YP_002515580.1). Genes that encode proteins involved in vitamin biosynthesis (ribD, ubiE), one-carbon transfer (ygfA), protein translocation (secDF), lipid metabolism (mvaK), NAD metabolism (nadD), tRNA modification (trmH), and posttranslational modification (def) appear to have been inactivated in MADAR due to single-nucleotide insertions or deletions within homopolymeric regions that alter the downstream open reading frame. Potentially, these genes no longer encode intact proteins. However, in some endosymbionts, transcriptional slippage within homopolymeric regions can correct the reading frame in a subpopulation of transcripts that can then be translated into functional enzymes (74). Such frameshifts within homopolymers may reduce transcription in an organism lacking common gene expression regulation mechanisms. Of the enzymes encoded by these apparent pseudogenes, peptide deformylase (encoded by the def gene), nicotinate-mononucleotide adenylyltransferase (nadD), and bifunctional riboflavin biosynthesis deaminase-reductase (ribD) are all essential for growth in Escherichia coli (3, 22, 43, 72). If they are also essential for MADAR growth, then transcriptional slippage may allow recovery of their functions. Alternatively, enzymes encoded by intact genes with similar activities may complement functions lost or attenuated by the frameshift. SpoU and a CMP/dCMP deaminase, which are encoded by intact genes, may complement the tRNA 2′-O-methyltransferase activity of TrmH and deaminase activity of RibD, respectively, to recover these functions.
In addition to the observed gene loss, the MADAR chromosome has undergone an ~242-kb inversion not found in the BPLAN or Bge chromosomes. The genes flanking the two rearrangement sites (the ligA, serS, and sirBC genes and an M22 glycoprotease-encoding gene) appear intact. The only other rearrangement among the three genomes is an ~19-kb inversion that distinguishes BPLAN from Bge and MADAR (see Fig. S2 in the supplemental material).
One explanation for the eroded biosynthetic capabilities of MADAR is that M. darwiniensis depends on gut microbes capable of provisioning the nutrients which Blattabacterium can no longer produce. Dependence upon the gut microbiota would require their reliable transmission in order to ensure host fitness. Thus, we would expect M. darwiniensis guts to harbor a consistent microbial community typical of that of other termites that is transmitted socially, whereas cockroaches may harbor a more variable gut community that differs among individual hosts and contains more microbes typical of other environments. To explore these predictions, we used a “pyrotag” approach to profile the bacterial gut microbiome in hindguts of three adults each from two termite species, M. darwiniensis and H. aureus, wild-caught and lab-reared individuals of the cockroach P. americana, and a wood roach species, Cryptocercus punctulatus. Wood roaches are the sister group to termites and display an early stage of sociality and a specialized gut microbiota dependent on social transmission (37, 38, 58). Dissimilarity of microbial community membership among individuals was least for H. aureus, followed by M. darwiniensis and C. punctulatus (Fig. 2). P. americana individuals showed limited sharing of gut microbes, even when sampled from the same lab colony. These observations suggest that the termites maintain a more consistent microbiota. Furthermore, the majority of sequences retrieved from both H. aureus and M. darwiniensis corresponded to phylotypes most similar to known termite symbionts, including known symbionts of termite gut protozoans. The C. punctulatus microbiome also appeared to maintain a consistent microbiome that included similar symbionts. In contrast, each of the P. americana gut communities, while most similar to one another, had a large proportion of sequences that were most closely related to environmental sequences and not to symbionts represented in current databases (Fig. 2). Taxonomic assignment of the top 5 OTUs for all five microbiomes revealed distinct community membership uniformity for the termite and wood roach microbiomes, with termite gut protozoan and/or tissue-associated Bacteroidales, Clostridiales, and Spirochaetales phylotypes being both consistently present and abundant in at least two-thirds of the M. darwiniensis, H. aureus, and C. punctulatus pyrotag libraries.
Complete sequencing of the M. darwiniensis Blattabacterium (MADAR) genome has revealed an obligate endosymbiont that has lost numerous genes, many of which are required for essential amino acid biosynthesis (Fig. 1). Nutrient provisioning is a key function of bacterial endosymbionts of insects, as evidenced by the retention of many of these genes even in the smallest endosymbiont genomes. The tiny 160-kb genome of the psyllid endosymbiont Carsonella has over 17% of its genes devoted to amino acid biosynthesis (51). Although loss of essential amino acid biosynthesis genes has been observed for other insect endosymbionts, as in the case of the auchenorrhynchan symbiont Sulcia muelleri that lacks the ability to produce up to three of 10 essential amino acids, functional complementation of the absent biosynthetic pathways has consistently been performed by another, coresiding bacterial symbiont (47). However, Blattabacterium is the sole resident of the bacteriocytes in MADAR, BPLAN, and Bge. Furthermore, the M. darwiniensis diet of wood is not expected to be a source of the missing essential amino acids.
An alternative hypothesis is that members of the gut microbiota have replaced some of the nutrient-provisioning functions of Blattabacterium MADAR. Functional replacement of Blattabacterium by the gut microbiota necessitates a reliable mechanism for efficient transmission of gut microbes. Direct transfer of hindgut fluids between conspecifics (proctodeal trophallaxis) and consumption of sibling feces (filial coprophagy) are documented dietary behaviors of termites, including M. darwiniensis, that present opportunities for microbe acquisition (41, 56, 57, 60). Coprophagous behavior is also observed for cockroaches and is extended to filial coprophagy and proctodeal trophallaxis in the wood roach Cryptocercus (56). A coprophagous mode of microbiota transmission requires consumption of fresh fecal pellets prior to degradation of anaerobic and desiccation-sensitive microbes (15, 24, 27). Proctodeal trophallaxis allows direct transfer of gut microbes between conspecifics with the least exposure to the outside environment, improving the likelihood of microbe survival during transmission. Bacterial symbionts of anaerobic lignocellulolytic protozoans (10, 54, 59, 64, 80) and other termite gut bacteria (Bacteroidales, Clostridiales, and Spirochaetales phylotypes) (17, 18, 52, 53, 79) were consistently observed in M. darwiniensis pyrotag libraries and may be transmitted by proctodeal trophallaxis.
“Candidatus Azobacteroides pseudotrychonymphae,” a Bacteroidales endosymbiont of the Coptotermes formosanus (Rhinotermitidae) cellulolytic gut protozoan Pseudotrichonympha species, is capable of fixing nitrogen and of producing all of the essential amino acids (26). This suggests that other Bacteroidales-protozoan mutualisms may also be capable of provisioning amino acids and of recycling or fixing nitrogen. Given the refractory nature of their primarily herbivorous diet, many “lower” termites rely upon their gut protozoans for their ability to degrade lignocellulose. H. aureus, which lacks Blattabacterium, has a gut bacterial microbiome in which a “Ca. Azobacteroides pseudotrychonymphae” phylotype is abundant in our pyrotag libraries (Fig. 2), and it likely receives nutrients resulting from this bacterium-protozoan mutualism. In contrast, P. americana, which retains Blattabacterium, has a less restrictive diet and a less uniform gut phylotype membership between individuals, minimizing reliance on the gut microbiome for nutrient provisioning. Acquisition of lignocellulolytic protozoans with symbiotic bacteria and of other bacteria capable of generating and provisioning nutrients, combined with a reliable means for their transmission from parent to offspring, may have set the stage for reduced reliance upon Blattabacterium for amino acid biosynthesis, resulting in its absence in termites. Additionally, the presence of nutrient-supplying gut symbionts could have facilitated the transition from producing clusters of eggs in enveloped oothecae, as observed with cockroaches and M. darwiniensis, to single egg laying, as in all other termites, a transition that may have interfered with transmission of Blattabacterium to offspring (67). Further genomic and biochemical characterization of the various gut microbial symbionts of M. darwiniensis is necessary to determine if they can complement the biosynthetic pathways missing in Blattabacterium.
Funding was provided by the U.S. National Science Foundation (awards 0626716 and 1062363 to N.A.M.).
We thank J. Eli Powell and Yogeshwar Kelkar for technical assistance. We are grateful to Christine A. Nalepa and Xuguo Zhou for thoughtful conversations and Cryptocercus specimens and Gaelen R. Burke and Kevin J. Vogel for collection of wild Periplaneta specimens.
Published ahead of print 21 October 2011
Supplemental material for this article may be found at http://aem.asm.org/.