Overall genomic contents can be reconstructed quickly and inexpensively when complete sequences are unavailable by utilizing hybridization to arrays available for closely related organisms. Although this technique has been successfully applied to gain a broad understanding of genome composition, it lacks the ability to identify events that render genes inactive, such as point mutations and frame shifts caused by deletions and insertions. Genes that are not found in the arrayed genome as well as species-specific genes also cannot be identified. Thus, a fraction of the genome of interest may remain unsampled when utilizing arrays of close relatives. Furthermore, the hybridization of genomes with nucleotide composition bias may lead to the detection of artifacts and result in variance between gene array hybridization and genome-wide sequence analysis (
1,
3). Like any comparative genome approach, this method is unable to detect nonorthologous gene displacement (
29), by which unrelated or distantly related proteins can be recruited for the same functions. Despite these limitations, this technique allows us to adequately compare genomes of microbes with similar guanine and cytosine (G+C) contents and with close phylogenetic distances to the organism from which the arrays were constructed. The identification of the full repertoire of genes comprising operons can further validate the results. For this study, related symbionts from two insect orders, Coleoptera and Diptera, were compared to delineate what alterations in microbial genome contents may have occurred since divergence from a common ancestor due to distinct host environments. To our knowledge, this is the first report describing microbial genome disintegration during the early phases of symbiotic establishment and in relation to environmental constraints.
Multiple phylogenetic analyses of
Enterobacteriaceae species have suggested a distinct lineage for
Sodalis and SOPE, indicating that they are members of a single bacterial taxon that have diverged from a common ancestral organism (
4,
24). Recent data on the Dryophthoridae endosymbiont phylogeny have estimated the divergence between
Sitophilus endosymbionts and
Sodalis to be <25 million years (C. Lefevre, personal communication). The fact that symbionts related to
Sodalis and SOPE are found in other insect taxa suggests that a progenitor had the ability to enter into relations with a wide range of hosts, perhaps as an insect pathogen (
15). With time, pathogenic effects may have been attenuated while functions and/or products important for the evolutionary success of both partners were retained. It will be interesting to identify those genes and their functions that are present in symbionts but absent from
E. coli. Equally intriguing is whether these genes were present in the common ancestor of enteric bacteria or were acquired horizontally after divergence.
Sodalis, known as the secondary symbiont of the tsetse fly, is transmitted vertically to intrauterine larvae through the mother's milk (
14,
31) and has both intra- and extracellular localization in the midgut, muscle, salivary glands, fat body, and hemolymph (
14). Although
Sodalis maintains an overall similar tissue tropism between species, its density appears to vary among tsetse fly species (
14). In contrast to
Sodalis, SOPE has a strict intracellular localization within specialized structures called bacteriomes (
25). Bacteriomes differentiate early during insect embryonic development and remain attached to the intestine at the junction of the foregut and midgut during the four larval stages. In young adults, bacteriomes are found in the mesenteric ceca of the intestine. However, in 2- to 3-week-old individuals, bacteriomes disappear, remaining only in the female ovaries from where bacteria are maternally transmitted to the offspring (
25). SOPE has been shown to supply weevil larvae with vitamins such as pantothenic acid, riboflavin, and biotin (
25) and is a source of amino acids such as phenylalanine and proline (
19). The symbiont also interacts with mitochondrial oxidative phosphorylation by increasing mitochondrial enzymatic activity (
23), thus extending the flight ability and other energy-dependent activities of its host (
21,
25,
39). Yet the symbiosis can be disrupted without causing whole host population lethality (
39).
Similarities in genomic makeup between SOPE and Sodalis are still highly evident, with retention of many of the same genetic components involved in housekeeping functions such as translation and posttranslational modification, cell processes, transcription, and nucleotide biosynthesis. The cellular machinery involved in these processes appears to be conserved in mutualists and commensals, in contrast to parasitic microbes, which exploit their hosts for many of these resources. Furthermore, the greater conservation of translational, as opposed to transcriptional, processes supports stronger genetic regulation at the translational level for Sodalis and SOPE.
Competence in complete nucleotide biosynthesis and metabolism, amino acid biosynthesis and metabolism, energy metabolism, and cofactor biosynthesis suggests that SOPE and
Sodalis may be approaching mutualism in their symbiotic associations. The retention of genes involved in regulatory functions, such as sigma factors, supports their recent symbiotic establishment. Obligate mutualists that live intracellularly, sheltered from environmental fluctuations, within their hosts have lost such genes with associated regulatory functions due to lack of need (
3,
35,
44).
Despite the vast similarities in detected ORFs for SOPE and
Sodalis, specialized modifications towards host environment, particularly in metabolic functions, appear to have occurred that are reminiscent of the genome tailoring of ancient symbionts. The significantly greater number of energy metabolism and fatty acid metabolism genes detected for SOPE than for
Sodalis may be due to the restricted cereal diet of SOPE's weevil host. Lipids, prominent in the tsetse fly blood meal, provide more energy than plant carbohydrates because they are in a more reduced form (
53). The erosion of fatty acid metabolism pathways in
Sodalis might reflect the natural abundance of such products in the host environment. SOPE, with a greater capacity for carbon catabolism, is capable of metabolizing plant sugars in the diet of its host, which is comprised of as much as 70% starch but is very low in lipid components (
http://www.nal.usda.gov). The purging of genes involved in plant sugar metabolism from the
Sodalis genome can be interpreted as an adaptive response to tsetse fly nutritional behavior. Since tsetse flies do not feed on plant material but on blood, which is low in carbohydrates and rich in simple sugars such as glucose and trehalose,
Sodalis has lost unnecessary pathways that catabolize plant sugars such as starch.
The higher number of unique ORFs corresponding to cellular processes (perhaps adaptations necessary for intra- and extracellular localization), central intermediate metabolism, and amino acid biosynthesis and metabolism for Sodalis and to carbon compound catabolism, cell structure, energy metabolism, and fatty acid metabolism and transport for SOPE may be indicative of differences in genome retention since their last common ancestor and suggests bacterial domestication by the host. These differences in retention will influence what is further maintained in the SOPE and Sodalis genomes as relations with their hosts further evolve.
Utilizing the intensively studied, 250-million-year-old
Buchnera-aphid association (
38) as a model, some authors have suggested that significant changes on microbial genome structure transpire early upon symbiotic establishments (
36). Large deletions, which typically span multiple genes, occur during the initiation of symbiosis, resulting not from selection for DNA loss but from decreased selection to maintain locus functionality (
36,
49). Such massive genome reduction early upon symbiosis is supported by near perfect gene order conservation in the whole-genome sequences of three divergent strains of
Buchnera (
44,
47,
49). It has been inferred that the content of these early deletions determines the degree of selection on remaining loci and ultimately governs the eventual genetic inventory of the reduced genome. Only later, at an exponentially decreasing pace, are some genes eliminated through inactivation and gradual erosion (
5,
45). These losses, resulting in the reduction of microbial functional flexibility, are expected to restrict the evolutionary options for the microbes, ultimately harnessing them into specific symbiotic lifestyles. Such drastic genome erosion may enable the recruitment of newer symbiotic associations to replace functions lost in the ancient obligate mutualists and potentially allow hosts to exploit new niches.
Despite their close taxonomic relatedness, the genomes of SOPE and Sodalis have been shaped differentially due to adaptations to their unique host environments. As a result, these organisms have diverged extensively and appear to be tailored to subsist on different metabolites provided in their host diets. These findings are of relevance for our applied genetic engineering studies by which we explore the use of symbionts to block transmission of pathogens in their insect hosts. Our results infer that the symbionts described here are anchored tightly to host biology through restricted metabolic capabilities and therefore may not be able to undergo horizontal transmission and establishment in distant insect taxa.