Repetitive and Mobile DNA
The most striking feature of the
wMel genome is the presence of very large amounts of repetitive DNA and DNA corresponding to mobile genetic elements, which is unique for an intracellular species. In total, 714 repeats of greater than 50 bp in length, which can be divided into 158 distinct families (
Table S1), were identified. Most of the repeats are present in only two copies in the genome, although 39 are present in three or more copies, with the most abundant repeat being found in 89 copies. We focused our analysis on the 138 repeats of greater than 200 bp (). These were divided into 19 families based upon sequence similarity to each other. These repeats were found to make up 14.2 % of the
wMel genome. Of these repeat families, 15 correspond to likely mobile elements, including seven types of insertion sequence (IS) elements, four likely retrotransposons, and four families without detectible similarity to known elements but with many hallmarks of mobile elements (flanked by inverted repeats, present in multiple copies) (). One of these new elements (repeat family 8) is present in 45 copies in the genome. It is likely that many of these elements are not able to autonomously transpose since many of the transposase genes are apparently inactivated by mutations or the insertion of other transposons (
Table S2). However, some are apparently recently active since there are transposons inserted into at least nine genes (
Table S2), and the copy number of some repeats appears to be variable between
Wolbachia strains (M. Riegler et al., personal communication). Thus, many of these repetitive elements may be useful markers for strain discrimination. In addition, the mobile elements likely contribute to generating the diversity of phenotypically distinct
Wolbachia strains (e.g., mod
− strains [
McGraw et al. 2001]) by altering or disrupting gene function (
Table S2).
| Table 2wMel DNA Repeats of Greater than 200 bp |
Three prophage elements are present in the genome. One is a small pyocin-like element made up of nine genes (WD00565–WD00575). The other two are closely related to and exhibit extensive gene order conservation with the WO phage described from
Wolbachia sp.
wKue (
Masui et al. 2001) (). Thus, we have named them
wMel WO-A and WO-B, based upon their location in the genome.
wMel WO-B has undergone a major rearrangement and translocation, suggesting it is inactive. Phylogenetic analysis indicates that
wMel WO-B is more closely related to the
wKue WO than to
wMel WO-A (
Figure S1). Thus,
wMel WO-A likely represents either a separate insertion event in the
Wolbachia lineage or a duplication that occurred prior to the separation of the
wMel and
wKue lineages. Phylogenetic analysis also confirms the proposed mosaic nature of the WO phage (
Masui et al. 2001), with one block being closely related to lambdoid phage and another to P2 phage (data not shown).
Genome Structure: Rearrangements, Duplications, and Deletions
The irregular pattern of GC skew in
wMel is likely due in part to intragenomic rearrangements associated with the many DNA repeat elements. Comparison with a large contig from a
Wolbachia species that infects
Brugia malayi is consistent with this (
Ware et al. 2002) (). While only translocations are seen in this plot, genetic comparisons reveal that inversions also occur between strains (
Sun et al., 2003), which is consistent with previous studies of prokaryotic genomes that have found that the most common large-scale rearrangements are inversions that are symmetric around the origin of DNA replication (
Eisen et al. 2000). The occurrence of frequent rearrangement events during
Wolbachia evolution is supported by the absence of any large-scale conserved gene order with
Rickettsia genomes. The rearrangements in
Wolbachia likely correspond with the introduction and massive expansion of the repeat element families that could serve as sites for intragenomic recombination, as has been shown to occur for some other bacterial species (
Parkhill et al. 2003). The rearrangements in
wMel may have fitness consequences since several classes of genes often found in clusters are generally scattered throughout the
wMel genome (e.g., ABC transporter subunits, Sec secretion genes, rRNA genes, F-type ATPase genes).
Although the common ancestor of
Wolbachia and
Rickettsia likely already had a reduced, streamlined genome,
wMel has lost additional genes since that time (
Table S3). Many of these recent losses are of genes involved in cell envelope biogenesis in other species, including most of the machinery for producing lipopolysaccharide (LPS) components and the alanine racemase that supplies D-alanine for cell wall synthesis. In addition, some other genes that may have once been involved in this process are present in the genome, but defective (e.g., mannose-1-phosphate guanylyltransferase, which is split into two coding sequences [CDSs], WD1224 and WD1227, by an IS5 element) and are likely in the process of being eliminated. The loss of cell envelope biogenesis genes has also occurred during the evolution of the
Buchnera endosymbionts of aphids (
Shigenobu et al. 2000;
Moran and Mira 2001). Thus,
wMel and
Buchnera have lost some of the same genes separately during their reductive evolution. Such convergence means that attempts to use gene content to infer evolutionary relatedness needs to be interpreted with caution. In addition, since
Anaplasma and
Ehrlichia also apparently lack genes for LPS production (
Lin and Rikihisha 2003), it is likely that the common ancestor of
Wolbachia,
Ehrlichia, and
Anaplasma was unable to synthesize LPS. Thus, the reports that
Wolbachia-derived LPS-like compounds is involved in the immunopathology of filarial nematode disease in mammals (
Taylor 2002) either indicate that these
Wolbachia have acquired genes for LPS synthesis or that the reported LPS-like compounds are not homologous to LPS.
Despite evident genome reduction in
wMel and in contrast to most small-genomed intracellular species, gene duplication appears to have continued, as over 50 gene families have apparently expanded in the
wMel lineage relative to that of all other species (
Table S4). Many of the pairs of duplicated genes are encoded next to each other in the genome, suggesting that they arose by tandem duplication events and may simply reflect transient duplications in evolution (deletion is common when there are tandem arrays of genes). Many others are components of mobile genetic elements, indicating that these elements have expanded significantly after entering the
Wolbachia evolutionary lineage. Other duplications that could contribute to the unique biological properties of
wMel include that of the mismatch repair gene
mutL (see below) and that of many hypothetical and conserved hypothetical proteins.
One duplication of particular interest is that of
wsp, which is a standard gene for strain identification and phylogenetic reconstruction in
Wolbachia (
Zhou et al. 1998). In addition to the previously described
wsp (WD0159),
wMel encodes two
wsp paralogs (WD0009 and WD0489), which we designate as
wspB and
wspC, respectively. While these paralogs are highly divergent from
wsp (protein identities of 19.7% and 23.5%, respectively) and do not amplify using the standard
wsp PCR primers (
Braig et al. 1998;
Zhou et al. 1998), their presence could lead to some confusion in classification and identification of
Wolbachia strains. This has apparently occurred in one study of
Wolbachia strain
wKueYO, for which the reported
wsp gene (gbAB045235) is actually an ortholog of
wspB (99.8% sequence identity and located at the end of the
virB operon [
Masui et al. 2000]) and not an ortholog of the
wsp gene. Considering that the
wsp gene has been extremely informative for discriminating between strains of
Wolbachia, we designed PCR primers to the
wMel
wspB gene to amplify and then sequence the orthologs from the related
wRi and
wAlbB
Wolbachia strains from
Drosophila simulans and
Aedes albopictus, respectively, as well as the
Wolbachia strain that infects the filarial nematode
Dirofilaria immitis to determine the potential utility of this locus for strain discrimination. A comparison of genetic distances between the
wsp and
wspB genes for these different taxa indicates that overall the
wspB gene appears to be evolving at a faster rate than
wsp and, as such, may be a useful additional marker for discriminating between closely related
Wolbachia strains (
Table S5).
Inefficiency of Selection in wMel
The fraction of the genome that is repetitive DNA and the fraction that corresponds to mobile genetic elements are among the highest for any prokaryotic genome. This is particularly striking compared to the genomes of other obligate intracellular species such as
Buchnera,
Rickettsia,
Chlamydia, and
Wigglesworthia, that all have very low levels of repetitive DNA and mobile elements. The recently sequenced genomes of the intracellular pathogen
Coxiella burnetti (
Seshadri et al. 2003) has both a streamlined genome and moderate amounts of repetitive DNA, although much less than
wMel. The paucity of repetitive DNA in these and other intracellular species is thought to be due to a combination of lack of exposure to other species, thereby limiting introduction of mobile elements, and genome streamlining (
Mira et al. 2001;
Moran and Mira 2001;
Frank et al. 2002). We examined the
wMel genome to try to understand the origin of the repetitive and mobile DNA and to explain why such repetitive/mobile DNA is present in
wMel, but not other streamlined intracellular species.
We propose that the mobile DNA in
wMel was acquired some time after the separation of the
Wolbachia and
Rickettsia lineages but before the radiation of the
Wolbachia group
. The acquisition of these elements after the separation of the
Wolbachia and
Rickettsia lineages is suggested by the fact that most do not have any obvious homologous sequences in the genomes of other α-Proteobacteria, including the closely related
Rickettsia spp. Additional evidence for some acqui-sition of foreign DNA after the
Wolbachia–Rickettsia split comes from phylogenetic analysis of those genes present in
wMel, but not in the two sequenced rickettsial genomes (see
Table S3; unpublished data). The acquisition prior to the radiation of
Wolbachia is suggested by two lines of evidence. First, many of the elements are found in the genome of the distantly related
Wolbachia of the nematode
B. malayi (see ; unpublished data). In addition, genome analysis reveals that these elements do not have significantly anomalous nucleotide composition or codon usage compared to the rest of the genome. In fact, there are only four regions of the genome with significantly anomalous composition, comprising in total only approximately 17 kbp of DNA (). The lack of anomalous composition suggests either that any foreign DNA in
wMel was acquired long enough ago to allow it to “ameliorate” and become compositionally similar to endogenous
Wolbachia DNA (
Lawrence and Ochman 1997,
1998) or that any foreign DNA that is present was acquired from organisms with similar composition to endogenous
wMel genes. Owing to their potential effects on genome evolution (insertional mutagenesis, catalyzing genome rearrangements), we propose that the acquisition and maintenance of these repetitive and mobile elements by
wMel have played a key role in shaping the evolution of
Wolbachia.
| Table 3Regions of Anomalous Nucleotide Composition in the wMel Genome |
It is likely that much of the mobile/repetitive DNA was introduced via phage, given that three prophage elements are present; experimental studies have shown active phage in some
Wolbachia (
Masui et al. 2001) and
Wolbachia superinfections occur in many hosts (e.g.,
Jamnongluk et al. 2002), which would allow phage to move between strains. Whatever the mechanism of introduction, the persistence of the repetitive elements in
wMel in the face of apparently strong pressures for streamlining is intriguing. One expla-nation is that
wMel may be getting a steady infusion of mobile elements from other
Wolbachia strains to counteract the elimination of elements by selection for genome streamlining. This would explain the absence of anomalous nucleotide composition of the elements. However, we believe that a major contributing factor to the presence of all the repetitive/mobile DNA in
wMel is that
wMel and possibly
Wolbachia in general have general inefficiency of natural selection relative to other species. This inefficiency would limit the ability to eliminate repetitive DNA. A general inefficiency of natural selection (especially purifying selection) has been suggested previously for intracellular bacteria, based in part on observations that these bacteria have higher evolutionary rates than free-living bacteria (e.g.,
Moran 1996). We also find a higher evolutionary rate for
wMel than that of the closely related intracellular
Rickettsia, which themselves have higher rates than free-living α-Proteobacteria (). Additionally, codon bias in
wMel appears to be driven more by mutation or drift than selection (
Figure S2), as has been reported for
Buchnera species and was suggested to be due to inefficient purifying selection (
Wernegreen and Moran 1999). Such inefficiencies of natural selection are generally due to an increase in the relative contribution of genetic drift and mutation as compared to natural selection (
Eiglmeier et al. 2001;
Lawrence 2001;
Parkhill et al. 2001). Below we discuss different possible explanations for the inefficiency of selection in
wMel, especially in comparison to other intracellular bacteria.
Low rates of recombination, such as occur in centromeres and the human Y chromosome, can lead to inefficient selection because of the linkage among genes. This has been suggested to be occurring in
Buchnera species because these species do not encode homologs of RecA, which is the key protein in homologous recombination in most species (
Shigenobu et al. 2000). The absence of recombination in
Buchnera is supported by the lack of genome rearrangements in their recent evolution (
Tamas et al. 2002). Additionally, there is apparently little or no gene flow into
Buchnera strains. In contrast,
wMel encodes the necessary machinery for recombination, including RecA (
Table S6), and has experienced both extensive intragenomic homologous recombination and introduction of foreign DNA. Therefore, the unusual genome features of
wMel are unlikely to be due to low levels of recombination.
Another possible explanation for inefficient selection is high mutation rates. It has been suggested that the higher evolutionary rates in intracellular bacteria are the result of high mutation rates that are in turn due to the loss of genes for DNA repair processes (e.g.,
Itoh et al. 2002). This is likely not the case in
wMel since its genome encodes proteins corresponding to a broad suite of DNA repair pathways including mismatch repair, nucleotide excision repair, base excision repair, and homologous recombination (
Table S6). The only noteworthy DNA repair gene absent from
wMel and present in the more slowly evolving
Rickettsia is
mfd, which is involved in targeting DNA repair to the transcribed strand of actively transcribing genes in other species (
Selby et al. 1991). However, this absence is unlikely to contribute significantly to the increased evolutionary rate in
wMel, since defects in
mfd do not lead to large increases in mutation rates in other species (
Witkin 1994). The presence of mismatch repair genes (homologs of
mutS and
mutL) in
wMel is particularly relevant since this pathway is one of the key steps in regulating mutation rates in other species. In fact,
wMel is the first bacterial species to be found with two
mutL homologs. Overall, examination of the predicted DNA repair capabilities of bacteria (
Eisen and Hanawalt 1999) suggests that the connection between evolutionary rates in intracellular species and the loss of DNA repair processes is spurious. While many intracellular species have lost DNA repair genes in their recent evolution, different species have lost different genes and some, such as
wMel and
Buchnera spp., have kept the genes that likely regulate mutation rates. In addition, some free-living species without high evolutionary rates have lost some of the same pathways lost in intracellular species, while many free-living species have lost key pathways resulting in high mutation rates (e.g.,
Helicobacter pylori has apparently lost mismatch repair [
Eisen 1997,
Eisen 1998b;
Bjorkholm et al. 2001]). Given that intracellular species tend to have small genomes and have lost genes from every type of biological process, it is not surprising that many of them have lost DNA repair genes as well.
We believe that the most likely explanations for the inefficiency of selection in
wMel involve population-size related factors, such as genetic drift and the occurrence of population bottlenecks. Such factors have also been shown to likely explain the high evolutionary rates in other intracellular species (
Moran 1996;
Moran and Mira 2001;
van Ham et al. 2003).
Wolbachia likely experience frequent population bottlenecks both during transovarial transmission (
Boyle et al. 1993) and during cytoplasmic incompatibility mediated sweeps through host populations. The extent of these bottlenecks may be greater than in other intracellular bacteria, which would explain why
wMel has both more repetitive and mobile DNA than other such species and a higher evolutionary rate than even the related
Rickettsia spp. Additional genome sequences from other
Wolbachia will reveal whether this is a feature of all
Wolbachia or only certain strains.
Mitochondrial Evolution
In our analysis of complete genomes, including that of
wMel, the first non-
Rickettsia member of the Rickettsiales order to have its genome completed, we find support for a grouping of
Wolbachia and
Rickettsia to the exclusion of the mitochondria, but not for placing the mitochondria within the Rickettsiales order (A and B;
Table S7;
Table S8). Specifically, phylogenetic trees of a concatenated alignment of 32 proteins show strong support with all methods (see
Table S7) for common branching of: (i) mitochondria, (ii)
Rickettsia with
Wolbachia, (iii) the free-living α-Proteobacteria, and (iv) mitochondria within α-Proteobacteria. Since amino acid content bias was very severe in these datasets, protein LogDet analyses, which can correct for the bias, were also performed. In LogDet analyses of the concatenated protein alignment, both including and excluding highly biased positions, mitochondria usually branched basal to the
Wolbachia–Rickettsia clade, but never specifically with
Rickettsia (see
Table S7). In addition, in phylogenetic studies of individual genes, there was no consistent phylogenetic position of mitochondrial proteins with any particular species or group within the α-Proteobacteria (see
Table S8), although support for a specific branch uniting the two
Rickettsia species with
Wolbachia was quite strong. Eight of the proteins from mitochondrial genomes (YejW, SecY, Rps8, Rps2, Rps10, RpoA, Rpl15, Rpl32) do not even branch within the α-Proteobacteria, although these genes almost certainly were encoded in the ancestral mitochondrial genome (
Lang et al. 1997).
This analysis of mitochondrial and α-Proteobacterial genes reinforces the view that ancient protein phylogenies are inherently prone to error, most likely because current models of phylogenetic inference do not accurately reflect the true evolutionary processes underlying the differences observed in contemporary amino acid sequences (
Penny et al. 2001). These conflicting results regarding the precise position of mitochondria within the α-Proteobacteria can be seen in the high amount of networking in the Neighbor-Net graph of the analyses of the concatenated alignment shown in . An important complication in studies of mitochondrial evolution lies in identifying “α-Proteobacterial” genes for comparison (
Martin 1999). For example, in our analyses, proteins from
Magnetococcus branched with other α-Proteobacterial homologs in only 17 of the 49 proteins studied, and in five cases they assumed a position basal to α-, β-, and γ-Proteobacterial homologs.
Host–Symbiont Gene Transfers
Many genes that were once encoded in mitochondrial genomes have been transferred into the host nuclear genomes. Searching for such genes has been complicated by the fact that many of the transfer events happened early in eukaryotic evolution and that there are frequently extreme amino acid and nucleotide composition biases in mitochondrial genomes (see above). We used the
wMel genome to search for additional possible mitochondrial-derived genes in eukaryotic nuclear genomes. Specifically, we constructed phylogenetic trees for
wMel genes that are not in either
Rickettsia genomes. Five new eukaryotic genes of possible mitochondrial origin were identified: three genes involved in de novo nucleotide biosynthesis (
purD,
purM,
pyrD) and two conserved hypothetical proteins (WD1005, WD0724). The α-Proteobacterial origin of these genes suggests that at least some of the genes of the de novo nucleotide synthesis pathway in eukaryotes might have been laterally acquired from bacteria via the mitochondria. The presence of such genes in other Proteobacteria suggests that their absence from
Rickettsia is due to gene loss (
Gray et al. 2001). This finding supports the need for additional α-Proteobacterial genomes to identify mitochondrion-derived genes in eukaryotes.
While organelle to nuclear gene transfers are generally accepted, there is a great deal of controversy over whether other gene transfers have occurred from bacteria into animals. In particular, claims of transfer from bacteria into the human genome (
Lander et al. 2001) were later shown to be false (
Roelofs and Van Haastert 2001;
Salzberg et al. 2001;
Stanhope et al. 2001).
Wolbachia are excellent candidates for such transfer events since they live inside the germ cells, which would allow lateral transfers to the host to be transmitted to subsequent host generations. Consistent with this, a recent study has shown some evidence for the presence of
Wolbachia-like genes in a beetle genome (
Kondo et al. 2002). The symbiosis between
wMel and
D. melanogaster provides an ideal case to search for such transfers since we have the complete genomes of both the host and symbiont. Using BLASTN searches and MUMmer alignments, we did not find any examples of highly similar stretches of DNA shared between the two species. In addition, protein-level searches and phylogenetic trees did not identify any specific relationships between
wMel and
D. melanogaster for any genes. Thus, at least for this host–symbiont association, we do not find any likely cases of recent gene exchange, with genes being maintained in both host and symbiont. In addition, in our phylogenetic analyses, we did not find any examples of
wMel proteins branching specifically with proteins from any invertebrate to the exclusion of other eukaryotes. Therefore, at least for the genes in
wMel, we do not find evidence for transfer of
Wolbachia genes into any invertebrate genome.
Metabolism and Transport
wMel is predicted to have very limited capabilities for membrane transport, for substrate utilization, and for the biosynthesis of metabolic intermediates (
Figure S3), similar to what has been seen in other intracellular symbionts and pathogens (
Paulsen et al. 2000). Almost all of the identifiable uptake systems for organic nutrients in
wMel are for amino acids, including predicted transporters for proline, asparate/glutamate, and alanine. This pattern of transporters, coupled with the presence of pathways for the metabolism of the amino acids cysteine, glutamate, glutamine, proline, serine, and threonine, suggests that
wMel may obtain much of its energy from amino acids. These amino acids could also serve as material for the production of other amino acids. In contrast, carbohydrate metabolism in
wMel appears to be limited. The only pathways that appear to be complete are the tricarboxylic acid cycle, the nonoxidative pentose phosphate pathway, and glycolysis, starting with fructose-1,6-biphosphate. The limited carbohydrate metabolism is consistent with the presence of only one sugar phosphate transporter.
wMel can also apparently transport a range of inorganic ions, although two of these systems, for potassium uptake and sodium ion/proton exchange, are frameshifted. In the latter case, two other sodium ion/proton exchangers may be able to compensate for this defect.
Many of the predicted metabolic properties of
wMel, such as the focus on amino acid transport and the presence of limited carbohydrate metabolism, are similar to those found in
Rickettsia. A major difference with the
Rickettsia spp. is the absence of the ADP–ATP exchanger protein in
wMel. In
Rickettsia this protein is used to import ATP from the host, thus allowing these species to be direct energy scavengers (
Andersson et al. 1998). This likely explains the presence of glycolysis in
wMel but not
Rickettsia. An inability to obtain ATP from its host also helps explain the presence of pathways for the synthesis of the purines AMP, IMP, XMP, and GMP in
wMel but not
Rickettsia. Other pathways present in
wMel but not
Rickettsia include threonine degradation (described above), riboflavin biosynthesis, pyrimidine metabolism (i.e., from PRPP to UMP), and chelated iron uptake (using a single ABC transporter). The two
Rickettsia species have a relatively large complement of predicted transporters for osmoprotectants, such as proline and glycine betaine, whereas
wMel possesses only two of these systems.
Host–Symbiont Interactions
The mechanisms by which
Wolbachia infect host cells and by which they cause the diverse phenotypic effects on host reproduction and fitness are poorly understood, and the
wMel genome helps identify potential contributing factors. A complete Type IV secretion system, portions of which have been reported in earlier studies, is present. The complete genome sequence shows that in addition to the five
vir genes previously described from
Wolbachia wKueYO (
Masui et al. 2001), an additional four are present in
wMel. Of the nine
wMel
vir ORFs, eight are arranged into two separate operons. Similar to the single operon identified in
wTai and
wKueYO, the
wMel
virB8,
virB9,
virB10,
virB11, and
virD4 CDSs are adjacent to
wspB, forming a 7 kb operon (WD0004–WD0009). The second operon contains
virB3,
virB4, and
virB6 as well as four additional non-
vir CDSs, including three putative membrane-spanning proteins, that form part of a 15.7 kb operon (WD0859–WD0853). Examination of the
Rickettsia conorii genome shows a similar orga-nization (A). The observed conserved gene order for these genes between these two genomes suggests that the putative membrane-spanning proteins could form a novel and, possibly, integral part of a functioning Type IV secretion system within these bacteria. Moreover, reverse transcription (RT)-PCRs have confirmed that
wspB and WD0853–WD0856 are each expressed as part of the two
vir operons and further indicate that these additional encoded proteins are novel components of the
Wolbachia Type IV secretion system (B).
In addition to the two major
vir clusters, a paralog of
virB8 (WD0817) is also present in the
wMel genome. WD0818 is quite divergent from
virB8 and, as such, does not appear to have resulted from a recent gene duplication event. RT-PCR experiments have failed to show expression of this CDS in
wMel-infected
Drosophila (data not shown). PCR primers were designed to all CDSs of the
wMel Type IV secretion system and used to successfully amplify orthologs from the divergent
Wolbachia strains
wRi and
wAlbB (data not shown). We were able to detect orthologs to all of the
wMel Type IV secretion system components as well as most of the adjacent non-
vir CDSs, suggesting that this system is conserved across a range of A- and B-group
Wolbachia. An increasing body of evidence has highlighted the importance of Type IV secretion systems for the successful infection, invasion, and persistence of intracellular bacteria within their hosts (
Christie 2001;
Sexton and Vogel 2002). It is likely that the Type IV system in
Wolbachia plays a role in the establishment and maintenance of infection and possibly in the generation of reproductive phenotypes.
Genes involved in pathogenicity in bacteria have been found to be frequently associated with regions of anomalous nucleotide composition, possibly owing to transfer from other species or insertion into the genome from plasmids or phage. In the four such regions in wMel (see above; see ), some additional candidates for pathogenicity-related activities are present including a putative penicillin-binding protein (WD0719), genes predicted to be involved in cell wall synthesis (WD0095–WD0098, including D-alanine-D-alanine ligase, a putative FtsQ, and D-alanyl-D-alanine carboxy peptidase) and a multidrug resistance protein (WD0099). In addition, we have identified a cluster of genes in one of the phage regions that may also have some role in host–symbiont interactions. This cluster (WD0611–WD0621) is embedded within the WO-B phage region of the genome (see ) and contains many genes that encode proteins with putative roles in the synthesis and degradation of surface polysaccharides, including a UDP-glucose 6-dehydrogenase (WD0620). Since this cluster appears to be normal in terms of phylogeny relative to other genes in the genome (i.e., the genes in this region have normal wMel nucleotide composition and branch in phylogenetic trees with genes from other α-Proteobacteria), it is not likely to have been acquired from other species. However, it is possible that these genes can be transferred among Wolbachia strains via the phage, which in turn could lead to some variation in host–symbiont interactions between Wolbachia strains.
Of particular interest for host-interaction functions are the large number of genes that encode proteins that contain ankyrin repeats (). Ankyrin repeats, a tandem motif of around 33 amino acids, are found mainly in eukaryotic proteins, where they are known to mediate protein–protein interactions (
Caturegli et al. 2000). While they have been found in bacteria before, they are usually present in only a few copies per species.
wMel has 23 ankyrin repeat-containing genes, the most currently described for a prokaryote, with
C. burnetti being next with 13. This is particularly striking given
wMel's relatively small genome size. The functions of the ankyrin repeat-containing proteins in
wMel are difficult to predict since most have no sequence similarity outside the ankyrin domains to any proteins of known function. Many lines of evidence suggest that the
wMel ankyrin domain proteins are involved in regulating host cell-cycle or cell division or interacting with the host cytoskeleton: (i) many ankyrin-containing proteins in eukaryotes are thought to be involved in linking membrane proteins to the cytoskeleton (
Hryniewicz-Jankowska et al. 2002); (ii) an ankyrin-repeat protein of
Ehrlichia phagocytophila binds condensed chromatin of host cells and may be involved in host cell-cycle regulation (
Caturegli et al. 2000); (iii) some of the proteins that modify the activity of cell-cycle-regulating proteins in
D. melanogaster contain ankyrin repeats (
Elfring et al. 1997); and (iv) the
Wolbachia strain that infects the wasp
Nasonia vitripennis induces cytoplasmic incompatibility, likely by interacting with these same cell-cycle proteins (
Tram and Sullivan 2002). Of the ankyrin-containing proteins in
wMel, those worth exploring in more detail include the several that are predicted to be surface targeted or secreted () and thus could be targeted to the host nucleus. It is also possible that some of the other ankyrin-containing proteins are secreted via the Type IV secretion system in a targeting signal independent pathway. We call particular attention to three of the ankyrin-containing proteins (WD0285, WD0636, and WD0637), which are among the very few genes, other than those encoding components of the translation apparatus, that have significantly biased codon usage relative to what is expected based on GC content, suggesting they may be highly expressed.
| Table 4. Ankyrin-Domain Containing Proteins Encoded by the wMel Genome |