We have performed a broad comparison of 17 Francisella genomes to infer past evolutionary events and possible ecological adaptations that distinguish the primarily human pathogenic F. tularensis from its less virulent opportunistic neighbors, F. novicida and F. philomiragia. Our analysis provides support for a proposed evolutionary scenario of events that formed F. tularensis. The analysis also offers important clarifications of several uncertainties and ambiguities regarding F. tularensis, more specifically concerning: (i) the phylogenetic origin of F. tularensis subsp mediasiatica, (ii) the suggested dependence of virulence on the mere occurrence of specific genes, (iii) the occurrence of genetic recombination in F. tularensis, and (iv) the previously suggested lateral mobility of the FPI in F. tularensis. In addition, the results provide evolutionary data indicating that strains of F. novicida should continue to be regarded as a separate species.
Based on our results, we propose the following sequence of events. F. tularensis
emerged from a recombining Francisella
population that was relatively unrestricted or free-living, then an ancestral F. tularensis
variant invaded a novel and more host-restricted niche. This event led to clonal evolution under a reduced purifying selection pressure with ensuing genome degradation and proliferation of insertion sequence elements. We further propose that this series of events provided important prerequisites for further alterations of genomic architecture, and possibly increased adaptability of F. tularensis
. In support of this scenario, we found apparent differences in the evolutionary mode of basal Francisella
lineages (including F. philomiragia
, F. novicida
and an ancestral F. tularensis
lineage), and F. tularensis
lineages. These apparent differences include frequent genetic exchanges among basal Francisella
lineages, but not among F. tularensis
lineages, strong purifying selection pressure in basal lineages but weak levels in F. tularensis
, and increasing frequencies of adenine and thymine nucleotides in F. tularensis
but not in F. novicida
genomes (Table S3
). The pronounced increase in G+C→A+T mutations in F. tularensis
supports a link to weak purifying selection, allowing for the fixation of slightly deleterious mutations. In agreement with this interpretation, Balbi et al. recently found inefficient purifying selection to be intimately connected with adenine and thymine enrichment in Shigella
Differences in genomic architecture were also apparent. While genome erosion appears to have occurred in all F. tularensis genomes, the representatives of basal lineages have maintained genomes tightly packed with genes. Corroborating a previous genomic analysis of F. novicida strain U112, the unfinished genomic sequences of F. novicida strain GA99-3548 and strain GA99-3549, as well as the completed F. philomiragia strain ATCC 25017, were found to contain few IS elements and pseudogenes. The overall gene synteny was extensive among the three available F. novicida genomes (data not shown). In contrast, all F. tularensis genomes are crowded with IS elements and pseudogenes, and display highly rearranged gene orders, each corresponding to a subspecies or a major genetic lineage. The findings in this study thus indicate that F. novicida has remained relatively unchanged over a long period with respect to gene content, presence of IS elements, and gene order. If so, the genomic architecture of the ancestor of F. tularensis must have more closely resembled F. novicida than any current F. tularensis isolate. Genomic data therefore indicate that the deviating evolutionary patterns in F. tularensis represent a derived state.
The greater metabolic competence of F. novicida
compared to F. tularensis
, and the abundance of IS elements in F. tularensis
(but not F. novicida
), provide additional indirect support for a change of living habitat. The genomic erosion identified in F. tularensis
is consistent with its occupation of a habitat that supplies nutrients, making some metabolic functions superfluous. Host-pathogen or recent symbiotic restrictions appear to have been similarly associated with genome erosion and proliferation of IS elements in several other organisms, e.g. Yersinia pestis 
, Bordetella pertussis 
, and bacterial endosymbionts of insects 
. Generally, IS element expansions in host-restricted bacteria are considered to be consequences of reductions in effective population size and relaxed purifying selection, which provide opportunities for insertions 
. Supporting this hypothesis in F. tularensis
is the bacterium's exceptionally high infectiousness, 10–25 cfu being sufficient to cause disease in humans, a trait consistent with repeated population contractions during infection of hosts.
Assuming that IS elements proliferate as a result of reduced selection pressure, it follows that this is a neutral process that in itself provides no advantage for the bacterium 
. A neutral random insertion of IS elements likely provided the necessary raw materials for secondary pathoadaptive mutations in F. tularensis
. Out of the two genetic loci that were found to be multiplied in all F. tularensis
genomes by an IS element-mediated process, both were found to represent functions of central importance to the pathogen. The first locus corresponds to the FPI, a critical virulence determinant recognized for its importance for phagosomal escape 
. The other locus contains a hypothetical glycosyltransferase gene, which we here demonstrate has been under strong adaptive selection. F. tularensis
may therefore provide an example of an organism for which random genetic drift, with consequent fixation of many neutral or slightly deleterious mutations, provided novel evolutionary opportunities. Although not providing definitive proof, we propose that secondary gene multiplications enabled by past random IS element insertions represent examples of adaptively selected traits of the bacterium. In line with arguments recently advanced by Lynch 
, our data suggest that an accumulation of mutations that were originally neutral or slightly deleterious to the organism in the short term proved to be fruitful in the long term when exploited by natural selection.
As mentioned above, the data presented here also offer possible clarifications of several uncertain aspects and ambiguities regarding F. tularensis.
- One such ambiguity concerns the phylogenetic origin of F. tularensis subsp mediasiatica. We here identified F. tularensis subsp. mediasiatica strain FSC147 as a monophyletic F. tularensis taxon. This conflicts with observations by Nübel et al. , who found (using multi locus sequence typing) that F. tularensis subsp. mediasiatica strain FSC147 (denoted F68 in their study) is not a member of the F. tularensis clade, but instead is associated with environmental Francisella isolates. The finding that the subspecies mediasiatica is “phylogenetically incoherent” was a central conclusion in their work. However, we found that several gene fragment sequences for FSC147 deposited in GenBank by Nübel et al. differ from the genomic sequence of this strain, but coincide with F. novicida sequences. Thus, their conclusion of a polyphyletic origin of the subspecies mediasiatica requires re-appraisal.
- Another uncertainty concerns the suggested dependence of virulence on the mere presence of specific virulence genes . Given the high genetic similarity between the members of subspp. tularensis and mediasiatica, there is an intriguing difference in virulence between the two subspecies . In a recent genome comparison, Rhomer et al. suggested a set of nine genes to be candidate mediators of the high virulence of subsp. tularensis . Our observations of gene content in the various subspecies of F. tularensis, including F. tularensis subsp. mediasiatica, provide little support for the hypothesis that any of these genes explain the higher degree of virulence of the tularensis subspecies. Our analyses indicate instead that these particular gene differences exemplify a superfluous gene set that is common to all F. tularensis lineages and is not yet completely inactivated in subspp. tularensis and mediasiatica ( and Figure S1). Further, we found evidence for substantially reduced purifying selection in F. tularensis, implying that its evolution has been strongly affected by random genetic drift. These findings do not exclude pathoadaptation of individual lineages. Possibly, gene silencing in subsp. tularensis may have promoted virulence, as has been suggested for the host adaptation of Shigella and Salmonella, in which deletion mutants of specific genes have produced phenotypes of increased virulence ,. In F. novicida, there is a parallel example, since silencing the pepO gene promoted virulence in a mouse model . Moreover, it is possible that virulence alterations may have resulted from genomic rearrangements during the formation of subspecies, affecting transcriptional networks.
- A third uncertainty regards the suggested occurrence of genetic recombination in F. tularensis . Among lineages within F. tularensis, we found no evidence of past recombination events. Our recombination analyses suggest that the few homoplasies detected in F. tularensis instead likely arose as a consequence of mutational biases in F. tularensis (, Table S3). In contrast, our comparative genome sequence data show that recombination events have been common features of the evolution of all environmental lineages, here represented by F. novicida U112, novicida-like strains GA99-3548, GA99-3549 and the F. philomiragia strain ATCC 25017. Visual sequence analysis, inferred recombination rates by Clonalframe, and Hudson's Rm all indicate substantial recombination rates. Since Rm represents a lower bound and ClonalFrame only models recombination “imports”, both methods likely underestimate the true number of recombinations.
Results of test for recombination among 13 F. tularensis genomes.
- A fourth uncertainty concerns the suggested lateral mobility of the FPI in F. tularensis . We found that the FPI is ubiquitous across all investigated genomes, implying that it was incorporated at an early stage into a Francisella ancestor. It is likely that acquisition of the FPI genes (which are now permanently integrated in duplicate copies in the chromosome of all F. tularensis lineages) was an important event for early host adaptation of Francisella. We found no genetic traces of recent extra-chromosomal mobilization of the FPI in F. tularensis, instead these genes seem to have evolved into a duplicated part of the core genome.
Finally, the results provide compelling arguments in favour of continuing to regard strains of F. novicida
as belonging to a separate species. In agreement with a previous proposal by Hollis et al., based on DNA-DNA re-association 
, our ANI analysis indicates that strains belonging to F. novicida
meet formal requirements for classification as a F. tularensis
subspecies. All pairs of isolates classified as F. novicida
-like, and F. tularensis
demonstrated ANI values well above 95% (Table S1
), a limit proposed as the threshold for classification into different bacterial species 
. According to the method-free species concept recently outlined by Wagner and Achtman 
, however, species should be regarded as “metapopulation lineages” where separate designations are warranted if population lineages evolved separately despite a close relatedness. Our comparisons of environmental lineages (F. novicida
, F. philomiragia
) and F. tularensis
show a typical example of such evolutionary separation. In addition to distinct population structures with regard to recombination, we also found substantial differences in overall dN/dS between environmental Francisella
and F. tularensis
(), lending support to smaller effective population sizes in the latter. Other differences between environmental Francisella
and F. tularensis
include differences in metabolic competence, which is higher among environmental strains, and signs of ongoing genome erosion, which is pronounced among F. tularensis
strains but not among the analyzed F. philomiragia
and F. novicida
It is also clear that tularemia caused by F. tularensis
is a distinct clinical disease entity with little similarity to the bacteraemia caused by F. novicida 
. Moreover, tularemia is a classical vector-borne zoonosis while F. novicida
is not known to be transmitted among vertebrate species, and F. tularensis
is considered a biothreat agent while F. novicida
is not. A fuzzy distinction between these quite different organisms may therefore complicate clinical decisions. Based on the evolutionary analyses described in this work, their distinct epidemiological features, and on clinical grounds: even though their average nucleotide identities exceed 97%, we propose that the species boundary between F. tularensis
and F. novicida
should be retained.