The genus
Brucella belongs to the α-Proteobacteria order and consists of mostly intra-cellular bacteria that are known to be pathogenic in a wide range of mammal hosts [
1]. The ailments caused by the different species and strains from genus
Brucella are known collectively as brucellosis [
1]. Brucellosis is a contagious zoonotic disease known to affect many different mammals ranging from livestock and humans to a wide variety of marine mammals. Each species or strain, however, has a narrow host range [
1].
The
Brucella genus has traditionally been classified into six species:
B. melitensis,
B. suis,
B. abortus,
B. neotomae,
B. ovis, and
B. canis, which are reflective of host preference. In 1985, it was proposed that the six
Brucella species should be grouped as biovars of a single species based on DNA-DNA hybridization studies [
2]. The
Brucella Taxonomic Subcommittee of the International Committee on Systematics of Prokaryotes adopted this proposition. However, the international community of
Brucella researchers has never accepted this change and a return to the pre-1986 taxonomy was advocated and eventually adopted by the
Brucella Taxonomic Subcommittee [
3]. Genus
Brucella has been further expanded with a set of recently discovered species. Such species include
B. ceti and
B. pinnipedialis that have been isolated from cetaceans and pinnipeds [
4].
B. microti has been isolated from the common vole [
5], and
B. inopinata was isolated from a breast implant infection in an woman with clinical signs of brucellosis [
6].
The genomes sequenced from genus
Brucella are also known to be very similar in terms of both base composition and genome size [
1]. All sequenced species have a GC content of approximately 57%, and most genomes consist of approximately 3.3 Mbp divided on two chromosomes (see Table ). None of the sequenced members of the
Brucella genus have any plasmids reported [
http://www.ncbi.nlm.nih.gov/genomes/lproks.cgi].
| Table 10th order Markov chain model based cluster groups of Brucella genomes |
The first
Brucella species to be sequenced was
B. melitensis 16M (biovar 1) [
7] followed closely by
B. suis 1330 (biovar 1) [
8]. As more genomes are being sequenced, taxonomic classification of the
Brucella genus is becoming more difficult, and many different methods have been applied [
9]. The challenges involved in taxonomical classification of
Brucella spp. are largely linked to the fact that marker genes typically used for phylogenetic classification are either missing or too similar to give any meaningful results [
9,
10]. Additionally, marker gene based methods like MLST and 16S rRNA do not directly reflect changes in gene content and may therefore fail to reproduce a broader view of the differences between species, strains, and biovars [
10,
11]. SNP analysis gives a better overview of changes happening at the genome level, but does not directly reflect changes in gene content [
10]. Hence, taxonomic classification of
Brucella spp. is a challenging task touching on difficult taxonomic and phylogenetic issues in prokaryotic species definition as a whole [
9].
The aim of this study was to examine the strength and weaknesses of a set of methods for phylogenetic classification based on whole genome comparisons. This was carried out using a number of sequenced genomes from species and strains taken from genus Brucella and the closely related genus Agrobacterium and genus Ochrobactrum. This study was motivated by the genomic homogeneity and the difficult phylogenetic assessment of genus Brucella. Genomic comparisons were performed using a number of different methods that reflect changes at both the proteome level and the base composition level.
The comparison methods reflecting DNA composition used in this study include oligonucleotide based 0
th, 1
st, and 2
nd order Markov chain genomic signature models (ZOM, FOM, SOM, respectively) [
12], and codon and amino acid frequencies analyses [
13].
For the proteome based comparisons of the genomes, the Prodigal gene finder [
14] was used to predict open reading frames (ORFs) in all genomes used in the study (See Table ). Whole genome BLAST comparisons were subsequently performed between all proteomes,
i.e. all-against-all gene comparisons between all genomes according to the guidelines given by Ussery
et al. [
15]. In addition, pan- and core genome analyses [
16,
17] were carried out to map gene exchange in sequenced members of genus
Brucella and the closely related phylogenetic genera such as
Agrobacterium and
Ochrobactrum [
18].
Scholz and co-workers [
18] have carried out a thorough 16S rRNA analysis and we refer to that article for these results.
Of the methods described above, the Markov chain models and codon and amino acid frequencies based analyses best reflect base compositional differences and whole genome mutational bias [
12,
19]. The oligonucleotide based methods are sensitive to mutations at the genome level, and therefore share certain similarities with the whole genome SNP analyses conducted by Foster
et al. [
10]. The BLAST comparisons and pan-genomic analyses focus on gene content comparisons and gene exchange and may thus be considered as complementary to the oligonucleotide frequencies based methods that mirror base compositional differences. To the best of our knowledge, recent whole genome based gene comparisons of
Brucella species, similar to the work conducted here, have only been carried out for 5
Brucella genomes (
B. ovis ATCC25840,
B. suis 1330 (biovar 1),
B. abortus 9-941 (biovar 1),
B. melitensis 16M (biovar 1) and
B. abortus 2308 (biovar 1)) by Tsolis
et al. [
20]. In the present work however, we perform whole genome comparisons of 32
Brucella genomes (Table ) using a variety of different genomic methods to obtain deeper insight into the obscure evolution of genus
Brucella. In addition to the 32
Brucella genomes, we also include three sequenced genomes from genus
Agrobacterium,
A. radiobacter K84,
A. tumefaciens C58,
A. vitis S4, and two from genus
Ochrobactrum,
O.
anthropi ATCC 49188 and
O.
intermedium LMG 3301, to examine the relative difference between these closely related microbes [
18].