|Home | About | Journals | Submit | Contact Us | Français|
Brucellosis is a worldwide disease of humans and livestock that is caused by a number of very closely related classical Brucella species in the alpha-2 subdivision of the Proteobacteria. We report the complete genome sequence of Brucella abortus field isolate 9-941 and compare it to those of Brucella suis 1330 and Brucella melitensis 16 M. The genomes of these Brucella species are strikingly similar, with nearly identical genetic content and gene organization. However, a number of insertion-deletion events and several polymorphic regions encoding putative outer membrane proteins were identified among the genomes. Several fragments previously identified as unique to either B. suis or B. melitensis were present in the B. abortus genome. Even though several fragments were shared between only B. abortus and B. suis, B. abortus shared more fragments and had fewer nucleotide polymorphisms with B. melitensis than B. suis. The complete genomic sequence of B. abortus provides an important resource for further investigations into determinants of the pathogenicity and virulence phenotypes of these bacteria.
Brucellosis is a bacterial disease of animals that can be transmitted to humans. The primary impact of brucellosis stems from losses due to reproductive failure in food animals and the loss of human productivity. Since brucellosis threatens the food supply and causes undulant fever, a long, debilitating disease in humans, Brucella species are recognized as potential agricultural, civilian, and military bioterrorism agents. Brucellosis in food animals is controlled by vaccination. Human brucellosis is treatable with antibiotics, though the course of antibiotic treatment must be prolonged due to the intracellular nature of Brucella.
Analysis of 16S rRNA sequences places Brucella spp. as members of the alpha-2 Proteobacteria (31). The genus Brucella has six recognized species, all of which exhibit distinct host preferences (25, 26). The high degree of similarity among the brucellae (1, 3, 13, 33) lends support to the proposal that the classical species of Brucella are actually strains of Brucella melitensis (40). However, this view conflicts with the hypothesized evolutionary isolation of these classical species due to their intracellular existence and host preference (29). Common host-pathogen associations among the classical Brucella species are as follows: B. abortus, cattle; B. suis, swine; B. melitensis, goats; B. ovis, sheep; B. canis, dogs; B. neotomae, desert wood rats. Although these host-pathogen associations represent the norm in nature, cross-species infections do occur. Recently, brucellae have also been isolated from marine mammals (11, 34). Brucellae may be more widespread than previously recognized.
Brucellae can be rapidly identified by PCR assays, such as those based on the insertion sequence IS711 (6, 7). Though the sequences of brucellae are very similar, biovars of some of the classical species were differentiated by DNA sequence determination of several outer membrane proteins (OMPs) (8, 12, 41). Strains of B. abortus biovar 1 were distinguishable by analysis of a multilocus variable nucleotide tandem repeat (4).
Pulsed-field gel electrophoresis (PFGE) maps of the classical Brucella spp. genomes are composed of two circular chromosomes of approximately 2.1 and 1.2 Mbp (22, 27, 28), with the exception of B. suis biovar 3, which has a single chromosome of 3.1 Mbp. PFGE studies revealed other differences, including a 640-kb inversion in the small chromosome of B. abortus 544 and a deletion in the small chromosome of B. ovis. The two chromosomes of brucellae differ in important ways (33). The origin of replication of the large chromosome (Chr I) is typical of bacterial chromosomes, while that of the small chromosome (Chr II) is plasmid like. Further, most of the essential genes are located on Chr I. The G+C content of the two chromosomes is nearly identical, consistent with the assertion that the assimilation and stabilization of a plasmid was an ancient event (33) in brucellae.
The genome sequences of B. melitensis and B. suis have been determined (10, 33). Comparative analyses revealed both that the two genomes are extremely similar and that they have many similarities to both bacterial plant and animal pathogens and symbionts (33, 38). The sequence identity for most open reading frames (ORFs) was 99% or higher. Nevertheless, unique fragments were reported to exist between these two genomes (33). Prior to sequencing the B. abortus genome, a large number of short sequences were available in GenBank. Many of these sequences were derived from analyses of plasmids estimated to cover 20% of the genome from a random shotgun library of B. abortus S2308 (36). In this study, we present the completed B. abortus genome sequence and compare it to the genomes of B. melitensis and B. suis. Taken together, the genome sequences of these classical Brucella species provide a firm foundation for further research into the genetic bases for host preference, pathogenesis, virulence, and biotype differences.
B. abortus strain 9-941 was obtained from the National Animal Disease Center culture collection. It was originally isolated from a serologically detected, infected cattle herd in northwestern Wyoming. The isolate was identified as B. abortus biovar 1 by the National Veterinary Services Laboratory based on morphology, bacteriologic characteristics, and phage typing. The isolate is nonmotile, nonhemolytic, A-antigen dominant, catalase positive, oxidase positive, urease positive at 3.5 h, nitrate reduction positive, citrate utilization negative, H2S production positive after 2 days of incubation at 37°C, sensitive to thionin dye (1:25,000), and resistant to basic fuchsin, thionin blue (1:500,000), penicillin, and erythritol.
Total genomic DNA was purified from B. abortus strain 9-941 by a modification of a previously described method (17). Bacteria were harvested from agar plates in saline and killed by the addition of two volumes of methanol. Approximately 1010 bacteria were pelleted, washed in TE buffer (10 mM Tris, 1 mM EDTA; pH 8.0), and treated with 0.5% Zwittergent 3-14 in TE buffered with citrate at pH 4.0 for 1 h at 50°C. The treated bacteria were washed in TE, lysed in a solution containing 4% sarcosine, 0.5% sodium dodecyl sulfate, 125 mg of proteinase K/ml, 10 mM EDTA, and 20 mM Tris (pH 8.0) for 20 min at 65°C, and the lysate was treated with RNase A. The DNA was precipitated in ethanol, removed by spooling, and resuspended in DNAzol (catalog no. 10503-027; Life Technologies, Grand Island, N.Y.). The DNA was precipitated with ethanol a second time, dissolved in 8 mM sodium hydroxide, and adjusted to pH 7.4 with 10 mM HEPES for storage at 4°C.
A random 2-kb insert library of B. abortus 9-941 was constructed by shearing whole genomic DNA using a nebulizer and compressed nitrogen according to protocols developed by Bruce Roe's laboratory and posted at The University of Oklahoma's Advanced Center for Genome Technology website (http://www.genome.ou.edu). The sheared DNA fragments were separated by gel electrophoresis, and fragments of 2 to 3 kb were excised from the gel and purified. The ends of the purified fragments were polished by the addition of nucleotides and Klenow fragment (New England BioLabs, Beverly, Mass.) and ligated to a SmaI-restricted calf intestinal alkaline phosphatase-treated pUC18 vector for cloning by electroporation into Escherichia coli. The library, which consisted of >90% recombinant clones, was used to construct a culture collection for sequence determination.
Plasmid DNA was extracted using the QIAprep 96 Turbo kit (QIAGEN, Santa Clarita, Calif.), quantitated using PicoGreen (dsDNA quantitation kit; Molecular Probes, Eugene, Oreg.), and labeled (DyeDeoxy Terminator cycle sequencing kit; ABI automated DNA sequencing chemistry guide, ABI, Foster City, Calif.) for sequencing in the presence of dimethyl sulfoxide. The sequence was determined (ABI Prism 3700 DNA analyzer) and assembled using Phred/Phrap/Consed software obtained from the University of Washington Genome Center (http://www.genome.washington.edu/UWGC/) and the MacVector 7.0 DNA analysis package (Accelrys Inc., San Diego, Calif.). Contigs were linked and gaps were filled by predicting linkages based on putative colinearity of sequences with B. suis. Linkages were confirmed by amplifying and sequencing genomic DNA from B. abortus 9-941. The genome sequence was derived from 37,718 plasmids, and coverage was 10-fold. Mean Phrap quality score was 86. The confirmed mean read length was 819 bp.
Artemis (releases 4 and 5; The Sanger Centre [www.sanger.ac.uk/software/ACT/]) (35) was used to identify putative genes by determining which of the ORFs with 50 or more amino acids (aa) encoded homologs in GenBank searches using BLASTP (2). B. abortus ORF annotations were modeled after those of the homologous ORFs in GenBank, especially those of the B. suis genome. B. abortus ORFs that were truncated due to premature stops or had frameshifts compared to homologs from B. suis or other entries in GenBank were designated pseudogenes. Pseudogenes of B. suis were obtained from the GenBank accession numbers for the two chromosomes of B. suis 1330. For B. melitensis, pseudogenes were identified by visual inspection of BLASTP searches of GenBank with Artemis and by cross-comparisons of the three genomes' DNA sequences using Act version 2 (The Sanger Centre).
ORFs that differed in length among the Brucella genomes due to frameshifts or premature stops were labeled as differential ORFs. ORFs were not categorized as differential if their lengths differed solely due to selection of alternative start codons during the annotation processes.
ORFs from The Institute for Genomic Research transposon role category database of the comprehensive microbial resource website (http://www.tigr.org/tigr-scripts/CMR2/CMRHomePage.spl) for B. suis were used to identify shared transposon-related sequences among B. melitensis and B. abortus by use of MacVector 7.0.
Putative genes were designated as unique if no homolog was identified by aligning ORFs from the three Brucella genomes with each other (BLASTP version 2.2.6, -e 0.01, -F F) and to GenBank. Products of unique genes and ORFs were denoted as hypothetical proteins.
Single nucleotide polymorphisms (SNPs) were identified using pairwise chromosome alignment data generated with the MUMmer 3.0 run-mummer3 script (24). No minimum separation distance restrictions were placed on neighboring SNPs. SNP totals were used as a measure of genetic distance for the neighbor-joining tree [Ps = (ΣSNP count/1,000)] construction with MEGA2 (23). Mesorhizobium loti (GenBank accession numbers NC_002678, pA:NC_002679, and pB:NC_002682) was used as an outgroup for rooting the tree.
The GenBank accession numbers for B. abortus 9-941 Chr I and Chr II are AE017223 and AE017224, respectively. The genome accession numbers used for B. suis 1330 were AE014291 and AE014292, and for B. melitensis M16 they were NC_003317 and NC_003318.
The whole genome sequence of a B. abortus biovar 1 field isolate was determined by the shotgun method. The genome is 3.3 Mb and is composed of two circular chromosomes of 2,124,242 (Chr I) and 1,162,780 bp (Chr II). The chromosome sequences of B. abortus 9-941 were assigned the same strand orientation and origin as those of B. suis. The G+C contents of Chr I and Chr II are 57.2 and 57.3%, respectively. This is identical to that found for the two chromosomes of B. suis (33) and is in agreement with that of B. melitensis (10). It is consistent with that determined in early hybridization studies (20, 21). The B. abortus genome contains 3,296 ORFs annotated as genes, 2,158 on Chr I and 1,138 on Chr II. This is similar to the annotated ORF counts for B. suis (3,388) and B. melitensis (3,197). Differences in the number of ORFs found among the three Brucella genomes derived primarily from differences in annotation of short ORFs and from large insertions-deletions (indels). Similarity among the Brucella orthologs was high, with an average amino acid sequence identity of greater than 99%. The B. abortus sequence confirmed PFGE maps (27) with regard to genome size, chromosome number, and presence of an inversion described in Chr II relative to other genomes.
Many of the annotation differences among the genomes are related to small ORFs. The number and annotation of short hypothetical ORFs of 100 aa or less are similar between B. suis and B. abortus. The annotated genome of B. melitensis has fewer short ORFs than B. abortus and B. suis. A total of 551 ORFs of less than 100 aa are annotated in B. abortus, while in B. melitensis there are only 304. The disparity in the number of small ORFs is larger when those less than 50 aa are considered. While 161 of the 551 short ORFs in B. abortus are less than 50 aa, only 11 of the 304 short ORFs in B. melitensis are. Functional assays such as microarrays and proteomic studies will be necessary to identify genes and their products.
Sequences larger than 100 bp that were previously reported to be unique in either B. suis or B. melitensis (see Table Table11 in reference 33) were aligned to the B. abortus genome to determine their presence or absence (Tables (Tables11 and and2).2). Many of these sequences had been shown to be related to mobile genetic elements, while others had not. Table Table11 lists loci of fragments related to phages, transposable elements, or plasmids. The loci of the remaining fragments are in Table Table2.2. Many of the fragments that were unique between B. suis and B. melitensis were found in B. abortus. The genome of B. abortus shared more fragments with B. suis and B. melitensis than B. suis and B. melitensis did with each other. B. abortus shared more fragments with B. melitensis than B. suis. This agrees with other analyses that showed B. abortus and B. melitensis being more closely related than B. abortus and B. suis (22, 27, 28).
Two fragments shared by B. suis and B. melitensis were not found in B. abortus. A 2,774-bp fragment encoding a probable surface protein and two partial ORFs with homology to the insertion sequences IS711 and ISBm1 is missing from B. abortus. The probable surface protein is annotated in B. suis as a cell wall surface protein (BRA0553) and in B. melitensis as a hemagglutinin (BMEII0717). Though they are highly similar, they differ slightly in length. The second fragment is a 25-kb sequence that had previously been identified as missing in B. abortus 544, a biovar 1 strain and the type species (42). This sequence was shown to encode a number of ORFs that may be involved in polysaccharide synthesis and was predicted to potentially affect phenotypes of brucellae, such as host preference (42).
The loci BruAb1_0072 and BruAb2_0168 (Fig. 1A and C) have sequences specific to B. abortus (Fig. 1B and D) relative to B. suis and B. melitensis and contain sequences that are repeated. There are eight copies of a 250-bp sequence in the 2-kb region and two copies of a 500-bp sequence in the 4-kb region. These sequences are not homologous.
The 2-kb region is in Chr I and encodes a similar-size putative OMP in B. abortus and B. suis, BruAb1_0072 (756 aa) and BR0072 (740 aa), respectively. Though there is an ORF of over 4,000 bp in B. abortus containing a number of possible start codons, the start codon selected for BruAb1_0072 is near a putative ribosome binding site that is more than 1,700 bp from the start of the 4,000-bp ORF (Fig. (Fig.1A).1A). The selected start codon produces a product that is similar in size to BR0072 (Fig. (Fig.1A).1A). In B. melitensis, there are two ORFs, BMEI1873 (366 aa) and BMEI1872 (506 aa); however, due to a frameshift in B. melitensis relative to B. suis, they appear to be pseudogenes (Fig. (Fig.1A).1A). In the 2-kb region in B. abortus, there are eight highly similar copies of a 250-bp sequence that occur as direct tandem repeats (Fig. (Fig.1B,1B, graph 1). The region containing the eight repeats is only found in B. abortus. In B. suis, there is a single copy of a sequence similar to the 250-bp repeat in B. abortus (Fig. (Fig.1B,1B, graph 2). In B. melitensis, there are three direct copies of a sequence similar to the repeated 250-bp sequence in B. abortus (Fig. (Fig.1B,1B, graph 3). The repeated sequence in B. melitensis is more similar to that in B. suis than to that in B. abortus (Fig. (Fig.1B,1B, graphs 2 to 4).
The 4-kb region is on Chr II and encodes an autotransporter in B. abortus and B. suis, BruAb2_0168 (1,983 aa) and BRA0173 (1,113 aa), respectively (Fig. (Fig.1C).1C). There were several possible start codons in the 6,062-bp ORF in B. abortus with homology to BRA0173. The start codon selected generating BruAb2_0168 was near the beginning of the large ORF and near a putative ribosome binding site. Experimental studies will be necessary to establish if this is an authentic start codon. In B. melitensis, a frameshift relative to the other genomes results in two relatively short ORFs, BMEII1069 (488 aa) and BMEII1070 (114 aa) (Fig. (Fig.1C).1C). As neither of these ORFs encodes the domains of autotransporters (19), they appear to be pseudogenes. The B. abortus sequence has two direct 500-bp repeats separated by approximately 2,750 bp of sequence; the 2,750-bp sequence was B. abortus specific (Fig. (Fig.1D,1D, graphs 1 to 3). There is a single copy of the B. abortus 500-bp repeat in B. suis (Fig. (Fig.1D,1D, graph 2), while in B. melitensis there is only a partial copy (180 bp) (Fig. (Fig.1D,1D, graph 3). The amino termini of BruAb2_0168 and BMEII1070 are more similar to each other than either is to the amino termini of BRA1073 (Fig. (Fig.1D,1D, graphs 2 to 4).
The two regions encoded OMPs that were previously suggested as potentially affecting the pathogenicity or host preference of the brucellae (30).
Sequences surrounding and at the site of the large inversion previously described in Chr II of B. abortus 544 relative to the other Brucella (27) were analyzed in B. abortus. The inversion disrupted B. abortus homologs of ORFs BRA1003 and BRA0235 of B. suis, resulting in four pseudogenes (BruAb2_0230, BruAb2_0231, BruAb2_0943, and BruAb2_0944). The ORFs BRA1003 and BRA0235 encode a putative GAF/GGDEF prokaryotic signaling domain protein and a hypothetical protein, respectively. The sites disrupted in the large inversion relative to B. melitensis are between ORFs BMEII0292 and BMEII0293 and within BMEII1009, a homolog of BRA1003. In B. abortus, a short distance downstream of the large inversion there is an indel of 838 bp. This affected a locus with homology to B. suis ORFs BRA1004 and BRA1005 and resulted in the pseudogene BruAb2_0945 in B. abortus. The finding of a single copy of the sequence 5′-CCA-GCA-CCG-CCT-GC-3′ (bp 949172 to bp 949185) in B. abortus and two copies in both B. suis and B. melitensis separated by 810 bp is consistent with the indel in B. abortus arising from either homologous recombination or slipped-strand mispairing during replication. The inversion site and the 838-bp indel were described recently in B. abortus 2308 (37). The large inversion in the small chromosome is not found in all biovars of B. abortus (27). Though it was found in biovars 1, 2, 3, and 4 by PFGE, we detected it by PCR (15) only in biovars found in the United States, biovars 1, 2, and 4 (1).
Two small inversions were found in B. abortus. An inversion of 2,185 bp that was unique to B. abortus is near a 780-bp indel unique to B. suis (Table (Table2).2). This inversion occurs in a homolog of the B. suis proline dipeptidase BR1062, creating pseudogenes BruAb1_1065 and BruAb1_1067. A second smaller inversion of 2,150 bp was found in both B. abortus and B. melitensis, disrupting B. suis homologs BRA0485 and BRA0487 in these genomes. These ORFs encode a putative protein and a glycosyl transferase family 25 protein, respectively. The glycosyl transferase family 25 protein could possibly affect lipopolysaccharide structure.
Two regions in the Brucella genomes encoding homologs of OMPs predicted to be virulence associated in Brucella (30, 33) were found to have greater sequence variation than that calculated for the genomes as a whole. One of the regions encoded a putative bacterial immunoglobulin-like protein with a group 1 domain (PFAM protein family PF02369) common to bacterial surface proteins invasins and adhesins. The sequence variation affected the sizes of the homologs among the three genomes and shifted the ORF in the carboxy end in B. abortus. The OMP in B. suis, BR2009, is 500 aa, while the OMPs in B. melitensis and B. abortus, BMEI0063 and BruAb1_1984, respectively, are less than 400 aa. The sequence variation resulted in the truncation of the amino terminus of BMEI0063 relative to BR2009 and the carboxy terminus of BruAb1_1984 relative to BR2009. BR2009 and BMEI0063 have proline-rich regions in their carboxy ends. In the proline-rich stretches, 21 of 25 aa are proline in B. suis and 20 of 25 aa are proline in B. melitensis. Due to a frameshift and sequence differences in the carboxy end of BruAb1_1984 from B. abortus, there is a leucine- and histidine-rich region rather than a proline one. Only a few proline- and leucine-rich regions were found by BLASTP, and these are in disparate proteins. These adhesins may affect pathogenicity and host preference (30, 33).
The second variable region encodes an autotransporter in B. suis, BR2013, and putative pseudogenes in B. melitensis and B. abortus. This region had two in-frame stop codons in B. abortus relative to B. suis and is annotated as a pseudogene (BruAb1_1988). In B. melitensis, the homolog of BR2013 (BMEI0058) appeared to be a pseudogene also, due to an in-frame stop codon. Among the three genomes, only the B. suis locus encoded all the functional domains of an autotransporter, which may represent a unique virulence factor of B. suis.
All ORFs that varied in size among the three genomes were compiled (Tables (Tables33 and and4).4). If the differences in sizes of ORFs resulted solely from selection of an alternative start codon in the annotation process, the ORFs were not labeled as being variable. The genome of B. suis was used as the reference for determining which ORFs were variable, because the annotation of its ORFs was more similar with annotation of protein homologs identified by BLASTP searches than the annotation of B. melitensis. There were almost as many variable ORFs on the large chromosomes as there were on the smaller chromosomes, even though the large chromosome has approximately twice the number of ORFs as the small chromosome. As the larger chromosome has many of the genes that encode core metabolic functions of the bacterium (33), mutations in the large chromosome may be selected against or lethal. Furthermore, the small chromosome, which appears to be a stabilized megaplasmid, may have more genes that were acquired horizontally. These genes may not be essential and, thus, might not be under positive selective pressure.
Few of the variable ORFs appear on a list assembled by Letesson and colleagues (9) of 184 genes that were identified in large-scale random screens as affecting the pathogenicity and virulence of at least one Brucella classical species. Two genes that were only variable in B. melitensis are on the list, homologs of B. suis ORF BRA1146 (fliF; M-S ring) and BR1401 (bicA; macrolide efflux). Seven B. abortus genes were on the list: homologs of B. suis BRA0156 (flgI; P-ring), BR0161 (glnL; nitrogen regulatory IIA), BRA0804 (nikA; Ni2+ uptake), BRA0443 (glpK; glycerol kinase), BRA0599 (pyrB; pyrimidine synthesis), BR1084 (caiB domain; CAIB/BAIF family), and BR0181 (cysI; sulfite reductase). Three variable ORFs in both B. abortus and B. melitensis were on this list, homologs of B. suis BR1401 (bicA; macrolide efflux), BRA1132 (flhA; flagellum-related putative export protein), and BRA0311 (hypothetical protein).
Several Chr I ORFs of B. abortus had no homologs in GenBank. These ORFs are referred to here as unique ORFs and are associated with regions containing homologs of phage or insertion sequences. Unique BruAb1_1085 is near a homolog of a site-specific integrase, phage family protein. Unique BruAb1_1088 flanks a resolvase family protein. Unique BruAb1_1833 is within Tn2020 (18). Though B. suis has a homolog of this ORF, it was not annotated. Most of the unique ORFs occur in a 20-kb phage-associated fragment shared only by B. melitensis (BMEI1674-BMEI1703) and B. abortus (BruAb1_0274-BruAb1_0242). BruAb1_0272 and BMEI1774 were annotated at homologous loci but on opposing strands. BruAb1_0272 was annotated on the opposing strand because it appeared to occur in an operon. BruAb1_0246 and BruAb1_0263 in B. abortus have no protein homologs in B. melitensis. Though a number of ORFs had homology to phages, the function of the encoded peptides was rarely known. Thus, the contribution of the phage- and plasmid-related regions to brucella metabolism and infectious cycle are unknown.
Two genomic fragments of 7,738 and 2,653 bp identified as B. suis specific relative to B. melitensis on Chr II by Paulsen and colleagues (33) correlated with the ability of B. suis but not B. melitensis to oxidize ornithine, citrulline, arginine, and lysine and were aligned with the genome of B. abortus. Although the 2,653-bp fragment is present in B. abortus, the 7,738-bp fragment is not. Like B. melitensis, B. abortus also does not oxidize these compounds (1). This supports that the 7,738-bp fragment plays a vital role in these reactions.
B. abortus Chr I has two urease clusters, as described for B. suis (33). B. abortus ORFs BruAb1_0267-BruAb1_0273 and BruAb1_1356-BruAb1_1363 are homologs of B. suis urease cluster 1 ORFs BR0267-BR0273 and urease cluster 2 ORFs BR1356-BR1362, respectively. While there were no pseudogenes in cluster 1 of B. abortus, ureE and ureA are pseudogenes in cluster 1 in B. suis and B. melitensis. While there are no pseudogenes in cluster 2 of B. suis and B. melitensis, ureE is a pseudogene and ureD has a 6-bp insert in the urease cluster 2 in B. abortus. The sequence differences among the clusters correlate with differences in the rate of urea hydrolysis among the three bacteria. Urea is hydrolyzed in 1 to 2 h by B. abortus, compared to 0 to 30 min by B. suis (1). The rate of hydrolysis of urease in B. melitensis is variable, suggesting that at least another locus influences hydrolysis of urea. B. suis infects the urinary tract, while B. abortus and B. melitensis do not. The ability to quickly hydrolyze urea by B. suis may aid in its infection of, and excretion from, the urinary tract and subsequent spread in swine herds.
The number and orientation relative to ORFs of the small, palindromic Bru-RS1 and Bru-RS2 elements were determined. Twenty-two whole or partial Bru-RS1 elements (14) were identified on Ch I and 18 were identified on Chr II. Short ORFs comprised largely of Bru-RS1 or Bru-RS2 elements were not annotated in B. abortus as ORFs but are in B. suis. The Bru-RS elements were not clustered, and their orientation relative to ORFs appeared to be random. A copy of the Bru-RS1 element in B. abortus was identified in one of the homologs of proline racemase (BruAb1_0363) and in a probable transcription regulator (BruAb1_1398). There were nine whole or partial Bru-RS2 elements on Chr I and five on Chr II. None of these elements occurs within genes of known function. The elements, which are just over 100 bp, could affect gene expression.
There were only minor differences in the distribution and presence of transposase-related ORFs among the genomes, and those differences were associated with IS711, also known as IS6501 (16, 32). It was known from previous studies (6, 16, 32) that sequences of IS711 elements are not identical and, though the genomes have copies at the same loci, they have at least one copy at a unique locus also. The unique insertion locus of IS711 in B. suis is on Chr I, whereas the unique insertion loci of IS711 in B. melitensis and B. abortus are on Chr II (Tables (Tables11 and and2).2). As described above, one of the common insertion site copies of IS711 is truncated along with ORF C of an ISBm1 copy in B. abortus relative to B. suis (BRA0551-BRA0559) and B. melitensis.
An analysis of the Brucella sequences suggests that the only mobile transposable element in Brucella is IS711. The B. abortus 9-941 genome has the same number of copies of IS711 as that found in B. abortus 544 (7). There is an additional copy in B. abortus 2308 (7), a biovar 1 strain that is commonly used as a vaccine challenge strain. The rough vaccine strain B. abortus RB51 has one more copy than its parental strain, S2308 (7, 39). Other brucellae have many more copies of IS711 than the sequenced genomes. In Southern blot analyses, B. ovis was estimated to have at least 30 copies (17), and Southern blot analyses have shown that the marine isolates have even more copies than that (5). This insertion sequence has not been identified in other bacterial genera and is most closely related to IS427 from the phylogenetically related bacterium Agrobacterium tumefaciens (32).
The sequence of B. abortus 9-941 was analyzed for loci detected by suppressive subtractive hybridization (SSH) in B. melitensis 16 M but not B. abortus S2308 (37). Homologs of B. melitensis 16 M loci BMEI0888, BMEI0943, BMEI0971, BMEI1055, BMEI1331, and BMEI919 missing in B. abortus S2308 were also missing in B. abortus 9-941. Several homologs not identified in B. abortus S2308 in initial SSH studies were identified in the sequence of B. abortus 9-941, namely BMEI0943, BMEI0971, BMEI1055, BMEII25, BMEI1331, BMEI1380, and BME1919.
SNPs were identified for the shared sequences of the genomes of B. abortus, B. suis, and B. melitensis (Table (Table5)5) by using the Mummer whole genome comparison tool. This analysis identified 7,208 SNP mutations between B. abortus and B. suis genomic sequences, 6,342 SNP mutations between B. abortus and B. melitensis, and 7,844 SNP mutations between B. suis and B. melitensis. The mean for the genomes was 1 SNP for approximately 463 nucleotides.
A rooted neighbor-joining tree showing the evolutionary relationships between the sequenced brucellae genomes was constructed using SNP data with MEGA2. Mesorhizobium loti data were included for use as an outgroup (Fig. (Fig.2).2). By this method, B. abortus was most closely related to B. melitensis, and B. suis was more closely related to B. abortus than to B. melitensis. These results, which were obtained from whole genomic DNA, are consistent with those found from PFGE studies of whole genomic DNA (27). When specific loci are used, the clustering of brucellae is dependent on the loci. For example, clustering of the three brucellae was dependent on whether results from 10 enzymes or 16 enzymes were used to construct a dendrogram using results from multilocus enzyme electrophoresis studies (13). In these studies, B. melitensis and B. abortus clustered when a dendrogram was constructed from 10 loci, but not one with 16 loci.
In summary, the genomes of B. suis, B. melitensis, and B. abortus are very similar in sequence, organization, and structure. Few fragments are unique among the genomes. B. melitensis and B. abortus share more sequences than either does with B. suis. A comparison of the three genome sequences of Brucella gives us a foundation to further our understanding of the Brucella genus and provides the groundwork to investigate the contribution of various pathways to the relative pathogenicity and virulence of these bacteria. The genome sequences allow construction of general brucella microarrays to observe the dance between microbe and host in understanding the course of brucellosis infection.
We thank Nancy Koster, Elizabeth Schmerr, Lindsey Engelby, Kai Tanaka, Richard Thielen, Stan Strum, Donnie Brooks, Aileen Duit, Dani Umbaugh, Daryl Pringle, Scott Farris, Chad Rienke, Rick Hornsby, Xiaowu Gai, and TIGR for technical assistance. We thank Darla Ewalt for biotyping B. abortus 9-941.
Mention of trade names or commercial products herein is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the U.S. Department of Agriculture.