is a betaproteobacterium and member of the Burkholderia cepacia
complex, a group of related bacteria that cause similar diseases in humans and possess wide metabolic and phenotypic diversity. Currently, the only B. multivorans
genome sequence available is ATCC 17616 (accession number NC_010084
), an environmental isolate. B. multivorans
is known to infect lungs of immunocompromised people, such as those with cystic fibrosis (CF) (3
), a hyperinflammatory state, and chronic granulomatous disease (CGD) (1
), a primary phagocyte immunodeficiency. The underlying genomic differences and similarities between the environmental, CF-related, and CGD-related isolates are not known.
We report the genome sequences for 2 CGD isolates, CGD1 (Biodefense and Emerging Infections Research Resources Repository [BEI] NR-20533) and CGD2 (BEI NR-20534), and 2 CF isolates, CF1 (ATCC BAA-247) and CF2 (BEI NR-20532). CGD1 and CF2 were isolated from sputum, and CGD2 was isolated from blood from patients at the National Institutes of Health Clinical Center between 1992 and 2007. CF1, the B. multivorans
type strain, was isolated from sputum from a CF patient in Belgium (6
). Isolates were identified as B. multivorans
at the species level by biochemical and molecular methods. Genomic DNA was prepared with the Qiagen blood and tissue kit according to the manufacturer's instructions. Genome sequences were determined using a combination of Roche 454 GS-FLX Titanium 8-kb mate-pair libraries (~12× coverage) and 100-bp Illumina fragment reads (50× coverage). CGD1 and CGD2 had less 454 Titanium data but had ~10× Sanger coverage using 3-kb and 15-kb mate pairs. Hybrid assemblies using all available sequencing data were generated with Celera Assembler 6.1 (4
). The assemblies of both CGD isolates were improved by closing gaps with directed PCRs and additional Sanger sequencing. The genomes were assembled into 12 (CGD1), 8 (CGD2), 286 (CF1), and 7 (CF2) scaffolds. A total of 92% of the intrascaffold gaps consisted of locations corresponding to tandem repeats with estimated gap sizes of only a few bases. Interscaffold gaps resulted from repetitive elements, such as transposable or phage elements, in the genomes. The two genomes with Sanger coverage (CGD1 and CGD2) had significantly fewer contigs (~40) than those done purely with next-generation sequencing (~500), mostly due to the longer reads that were able to span short tandem repeats. In CGD1 and CGD2, chromosomes 1, 2, and 3 were composed of 18, 5, and 2 contigs and 21, 8, and 2 contigs, respectively. In CF1 and CF2, chromosomes 1, 2, and 3 were composed of 360, 180, and 50 contigs and 295, 140, and 58 contigs, respectively. Except for CF1, all chromosomes were composed of 4 or fewer scaffolds. The CF1 assembly was more fragmented than the other assemblies, likely as a consequence of a poorer mate-pair library. Each genome had 3 chromosomes, with total lengths of 6.6, 6.5, 6.3, and 6.5 Mb for CGD1, CGD2, CF1, and CF2, respectively, and ~67% G+C content. CGD1 possessed a novel ~93-kb virulence plasmid, pBMULCGD1, with ~60% G+C content. Genomes were annotated using JCVI's annotation pipeline (www.jcvi.org
) and submitted to GenBank. Overall, strains had similar numbers of predicted open reading frames (ORFs), with 6,564, 6,651, 6,319, and 6,657 ORFs in CGD1, CGD2, CF1, and CF2, respectively.
No obvious, large genetic variations were present to account for distinct disease associations, suggesting that more-subtle differences are at play. A detailed analysis to identify and characterize these strain-specific characteristics is forthcoming.
Nucleotide sequence accession numbers.
The B. multivorans CGD1 whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number ACFB00000000. The version described in this paper is the first version, ACFB01000000. The B. multivorans CGD2 whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number ACFC00000000. The version described in this paper is the first version, ACFC01000000. The B. multivorans CF1/BAA-247 whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number ALIW00000000. The version described in this paper is the first version, ALIW01000000. The B. multivorans CF2 whole-genome shotgun project has been deposited at DDBJ/EMBL/GenBank under the accession number ALIX00000000. The version described in this paper is the first version, ALIX01000000.