|Home | About | Journals | Submit | Contact Us | Français|
Bacterial infections of the lungs of cystic fibrosis (CF) patients cause major complications in the treatment of this common genetic disease. Burkholderia cenocepacia infection is particularly problematic since this organism has high levels of antibiotic resistance, making it difficult to eradicate; the resulting chronic infections are associated with severe declines in lung function and increased mortality rates. B. cenocepacia strain J2315 was isolated from a CF patient and is a member of the epidemic ET12 lineage that originated in Canada or the United Kingdom and spread to Europe. The 8.06-Mb genome of this highly transmissible pathogen comprises three circular chromosomes and a plasmid and encodes a broad array of functions typical of this metabolically versatile genus, as well as numerous virulence and drug resistance functions. Although B. cenocepacia strains can be isolated from soil and can be pathogenic to both plants and man, J2315 is representative of a lineage of B. cenocepacia rarely isolated from the environment and which spreads between CF patients. Comparative analysis revealed that ca. 21% of the genome is unique in comparison to other strains of B. cenocepacia, highlighting the genomic plasticity of this species. Pseudogenes in virulence determinants suggest that the pathogenic response of J2315 may have been recently selected to promote persistence in the CF lung. The J2315 genome contains evidence that its unique and highly adapted genetic content has played a significant role in its success as an epidemic CF pathogen.
Burkholderia cenocepacia is the most clinically important member of B. cepacia complex (BCC) group of opportunistic pathogens to cause lung infections in cystic fibrosis (CF) patients (83, 140). The BCC (originally described as Pseudomonas cepacia) emerged as significant CF pathogens in the early 1980s when a minority of infected patients exhibited rapid clinical deterioration due to necrotizing pneumonia and sepsis, resulting in early death (54). This fatal decline in clinical condition became known as “cepacia syndrome” and has not been observed with any other CF pathogen. The key determinants associated with this syndrome are not clear; clonal isolates can be isolated from patients with or without cepacia syndrome, suggesting that both bacterial and host factors play important roles in determining clinical prognosis (44, 57). During the 1990s a highly transmissible epidemic B. cenocepacia lineage emerged that was readily spread between individuals with CF (44); multilocus enzyme electrophoresis designated it as electrophoretic type 12 (ET12) (56). Subsequent studies showed that this B. cenocepacia strain was widespread across Canadian (80, 126), United Kingdom (44), and European (140) CF communities and was suggested to have spread through patient-to-patient contacts, including those at CF summer camps (44).
At least 17 species comprise the BCC (79, 141, 142), a diverse collection of genetically distinct but phenotypically similar strains that includes bioremediation and biocontrol strains, as well as plant, animal, and human pathogens (83). Although strains from each of the BCC species have been isolated from CF infection (83), epidemics are largely attributed to B. cenocepacia (25, 27) and Burkholderia multivorans (120) strains. Phylogenetic analysis of the recA gene subdivides B. cenocepacia into four distinct subgroups (141), with subgroup IIIA containing the ET12 epidemic strain, associated with “cepacia syndrome” (54), clinical deterioration, increased mortality (44, 57), and the ability to superinfect over existing B. multivorans lung infection (84). Virulence markers, such as the B. cepacia epidemic strain marker and the B. cenocepacia island (CCI), are more frequently associated with B. cenocepacia IIIA strains than other B. cenocepacia recA subgroups (10). In addition, ET12 isolates possess the cable pilus (83, 137). Unlike other B. cenocepacia subgroups, which can frequently be recovered from the natural environment (31, 70), very few environmental B. cenocepacia IIIA strains have been described (8), suggesting a definite shift from a soil saprophyte to host-associated pathogen lifestyle.
The genome of B. cenocepacia strain J2315, a multidrug-resistant CF patient isolate belonging to the ET12 lineage (44, 100), has been sequenced. The genomic analysis of B. cenocepacia J2315 provides insights into the success of this strain and how the ET12 lineage appears to have recently adapted to its clinical niche in human infection.
B. cenocepacia strain J2315 (CF5610) was isolated in 1989 from the sputum of a CF patient in Edinburgh, who was the United Kingdom index case of the highly transmissible ET12 lineage (44). J2315 is resistant to the aminoglycosides amikacin and tobramycin, the macrolide azithromycin, the β-lactams imipenem and piperacillin, and cotrimoxazole (trimethoprim-sulfamethoxazole) and also exhibits intermediate resistance to the fluoroquinolone ciprofloxacin. Strain J2315 has been deposited as LMG 16656 in the BCCM/LMG Bacteria Collection.
Additional B. cenocepacia strains used in the present study were K56-2*, BC7*, LMG 13307 (BCC0162), CEP0791 (BCC0077), LMG 13320 (BCC0179), FC0504 (BCC0313), LMG 18827* (BCC0016), BCC1261, CEP0826 (BCC0222). (An asterisk [*] indicates strains that are part of the published BCC strains panel ). All strains had previously been sequence typed (9).
Strain J2315 was grown, and DNA was extracted exactly as described previously (82). Sequence data were obtained from 215,165 end sequences (giving approximately 11.9-fold coverage) derived from m13mp18 and pUC18 genomic shotgun libraries (with insert sizes of 1 to 6 kb) using BigDye terminator chemistry on ABI 3700 automated sequencers. A total of 4,300 end sequences from a large insert bacterial artificial chromosome library (with insert sizes of 10 to 20 kb) were used as a scaffold. All identified repeats were bridged by read-pairs or end-sequenced PCR products.
The sequence was annotated by using Artemis software (112). Initial coding sequence (CDS) predictions were performed by using Orpheus (40), Glimmer2 (34), and EasyGene (69) software. These predictions were amalgamated, and codon usage, positional base preference methods, and comparisons to the nonredundant protein databases using BLAST (4) and FASTA (106) software were used to refine the predictions. The entire DNA sequence was also compared in all six reading frames against UniProt, using BLASTX (4) to identify any possible CDSs previously missed. Protein motifs were identified by using Pfam (12) and Prosite (37), transmembrane domains were identified with TMHMM (65), and signal sequences were identified with SignalP version 2.0 (98). rRNAs were identified by using BLASTN (4) alignment to defined rRNAs from the EMBL nucleotide database; tRNAs were identified by using tRNAscan-SE (75); stable RNAs were identified by using Rfam (46).
Comparison of genome sequences was facilitated by using Artemis Comparison Tool (22), which enabled visualization of BLASTN and TBLASTX comparisons (4). Orthologous proteins were identified as reciprocal best matches by using FASTA (106) with manual curation. Pseudogenes had one or more mutations that would prevent correct translation; all inactivating mutations were checked against original sequencing data.
The J2315 genome was compared to B. vietnamensis strain G4 (accession numbers CP000614, CP000615, and CP000616) (97), B. contaminans strain 383 (CP000151, CP000152, and CP000153) (127, 141), B. ambifaria strain AMMD (CP000440, CP000441, and CP000442) (26), B. cenocepacia strains AU1054 (CP000378, CP000379, and CP000380) (http://genome.jgi-psf.org/finished_microbes/burca/burca.home.html) and HI2424 (CP000458, CP000459, and CP000460) (70), B. pseudomallei strain K96243 (BX571965 and BX571966) (51), B. mallei strain ATCC 23344 (CP000010 and CP000011) (99), B. thailandensis strain E264 (CP000086 and CP000085) (148), B. xenovorans strain LB400 (CP000270, CP000271, and CP000272) (23), and Ralstonia solanacearum strain GMI1000 (118).
PCR amplification was performed by using Platinum Pfx DNA polymerase (Invitrogen) according to the supplied protocols, with the optional addition of 1/10 enhancer solution. Amplification consisted of 94°C for 10 min, followed by 40 cycles of 94°C for 30 s, a suitable annealing temperature for 30 s, and 68°C for 1 min per kb. A final extension of 10 min at 68°C was used. The primers, along with the annealing temperature(s) used, were as follows: BCAL3125 (AATCGGAACAGGTTGCACTC and AAACTGGAATGCGAAGATGC), 60°C; BCAL3223 (ACCGATGTCTTCCTGTTTGG and AGCGGATGGTTCTTGATGAC), 60 to 68°C; BCAL3517 (GCACGTTGATTGTTTCTTTGC and AATCGGGATCGACCTTGAC), 63 to 68°C; BCAM0856 (TCGAAATACTTGTGCGCTTG and ACAGGAAGTGGTAGCCGATG), 68°C; and BCAM2228 (GAACCTGACGGTGCTGAAC and GTAGACGGACAGGTCGAAGC), 68°C.
The sequence and annotation of the B. cenocepacia strain J2315 genome have been deposited in the EMBL database under accession numbers AM747720, AM747721, AM747722, and AM747723.
The complete genome of B. cenocepacia strain J2315 consists of three circular chromosomes of 3,870,082, 3,217,062, and 875,977 bp and a plasmid of 92,661 bp (Fig. (Fig.1).1). These four replicons encode 3,537, 2,849, 776, and 99 predicted CDSs, respectively (for a summary of the features of the replicons, see Table Table1),1), of which 126 are pseudogenes or partial genes. Identification of essential genes on chromosomes 2 and 3 led to designation of these components of the genome as chromosomes rather than megaplasmids.
Inter-replicon comparisons revealed very little extended similarity except in the regions of the rRNA clusters (data not shown). Intrareplicon comparison revealed that chromosome 1 contains a 57-kb perfect duplication (BCAL0969 to BCAL1026 and BCAL2901 to BCAL2846). The duplicated regions are at different locations on the chromosome, and each contains 57 CDSs, leading to an increase in the gene dosage of CDSs that encode a diverse collection of functions, including molybdopterin biosynthesis proteins, RNase E, ribosomal protein, fatty acid/phospholipid synthesis proteins, sigma-E factor and regulon, GTP-binding protein LepA, signal peptidase I, RNase III, DNA repair protein RecO, elongation factor P, and CDP-diacylglycerol-glycerol-3-phosphate 3-phosphatidyltransferase.
Analysis of the predicted functions of the CDSs on the chromosomes revealed distinct partitioning of functions, as has been previously seen in other Burkholderiaceae (51): chromosome 1 contains a higher proportion of CDSs involved in core functions (cell division, central metabolism, and other “housekeeping” functions), whereas chromosomes 2 and 3 contain a greater proportion of CDSs encoding accessory functions, such as protective responses and horizontal gene transfer, and a greater proportion of CDSs with unknown functions (Fig. (Fig.22).
The J2315 genome was compared to five other complete BCC genomes in the public databases: Burkholderia vietnamiensis strain G4, an aromatic hydrocarbon degrading environmental isolate (97); Burkholderia contaminans strain 383, a soil isolate (127, 141); Burkholderia ambifaria strain AMMD, a plant-associated biocontrol strain (26); and B. cenocepacia IIIB strains AU1054 (http://genome.jgi-psf.org/finished_microbes/burca/burca.home.html) and HI2424 (70), isolated from a CF patient and soil, respectively. Both of these B. cenocepacia strains are representatives of the B. cenocepacia PHDC clonal lineage (25) that is widely distributed in the United States (71). In addition, the genomes of four other Burkholderia species were compared: Burkholderia pseudomallei (51) and Burkholderia mallei (99), biothreat agents that cause melioidosis and glanders, respectively; Burkholderia thailandensis (148), a soil saprophyte related to B. pseudomallei and B. mallei; and Burkholderia xenovorans (23), a nonpathogenic soil isolate that degrades polychlorinated biphenyl (PCB) compounds. Ralstonia solanacearum (118), a plant pathogenic member of the Burkholderiaceae, was also included as an outlier.
The number of orthologous CDSs identified by best reciprocal FASTA in the genome of J2315 correlated with taxonomic relatedness (Fig. (Fig.3)3) (141); the largest number of orthologs were identified in the BCC (78 to 63% of total CDSs), followed by other Burkholderia species (56 to 50%), and Ralstonia solanacearum (37%). The distribution of orthologs on the different replicons is similar to that seen in other Burkholderia species, where the level of orthology is greatest on the largest chromosome, with the secondary chromosomes being progressively more divergent (51). The relative diversity of the chromosomes was also evident in pairwise alignments that illustrate regions of similarity and the overall genome structure. A comparison of concatenated genomes of B. cenocepacia J2315, B. pseudomallei K96243, and B. xenovorans LB400 shows that, of all the chromosomes, chromosome 1 displays the greatest level of conservation, both in the number and the order of matches (see Fig. S1 in the supplemental material). The largest chromosomes of these three Burkholderia species contain colinear and inverted blocks of similarity, suggesting that these replicons have undergone several rearrangement events since they diverged from a common ancestor. The second chromosome exhibits lower levels of overall conservation. Although there are matches between the third chromosomes of B. cenocepacia and B. xenovorans (the genome of B. pseudomallei only contains two replicons), there is no detectable conservation of gene order.
Pairwise genome comparisons of the B. cenocepacia strain J2315 genome to the other B. cenocepacia strains identified regions of difference (RODs) comprising ca. 21% of the DNA in J2315. These include genomic islands (Table (Table22 and Fig. Fig.1)1) that are likely to have arisen from recent horizontal gene transfer (9.3% of the chromosomal DNA) and encompasses mobile genetic elements (MGEs). In addition to the genomic islands, the J2315 genome contains 79 insertion sequence (IS) elements (Table (Table11 and see Table S1 in the supplemental material). Genomic islands were defined as regions displaying anomalies in %G+C content or dinucleotide frequency signature (which is indicative of very recent lateral transfer) and/or contained CDSs with similarities to genes associated with MGEs such as bacteriophages, transposons, and plasmids. Boundaries of genomic islands were mapped by using comparative genomic analysis. Other RODs in the J2315 genome (11.7% of the chromosomal DNA) include indel regions that represent lineage-specific DNA insertions or deletions and allelic variants that have divergent sequences at the same locus (Fig. (Fig.1).1). For the purpose of our analysis, we have not included RODs that do not include at least one complete CDS. Although these other RODs were identified as being unique in comparison to B. cenocepacia strains AU1054 and HI2424, many of these regions are ancestral regions that have been deleted in the other B. cenocepacia genomes relative to J2315 and the other BCC. For example, if B. contaminans strain 383 is included in the comparison, the unique component of the J2315 genome falls to 7.0% (see Table S2 in the supplemental material).
The ET12 lineage of B. cenocepacia emerged recently and proved very successful at spreading between patients and causing disease (83). A possible contributory factor in its rise could be the horizontal transfer of DNA. Fourteen regions in the J2315 genome were identified as putative genomic islands (Table (Table2),2), all fourteen of which are absent from the genomes of B. cenocepacia AU1054 and HI2424.
The CCI, identified in the J2315 genome as genomic island 11 (BcenGI11; Table Table2),2), has been shown to play a role in infection (10, 84). When originally described, comparative analysis of other BCC genomes was not available to allow the boundaries of the island to be accurately predicted. The current analysis defines the CCI as a 44-kb region, 12 kb and six CDSs larger than its original description (10). New functions attributed to the island include arsenic resistance, antibiotic resistance, ion and sulfate family transporter, and stress response, in addition to the fatty acid metabolism, amino acid transport and utilization, and various regulators that include an N-acyl-homoserine lactone-dependent quorum-sensing system originally described (10).
The genome also provides evidence that the pan genome of B. cenocepacia encompasses elements that may circulate in the wider bacterial population, as several of the genomic islands are similar to elements identified in other Burkholderia species. The J2315 genome contains at least five prophages, one of which, BcenGI1, exhibits extended mosaic similarity to the K96243 prophage in the B. pseudomallei K96243 genome (BP_GI1; see Fig. S2A in the supplemental material) (51).
Comparative genomic analysis identified related genomic islands that integrate at orthologous sites in different species. BcenGI2 is a 16.4-kb genomic island that contains CDSs with similarity to plasmid conjugal transfer proteins, suggesting that it may be an integrated conjugative element (21). BcenGI2 is integrated at a tRNAAla gene and has some similarity to an island (BP_GI11) integrated at the orthologous site in the B. pseudomallei K96243 genome (see Fig. S2B in the supplemental material). This locus may be a hot spot for the traffic of related islands in Burkholderia.
The recent ecology of the ET12 lineage of B. cenocepacia is that of a human-associated pathogen, and as such the genome of strain J2315 appears to be well equipped with functions associated with virulence in the CF lung (for a summary, see Table Table3)3) . There is also evidence in the genome of the wider host associations of this species, underlying its environmental origins. Orthologous matches to putative J2315 virulence factors were found in the other Burkholderia genomes investigated, which included environmental bacteria. For example, ~80% of the J2315 virulence functions have orthologous matches to other B. cenocepacia strains, ~74% have matches to B. contaminans strain 383, and ~68% have matches to B. pseudomallei. Many of these functions therefore represent Burkholderia-wide functions, which may promote survival in challenging and complex environments such as the soil and rhizosphere but may also have utility in the CF lung. In addition, comparative analysis has highlighted virulence determinants in variable regions of the J2315 genome (Table (Table33 and see Table S2 in the supplemental material), suggesting that this strain may have supplemented its core virulence determinants with accessory virulence functions to enhance its disease causing ability.
Exoenzymes produced by B. cenocepacia play an important role in modulating host cell interactions. Two secreted zinc metalloproteases, ZmpA and ZmpB (Table (Table3),3), have been found to have proteolytic activity against a range of host molecules and have been implicated in the virulence of B. cenocepacia (29, 62, 63). Phospholipases are widely distributed in bacteria and have been shown to mediate various cellular functions, including membrane maintenance, cellular turnover, and inflammatory response. Production of phospholipase C is linked with CF isolates from patients of poor clinical status (68). The J2315 genome encodes five homologs of Pseudomonas aeruginosa phospholipase C proteins (Table (Table3).3). The redundancy of these phospholipid-degrading enzymes suggests that there is some functional specificity. To this end, studies of the Plc-1 and Plc-2 phospholipases C from B. pseudomallei (orthologues of which are present in the J2315 genome; BCAL1046 and BCAM2429, respectively) have demonstrated that although they both hydrolyze phospholipid phosphatidylcholine and sphingomyelin, they exhibit marked differences in their cytotoxicity (64).
In addition to the five phospholipase C homologs the J2315 genome contains a putative phosphatidylinositol-specific phospholipase C (PI-PLC; BCAM1969). Analysis of the taxonomic distribution of proteins containing the PI-PLC domain PF04185 reveals a very limited occurrence; among the gram-negative bacteria, Burkholderia is the only genus within which these proteins have been found, and among the gram-positive bacteria, they have been found in an actinobacterium and some firmicutes, including pathogenic bacilli, Staphylococcus aureus, and Listeria monocytogenes (for the species distribution, see http://pfam.sanger.ac.uk/family?acc=PF00388). In this latter group of pathogens, PI-PLCs have been shown to have a role in virulence (146, 149).
The J2315 genome also contains putative secreted proteins similar to virulence factors produced by bacterial plant pathogens associated with the degradation of plant tissue. Two CDSs encode polygalacturonases, exoproteins that degrade pectin, a major component of plant cell walls; BCAM2783 and BCAS0196 are 69.1 and 31.9% identical to polygalacturonases from R. solanacearum (43) and Agrobacterium vitis (50), respectively. In addition to these degradative enzymes, the genome also contains a locus (BCAM0153 to BCAM0156) encoding additional pectin degradation components.
Further evidence of the plant associations of B. cenocepacia are found in the secretion systems. The J2315 genome encodes a type IV secretion system (T4SS) associated with disease in plants; the plasmid-encoded T4SS (Table (Table3)3) secretes plant cytotoxic proteins responsible for plant tissue watersoaking (ptw) on onions (35). In addition, there is a cluster on chromosome 2 (Table (Table3)3) that is similar to the ptw cluster and other T4SSs. The function of this cluster (vir) is unclear: mutants do not affect expression of the ptw phenotype. The organization of the vir cluster is similar to that of the virB clusters of Brucella abortus and Agrobacterium tumefaciens, although it lacks homologues of virB5 and virB7. The cluster contains two additional CDSs between the virB6 (BCAM0328) and virB8 (BCAM0331) homologues, similar to traF (BCAM0329) and traI (BCAM0330) components of the pSB102 plasmid conjugation system. These tra system components are functionally equivalent to virB5 and virB7. It is unlikely that this cluster is functional in J2315, since the virD4 homologue (BCAM0335) contains a frameshift mutation.
The type III secretion system (T3SS; Table Table3)3) is associated with pathogenesis: mutants in the T3SS demonstrate reduced virulence in a murine model of infection (132) and in a Caenorhabditis elegans killing assay (86). T3SSs have been shown to be important in the intracellular survival of several pathogens; however, there is no evidence that this is the case in B. cenocepacia: internalization and survival of J2315 T3SS mutants in macrophages was the same as for the wild type (67).
There are four type V secretion (T5S) proteins in the genome (Table (Table3).3). Two of these autotransporters contain pertactin domains (Pfam PF03212; BCAL3353 and BCAM0183), and the other two contain hemagglutinin repeat domains (Pfam PF05594; BCAM2169 and BCAS0321).
The extensive array of secretion systems has further been enhanced by the identification of a type VI secretion (T6S) system on chromosome 2 (Table (Table3).3). Three T6S system clusters have been identified in P. aeruginosa (33), one of which (HSI-I) has been demonstrated to be essential in the chronic rat lung infection model (108). HSI-I exports Hcp1, a hexameric protein that has been detected in pulmonary secretions of CF patients, and Hcp1-specific antibodies detected their in sera (93).
In contrast to the other pathogenic Burkholderia sequenced, B. cenocepacia J2315 does not exhibit a large redundancy of secretion systems. For example, B. mallei, and B. pseudomallei possess four and six T6SSs, respectively (119), and two and three T3SSs, respectively (51, 99).
Lipopolysaccharide (LPS) produced by B. cenocepacia has an important role in both disease and resistance to antimicrobial peptides. The LPS of the ET12 strain C1359 has been demonstrated to be endotoxic and to stimulate tumor necrosis factor production in greater quantities than that of P. aeruginosa (121). Three clusters in the J2315 genome are associated with the production of the core (BCAL2402 to BCAL2408), O antigen (BCAL3110 to BCAL3125) (101), and lipid A modification (BCAL1929 to BCAL1935) of LPS. However, strain J2315 has lost its ability to make complete LPS O antigen, due to an IS insertion in the glycosyltransferase BCAL3125 (101).
The structure and composition of B. cenocepacia LPS contributes to the intrinsic resistance to aminoglycoside (30, 91) and polymyxin. The presence of 4-amino-4-deoxy-l-arabinose moieties within the inner core region has been shown to reduce the binding of cationic antibiotics such as polymyxin B to B. cenocepacia LPS (74, 122). A locus on chromosome 1 (BCAL1929 to BCAL1935) is similar to a cluster of six CDSs in E. coli and Salmonella that have been shown to direct the transfer of 4-amino-4-deoxy-l-arabinose to lipid A in polymyxin-resistant mutants (134). Interestingly, in pathogens such as Salmonella, polymyxin B acts as an antagonist of LPS pathological activity; however, in B. cenocepacia this antibiotic enhances its activity (122). A recent study showed that this cluster is essential for the viability of B. cenocepacia and that a reduction in viability was accompanied by changes in cell morphology (102).
A cluster associated with the production of cepacian (B. cenocepacia-specific EPS) has been identified (Table (Table3)3) (92). J2135 does not produce cepacian; a CDS (BCAM0856) in the cluster contains a frameshift mutation (11-bp deletion). Although J2315 is described phenotypically as nonmucoid (49), the genome contains several other loci that encode putative EPS-related functions (Table (Table3),3), suggesting that J2315 may have the ability to produce capsular material under some environmental conditions. One of these clusters (BCAL3217 to BCAL3246) (105) is similar to the capsular polysaccharide cluster of B. pseudomallei K96243 (51), containing two regions of similarity (BCAL3217 to BCAL3223 and BCAL3240 to BCAL3246) separated by a block of divergent sequence (BCAL3227 to BCAL3239). This cluster is probably not expressed in J2315 since it contains an IS element that disrupts a putative capsule polysaccharide biosynthesis/export protein (BCAL3223).
The adherence of pathogenic bacteria to host cells is often associated with pili, fimbriae, and adhesins. These surface-expressed structures can modulate interactions with host cells and other bacterial cells and can target cells to a site of infection. Members of the BCC possess an array of different types of appendage pili: electron microscopy studies identified five morphologically distinct classes of appendage pili in BCC strains (42).
The cable pilus is associated with the ET12 B. cenocepacia lineage (130) and modulates binding to host molecules such as cytokeratin 13 (116) and mucins (115) that are abundant in the CF lung. The cable pilus gene cluster is located on chromosome 2 and consists of seven CDSs (Table (Table3),3), four of which encode structural and processing components (cblDCAB) and three of which encode regulatory components (cblRTS). The pili are arranged as large peritrichous individual fibers 2 to 4 μm in length and are associated with a 22-kDa adhesion protein (AdhA) (137). Both the cbl cluster and adhA are in RODs, as is one of the three chaperone-usher-type fimbria clusters contained within the J2315 genome (Table (Table3).3). The first of these clusters encodes a fimbrial protein (BCAL1677) similar to the type I fimbrial protein FimA from E. coli (60). The other two clusters do not contain homologs of characterized fimbrial proteins, although both clusters contain exported proteins (BCAL1826; BCAL2634a and BCAL2635) which may be fimbrial components.
Type IV pili have been shown to modulate a variety of processes, including adhesion, twitching motility, and biofilm initiation and development (20). In B. pseudomallei a type IV pilin deletion mutant (pilA mutant) was attenuated in mouse and nematode models of virulence (36). Type IV pili have also been shown to play a role in the adherence of B. pseudomallei to eukaryotic cells. Microcolony formation is a key process in cell adherence—an ability reduced in pilA mutants (15). In P. aeruginosa, type IV pili have been shown to bind to human epithelial cells (53), as well as induce apoptosis (55). The observation that these pili also bind DNA (143) suggests that they may play an important role in the formation of biofilms in the CF lung, where DNA is an abundant matrix molecule. There are several loci in the J2315 genome that encode components of a type IVa pilus, as well as two clusters that encode Flp-type pili (Table (Table3)3) (58).
The J2315 genome contains eight BuHA family proteins (Table (Table3),3), five of which are unique to J2315 and are present in RODs (see Table S2 in the supplemental material). This family of autotransporting membrane proteins contain a C-terminal YadA domain, together with HIM and Hep_Hag domains, a domain architecture that is shared with hemagglutinins and invasins that mediate bacterial interactions with host cells or extracellular matrix proteins. In B. mallei BuHA proteins expressed in vivo during experimental equine glanders infection were found to be immunodominant (131). The distribution of these proteins is widespread in gram-negative bacteria; however, the genomes of Burkholderia species, especially the pathogenic members of the genus, contain greater numbers of members of this family.
Iron is vital for life; however, much of the iron in the human body is complexed by compounds such as ferritin. In order for B. cenocepacia to survive in the host, iron must be scavenged via the production and uptake of siderophores. Biosynthesis clusters for ornibactin, salicylic acid (SA), and pyochelin siderophores are present in the J2315 genome (Table (Table3).3). B. cenocepacia produce the iron-chelating siderophores ornibactin, pyochelin, and SA in a strain-dependent manner (145). Ornibactin has been shown to be the most important of these in CF lung pathogenesis and consists of a mixture of modified tetrapeptides with three different side groups (128, 145). The ornibactin biosynthetic cluster is located on chromosome 1 (1), whereas the SA and pyochelin clusters are situated on chromosome 2 (111). The ability of J2315 to produce pyochelin is compromised since the pyochelin biosynthesis gene pchF (BCAM2230) contains a frameshift mutation. However, the genes encoding the transport and utilization of pyochelin in J2315 are intact and therefore probably functional.
Flagella have been shown to play an important role in the pathogenesis of B. cenocepacia, contributing to the invasion of lung epithelial cells (133) and modulating the immune response via the Toll-like receptor 5 (138). Five gene clusters on chromosome 1 together encode the components of a complete flagellum system (Table (Table3).3). Two duplicated components of this system are encoded on the other replicons: flagellar basal body protein FlgE2 (BCAM0987; paralog of BCAL0567) and flagellar hook-associated protein FliD2 (BCAS0104; paralog of BCAL0113). In P. aeruginosa two distinct flagellar hook-associated proteins have been identified (6) and shown to be antigenically distinct. In addition to being structural components of the flagella, flagellar cap proteins also bind mucin (7), an important initial event in the colonization of the CF lung. The additional copy of an antigenically distinct fliD therefore provides J2315 with variants, which it may use to evade the host immune system during the initial stage of infection.
Intracellular survival of BCC bacteria within macrophages may contribute to bacterial persistence within the lung and airways of patients with CF and to sustained tissue inflammation (17, 87, 113). Resistance to oxidative stress is often associated with the ability of bacterial pathogens to survival within macrophages. Two of the most potent mechanisms utilized by activated macrophages to kill bacteria involve the production of reactive oxygen and reactive nitrogen oxide species. The detoxification of nitric oxide in Salmonella enterica serovar Typhimurium involves the flavohemoglobin HmpA (129), and the J2315 genome contains homologue of hmpA (Table (Table3).3). The detoxification of superoxide requires conversion of superoxide to hydrogen peroxide, encoded by sod genes, followed by destruction of the hydrogen peroxide by catalases, encoded by kat genes. The J2315 genome contains homologues of sodB (BCAL2757), sodC (BCAL2643), katA (BCAM2107), and katB (BCAL3299), as well as an additional catalase (BCAM0931) and a manganese-containing catalase (BCAS0635). There are also five NRAMP (natural resistance-associated macrophage protein; Table Table3)3) family proteins in the genome. These divalent transition metal transporters are involved in iron metabolism and play a role in bacterial response to reactive oxygen species (59, 144).
Strains of BCC exhibit high levels of antibiotic resistance, so much so that some BCC strains can use penicillin G as a sole carbon source (14). The drug resistances of strains infecting CF patients are often considered markers of mortality and in this way are considered virulence factors. In the BCC, resistance to multiple antibiotics is produced by multiple mechanisms that include alterations in cell permeability, the production of modifying or degradatory enzymes, and antibiotic target alteration. Other mechanisms of resistance may also be related to diminished antibiotic access (16), including drug efflux (150). J2315 is resistant to the aminoglycosides amikacin and tobramycin, the macrolide azithromycin, the β-lactams imipenem and piperacillin, and cotrimoxazole (trimethoprim-sulfamethoxazole). The strain also exhibits intermediate resistance to the fluoroquinolone ciprofloxacin.
Resistance to the β-lactam antibiotics appears to be caused by synergistic mechanisms, including the induction of chromosomal β-lactamases (109, 135) and decreased drug access (5, 104). There are at least four β-lactamases encoded in the J2315 genome, including: two class A, one class C, and one class D (Table (Table4).4). In addition, there are several β-lactamase family proteins containing β-lactamase Pfam domains (PF00144) that may have antimicrobial resistance functions.
Efflux systems can modulate broad-spectrum antibiotic resistance, as well as resistance to specific antimicrobial compounds. Multiple transport systems belonging to six families associated with drug resistance were identified in the J2315 genome: MFS (major facilitator superfamily), ABC (ATP binding cassette) family, RND (resistance nodulation division) family, MATE (multidrug and toxic compound extrusion) family, SMR (small multidrug resistance) family, and fusaric acid resistance family proteins (Table (Table4).4). Some of these families have many members in the J2315 genome; however, it is unclear from in silico analysis alone whether or not they play a role in antibiotic resistance. For example, 16 CDSs were identified in the J2315 genome that encode efflux pumps belonging to the RND family. Two of these CDSs belong to systems that have been shown to be associated with drug resistance in B. cenocepacia: BCAM2550 (ceoB) is a component of a system that encodes chloramphenicol, trimethoprim, and ciprofloxacin resistance (19, 95), and BCAS0765 is associated with resistance to the antibiotics fluoroquinolones, tetraphenylphosphonium, and streptomycin, as well as to ethidium bromide (47). In addition, the genome contains orthologues of RND efflux proteins that have been shown to mediate resistance to antibiotics (2, 77, 78, 90, 94), metals (39, 66, 103), and other antimicrobial compounds (48) in other organisms.
Comparisons with other sequenced strains show that the J2315 genome contains strain-specific CDSs (Table (Table4)4) that may contribute to its elevated drug resistance, for example: a putative fusaric acid efflux system, RND family efflux systems, an aminoglycoside 3′-phosphotransferase, and a multiple antibiotic resistance protein.
Considering the pathogenic pedigree of J2315, it was surprising that several virulence determinants that have been shown to be important for B. cenocepacia pathogenicity were pseudogenes in J2315 (Table (Table5).5). In order to discover how widely distributed these mutations were we screened five of the virulence factor pseudogene loci in B. cenocepacia strains. Multilocus sequence typing was used to select strains as it proved additional resolution for distinguishing strains within, and related to, the ET12 lineage (Table (Table6)6) (9). Four strains belonging to the same sequence type (ST) as J2315 (ST28) were screened, along with two other closely related strains (BCC0016 and K56-2) that are single and double locus variants of ST28 (ST29 and ST30, respectively; http://pubmlst.org/bcc).
Screening of the virulence pseudogenes revealed the likely relative timescales of acquisition of these mutations. For example, pseudogenes that disrupt pyochelin biosynthesis and cepacian capsule functions were identified in all of the ET12 strains tested (Table (Table6),6), suggesting that they occurred in an ancestral strain, whereas the O-antigen cluster, T2SS, and uncharacterized EPS cluster pseudogenes were intermittently distributed, indicating that they are recent mutational events.
The observation of independent mutations in the ET12 strain K56-2 uncharacterized EPS CDS suggests ongoing selection for the loss of this potential virulence function in the CF lung. There is further evidence for pathoadaptation involving exopolysaccharide structures in the PHDC lineage of B. cenocepacia (i.e., strains AU1054 and HI2424). In the B. cenocepacia IIIB strains there are divergent clusters at orthologous loci for the LPS O antigen and EPS. In AU1054, both of these clusters are disrupted by IS element insertions, whereas the HI2424 clusters remain intact.
The modification of core functions via point mutation has also contributed to drug resistance in J2315. Trimethoprim interferes with the action of bacterial dihydrofolate reductase (DfrA), inhibiting synthesis of the essential tetrahydrofolic acid. Members of the ET12 lineage exhibit different sensitivities to trimethoprim (100). To investigate the evolution of trimethoprim resistance in the ET12 lineage, we sequenced dfrA from members of this clonal group that have different trimethoprim MICs (K56-2 and BCC0179, MIC < 2 mg/liter; J2315, BCC0016, and BC7, MIC > 32 mg/liter). J2315, BCC0016, and BC7 all contain a single nonsynonymous nucleotide substitution at codon 99 (CTC to ATC), resulting in a leucine-for-isoleucine substitution. In an experiment with E. coli mutator strains exposed to antibiotics, resistance to trimethoprim was shown to be the result of a single point mutation in DfrA, Ile94 to Leu, for which the equivalent residue in B. cenocepacia is exactly Ile99/Leu99 (89).
B. cenocepacia is a versatile environmental organism that has emerged as an important pathogen of CF patients. Using the J2315 genome we have been able to investigate the genomic basis for the success of this CF pathogen and examine the evolutionary mechanisms that may lead to its emergence and ongoing spread.
Comparative analysis of Burkholderia genomes reveals that horizontal gene transfer has contributed to the genomic plasticity of this versatile group of organisms. The exchange of MGEs and movement of genomic islands facilitates the spread of genes between genetically diverse bacteria, a process which could be advantageous to the bacterium in its existing environment or allow adaptation to new niches, such as the CF lung. The J2315 genome contains 14 genomic islands that are absent from the other B. cenocepacia strains. Some of the islands share similarity with islands in other Burkholderia spp., suggesting that the extent of the B. cenocepacia pan genome extends well beyond that of the species. The acquisition of genomic islands appears to have been seminal in the evolution of the ET12 lineage, introducing functions that promote survival and pathogenesis in the CF lung. One such island is the CCI (BcenGI11). This island plays a role in infection, is ubiquitous in the ET12 lineage, and is more common in B. cenocepacia IIIA strains than IIIB (10, 84). The contribution of the other genomic islands to the virulence and survival of J2315 in the CF lung remains to be resolved, since many of the functions encoded in the genomic islands are associated with enhancing the metabolic repertoire of the bacterium or are unknown.
Evidence of the pathogenic specialization of the ET12 lineage can be found in the other RODs. These regions do not appear to have the properties of MGEs and as such represent more stable components of the J2315 genome, albeit some may have arisen by horizontal gene transfer in the more distant past. Contained within this unique component of the J2315 genome are the cable pilus locus and the 22-kDa adhesion protein AdhA. These proteins bind cytokeratin 13 (116), a cytoplasmic protein that may become surface exposed during the course of chronic infection in CF (114), and also mucins (115), which are produced in abundance in the CF lung due to poor clearance. The cable pilus/AdhA complex is also associated with the ability of B. cenocepacia to bind to CF lung explant tissue (114) and bind and invade epithelial cells (117). Intriguingly, the CDSs encoding these components are at separate loci on chromosome 2, and orthologs are absent from the other BCC strains examined. This suggests that the pilus and the 22-kDa adhesin may have independent origins, but their concurrence in J2315 has resulted in functional synergy. Other virulence functions found within the J2315-specific RODs include surface polysaccharide biosynthesis, BuHA family putative adhesins, chaperone-usher type fimbriae, and a phospholipase C.
In recent years B. cenocepacia strains have acquired additional resistances to antibiotics commonly used in the treatment of CF patients. In particular, strains from within the ET12 lineage have different sensitivities to ciprofloxacin, tobramycin, tetracycline, and trimethoprim (100). In comparison to other members of the ET12 lineage, J2315 has developed enhanced resistance to a number of antibiotics (100). Indeed, we found that the J2315 genome contains drug resistance in genomic islands and RODs, highlighting the role that horizontal gene transfer has played in the evolution of drug resistance in even the most intrinsically resistant of organisms. The genome also provides evidence for the evolution of drug resistance through point mutation, elucidating a nonsynonymous base change in the dihydrofolate reductase gene that generates trimethoprim resistance.
Although the success of J2315 may be in part due to the acquisition of new functions, gene loss via mutation appears to have also played an important role. J2315 is a formidable pathogen of the CF lung; once infected with ET12, the life expectancy of a patient shortens dramatically (57). It is therefore surprising that the J2315 genome contains pseudogenes, formed via both IS disruption and frameshift mutations, in important B. cenocepacia virulence functions, such as O antigen and capsule (Table (Table6).6). Many of the putative virulence determinants identified in J2315 are shared with other BCC strains. These functions may therefore have important roles for the survival of BCC in its natural reservoir rather than in an opportunistic pathogen niche. In the case of J2315, the emergence and patient-to-patient spread of ET12 may mean that many of the functions required for survival in the environment are no longer required and have become superfluous or even disadvantageous. The level of pseudogenes and partial genes in the J2315 genome (1.7% of CDSs) is similar to the level found in most other bacterial genomes (72), suggesting that there is not an elevated level of mutation in this strain.
The screening of the virulence pseudogenes in other ET12 strains showed that some of the J2315 mutations may have occurred early on in the evolution of the ET12 lineage, whereas others represent recent strain-specific mutations. Some of these mutations may therefore represent formative pathoadaptive mutations that contributed to the initial success and emergence of the ET12 lineage, whereas others may be indicative of the ongoing selection pressures in the CF lung.
All of the ET12 strains that we screened contained the same frameshift mutation in the pyochelin siderophore biosynthesis gene pchF. Interestingly, the siderophores produced by CF isolates of B. cenocepacia exhibit strain variation, with SA and ornibactins being the most prevalent, followed by pyochelin (32). In a study that investigated pyochelin production in CF patients from Toronto and Cleveland (125), pyochelin-negative strains were isolated from patients with moderate or mild infections, whereas pyochelin-positive strains were more frequently isolated from patients with severe pulmonary disease going on to suffer high mortality. It is possible that pyochelin production may play an important role in the progress of B. cenocepacia disease in CF patients. Switching off expression of the pyochelin production in ET12 strains may promote persistence in the CF lung and thus the spread of the members of this lineage between patients.
The long-term maintenance of infection in the CF lung may result in the streamlining of a pathogen's virulence and drug resistance functions, since functions required for the initiation of acute infections may be selected against during chronic infections. Evidence for recent pathoadaptive mutations came from the observation of an independent mutation in a glycosyltransferase of an uncharacterized EPS cluster, in K56-2, another member of the ET12 lineage. Further evidence for pathoadaptation involving surface carbohydrates came from the in silico comparison of the B. cenocepacia strains; in the CF epidemic strain there are mutations in the LPS O-antigen cluster and an EPS cluster, whereas the related environmental strain's clusters remain intact. A reduction in glycosylated surface molecules may provide some advantage, such as reducing immunorecognition in the lung, thus promoting the maintenance of a long-term infection. In a study of the EPS production in a collection of 506 B. cenocepacia strains isolated from CF patients in the Vancouver area over a 26-year period, more than half were nonmucoid (151). The study also revealed evidence of phenotype switching in sequential isolates from individual patients, with the conversion from mucoid to nonmucoid being the most prevalent switch. The authors hypothesized that the loss of EPS may reflect adaptation from persistence in the CF lung to increased disease severity.
Evidence for pathoadaptation can also be found in P. aeruginosa, the major pathogen of the CF lung, where the loss of acute virulence determinants has been observed in CF isolates, suggesting that these products are dispensable for long-term maintenance of P. aeruginosa in vivo (76, 147). A recent study by Smith et al. investigated genetic adaptation of P. aeruginosa in CF infections (123). Genomic sequencing of strains isolated from a CF patient 8 years apart, as well as additional chronic infections, identified that virulence factors genes were the most prevalent class of genes mutated during the course of infections. Significantly one of the P. aeruginosa virulence functions that acquired deleterious mutations was the O antigen (123), a function also lost in J2315 (101). One additional virulence mutational adaptation is also shared: mexZ, a negative regulator of the mexXY component of the MexXY-OprM multidrug-efflux pump, is orthologous to a J2315 pseudogene (BCAL1672). In P. aeruginosa, upregulation of this multidrug-efflux pump is associated with resistance to aminoglycoside antibiotics that are routinely used to treat infection in CF patients (124).
Although there are parallels in the potential pathoadaptations of P. aeruginosa and B. cenocepacia, there are also intriguing differences. The high frequency of nonmucoid B. cenocepacia isolated from CF patients (151) is in marked contrast to P. aeruginosa, where isolates from CF patients are more frequently mucoid than nonmucoid (45). In P. aeruginosa the production of the EPS alginate is linked to increased morbidity and mortality (45), whereas strains of B. cenocepacia that are considered to be more virulent, such as those in the ET12 lineage, have been shown not produce EPS (11, 151). These somewhat paradoxical observations point toward subtle differences in the role that the different EPS plays in the mechanism of pathogenicity and host-cell interaction in these two CF pathogens.
The genome sequence of J2315 has afforded a tantalizing glimpse of components of the genome that may promote growth in the CF lung and provided clues to the potency and spread of ET12 in recent decades. Evidence from comparative genomics suggests that loss of functions through mutation and gain of functions via horizontal gene transfer appear to promote growth and persistence in the CF lung and contribute to the success of J2315. Much remains to be learned, however, as the pathology of B. cenocepacia infections and the physiology of the CF lung are both complex. The complete genome sequence will therefore be a valuable resource for future investigation into disease caused by B. cenocepacia.
We thank the Sanger Institute's Pathogen Production Group for shotgun and finishing sequencing and the Informatics Group. We are grateful to the Joint Genome Institute for making the HI2424 and AU1054 sequences available before scientific publication and to Tom Coeyne, Dominic Campopiano, and Alan Brown for useful comments regarding the manuscript. M.T.G.H. thanks Alan Smyth for useful discussions.
This study was supported by the Wellcome Trust through its Beowulf Genomics initiative.
Published ahead of print on 17 October 2008.
†Supplemental material for this article may be found at http://jb.asm.org/.