|Home | About | Journals | Submit | Contact Us | Français|
The entire 127,923-bp sequence of the toxin-encoding plasmid pBtoxis from Bacillus thuringiensis subsp. israelensis is presented and analyzed. In addition to the four known Cry and two known Cyt toxins, a third Cyt-type sequence was found with an additional C-terminal domain previously unseen in such proteins. Many plasmid-encoded genes could be involved in several functions other than toxin production. The most striking of these are several genes potentially affecting host sporulation and germination and a set of genes for the production and export of a peptide antibiotic.
Isolates of Bacillus thuringiensis are the biological control agents most widely used to eradicate insect pests of crops or vectors of human disease. For the latter application, Bacillus thuringiensis subsp. israelensis is the bioinsecticide of choice in programs worldwide to control mosquitoes and blackfly vectors (29). The insect pathogenicity of this bacterium depends on the presence of the pBtoxis megaplasmid (13) that encodes all six of the previously described toxins in this isolate (Cry4Aa, Cry4Ba, Cry10Aa, Cry11Aa, Cyt1Aa, and Cyt2Ba) (7, 18). In addition, the plasmid carries several insertion sequences and encodes two further proteins (P19 and P20) with roles in promoting crystal formation and enhancing cell viability, probably by acting as chaperones (12, 27, 50). The pBtoxis plasmid has been partially mapped (6, 7), but the nucleotide sequence is limited to toxin genes and their flanking regions. Since the toxicity of the B. thuringiensis subsp. israelensis crystal is greater than that of any combination of the known toxins derived from it (9), it seems that other toxins or virulence factors may play a role in the activity of wild-type crystals. One possible source of such additional factors is the approximately 80% of the pBtoxis sequence that has not previously been analyzed. In order to understand fully this highly important virulence plasmid, we have therefore determined its entire nucleotide sequence as presented here.
The pBtoxis plasmid was prepared from B. thuringiensis subsp. israelensis strain 4Q2-72 (also known as 4Q5) and purified on a CsCl-ethidium bromide density gradient as previously described (6).
Plasmid DNA was sonicated and size fractionated on agarose gels. Two libraries were generated in pUC18 using insert sizes of 1.4 to 2 and 2 to 4 kb. Each clone was sequenced once from each end using ABI Big-Dye terminator chemistry on ABI3700 capillary sequencing machines. The final sequence was generated from 1,467 sequencing reads, giving 6.4-fold total coverage. All repeats were bridged by clone end read pairs or end-sequenced PCR products to confirm the assembly.
The finished sequence was annotated using Artemis software (41). Potential coding sequences were identified by codon usage (34) and positional base preference methods and compared to the nonredundant protein databases using BLAST (3) and FASTA (38) software. The entire DNA sequence was also compared in all six reading frames against the nonredundant protein databases, using BLASTX to identify any possible coding sequences previously missed. Exploration of the functions of Bacillus subtilis homologues was facilitated by the Subtilist database (33). Protein motifs were identified using InterPRO (5), transmembrane domains were identified with TMHMM (23), and signal sequences were identified with SignalP version 2.0 (35).
Oligonucleotide primers (2.5D, CAGCTTCTTTCGAACATAAGAAGTC, and 2.5R, GATCTCGAAGTATTCTTATATCTGC) were designed from part of the pBt007 sequence in order to produce a 613-bp amplicon in PCRs under the following conditions: 95°C for 180 s, 48 to 54°C for 90 s, and 72°C for 120 s for 1 cycle and 94°C for 45 s, 48 to 54°C for 50 s, and 72°C for 90 s for 29 cycles. DNAs from vegetative cells from a variety of B. thuringiensis strains were added to the PCR mixtures as template DNA, and the resulting products were analyzed by agarose gel electrophoresis. Most standard B. thuringiensis strains were kindly supplied by D. R. Zeigler (Bacillus Genetic Stock Center, Columbus, Ohio).
The full-length 127,923-bp pBtoxis sequence and annotation (Fig. (Fig.1)1) has been deposited in the EMBL database under accession number AL731825.
In silico restriction analyses of the complete 127,923-bp pBtoxis sequence agree with the previously published map (7), except that all of the predicted restriction fragments are slightly smaller than previously estimated, consistent with the slightly smaller overall size of the plasmid (128 kb compared to the 137 kb proposed). The placement of genes on the restriction map agreed with those detected in the sequence, with the exceptions of cyt2Ba, cry4Ba, and cry10Aa, which are in the same positions but inverted in order and orientation. pBtoxis properties are summarized in Table Table1,1, and predicted genes are described in Table Table2. 2.
The pBtoxis coding sequence (CDS) pBt054 is a previously uncharacterized CDS that encodes a protein of approximately 60 kDa, which is related at its N terminus to the known Cyt toxins of B. thuringiensis. Comparison of this region of the CDS to known Cyt proteins indicates that it could represent a new subdivision of this family, Cyt1Ca according to the conventional B. thuringiensis toxin nomenclature (10; http://www.biols.susx.ac.uk/Home/Neil_Crickmore/Bt/index.html), although confirmation of this provisional name awaits further experimental evidence of its properties. The pBtoxis CDS is, however, unusual in another way. Whereas previously recognized members of the Cyt family are proteins of approximately 26 to 28 kDa, pBt054 represents a fusion between the Cyt1Ca-like region at the N terminus and an extra domain at the C terminus. The last 280 amino acids (aa) of this C-terminal domain appear to be tandem beta-trefoil modules like those found in other bacterial toxins, such as ricin, Clostridium botulinum neurotoxin, and the mosquito larvicidal Mtx1 toxin from Bacillus sphaericus (46). This superfamily of motifs is implicated as likely carbohydrate binding moieties, so one possible function for the C-terminal region of pBt054 could be recognition of carbohydrate groups on toxin receptors.
In addition to the complete toxin CDSs, pBtoxis also contains short sequences encoding fragments of toxins. pBt025 and pBt026 encode two segments with homologies to the center region of a Cry28Aa-like toxin, while pBt053 appears to encode a sequence with homology to the extreme C terminus of a Cry26Aa-like protein (49). In addition, the amino acid sequence encoded by pBt055 is similar at its C terminus to proteins encoded upstream of toxin genes (e.g., a hypothetical 29.1-kDa protein in the cry2Aa 5′ region of Bacillus thuringiensis subsp. kurstaki), while its N terminus is similar to that of Cry11Bb. These apparent cry toxin gene remnants suggest that during the evolution of pBtoxis, its ancestors have been host to other toxins now lost. This suggests that toxin composition is a dynamic factor and may help to explain the great diversity in toxin composition observed in B. thuringiensis isolates. The fact that these remnants are located close to CDSs with possible roles in transposition (pBt052 is similar to IS240-A; pBt027 and pBt028 are similar to IS231W sequences) implies that transposition is the most likely mechanism for this effect, and this is consistent with previous observations that B. thuringiensis toxin genes may be flanked by transposase sequences (26). In total, over 23% of the genes on pBtoxis show similarity to transposon-related genes, indicating that a considerable amount of DNA exchange has occurred in the evolutionary history of pBtoxis. As previously reported (11, 47), the cry10Aa gene (pBt047) is similar to the 5′ end of other cry genes and encodes an ~78-kDa protein that would appear to be truncated compared to related Cry proteins. This gene is followed by a second CDS (pBt048) with similarity to the 3′ ends of other cry genes. The intervening 67 bp contains at least two stop codons in each of the three reading frames and causes disruption of what may once have been a single CDS to produce two CDSs. However, protein derived from cry10Aa (pBt047) has been identified in B. thuringiensis subsp. israelensis inclusions (16), indicating that this CDS is not a pseudogene remnant.
In addition to Cry and Cyt toxins, B. thuringiensis strains, like the closely related Bacillus cereus, are known to produce other potential virulence factors, including phosphatidyl inositol-specific phospholipase C, that may contribute to the role of the spore in overall toxicity (42). The expression of the genes encoding these factors is activated by the PlcR regulator protein that binds to the palindromic sequence TATGNAN4TNCATA (2). It appears that the pBtoxis plasmid encodes a separate, extrachromosomal copy of a phosphatidyl inositol-specific phospholipase C (pBt087), although the presence of an in-frame TGA stop codon indicates that this either is a pseudogene or is expressed by translational read-through. Inspection of the upstream control region for this gene also reveals no PlcR binding site. The PlcR binding palindrome does, however, occur within pBtoxis between two divergent groups of CDSs which appear to be part of the peptide antibiotic production and export system (pBt130-134 and pBt136-140; see below). The significance of this is unclear.
Analysis of the plasmid revealed many other genes that may have significant effects on several aspects of the phenotype of the host organism, the most striking of which are potentially involved in sporulation and germination.
The apparently cotranscribed genes pBt084, pBt085, and pBt086 are similar to several operons encoding germination complex genes. pBt086 is similar to the A integral membrane component (e.g., gerAA ), pBt085 is similar to the B integral membrane component (e.g., gerAB ), and pBt084 is similar to the C lipoprotein component (e.g., gerAC ). These components form membrane-associated complexes that allow the spore to respond to different germination signals (32). The putative plasmid-encoded complex composed of pBt084, pBt085, and pBt086 might enhance the response of the host to known germinants or allow it to recognize a novel germination signal. The Bacillus anthracis toxin-encoding pXO1 plasmid also encodes a set of germination proteins, GerXB, -A, and -C, and these have been shown to be important for the virulence of the host (19). Of the three, only the pXO1 gerXA gene shows significant similarity to the pBtoxis gene pBt086. Intriguingly, the remnants of a second germination complex are also present in the form of two interrupted and truncated pseudogenes, pBt060 and pBt063, representing the A and B components of such a complex. This suggests that, as with the toxin genes themselves, the plasmid may have carried a different repertoire of germination genes in the past.
pBt031 shows significant similarity to many cell wall hydrolases, both phage and chromosomally encoded, and appears to contain a direct-repeat peptidoglycan binding domain at its C terminus. One homologue of this protein, CwlM, is sporulation specific in B. subtilis, raising the possibility that pBt031 might also be involved in sporulation. The product of pBt145 is a homologue of CotN, a secreted protein that has been shown to be incorporated into, and potentially be involved in the production of, the B. subtilis spore (43, 45).
pBt094 and pBt148 encode homologues of the B. subtilis transition state regulatory protein AbrB, which is known to be involved in the regulation of postexponential expression and the early events leading to sporulation (15). The B. subtilis genome also includes a second arbB-like gene, abh (24), suggesting that the putative redundancy or complementarity of these chromosomal regulators may be supplemented by additional plasmid-borne genes in B. thuringiensis. Divergently transcribed from the plasmid-borne arbB-like gene is pBt095, a homologue of the ynzD gene of B. subtilis whose product has been identified as an aspartyl phosphatase which has direct effects on sporulation efficiency (39).
Taken together, the presence of these genes indicates that pBtoxis may exert a considerable influence on the sporulation and germination processes of its B. thuringiensis host, and this possibility is under experimental analysis.
In addition to the putative sporulation-regulatory proteins described above, pBtoxis encodes a number of other potential transcriptional regulators. pBt108 is a predicted sigma factor that shows homology to sigma E, which is known to be associated with the transcription of cry4Aa (51, 52), cry4Ba (53), cry11Aa (12), and both cyt genes (8, 18) in B. thuringiensis subsp. israelensis. This sigma factor is involved in transcription within the early mother cell (approximately 3 h into sporulation)—the time at which crystal formation is also occurring within the mother cell. pBt091 and pBt149 are members of the ArsR family and show similarity to the pXO1 genes pXO1-109 (PagR), a regulator of transcription of the B. anthracis protective antigen (22), and pXO1-138. pBt102 contains a GntR family regulator fused to an aminotransferase domain and has homologues in B. subtilis (YdfD) and many other bacterial genomes. Other genes that are predicted to encode regulators include pBt014, a member of the PbsX/Xre family of regulators with some similarity to B. subtilis SinR, a global regulator of post-exponential-phase response genes (17, 28); pBt158, a member of the MerR family; and the genes pBt157 and pBt011, which both contain predicted helix-turn-helix domains but have no significant database similarities.
Aside from the genes for these transcriptional regulator proteins, pBtoxis contains two genes, pBt093 and pBt147, with similarity to the bacterial RNA-binding protein Hfq, a regulator of mRNA poly(A) tails (20), and a gene, pBt092, which encodes a member of the bacterial histone-like protein family, HU.
One of the more surprising determinants carried by the plasmid is pBt136-140, a set of genes that appear to be involved in the production and export of a peptide antibiotic. Several of these are similar in order and orientation to those in an operon from Enterococcus faecalis responsible for the production and secretion of the ribosomally synthesized circular peptide antibiotic AS-48 (31). AS-48 is apparently produced by the circularization of a propeptide produced by the removal of a 35-aa signal sequence (30); pBt136 encodes a protein similar in length and sequence to the processed propeptide of AS-48. The next two genes, pBt137 and pBt138, encode predicted integral membrane proteins similar to AS-48B and AS-48C, which have been suggested to be involved in the maturation and secretion of the antibiotic. pBt139 encodes an ABC transporter ATP-binding protein with some similarity to AS48-D, and pBt140 is predicted to encode an integral membrane protein which is presumably part of the same system. Interestingly, there is no homologue of AS-48C1, which is the only gene shown to be indispensable for immunity to AS-48. No other potential immunity proteins could be identified.
Divergently transcribed from these genes are pBt133 to pBt130, encoding the components of an ABC transport system: an exported solute binding protein (pBt133), an ATP-binding protein (pBt132), a permease protein (pBt131), and a predicted integral membrane protein (pB130). These resemble many predicted components of ABC transporters from microbial genomes, with little evidence of their specific functions. However, the first three components do show weak similarity to BacG, BacH, and BacI, encoded by genes downstream of the bacteriocin 21 production and secretion genes from E. faecalis plasmid pPD1 (which are nearly identical to the AS-48 genes described above ). These bac genes are necessary for full bacteriocin 21 expression, and the pBt genes, therefore, may also be involved in the production or secretion of the putative peptide antibiotic.
The genes pBt096 to pBt101 encode a series of proteins with diverse database similarities and protein motifs. All seem to be involved in some way in amino acid metabolism. The first gene in the cluster, pBt101, encodes a protein with weak similarities to diverse kinase proteins; the second, pBt102, is weakly similar to (although considerably smaller than) a number of alanyl-tRNA synthetases and contains a class II tRNA synthetase PFAM domain. Although it is highly unlikely to be a tRNA synthetase, it could potentially encode some form of amino acid transferase or ligase activity. The third gene encodes a small hydrophobic protein with no database matches, while the fourth, pBt098, is a member of the pyridoxal phosphate-dependent enzymes and has similarities to many O-acetyl serine lyases (cysteine synthases); again, this is probably not a cysteine synthase but may be involved in amino acid modification. The product of pBt097 is predicted to be an aminotransferase with similarities to many characterized and predicted aminotransferases in the database, including the Escherichia coli MalY protein, a bifunctional protein with cysteine lyase activity, and several aspartate aminotransferases. The last gene in the cluster, pBt096, encodes a predicted integral membrane protein with similarity to many predicted transporters. Divergently transcribed from these proteins is the predicted regulator pBt102, which contains an aminotransferase domain and might be involved in the regulation of these genes.
Two possibilities could be suggested for the functions of these proteins. They may enable the uptake from the environment of an amino acid or an amino acid homologue and its utilization as an energy or carbon source, or they may be responsible for the production and export of an amino acid or amino acid homologue. It is known that amino acids can act as germination signals in bacilli (44), and it is possible, therefore, that these genes are involved in producing a novel sporulation signal. Although this is speculation, it does fit well with the presence of other predicted sporulation and germination determinants on this plasmid.
Analysis of the GC skew of the plasmid (25) indicated a potential origin of replication near base 1 of the sequence (Fig. (Fig.2).2). Although no replication proteins could be identified through database comparisons, the CDS to the right of this region (pBt001) showed >78% amino acid identity with pXO1-49, which is located close to a similar putative replication origin of pXO1, which we predict by GC skew analysis may be between bases 60955 and 62192 (36). pXO1-49 is shorter than pBt001, due to the predicted use of a later start codon; however, the upstream start codon equivalent to that predicted for pBt001 is present in pXO1. It is therefore possible that this protein, which has no other similarities in the database, may be involved in plasmid replication. Also close to this putative origin, on the opposite side, is pBt156, which shows weak similarities to FtsZ/tubulin-like proteins from Pyrococcus (EMBL number AB031743; 21% identity in 394 aa) and to pXO1-45 (21% identity in 444 aa), which is similarly located in pXO1. Proteins of the FtsZ family are known to be involved in cell division (21), forming a ring structure at the dividing septum, and it is therefore possible that pBt156 may play some role in plasmid partition. Previous studies have suggested that the pXO1 origin might lie between bp 86249 and 97209 (36, 40); the large majority of this region shows no similarities with pBtoxis, except in the first CDS, pXO1-72, a conserved hypothetical CDS which shows partial matches to pBt035, pBt067, and pBt127.
Possible similarities between pBtoxis and other B. thuringiensis plasmid sequences in the database were analyzed by BLAST comparisons. The only significant match was between pBtoxis pBt010 and an unannotated CDS of unknown function in pTX14-3 from B. thuringiensis subsp. israelensis (4) (44% identity in 84 aa). No other database matches for these sequences exist, so the physiological functions, if any, of the sequences cannot presently be judged. No significant matches were found between pBtoxis and pBMB9741, pBMB2062, pTX14-1, pHD2, or the miniplasmid submitted under accession number S49203, and similarity with plasmid pGI2 was limited to a transposase gene.
Overall, 29 of 125 predicted pBtoxis proteins show detectable similarity to predicted proteins from pXO1 (Table (Table2)2) (36). Excluding the transposon- or insertion sequence-related proteins, only 17 of the predicted pBtoxis proteins are similar to predicted proteins from pXO1. This corresponds to the results of a previous study looking at conservation of pXO1 genes in a variety of Bacillus species (37): between 1 and 53 pXO1 genes were found to be present in different B. thuringiensis strains by hybridization and PCR experiments.
Most isolates of B. thuringiensis, like B. thuringiensis subsp. israelensis, encode their insecticidal toxins on extrachromosomal elements. Since pBt007 was found to be conserved between pBtoxis and pXO1-16 (96% identity in 569 aa), its distribution in other B. thuringiensis strains was also investigated by PCR, as described in Materials and Methods. As expected, no amplicons were produced from the primers when the negative control B. thuringiensis subsp. israelensis strain 4Q7, a strain cured of pBtoxis, was used. PCR also produced no prod-uct from the following B. thuringiensis isolates: Bacillus thuringiensis subsp. dakota [Oats43(4R1)], Bacillus thuringiensis subsp. kyushuensis [HD541(4U1)], Bacillus thuringiensis subsp. morrisoni [HD12(4K1)], Bacillus thuringiensis subsp. tenebrionis, and Bacillus thuringiensis subsp. tohokuensis [78-FS-29-17(4V1)]. This may reflect the absence of homologous sequences in these strains, or it could be the result of an alteration in nucleotide sequence in the regions corresponding to one or both of the test primers. However, the existence of pBt007-homologous sequences was revealed by the production of ~600-bp amplicons (results not shown) in the following B. thuringiensis isolates: Bacillus thuringiensis subsp. aegypti (from commercial Agerin powder), Bacillus thuringiensis subsp. aizawai [HD133(J3)], Bacillus thuringiensis subsp. galleriae (HD155), Bacillus thuringiensis subsp. indiana [HD521(4S2)], B. thuringiensis subsp. israelensis [IPS70(4Q3)], B. thuringiensis subsp. israelensis [HD500(4Q2)], B. thuringiensis subsp. israelensis [HD567(4Q1)], Bacillus thuringiensis subsp. jegathesan, Bacillus thuringiensis subsp. japonensis [T23001(4AT1)], Bacillus thuringiensis subsp. kenyae [HD136(4F1)], Bacillus thuringiensis subsp. kumamotoensis [HD867(4W1)], B. thuringiensis subsp. kurstaki [HD1(4D1)], B. thuringiensis subsp. kurstaki [HD73(4D4)], Bacillus thuringiensis subsp. medellin, Bacillus thuringiensis subsp. thuringiensis [HD2(4A3)], Bacillus thuringiensis subsp. tochingiensis [HD868(4Y1)], Bacillus thuringiensis subsp. tolworthy [HD125(4L1)], and Bacillus thuringiensis subsp. wuhanensis [HD525(4T1)]. This indicates that the pBt007/pXO1-16-like sequence is widespread in B. thuringiensis isolates, and we speculate that it is likely to be associated with the virulence plasmids in all of these strains. In addition, an amplicon of the same size was also produced from the house fly-toxic Bacillus cereus subsp. moritai (originally named Bacillus moritai ), perhaps indicating that this isolate should again be reclassified as B. thuringiensis subsp. moritai.
This project was supported by funding from the Wellcome Trust through its support of the Sanger Institute Pathogen Sequencing Unit, the Royal Society (C.B.); a grant (97-00081) from the United States-Israel Binational Science Foundation (BSF), Jerusalem, Israel (A.Z.); and a postdoctoral fellowship (E.B.-D.) from the Israel Ministry of Science.