We have taken advantage of sequence reads collected at HGSC during the bovine genome sequencing project to construct the complete DH10B genomic sequence by using a new desktop sequence assembler, SMGA. The sequence is informative in the understanding of both the biology of DH10B and the intended and unintended changes that can result during extensive strain construction using classic genetic methods.
Although the alleles targeted during the various construction steps are basically as expected, the
deoR gene is a significant exception. The mutation of
deoR was thought to be responsible for the enhanced transformation efficiency of DH10B (
17,
19), but its sequence is unambiguously wild type. Mutations of
deoR were originally isolated by selecting for mutants that grew rapidly on inosine but not uridine, due to the constitutive activation of the
deoCABD operon (
31). Using a similar selection scheme, Hanahan observed that the fast-growing strains also had higher transformation efficiencies of large plasmids and assumed that this was also caused by mutation of
deoR (
19a). The wild-type
deoR locus indicates either that the two phenotypes (fast growth on inosine and high transformation efficiency) are completely separable or that another undefined locus (or loci) is responsible. In favor of the latter possibility, the same selection scheme was used to independently isolate the highly transformable DH5 strain (
19a) which has now also been shown to contain a wild-type
deoR gene (Invitrogen Corp., unpublished results). Interestingly, the multiple-deletion strain, MDS42, has transformation properties similar to those of DH10B (
33), raising the possibility that they may share a common subset of mutated genes accounting for the phenotype. Even when pseudogenes and phage elements are excluded, there are still 52 mutated genes in common to be investigated. A systematic investigation of this set and its effect on transformation efficiency is now possible.
The 13.5-fold-higher mutation rate in DH10B than in MG1655 is entirely due to increased IS transposition. This is consistent with previous findings showing a high incidence of IS transposition into eukaryotic BAC library clones (
25). In those studies, IS
10 was the most frequently observed element in the BAC clones, while IS
150 transposition predominated in our study. It is important to note that
cycA does not contain a preferred IS
10 target site (
18,
25), so that IS
10 transpositions are expected to be rare. No target site specificity has yet been reported for IS
150 or other IS
3 family members, but this is under investigation in a separate project. IS
150 transposase levels and/or activity could be elevated in DH10B. IS
150 transposase production is regulated by a highly efficient programmed translational frameshifting mechanism (
42), although precise details of the frameshifting mechanism are only now emerging. No obvious connections with the DH10B genotype are evident.
Conserved IS
5 elements were likely involved in creating the large tandem duplication that doubles the gene dosage of 106 genes. Such duplications are quite common in
E. coli and
Salmonella enterica serovar Typhimurium, probably due to RecA-dependent unequal-sister-strand exchanges between repeated sequences (
36), although they are lost at high frequency unless they confer some selective advantage under the given growth conditions. While
recA1 in DH10B would allow fixation of the duplication even in the absence of a selective advantage, the three construction steps following introduction of
recA1 (Fig. ) employed a wild-type RecA-expressing plasmid that was subsequently cured from the strain. This implies either that the duplication arose very late in the construction and was fixed by curing of the
recA plasmid or that it confers a selective advantage for growth on complex media (e.g., Luria-Bertani or tryptone broth) that are generally used for culturing DH10B. One candidate operon for positive selection is
gltLKJI, which encodes the glutamate-aspartate ABC family transporter (
28). Cells growing in complex media consume available amino acids in a sequential fashion, with serine and aspartate being used during exponential growth and others such as glutamate used in the transition to stationary phase (
4,
34). Doubling the expression of
gltLKJI could enhance the uptake of these amino acids, providing a growth advantage.
The range of nutrients that DH10B can utilize is limited by the deletion of numerous metabolic pathways. Nevertheless, DH10B requires only leucine as a supplement for growth on minimal media with a suitable carbon source. The consistently lower growth rates observed with DH10B cultures compared to MG1655 are likely a consequence of elevated basal (p)ppGpp levels caused by the
spoT1 allele as seen in different backgrounds (
14,
26). Consistent with the “relaxed” phenotype imparted by
relA1, however, DH10B does exhibit extensive growth lags during nutrient downshifts (data not shown).
Classical genetic strain construction has been an invaluable tool in elucidating much basic molecular biology. DH10B can be considered an extreme case by the extent of manipulations and the resulting changes to the genome. While most of the changes do not impact the strain itself, IS transposition into cloned fragments propagated in this host sends a strong cautionary signal regarding uncharacterized genomes. Recent advances in sequencing technology and strain construction are now allowing such issues to be eliminated (
33).