The polymorphisms of 48 potential VNTR loci previously identified by us (27
) were simultaneously investigated and characterized here. It was demonstrated in this study that among M. tuberculosis
isolates, there were variations of the repeat units in the majority of the repeats (39 loci), including VNTR in 9 novel loci. Most VNTR studied here were present in the noncoding or intergenic regions of the M. tuberculosis
chromosome. Intriguingly, almost all of the coding tandem repeats were also variable and therefore likely to code polymorphic proteins. These polymorphic proteins may play a role in the adaptive mechanism of M. tuberculosis
, for instance, in protecting the bacterium from the defensive barriers of the human host. Examples of the VNTR found in the possible surface protein coding regions of M. tuberculosis
were recently reported (26
). The polymorphic proteins caused by these VNTR may be potential sources of antigenic variation allowing the bacteria to evade the immune response.
Short mutations in the different copies of the VNTR sequence, which consist of few substitutions, insertions, and deletions (indels), could be seen in many VNTR loci of M. tuberculosis
). In contrast, long indels in the VNTR sequence were rarely found. To our knowledge, up to now, only one VNTR locus (VNTR0580 or MIRU4) has been reported to have such a long mutation, which was confirmed in this study (27
). This study revealed an additional locus (VNTR4052) having a 54-nucleotide deletion in the VNTR and the nearby flanking sequence. Sequence analysis revealed that the deletion at VNTR4052 was most likely caused by the homologous recombination of the perfect 18-bp direct repeats lying exactly at the boundaries of the deleted sequence (Fig. ). The proportion of the isolates containing the deletion in the allele of VNTR4052 was relatively low and had no correlation with IS6110
patterns, suggesting that deletion in this region may not have resulted from a single mutational event. Although VNTR4052 resided in a possible coding unit (Rv3611), such internal deletion did not affect the translational frame of the remaining coding sequence. Internal deletion was also observed in alleles of VNTR0580 in this study, possibly associating with the notable pentaguanine (G5
) direct repeats lying at the border of the deleted sequence (Fig. ). However, the underlying cause of this deletion could not be explained by the general recombination process.
The deletion in VNTR4052 also suggests that homologous recombination between different repeat units of the VNTR sequence can occur and contribute to the allelic variation of VNTR. The recombination can either reduce or increase the copy number. The latter case occurs if a double crossover between two DNA strands occurs after the chromosome replication fork has passed the site but before the two daughter chromosomes have been partitioned into the two daughter cells. Alternatively, the addition and deletion of VNTR copies could be generated from the DNA polymerase slippage process, as also suggested from a previous study (32
). This mechanism is typically characterized by the stepwise alteration of the complete repeat unit in the variable alleles. In this study, many alleles of VNTR loci had only the incomplete copy and lacked a complete repeat unit. The isolates without the complete repeat unit should not be able to regain the variability property, and therefore VNTR in these loci are practically lost.
The polymorphism of each tandem repeat locus was found to be different. The degree of polymorphism may relate inversely with the selective pressure acting on each locus. However, there appeared to be no differences between the degrees of polymorphism of VNTR residing inside and outside coding sequences. In the absence of selective constraints, one may assume that the diversity of a particular VNTR is correlated with its evolutionary rate. Therefore, a VNTR locus with a high level of polymorphism would be assumed to have a faster evolutionary rate than a locus with a lower level. If so, the rate seemed not to relate to the length of the repeat unit or the position in the genome.
Many VNTR loci in this study had moderate or high allelic diversity (PIC ≥ 0.4). These VNTR loci are useful for the differentiation of M. tuberculosis
strains. It was shown in this study that the 10-VNTR set had a resolution comparable to standard IS6110
RFLP typing when tested with the panel of M. tuberculosis
isolates containing high copy numbers (more than five) of IS6110
; meanwhile, this VNTR set could remarkably differentiate the isolates having only one IS6110
copy. These findings are similar to those of other studies using different VNTR typing sets and M. tuberculosis
isolates from various geographical origins (18
), which also suggests that VNTR typing is suitable for use in the global epidemiological study of tuberculosis.
Molecular genotyping based on VNTR-PCR analysis has several advantages over standard IS6110
RFLP and other typing methods (16
). Also, an individual VNTR locus can be examined independently, giving the investigators the flexibility of using it to modify and improve their own typing format. We found that many VNTR had biased diversity for each IS6110
RFLP type; for instance, VNTR3232 gave the maximal diversity for Beijing strains, whereas this locus showed very low or no discriminating value for the remaining RFLP types. This strain-dependent property of VNTR can be beneficial to the investigation of an outbreak caused by a particular IS6110
type of M. tuberculosis
VNTR-based analysis can be used to infer the phylogenetic relationships of M. tuberculosis
strains. A dendrogram constructed from the 10-VNTR set agrees with the hypothesis that M. tuberculosis
may have evolved from two separate lineages, the high- and low-IS6110
-copy-number isolates. In this study, the high-IS6110
-copy-number group was mostly composed of the isolates having >10 IS6110
copies (Fig. ). All Beijing isolates were included in this high-IS6110
-copy-number group, which conforms to the notion that the Beijing strains constitute a homogeneous M. tuberculosis
family. At the same time, the isolates in the low-copy group comprised those possessing one to four copies of IS6110
as well as those having five to eight IS6110
copies. The latter were classified in the high-copy lineage in a previous analysis (18
In this study, many VNTR loci exhibited significant copy number correlations. These correlations may be caused by several factors. We observed that correlations of VNTR loci could occur between loci with different degrees of allelic diversity. It is probable that the correlations between VNTR loci having low allelic diversity may represent false correlations occurring from the combined effect of the small number of allele combinations between the VNTR loci and the biased allele frequencies of the VNTR loci. In contrast, significant relationships of the VNTR loci having relatively high allelic diversity were unlikely to occur artificially due to a large number of allele combinations between the VNTR loci and should represent the actual correlations. This may happen because the bacterial populations in the study were of multiple clonal origins and M. tuberculosis
possesses few mechanisms for horizontal genetic exchanges. If the number of copies of the repeats varies gradually, the correlation of the lengths between some sites may just be the result of the fact that the ancestors originally had VNTR of different lengths at those sites. Alternatively, the correlated VNTR may be subject to the same biological constraint, such as binding to the same proteins, and therefore tend to evolve in the same way. Similar associations between different VNTR loci were also previously recognized in M. tuberculosis
Correlations of VNTR could limit the number of possible patterns among M. tuberculosis isolates. As a consequence, these relationships would reduce to some extent the potential discriminating power of the VNTR typing system composed of the correlating VNTR loci.
The present VNTR typing systems could not define all unique isolates and still require the complementation of other typing methods, such as IS6110 RFLP. The power of the VNTR typing system can be improved by the supplementation of an extra number of VNTR loci. It was shown in this study that our 10-VNTR set was powerful for distinguishing M. tuberculosis isolates. This VNTR set has already been developed in our laboratory for application in the molecular epidemiological study of M. tuberculosis. We adopted multiplex PCR with three separate sets of primers to target those 10 VNTR loci, in conjunction with analysis by agarose gel electrophoresis. Without the requirement of the automated gel analysis equipment, our system is inexpensive and so can be exploited by general researchers. We are now applying this 10-VNTR typing format to examining different M. tuberculosis populations in Thailand.