There was minimal discordance (0.8%) between sequencing by hybridization and cycle sequencing for determination of the HIV-1 RT amino acid sequences across all codons. However, the results of the methods were significantly more discordant for resistance-associated codons (3.9%) than for other codons (0.6%). Discordances were most often seen for codon 67 (45% of isolates), but even if the data for codon 67 were excluded from the analysis, the discordance rate would still be significantly higher for resistance-associated codons (1.7%) than for other codons. Similarly, for highly polymorphic RT codons that are not known to be associated with resistance, the discordance rate (2.3%) was significantly higher than the overall rate. These observations suggest that increased heterogeneity in the nucleotide sequence at specific sites in HIV-1 RT are associated with more discordance between the two methods when the population PCR product is used.
The low level of amino acid discordance overall in our study is in agreement with the low level of nucleotide discordance reported in two earlier studies of clade B HIV-1 isolates. One study found a nucleotide discordance rate of 0.8% for the protease region of 114 clinical specimens from protease inhibitor-naive individuals (11
). Another study of 12 zidovudine-experienced individuals (including some with a history of receipt of other nucleoside RT inhibitors with or without the protease inhibitor indinavir) found a nucleotide discordance rate of 0.8% for 29 clinical specimens (6
). A relatively poor concordance in the nucleotide sequence at RT codon 67 was suggested in the latter study (6
) and a recent report (Harris et al., Abstr. 3rd Int. Workshop on HIV Drug Resistance and Treatment Strategies). We extend these observations by studying patients with greater prior experience with antiretroviral agents, by using a different platform for cycle sequencing, by comparing the amino acid sequences deduced by each method at resistance- and non-resistance-associated codons, and by exploring the causes of discrepancies in deduced amino acids.
One source of discrepancy may be that different strengths of hybridization, based on nucleotide composition, affect hybridization sequencing and not cycle sequencing. In particular, if mixtures are present (as is often seen in population sequencing of HIV-1), hybridization sequencing may preferentially detect a C or a G residue rather than an A or a T residue because of stronger base pairing. This does not explain the frequent discordances for codon 67 since a mixture was not detected by cycle sequencing for those cases. Furthermore, stronger hybridization between C and G in mixtures cannot explain other discordances, such as those for codon 66, with cycle sequencing indicating AAG and hybridization sequencing indicating AAA. Another possible explanation for discrepancy is that AT-rich regions may be more problematic for hybridization sequencing due to the relatively lower strength of base pairing there. Although the immediate region of codon 67 is relatively AT rich, several other regions in HIV-1 RT are equally AT rich but do not display as high a rate of discordance (data not shown).
A more complex source of discrepancy was characterized here: the influence of distinct neighboring nucleotide polymorphisms (at times, several nucleotides away) on the position of interest. Discrepancies in codon 67 (with GAC being called by hybridization sequencing instead of AAC or AAT) were often associated with a distinct nucleotide sequence polymorphism in codon 66 (AAG instead of AAA). Similar associations were made between polymorphisms in codon 99 and discordance in codon 100, codon 70 and discordance in codon 69, and codon 45 and discordance in codon 47. The sequences associated with discrepancies tended to be relatively uncommon and therefore may not have been adequately represented by the interrogating oligonucleotide probes used for hybridization sequencing of population PCR products. These observations suggest that the heterogeneity of sequences at specific sites in HIV-1 may cause significant miscalls in hybridization sequencing. Moreover, the heterogeneity associated with miscalls is likely to be in codons adjacent to the one of interest. Identification of the positions that contribute to miscalls, as in this study, will facilitate improvements in probe arrays.
Genetic mixtures at the site of interest also contribute to the discrepancies noted between hybridization and cycle sequencing. Mixtures at the base of interest were the major cause of ambiguities in cycle sequencing. However, if the presence of mixtures at the site of interest was the only factor that resulted in discordance or ambiguity with hybridization sequencing, we would expect that ambiguous amino acid determination by one method should correlate with ambiguous amino acid determination by the other method. Of the 202 amino acids with ambiguity, 196 were ambiguous by only one method and only 6 were ambiguous by both methods. This suggests that factors other than mixtures in the codon of interest played important roles in hybridization sequencing ambiguities or discordances. These factors include heterogeneity in adjacent codons genetically linked to a homogeneous codon of interest.
In order to evaluate further the role of genetic mixtures in cases of discordance and ambiguity, we examined the performances of the two sequencing methods using genetic clones. In cases in which cloned PCR products were sequenced, there was excellent agreement between cycle sequencing and hybridization sequencing. The rate of discordance was lower for clones than for population sequencing for resistance-associated codons as well as other codons. Even when the methods gave discordant or ambiguous results with population PCR products, the clone sequences were highly concordant by both methods and generally agreed with the cycle sequencing results for population PCR products. Interestingly, nucleotide sequence polymorphisms which were associated with hybridization sequencing miscalls by population sequencing (such as for codons 66 and 67 and codon 106) did not cause hybridization sequencing miscalls in the sequencing of the clones. This provides further support that minority variants with heterogeneity either at the position of interest or at nearby positions which affect hybridization constituted a major cause of discordance in population sequencing. Indeed, it suggests that hybridization sequencing may be most accurate when used to sequence genetically homogeneous, cloned templates of HIV-1. This may minimize the need to continuously update the probe array to account for ongoing genetic changes in HIV-1, although some regions problematic even in clones (Table , RT codon 35) will still warrant improvements in probes. Because of the more automated nature of hybridization sequencing, the use of cloned templates for hybridization sequencing is more feasible than the use of multiple cloned templates for cycle sequencing.
Genetic mixtures were the major cause of ambiguous amino acid determinations by cycle sequencing, a rate significantly higher than that for hybridization sequencing. This was notable for RT codon 215 (the site of a zidovudine resistance mutation more pivotal and common than that at RT codon 67), at which 10% of isolates had an ambiguous amino acid sequence as determined by cycle sequencing. This led to a low concordance rate (85%) at that position, despite a low discordance rate (4%). One likely explanation for fewer ambiguities in amino acids by hybridization sequencing in our study relates to the software for nucleotide assignment used in hybridization sequencing, which calls only the dominant base at each position. However, an evaluation of the ability to detect mixtures of wild-type and mutant amino acids at RT codons 41, 184, and 215 showed that two laboratories that used sequencing by hybridization performed at least as well as 21 other laboratories that used cycle sequencing (15
Our methods identified resistance mutations in virus isolates from PBMCs, but virus isolation is neither recommended nor preferred over direct testing of plasma HIV-1 RNA. HIV-1 resistance testing is now available and should be performed from plasma HIV-1 RNA. Nevertheless, the conclusions of this study, which began before sequencing from plasma HIV-1 RNA was standardized, are relevant to HIV-1 genotyping from plasma HIV-1 RNA. Genotypic data derived by the sequencing of PCR products amplified from either the HIV-1 RNA circulating in patient plasma or from virus isolate-infected cultured cell DNA have yielded identical results for subjects with plasma HIV-1 RNA that rebounded during therapy with NNRTIs and nucleoside RT inhibitors (L. Bacheler, O. Weislow, S. Snyder, G. Hanna, and R. D'Aquila, Conf. Rec. 12th Int. Conf. AIDS, abstr. 41213, p. 784, 1998; V. Johnson, M. Saag, J. Decker, J.-P. Sommadossi, M. Myers, S. Cort, D. Hall, J. Griffin, J. Lifson, and G. Shaw, Proc. Third Int. HIV Drug Resistance Workshop, abstr. 70, 1994). The dominant replication-competent virus present in vivo is identifiable either in plasma HIV-1 RNA or in virus isolates in PBMCs. Furthermore, in chronically infected patients, the sequences from both sources of the HIV-1 nucleic acid display considerable genetic heterogeneity. Therefore, our conclusions on the effects of heterogeneous population PCR products on the performance of the two methods of sequencing would remain valid for either plasma virus- or culture isolate-derived sequences. Our results suggest the hypothesis that there will be less discordance between the methods when sequencing templates have less virus genetic heterogeneity.
An important research use of nucleotide sequencing of HIV-1 RT is identification of novel changes in amino acids associated with the failure of new therapies. Among the non-resistance-associated codons, our data on discordant cases suggest that hybridization sequencing may systematically miscall amino acids at certain codons (albeit infrequently). For instance, hybridization sequencing of the nevirapine-containing arm showed that the HIV-1 isolates from 4 of 41 subjects (9.8%) developed 47Asn during therapy (data not shown). This amino acid was not seen in any baseline isolate and is not known to be a naturally occurring polymorphism. This result, in isolation, suggests that RT Ile47Asn may be selected by nevirapine therapy. Two of the four isolates with 47Asn were sequenced by both methods as part of the present study. This amino acid call was not confirmed by cycle sequencing of either of these two isolates. This suggests that if hybridization sequencing is to be used to identify new drug-selected substitutions that occur at low frequencies, the potentially new mutations should be confirmed by another method.
An increasingly common and more clinically important use of HIV-1 genotyping is determination of the presence of known resistance-associated mutations and use of this information to choose drugs, particularly for salvage regimens (3
). Hybridization sequencing had excellent concordance with cycle sequencing at most codons studied here (including RT codons 41 and 181, in which 100% concordance was seen, despite the presence of resistance mutations in >15% of isolates). The performance of hybridization sequencing and each particular method of cycle sequencing of population PCR products should be validated for each known resistance mutation of clinical interest by using an adequate number of samples of heterogeneous virus populations from treatment-experienced individuals. Indeed, improved probe arrays can be developed to correct defined problems, such as those identified here, as has already been done for the HIV-1 protease and RT chips, which have now replaced the version used in the present study. Because of its ease of use and with ongoing improvement in probe arrays, sequencing by hybridization will likely remain an important methodology for clinical laboratories.