|Home | About | Journals | Submit | Contact Us | Français|
The performances of two methods of nucleotide sequencing were compared for the detection of drug resistance mutations in human immunodeficiency virus type 1 reverse transcriptase (RT) in viruses isolated from highly RT inhibitor-experienced individuals. Of 11,677 amino acids deduced from population PCR products by both cycle sequencing and sequencing by hybridization to high-density arrays of oligonucleotide probes, 97.4% were concordant by both methods, 0.8% were discordant, and 1.7% had an ambiguous determination by at least one method. A higher rate of discordance (3.9%) was observed among RT inhibitor resistance-associated codons. In 45% of the isolates, RT codon 67 was deduced as the wild-type Asp by hybridization sequencing but as the zidovudine resistance-associated Asn by cycle sequencing. In other resistance-associated codon discordances, cycle sequencing also more commonly called a known resistance-associated amino acid than hybridization sequencing did. The nucleotide sequence in the vicinity of several codons with discordant calls influenced population-based hybridization sequencing. For isolates evaluated by additional sequencing of molecular clones of PCR products by both methods, the discordance between methods was less frequent (0.4% of all 5,994 amino acids and 0 of 494 drug resistance-associated codons). At positions which were discordant or ambiguous in the population sequences, the results of sequencing of clones by both methods were usually in agreement with the population cycle sequencing result. In summary, most RT codons were highly concordant by both methods of population-based sequencing, with discordances due in large part to genetic mixtures within or adjacent to discordant codons.
The success of antiretroviral therapy of human immunodeficiency virus type 1 (HIV-1) infection has been limited by the emergence of mutations in the HIV-1 genome that confer decreased susceptibility to antiretroviral drugs. Resistance to any of the currently available antiretroviral agents (reverse transcriptase [RT] inhibitors or protease inhibitors) may develop if viral replication persists during therapy (1, 8). Detection of drug resistance by nucleotide sequencing or phenotypic susceptibility testing may be clinically useful in predicting which antiretroviral regimens can better suppress virus replication (3, 9). Two different methods of nucleotide sequencing are in wide use. Automated cycle sequencing has been the standard method, and current variations include the use of dye-labeled primers or terminators in the sequencing reaction, followed by electrophoresis and base calling in an automated sequencer. More recently, a method based on sequencing by hybridization to miniaturized high-density arrays of oligonucleotide probes followed by automated base calling and mutation detection has become available (4, 12).
The comparative performances of the two methods in detecting naturally occurring polymorphisms and resistance-associated mutations in HIV-1 is of clinical interest. Although hybridization sequencing may be reliable when used to determine eukaryotic gene sequences, its use with HIV-1 is more complicated for at least two reasons. First, the nucleotide sequences of HIV-1 isolates obtained from different infected individuals often vary markedly and unpredictably. The amount of interpatient variability in the sequences of HIV-1 genes far exceeds that seen in the sequences of most eukaryotic genes, creating a challenge in the design of adequate sets of oligonucleotide probes used in hybridization sequencing. Second, in contrast to eukaryotic genes, which usually display two alleles per genetic locus, HIV-1 often exists as a highly heterogeneous mixture of genetic variants within a single chronically infected individual. This may be a source of potential difficulties in using hybridization sequencing to analyze the PCR products of an HIV-1 population. Cycle sequencing of the PCR products of a heterogeneous population cannot reliably detect minorities that constitute <20% of the total (15). Prior studies that have compared hybridization sequencing to cycle sequencing found a high rate of concordance in nucleotide base calls for HIV-1 clade B isolates (6, 11), although analysis of amino acid assignments at specific codons associated with drug resistance was limited due to the relatively few specimens with multiple drug resistance mutations. Current hybridization sequencing technology is optimized for clade B HIV-1 isolates, and problems have been noted in its performance with HIV-1 isolates from other clades and clade B isolates with insertion mutations (18; J. L. Harris, F. Lu, A. Bae, K. Borroto-Esoda, J. L. Stand, J. T. Houng, R. M. Lloyd, and B. J. McCreedy, Abstr. 3rd Int. Workshop on HIV Drug Resistance and Treatment Strategies, abstr. 96, p. 64, 1999).
We hypothesized that the genetic diversity of HIV-1 isolates obtained from chronically infected, previously treated individuals may result in significantly lower rates of concordance between the two sequencing methods at sites associated with sequence heterogeneity. We compared the RT amino acid assignments of the two methods using PCR products from viruses isolated from highly RT inhibitor-experienced, clade B HIV-1-infected individuals, particularly at RT sites expected to display considerable heterogeneity (codons associated with natural polymorphisms or drug resistance). Most of these specimens contained multiple mutations that confer resistance to nucleoside and nonnucleoside RT inhibitors (NNRTIs) (7). For a subset of specimens, we also generated and sequenced multiple molecular clones of PCR products by both methods to evaluate the role of genetic heterogeneity in discordant amino acid assignment.
Virus isolates were obtained from HIV-1-infected subjects participating in ACTG (AIDS Clinical Trials Group) protocol 241, a 48-week, randomized, double-blind, placebo-controlled study that compared zidovudine and didanosine in combination to zidovudine, didanosine, and nevirapine in combination (2). All subjects had baseline CD4+-T-cell counts of ≤350 cells/mm3 and ≥6 months (median, 25 months) of prior nucleoside analogue RT inhibitor therapy (zidovudine, didanosine, or zalcitabine). Prior to enrollment, 97% had received zidovudine, 48% had received didanosine, and 31% had received zalcitabine. The 49 virus isolates studied and described in this report were obtained from a subset of 22 participants in the ACTG protocol 241 virology substudy (7, 10).
Virus isolation was performed at the time of entry into the study (baseline isolates), 8 weeks into the study (early therapy isolates), and the end of the study (late therapy isolates, primarily from week 48). All subjects were receiving treatment (zidovudine and didanosine, with or without nevirapine) when on-therapy isolates were obtained, and all were taking nucleosides (zidovudine, didanosine, and/or zalcitabine) when baseline isolates were obtained. Plasma HIV-1 RNA levels were measured by quantitative reverse transcription-PCR (Roche Molecular Systems, Almeida, Calif.), with a lower limit of detection of 200 copies/ml, and were available for the same time of virus isolation for 47 of 49 isolates. One isolate originated from a subject with concurrent plasma HIV-1 RNA levels of <200 copies/ml. All other isolates were obtained when plasma HIV-1 RNA was detectable, and most (94%) were obtained when the plasma HIV-1 RNA level was >1,000 copies/ml.
Virus isolates were obtained by the ACTG quantitative microculture protocol (Division of AIDS and National Institutes of Health [http://www.niaid.nih.gov/daids/vir_manual]). Briefly, the subjects' peripheral blood mononuclear cells (PBMCs) were incubated at six serial fivefold dilutions with uninfected, phytohemagglutinin (Difco, Detroit, Mich.)-stimulated donor PBMCs. On day 14, cell-free supernatant from each well was assayed for HIV-1 p24 antigen (HIV-1 p24 enzyme-linked immunosorbent assay; NEN Life Science Products, Boston, Mass.). Supernatants from p24-positive wells with the least diluted subject PBMCs were combined to form the initial virus isolate stock. The initial virus stock (1 ml) was cultured with 107 phytohemagglutinin-stimulated uninfected donor PBMCs. On day 7, infected cells were harvested and lysed (Puregene; Gentra Systems Inc., Minneapolis, Minn.).
PCRs were performed with XL rTth DNA polymerase (Perkin-Elmer, Foster City, Calif.) at 1 U/100 μl of reaction mixture with a hot start (94°C for 1 min) and two temperature steps in each thermal cycle. For population sequencing, PCR with 5′ primer PRT-T3 (5′-CAGACCAGAGCCAACAGCCCCA-3′; coordinates 2142 to 2163 on isolate NL4-3) and 3′ primer 3CA4171 (5′-TCCUTUGUGUGCUGGUACCCAUGC-3′; coordinates 4149 to 4172) was performed by using 30 cycles of amplification (denaturation at 94°C for 15 s, followed by annealing and extension at 65°C for 1 min and 30 sec), followed by a final incubation at 72°C for 10 min to yield a 2.0-kb product. For samples undergoing additional clonal analysis (see below), PCR was performed with primers that facilitate molecular cloning, 5′ primer 5CA2611 (5′-UAAACAAUGGCCAUTGACAGAAGA-3′; coordinates 2612 to 2635) and 3′ primer 3CA4171, under similar conditions to yield a 1.6-kb product. Primers were used at 1 μM, magnesium acetate was used at 1.2 mM, and deoxynucleoside triphosphates were used at 0.2 mM (the magnesium acetate and deoxynucleoside triphosphates were from Perkin-Elmer). PCR was performed in a GeneAmp PCR System 9600 thermocycler (Perkin-Elmer). The PCR products were purified with the QIAquick kit (Qiagen, Chatsworth, Calif.). Multiple negative, reagent-only control reactions were performed in each amplification, and sequencing was done only if these were negative.
The RT region including codons 1 to 242 was sequenced by fluoresceinated primer extension (ThermoSequenase Fluorescent Labeled Primer Cycle Sequencing Kit; Amersham Life Science, Buckinghamshire, England), followed by gel electrophoresis on an automated sequencer (ALF and ALF Manager 2.6 base-calling software; Pharmacia, Upsala, Sweden). Sequencing primers were as follows: sense primers 5FP11 (5′-TTGGGCCTGAAAATCCATACAAT-3′; coordinates 2698 to 2720 on isolate NL4-3), 5F127 (5′-ATACTGCATTTACCATACCTAG-3′; coordinates 2929 to 2950), and 5F215 (5′-TTAGAAATAGGGCAGCATAG-3′; coordinates 3126 to 3145) and antisense primers 3FP12 (5′-CTGCGGGATGTGGTATTCCTAA-3′; coordinates 2823 to 2844), 3F150 (5′-TGGAAGCACATTGTACTGATA-3′; coordinates 2979 to 2999), and 3F262 (5′-TCCCACTAACTTCTGTATGTC-3′; coordinates 3315 to 3335). If <300 nucleotides were automatically base called, sequencing was repeated. Ambiguity codes (International Union of Pure and Applied Chemistry [IUPAC]) were assigned by the software to indicate mixtures at any position with a secondary peak that was at least 20% of the primary peak height. Sequences were aligned and compared to the sequences of reference HIV-1 isolates (isolates NL4-3 and HxB2) with Sequencher 3.1 software (GeneCodes, Ann Arbor, Mich.). An additional review for mixtures was performed for any nucleotide position that differed from a reference sequence and for all resistance mutation-associated codons. Sequences were compared to each other and to the reference sequences to exclude cross-contamination.
A 1.2-kb DNA fragment of HIV-1 pol was amplified by PCR from cell culture lysates by using the conditions and primers described previously (6). Multiple negative, reagent-only control reactions were performed in each amplification, and sequencing was done only if these were negative.
The RT region from codons 1 to 242 was sequenced by a method of hybridization sequencing with high-density oligonucleotide arrays (Affymetrix GeneChip HIV PRT440 nucleic acid sequencing) to determine the nucleotide sequence of a PCR product (11). PCR products were first transcribed with T3 or T7 RNA polymerase (Promega, Madison, Wis.) in the presence of fluorescein-labeled UTP (Boehringer Mannheim, Indianapolis, Ind.). The fluorescein-labeled RNA fragments were randomly sheared by alkaline hydrolysis and were then hybridized to the PRT440 sense and antisense chips (Affymetrix, Santa Clara, Calif.). The chips were scanned with a confocal laser microscope, and the nucleotide sequences were generated by integrating the sense and antisense chip data with GeneChip 2.0 Rule Algorithm software (Affymetrix). Ambiguities were assigned an IUPAC code. Nucleotide sequences were exported and translated to amino acid sequences with Sequencher 3.1 software (GeneCodes). Prior to analysis, all sequences were evaluated to exclude cross-contamination with either a laboratory strain or an isolate from another subject (5).
PCR products amplified with primers 5CA2611 and 3CA4171 were used for cloning. These primers contained dUMP instead of dTMP, which allowed HIV-1 sequence-specific uracil deglycosylase-mediated cloning into vector pRTdel (13). Vector plasmid was mixed with isolate-derived PCR product, and the mixture was incubated with 1 U of uracil deglycosylase (GIBCO-BRL, Gaithersburg, Md.) in an annealing buffer (20 mM Tris-HCl [pH 8.4], 50 mM KCl, 1.5 mM MgCl2) at 37°C for 20 min and then at 65°C for 10 min. This generated overlapping, complementary, single-stranded ends that subsequently annealed. Escherichia coli DH10β (GIBCO BRL) was transformed with the mixture, and colonies with an insert of the appropriate size in a restriction enzyme digest were selected and grown. The plasmid DNA was then isolated (QIAwell System; Qiagen) for sequencing by both methods.
The RT region from codons 1 to 242 was analyzed in 49 virus isolates from 22 subjects. For each RT codon position, the number of evaluable isolates ranged from 45 to 49 because some positions in six isolates did not have readable sequence, despite repeated assays. Clonal analysis was done with three isolates, with the RT region from codons 1 to 242 sequenced by both methods for a total of 25 molecular clones.
Amino acid sequences were imported into Excel (version 7.0; Microsoft, Redmond, Wash.). All differences between sequences determined by both methods were noted. Additional analyses were performed with codons known to be associated with antiretroviral drug resistance and non-resistance-associated codons that displayed significant amino acid polymorphism. Major nucleoside RT inhibitor resistance-associated changes were defined to include Met41Leu, Asp67Asn, Lys70Arg, Leu210Trp, Thr215Tyr or Phe, and Lys219Gln or Glu for zidovudine and Lys65Arg, Thr69Asp or Asn, Leu74Val, and Met184Val for didanosine and zalcitabine (14). Major changes associated with resistance to currently available NNRTIs (including nevirapine, delavirdine, and efavirenz) were Ala98Gly, Leu100Ile, Lys101Glu, Lys103Asn or Arg, Val106Ala or Ile, Val108Ile, Val179Asp, Tyr181Cys, Tyr188Leu or His, and Gly190Ala or Ser (14). None of the specimens had resistance mutations associated with the codon 151 complex (16, 17) by either sequencing method or with insertions in the vicinity of codon 69 (19) by cycle sequencing, so these positions were not included in this analysis. Highly polymorphic codons were arbitrarily defined as those which had an amino acid other than the dominant one in >10% of samples by either of the two sequencing methods. Differences in assignment of amino acids by each of the two sequencing methods at each position were determined both for discordances (different unique amino acids assigned by each method) and for ambiguities (an IUPAC code for a sequence that does not encode a unique amino acid). Statistical significance was assessed by chi-square testing, and all reported P values are two-tailed.
Nucleotide sequences have been submitted to GenBank under the accession numbers AF156033-6, AF156052-6, AF156074-5, AF156077-85, AF156087, AF166026-7, AF166029-31, AF166038-41, AF166059-66, AF166068-9, AF166078-81, and AF198038-42.
Of 11,677 amino acids deduced by both sequencing methods with isolate-derived population PCR products, 11,376 (97.4%) were concordant (Table (Table1).1). Ninety-nine (0.8%) were discordant, and 202 (1.7%) had an ambiguous determination by one or both methods. The number of discordant amino acid determinations per isolate ranged from zero to seven (2.9%). In contrast, for the molecular clones sequenced by both methods, 5,901 of 5,994 (98.4%) evaluable amino acids were concordant. Twenty-one (0.4%) were discordant, a rate significantly lower than that for isolate-derived population sequencing (P < 0.001).
Among 20 codons associated with RT inhibitor resistance listed in Materials and Methods, 972 amino acids were deduced by both methods and 902 (92.8%) were concordant. Thirty-eight (3.9%) were discordant and 32 (3.3%) had an ambiguous determination by one or both methods, rates significantly higher than those for non-resistance-associated codons (P < 0.001) (Table (Table1).1). The 38 discordant amino acids in resistance-associated codons represented a substantial portion (38%) of the discordance seen at all sites. The proportion of ambiguous amino acids in the 20 resistance-associated codons was higher by cycle sequencing (2.8%) than by hybridization sequencing (0.8%) of isolate-derived population PCR products (P < 0.01) (Table (Table1).1). In contrast, sequencing of molecular clones by both methods showed no discordance for 494 amino acids at resistance-associated codons, significantly less than that by isolate-derived population sequencing (P < 0.001).
The majority of the discordances (22 of 38) for resistance-associated codons occurred at codon 67 (Table (Table2).2). Among the 49 isolates in this study, 22 (45%) had discordances at codon 67, 23 (47%) had concordance, and 4 (8%) had an ambiguous amino acid by one or both methods. In each case of discordance, hybridization sequencing called the codon wild-type Asp (GAC), whereas cycle sequencing called it the zidovudine resistance-associated Asn (usually AAC and less commonly AAT). By hybridization sequencing, only 2 isolates (4%) had 67Asn, whereas by cycle sequencing, 25 isolates (51%) had this mutation.
The nucleotide sequence in the vicinity of codon 67, particularly in codon 66, influenced the results of hybridization sequencing of the isolate-derived population PCR product. By use of the cycle sequencing result as the reference standard, 43 isolates with unambiguous nucleotide sequence at codons 66 and 67 by both methods were compared (Table (Table3).3). All isolates had 66Lys encoded by either AAG or AAA. Interestingly, among the 20 isolates with 66Lys encoded by AAG, all had the zidovudine resistance-associated 67Asn, usually encoded by AAC. Hybridization sequencing miscalled all of these specimens at codon 67 as GAC, which codes for wild-type Asp. Most of these were also miscalled at codon 66, usually as AAA, which also codes for 66Lys. In contrast, among 23 isolates with 66Lys encoded by AAA, as determined by cycle sequencing, 19 had wild-type 67Asp (encoded by GAC or GAT) and 4 had the zidovudine resistance-associated 67Asn (encoded by AAC or AAT). The hybridization sequencing results agreed with the cycle sequencing results for all isolates in this context except the two with 67Asn encoded by AAT (which were miscalled GAC [Asp]). Thus, population-based hybridization sequencing detected the zidovudine resistance-associated 67Asn by population sequencing only when it was encoded by AAC in the context of 66Lys encoded by AAA, a small minority of all specimens with the 67Asn mutation.
These limitations were not present when molecular clones were tested by hybridization sequencing. All clones were concordant by the two methods at codons 66 and 67 (Table (Table3).3). Both isolates in which sequencing of clones by both methods showed only 67Asn had a population hybridization sequence that showed 67Asp.
Sixteen discordant calls occurred at eight resistance-associated codon positions other than codon 67 in isolate-derived population sequences, with the proportion of discordant calls ranging as high as 6% for codons 100 and 106, which are associated with NNRTI resistance (Table (Table2).2). No discordance was seen for the remaining 11 resistance-associated codons. Nevertheless, even when data for codon 67 were excluded, the discordances (1.7%) and ambiguities (3.0%) for the remaining resistance-associated codons were more common than those for non-resistance-associated codons (P < 0.01). These cases of discordance are shown in detail in Table Table4.4. Eight of them occurred in codons associated with nucleoside RT inhibitor resistance. In six of these cases, a known resistance-associated amino acid was deduced by cycle sequencing and either the wild-type or a non-resistance-associated amino acid was deduced by hybridization sequencing. Eight other discordances occurred in codons associated with NNRTI resistance. In four of these cases, a known resistance-associated amino acid was deduced by cycle sequencing and either the wild-type or a non-resistance-associated amino acid was deduced by hybridization sequencing.
The nucleotide sequence in the vicinity also influenced the result of sequencing by hybridization in several of these cases of discordance, similar to the situation with codon 67. For instance, in the only two isolates with the NNRTI resistance-associated ATA (Ile) at codon 106 determined by cycle sequencing, AGA was called by hybridization sequencing. This encodes the amino acid Arg, which is unusual at that codon position. Nine clones from one of these isolates (from subject E in Table Table5)5) sequenced by both methods showed only ATA (Ile), in agreement with the population PCR product cycle sequencing results. A second example involved RT codons 99 and 100. In the only three isolates with GGA (Gly) at codon 99 as determined by cycle sequencing, hybridization sequencing miscalled codon 99 as GGG (also Gly) and also miscalled codon 100 as ATA (Ile, associated with NNRTI resistance at this position) instead of the wild-type TTA (Leu) as determined by cycle sequencing. A third example involved codons 69 and 70. In the only two isolates with AGG (Arg) at codon 70, hybridization sequencing miscalled codon 69 as the wild-type ACT (Thr) instead of the nucleoside resistance-associated AAT (Asn) as determined by cycle sequencing.
Nineteen non-resistance-associated RT codon positions defined here as polymorphic (sites where >10% of isolates had nondominant amino acids as determined by one or both methods) were examined next: codons 35, 39, 43, 44, 60, 83, 104, 118, 122, 123, 135, 162, 178, 196, 200, 202, 207, 211, and 214. Overall, 919 amino acids were deduced by both methods at these positions, and 856 (93.1%) were concordant. Twenty-one (2.3%) were discordant and 42 (4.6%) had an ambiguous determination by one or both methods, all of which were higher than the rates for nonpolymorphic, non-resistance-associated codons (P < 0.001) (Table (Table1).1). The rate of ambiguous amino acid determinations among the polymorphic codons was higher by cycle sequencing (3.0%) than by hybridization sequencing (1.6%), but this did not reach statistical significance (P = 0.06).
Sixty-one of the 99 discordances seen in this study occurred at the 222 codon positions not associated with drug resistance. The number of discordances ranged from 0 to 8% per codon. For 11 codon positions, there was a systematic low-level discordance (4 to 6%) between the two methods, with hybridization sequencing calling the same nondominant amino acid in the majority of cases. As was seen with several resistance-associated codons, discordance at some of these codons was associated with the specific nucleotide sequence in adjacent codons. For example, in codon 47, cases of discordance resulted from cycle sequencing calling the codon ATT or ATC (both wild-type Ile) and hybridization sequencing calling it AAT or AAC (both variant Asn). A nucleotide sequence polymorphism (GGA rather than GGG, both of which code for Gly) at codon 45 was associated with the variant 47Asn as determined by hybridization sequencing. GGA at codon 45 allows the formation of a longer run of A residues upstream of codon 47, and this sequence is often erroneously reported to extend further into codon 47 by hybridization sequencing (one isolate had GGA AAA ATC [Gly-Lys-Ile] at codons 45 to 47 by cycle sequencing and GGA AAA AAC [Gly-Lys-Asn] by hybridization sequencing, and the other isolate had GGR AAA ATT [Gly-Lys-Ile] by cycle sequencing and GGA AAA AAT [Gly-Lys-Asn] by hybridization sequencing). In clonal analyses of three other specimens, only one clone showed a discordance in the amino acid at codon 47, with hybridization sequencing showing 47Asn and cycle sequencing showing 47Ile. This single discordant clone originated from an isolate with heterogeneity in clones in codon 45 by cycle sequencing (three clones with GGA AAA ATT for codons 45 to 47 and six clones with GGG AAA ATT, all of which code for Gly-Lys-Ile). As suggested from the findings of isolate-derived population PCR product sequencing, the single case of discordance was for a clone with the longer stretch of A residues because codon 45 was GGA, with hybridization sequencing miscalling the sequence GGA AAA AAT (which codes for Gly-Lys-Asn) instead of the correct GGA AAA ATT (which codes for Gly-Lys-Ile).
All codons that showed discordance or ambiguity in the analyses of isolate-derived population sequences of three isolates with available clones were examined in detail. For codons with discordant amino acid determinations by the two methods of population sequencing, the amino acid call for the majority of the molecular clone sequences (whether determined by hybridization sequencing or cycle sequencing) was usually in agreement with the cycle sequencing result in isolate-derived population sequencing (Table (Table5).5). In cases of ambiguous amino acid calls for the population sequences generated by either method, clonal sequencing by both methods usually agreed and resolved the ambiguity. However, the sequences obtained by clonal analyses usually agreed with the population sequences generated by cycle sequencing and often conflicted with the population sequences determined by hybridization sequencing (data not shown).
There was minimal discordance (0.8%) between sequencing by hybridization and cycle sequencing for determination of the HIV-1 RT amino acid sequences across all codons. However, the results of the methods were significantly more discordant for resistance-associated codons (3.9%) than for other codons (0.6%). Discordances were most often seen for codon 67 (45% of isolates), but even if the data for codon 67 were excluded from the analysis, the discordance rate would still be significantly higher for resistance-associated codons (1.7%) than for other codons. Similarly, for highly polymorphic RT codons that are not known to be associated with resistance, the discordance rate (2.3%) was significantly higher than the overall rate. These observations suggest that increased heterogeneity in the nucleotide sequence at specific sites in HIV-1 RT are associated with more discordance between the two methods when the population PCR product is used.
The low level of amino acid discordance overall in our study is in agreement with the low level of nucleotide discordance reported in two earlier studies of clade B HIV-1 isolates. One study found a nucleotide discordance rate of 0.8% for the protease region of 114 clinical specimens from protease inhibitor-naive individuals (11). Another study of 12 zidovudine-experienced individuals (including some with a history of receipt of other nucleoside RT inhibitors with or without the protease inhibitor indinavir) found a nucleotide discordance rate of 0.8% for 29 clinical specimens (6). A relatively poor concordance in the nucleotide sequence at RT codon 67 was suggested in the latter study (6) and a recent report (Harris et al., Abstr. 3rd Int. Workshop on HIV Drug Resistance and Treatment Strategies). We extend these observations by studying patients with greater prior experience with antiretroviral agents, by using a different platform for cycle sequencing, by comparing the amino acid sequences deduced by each method at resistance- and non-resistance-associated codons, and by exploring the causes of discrepancies in deduced amino acids.
One source of discrepancy may be that different strengths of hybridization, based on nucleotide composition, affect hybridization sequencing and not cycle sequencing. In particular, if mixtures are present (as is often seen in population sequencing of HIV-1), hybridization sequencing may preferentially detect a C or a G residue rather than an A or a T residue because of stronger base pairing. This does not explain the frequent discordances for codon 67 since a mixture was not detected by cycle sequencing for those cases. Furthermore, stronger hybridization between C and G in mixtures cannot explain other discordances, such as those for codon 66, with cycle sequencing indicating AAG and hybridization sequencing indicating AAA. Another possible explanation for discrepancy is that AT-rich regions may be more problematic for hybridization sequencing due to the relatively lower strength of base pairing there. Although the immediate region of codon 67 is relatively AT rich, several other regions in HIV-1 RT are equally AT rich but do not display as high a rate of discordance (data not shown).
A more complex source of discrepancy was characterized here: the influence of distinct neighboring nucleotide polymorphisms (at times, several nucleotides away) on the position of interest. Discrepancies in codon 67 (with GAC being called by hybridization sequencing instead of AAC or AAT) were often associated with a distinct nucleotide sequence polymorphism in codon 66 (AAG instead of AAA). Similar associations were made between polymorphisms in codon 99 and discordance in codon 100, codon 70 and discordance in codon 69, and codon 45 and discordance in codon 47. The sequences associated with discrepancies tended to be relatively uncommon and therefore may not have been adequately represented by the interrogating oligonucleotide probes used for hybridization sequencing of population PCR products. These observations suggest that the heterogeneity of sequences at specific sites in HIV-1 may cause significant miscalls in hybridization sequencing. Moreover, the heterogeneity associated with miscalls is likely to be in codons adjacent to the one of interest. Identification of the positions that contribute to miscalls, as in this study, will facilitate improvements in probe arrays.
Genetic mixtures at the site of interest also contribute to the discrepancies noted between hybridization and cycle sequencing. Mixtures at the base of interest were the major cause of ambiguities in cycle sequencing. However, if the presence of mixtures at the site of interest was the only factor that resulted in discordance or ambiguity with hybridization sequencing, we would expect that ambiguous amino acid determination by one method should correlate with ambiguous amino acid determination by the other method. Of the 202 amino acids with ambiguity, 196 were ambiguous by only one method and only 6 were ambiguous by both methods. This suggests that factors other than mixtures in the codon of interest played important roles in hybridization sequencing ambiguities or discordances. These factors include heterogeneity in adjacent codons genetically linked to a homogeneous codon of interest.
In order to evaluate further the role of genetic mixtures in cases of discordance and ambiguity, we examined the performances of the two sequencing methods using genetic clones. In cases in which cloned PCR products were sequenced, there was excellent agreement between cycle sequencing and hybridization sequencing. The rate of discordance was lower for clones than for population sequencing for resistance-associated codons as well as other codons. Even when the methods gave discordant or ambiguous results with population PCR products, the clone sequences were highly concordant by both methods and generally agreed with the cycle sequencing results for population PCR products. Interestingly, nucleotide sequence polymorphisms which were associated with hybridization sequencing miscalls by population sequencing (such as for codons 66 and 67 and codon 106) did not cause hybridization sequencing miscalls in the sequencing of the clones. This provides further support that minority variants with heterogeneity either at the position of interest or at nearby positions which affect hybridization constituted a major cause of discordance in population sequencing. Indeed, it suggests that hybridization sequencing may be most accurate when used to sequence genetically homogeneous, cloned templates of HIV-1. This may minimize the need to continuously update the probe array to account for ongoing genetic changes in HIV-1, although some regions problematic even in clones (Table (Table5,5, RT codon 35) will still warrant improvements in probes. Because of the more automated nature of hybridization sequencing, the use of cloned templates for hybridization sequencing is more feasible than the use of multiple cloned templates for cycle sequencing.
Genetic mixtures were the major cause of ambiguous amino acid determinations by cycle sequencing, a rate significantly higher than that for hybridization sequencing. This was notable for RT codon 215 (the site of a zidovudine resistance mutation more pivotal and common than that at RT codon 67), at which 10% of isolates had an ambiguous amino acid sequence as determined by cycle sequencing. This led to a low concordance rate (85%) at that position, despite a low discordance rate (4%). One likely explanation for fewer ambiguities in amino acids by hybridization sequencing in our study relates to the software for nucleotide assignment used in hybridization sequencing, which calls only the dominant base at each position. However, an evaluation of the ability to detect mixtures of wild-type and mutant amino acids at RT codons 41, 184, and 215 showed that two laboratories that used sequencing by hybridization performed at least as well as 21 other laboratories that used cycle sequencing (15).
Our methods identified resistance mutations in virus isolates from PBMCs, but virus isolation is neither recommended nor preferred over direct testing of plasma HIV-1 RNA. HIV-1 resistance testing is now available and should be performed from plasma HIV-1 RNA. Nevertheless, the conclusions of this study, which began before sequencing from plasma HIV-1 RNA was standardized, are relevant to HIV-1 genotyping from plasma HIV-1 RNA. Genotypic data derived by the sequencing of PCR products amplified from either the HIV-1 RNA circulating in patient plasma or from virus isolate-infected cultured cell DNA have yielded identical results for subjects with plasma HIV-1 RNA that rebounded during therapy with NNRTIs and nucleoside RT inhibitors (L. Bacheler, O. Weislow, S. Snyder, G. Hanna, and R. D'Aquila, Conf. Rec. 12th Int. Conf. AIDS, abstr. 41213, p. 784, 1998; V. Johnson, M. Saag, J. Decker, J.-P. Sommadossi, M. Myers, S. Cort, D. Hall, J. Griffin, J. Lifson, and G. Shaw, Proc. Third Int. HIV Drug Resistance Workshop, abstr. 70, 1994). The dominant replication-competent virus present in vivo is identifiable either in plasma HIV-1 RNA or in virus isolates in PBMCs. Furthermore, in chronically infected patients, the sequences from both sources of the HIV-1 nucleic acid display considerable genetic heterogeneity. Therefore, our conclusions on the effects of heterogeneous population PCR products on the performance of the two methods of sequencing would remain valid for either plasma virus- or culture isolate-derived sequences. Our results suggest the hypothesis that there will be less discordance between the methods when sequencing templates have less virus genetic heterogeneity.
An important research use of nucleotide sequencing of HIV-1 RT is identification of novel changes in amino acids associated with the failure of new therapies. Among the non-resistance-associated codons, our data on discordant cases suggest that hybridization sequencing may systematically miscall amino acids at certain codons (albeit infrequently). For instance, hybridization sequencing of the nevirapine-containing arm showed that the HIV-1 isolates from 4 of 41 subjects (9.8%) developed 47Asn during therapy (data not shown). This amino acid was not seen in any baseline isolate and is not known to be a naturally occurring polymorphism. This result, in isolation, suggests that RT Ile47Asn may be selected by nevirapine therapy. Two of the four isolates with 47Asn were sequenced by both methods as part of the present study. This amino acid call was not confirmed by cycle sequencing of either of these two isolates. This suggests that if hybridization sequencing is to be used to identify new drug-selected substitutions that occur at low frequencies, the potentially new mutations should be confirmed by another method.
An increasingly common and more clinically important use of HIV-1 genotyping is determination of the presence of known resistance-associated mutations and use of this information to choose drugs, particularly for salvage regimens (3, 9). Hybridization sequencing had excellent concordance with cycle sequencing at most codons studied here (including RT codons 41 and 181, in which 100% concordance was seen, despite the presence of resistance mutations in >15% of isolates). The performance of hybridization sequencing and each particular method of cycle sequencing of population PCR products should be validated for each known resistance mutation of clinical interest by using an adequate number of samples of heterogeneous virus populations from treatment-experienced individuals. Indeed, improved probe arrays can be developed to correct defined problems, such as those identified here, as has already been done for the HIV-1 protease and RT chips, which have now replaced the version used in the present study. Because of its ease of use and with ongoing improvement in probe arrays, sequencing by hybridization will likely remain an important methodology for clinical laboratories.
This work was supported in part by the Adult AIDS Clinical Trials Group of the National Institute of Allergy and Infectious Diseases and by Public Health Service grants AI29193, AI01696, AI07387, AI27659, AI40876, AI32775, AI27767, AI32770, RR-00051, AI42567, AI27670, AI38858, AI36214, and AI29164 and contract 96VC001; Core Laboratory Research Facilities of the University of Alabama at Birmingham School of Medicine; University of Alabama at Birmingham Center for AIDS Research; Birmingham Veterans Affairs Medical Center; and Research Center for AIDS and HIV Infection of the San Diego Veterans Affairs Medical Center. J. Martinez-Picado was supported by a postdoctoral fellowship from the Spanish Ministry of Education.
We thank Minoo Bakhtiari, Caroline Ignacio, Jennifer Koel, Clinton Nail, Anu Savara, David Shugarts, and Russell Young for expert technical assistance; Andrew Leigh Brown, and Huldrych Gunthard for help with the compilation of sequences and advice; and Joan Kaplan for constructive review of the manuscript.