Search tips
Search criteria 


Logo of jvirolPermissionsJournals.ASM.orgJournalJV ArticleJournal InfoAuthorsReviewers
J Virol. 2003 April; 77(8): 4836–4847.
PMCID: PMC152121

Mutation Patterns and Structural Correlates in Human Immunodeficiency Virus Type 1 Protease following Different Protease Inhibitor Treatments


Although many human immunodeficiency virus type 1 (HIV-1)-infected persons are treated with multiple protease inhibitors in combination or in succession, mutation patterns of protease isolates from these persons have not been characterized. We collected and analyzed 2,244 subtype B HIV-1 isolates from 1,919 persons with different protease inhibitor experiences: 1,004 isolates from untreated persons, 637 isolates from persons who received one protease inhibitor, and 603 isolates from persons receiving two or more protease inhibitors. The median number of protease mutations per isolate increased from 4 in untreated persons to 12 in persons who had received four or more protease inhibitors. Mutations at 45 of the 99 amino acid positions in the protease—including 22 not previously associated with drug resistance—were significantly associated with protease inhibitor treatment. Mutations at 17 of the remaining 99 positions were polymorphic but not associated with drug treatment. Pairs and clusters of correlated (covarying) mutations were significantly more likely to occur in treated than in untreated persons: 115 versus 23 pairs and 30 versus 2 clusters, respectively. Of the 115 statistically significant pairs of covarying residues in the treated isolates, 59 were within 8 Å of each other—many more than would be expected by chance. In summary, nearly one-half of HIV-1 protease positions are under selective drug pressure, including many residues not previously associated with drug resistance. Structural factors appear to be responsible for the high frequency of covariation among many of the protease residues. The presence of mutational clusters provides insight into the complex mutational patterns required for HIV-1 protease inhibitor resistance.

Drug resistance is a major obstacle to the effective treatment of human immunodeficiency virus type 1 (HIV-1) infection. Although 16 antiretroviral drugs have been approved for the treatment of HIV-1, cross-resistance within each of the three antiretroviral drug classes—nucleoside reverse transcriptase (RT) inhibitors, nonnucleoside RT inhibitors, and protease inhibitors—often leads to the development of multidrug resistance. HIV-1-specific protease inhibitors pose a high genetic barrier to drug resistance because multiple protease mutations are usually required for the development of resistance to these inhibitors (4, 13, 19). Nonetheless, resistance to multiple protease inhibitors occurs commonly, attesting to the conformational flexibility of the HIV-1 protease enzyme (5, 10, 13, 26).

Most of the published sequence data on protease inhibitor-associated mutations are based on isolates obtained from persons treated for no more than 1 year with a single inhibitor (4, 17, 19-21). Few published data are available from persons with carefully characterized treatment histories who have received more than one inhibitor (12), and the genetic mechanisms by which HIV-1 protease develops resistance to multiple inhibitors have not been explored. Understanding the genetic basis of multidrug resistance, however, is critical to designing new non-cross-resistant protease inhibitors that are active against current drug-resistant HIV-1 isolates.

To characterize the patterns of mutations in protease isolates from heavily treated persons, we collected and analyzed a large number of protease sequences of HIV-1 isolates obtained from persons with a range of protease inhibitor experiences. Our analysis allows us to extend previous observations of the mutational flexibility of HIV-1 protease and to identify interactions among protease mutations. We used published structural data to explore possible underlying causes for these interactions.


Virus isolates and sequences.

We analyzed HIV-1 subtype B protease sequences from persons with well-characterized antiretroviral treatment histories. These sequences were taken from previously published studies (appearing in the 15 April 2002 release of the Stanford University HIV RT and Protease Sequence Database []) (25) and from sequencing performed at the Stanford University Hospital Diagnostic Virology Laboratory between 1 July 1997 and 31 December 2001. The isolates were subtyped by comparing them to reference sequences of known subtype (8, 15).

If multiple isolates were obtained from the same person during the course of protease inhibitor treatment, we included only the most recent isolate. We included two isolates from the same person only if a pre-protease inhibitor treatment isolate was also available. Only sequences that encompassed positions 10 to 90 were included in our analysis (96% included the complete protease, positions 1 to 99). All isolates were sequenced by dideoxynucleotide sequencing rather than by hybridization assays.


Mutations were defined as differences from the HIV-1 protease consensus B sequence (15). Of 2,244 sequences meeting the study criteria, 89% (1,990) were determined by direct PCR (population-based) sequencing and 11% (254) were determined by sequencing multiple clones of an isolate. About 1% of nucleotide positions in the sequences determined by direct PCR sequencing contained nucleotide mixtures (defined as the presence of a second electrophoretic peak of at least 20 to 30% of the primary peak). Positions with mixtures were scored as mutations in our analysis of mutation prevalence. However, because it is not possible to determine if these mutations were present in the same genome as other mutations in the sequence, mutations present as mixtures were excluded from our covariation analysis.

For the 254 isolates for which multiple clones were sequenced, we restricted our analysis to the clone that occurred with the highest frequency. This restriction caused us to exclude 128 mutations that were present in 30% or more of the clones from an individual (but that were not present in the clone with the highest prevalence) and to include 15 mutations that, although present in the most prevalent clone, existed in <30% of the total. This restriction was necessary to prevent the inclusion of mutations from different genomes in our covariation analysis. It did not significantly change the results of our analysis of mutation prevalence.

Statistical analysis. (i) Mutation prevalence.

We performed chi-square tests of independence to determine if there was an association between drug treatment and a mutation at each protease position. The chi-square statistic was based on a 2-by-2 contingency table containing the numbers of isolates from treated and untreated persons and the numbers of isolates with and without mutations.

To investigate whether there was a linear relationship between the number of protease inhibitors received and the prevalence of a mutation, we performed a logistic regression analysis in which the number of drugs was the independent variable and the presence or absence of mutation was the dependent variable. Persons were categorized in one of four groups according to the extent of treatment: one, two, three, and four or more protease inhibitors. Untreated persons were not included in this analysis.

For the chi-square and logistic regression analyses, we used the method of Benjamini and Hochberg to identify results that were statistically significant in the presence of multiple-hypothesis testing (1). This method was developed for the problem of multiple-hypothesis testing when multiple significant findings are not unexpected. As opposed to the Bonferroni correction, which divides the significance cutoff by the number of hypotheses tested (n), the Benjamini-Hochberg method ranks the hypotheses by their P values, and each hypothesis of rank r is compared with a significance cutoff, now called a false-discovery rate (FDR), divided by (nr). In this study, FDRs of 0.01 and 0.05 were used to determine statistical significance.

(ii) Mutation covariation.

We investigated covariation between positions by calculating the binomial (phi) correlation coefficient for the simultaneous presence of mutations at two positions in the same isolate. The correlation coefficients were computed separately for the subsets of protease inhibitor-treated and untreated individuals. Statistically significant correlations were those with P values of ≤0.05 using a Bonferroni correction for 2,080 (i.e., the binomial coefficient equation M1) pairwise comparisons. We further investigated the relationships among positions by performing a principal-components analysis (PCA) of positions found in the analysis described above to be mutated in treated persons. The matrix of binomial correlation coefficients was used as a measure of similarity between positions.

Mutational clusters were defined as clusters of three or more positions in which each member of the cluster was significantly correlated with the presence of each of the other members of the cluster (referred to as cliques in graph theory). Mutational clusters were identified by an exhaustive search technique that evaluated all possible clusters that could be formed from the statistically significant pairs of covarying residues.

(iii) Structural analysis.

We used two published X-ray crystallographic structures (1hsg [3] and 1hhp [27]) and one molecular-dynamics simulation (23) of wild-type HIV-1 protease to examine the interresidue distances between positions with statistically significant frequencies of covariation. One X-ray crystallographic structure (1hsg) was of HIV-1 protease bound to indinavir, and one (1hhp) was of an unliganded enzyme. The molecular-dynamics simulation, based on the 1hhp structure, showed the flaps of the protease curled inward. The distance between two residues was considered to be the shortest interatomic distance between any atoms in the two residues. Interresidue distances were calculated between all positions in the protease dimer in each of the three structures. Residues within 8 Å of each other in at least one structure were considered to be neighboring pairs in the folded enzyme. This distance was chosen as a conservative maximum distance at which two residues may interact.

When covariation could not be explained by the proximity of the two residues, we investigated the possibility that covariation resulted from the presence of one or more linking residues, a phenomenon called chained covariation (A. S. Lapedes, B. G. Giraud, L. C. Liu, and G. D. Stormo, presented at the AMS/SIAM Conference on Statistics in Molecular Biology, Seattle, Wash., 1997). To identify chained covariation, we performed a Markov chain analysis of the statistically significant pairs of covarying residues. This analysis finds the shortest chain between a residue pair, where the chain consists entirely of correlated residues within eight Å of one another. We then counted the number of covarying pairs that could be explained by a chain of one, two, three, or more linking residues. To determine whether such chains were statistically significant, we performed a stepwise permutation analysis in which we randomly generated pairs of residues and determined whether these residues could also be linked by a chain consisting entirely of correlated, neighboring residues. Repeated permutations provided the expected number and distribution of chains of one, two, three, or more linking residues in a molecule having the size and topology of HIV-1 protease.

Nucleotide sequence accession numbers.

The nucleotide sequences, mutations, drug treatment histories, and GenBank accession numbers can be downloaded as a PDF file from Of the 599 previously unpublished isolates sequenced at the Stanford University Hospital between 1 July 1997 and 31 December 2001, 383 had already been submitted to GenBank for a study of HIV subtypes in northern California (8); 216 new sequences were submitted to GenBank with this report (AF544406 to AF544621).


Protease inhibitor treatments.

Sequences of 2,244 protease isolates from 1,919 persons met the study selection criteria. Two isolates, one before and one after a protease inhibitor was received, were included from each of 325 persons. The sequences of 1,645 isolates from 1,344 individuals were published previously; the sequences of 599 isolates from 575 individuals have not been published previously. Table Table11 groups the isolates in the study according to the protease inhibitor treatments of the persons from whom isolates were obtained. Indinavir, saquinavir, and nelfinavir were each received by >500 persons. Ritonavir was received by 456 persons, ~60% of whom were receiving ritonavir at a low dose as part of a dual protease inhibitor combination. One hundred fifteen persons received amprenavir, which was approved in 1999, and eight persons received lopinavir, which was approved in 2001.

HIV-1 isolates and protease inhibitor exposurea

Protease mutations and their association with treatment.

The median number of protease mutations per isolate increased in proportion to the number of protease inhibitors received, from 4 mutations per isolate in untreated persons to 12 mutations per isolate in persons receiving four or more inhibitors (Fig. (Fig.1).1). Table Table22 shows the mutation frequencies of the 99 protease positions according to the number of protease inhibitors received. Based on our chi-square analysis, mutations at 45 positions were found to be treatment associated in that mutation frequencies were significantly associated with treatment with at least one protease inhibitor. An additional 17 positions had non-treatment-related polymorphisms; these positions had mutations, but the mutation frequencies were not statistically associated with protease inhibitor treatment. The remaining 37 positions had mutation frequencies of <0.5%, even in isolates exposed to treatment, and were considered invariant.

FIG. 1.
Histograms of mutation frequency according to the number of protease inhibitors (PIs) received. The median number of mutations (differences from the consensus B sequence) increased from 4 in untreated persons to 12 in persons receiving ≥4 inhibitors. ...
Mutation frequencies at protease positions 1 to 99 according to the number of protease inhibitors received

The 45 treatment-associated positions included 23 positions previously associated with drug resistance (10, 20, 24, 30, 32, 33, 36, 46, 47, 48, 50, 53, 54, 60, 63, 71, 73, 77, 82, 84, 88, 90, 93) and 22 positions which had not previously been associated with drug resistance (11, 13, 22, 23, 34, 35, 43, 45, 55, 58, 62, 66, 72, 74, 75, 76, 79, 83, 85, 89, 92, 95). Thirteen of the 22 newly described treatment-associated positions (positions 11, 22, 23, 45, 58, 66, 74, 75, 76, 79, 83, 85, 95) showed little or no variation—mutation frequencies of <0.5%—in untreated persons, as shown in Table Table2,2, column 0. These 13 positions played a significant role in HIV-1 protease variation, with mutations occurring in 92 of 637 (14.4%) persons receiving a single inhibitor and 162 of 603 (26.9%) persons receiving two or more inhibitors. These mutations usually occurred in isolates with one or more primary protease inhibitor resistance mutations (219 of 254 [85.8%]).

Our logistic regression analysis revealed that mutations at 24 positions had statistically significant positive linear relationships between the number of protease inhibitors received and the presence of a mutation (Table (Table2).2). The positions with the strongest linear relationships were positions 10, 20, 46, 53, 54, 63, 71, 73, 82, 84, and 90. There was a statistically significant negative linear relationship between the number of inhibitors and the presence of a mutation at position 30.

Locations of protease mutations within the enzyme's three-dimensional structure.

The invariant HIV-1 protease positions include the active-site positions (positions 25 to 27); other positions in or near the substrate cleft (positions 28 to 29, 31, and 80 to 81); most of the N- and C-terminal domains, which together with the active site make up the dimer interface; and other positions that appear to be associated with maintaining the enzyme's conformation and flexibility (e.g., 10 conserved glycines, including 3 in the flexible tips of the enzyme flap at positions 49, 51, and 52). The polymorphic positions are found almost entirely in surface loops.

The 23 known drug resistance positions include six substrate cleft residues (positions 30, 32, 48, 50, 82, and 84); four flap tip drug resistance mutations (positions 46, 47, 53, and 54); position 90, which although not in the substrate cleft decreases susceptibility to multiple protease inhibitors; three additional residues which are generally mutated only in treated persons (positions 24, 73, and 88); and nine polymorphic residues (positions 10, 20, 33, 36, 60, 63, 71, 77, and 93). The 22 new drug resistance positions include one substrate cleft residue (position 23), three flap residues (positions 43, 45, 55), one terminal-domain residue (position 95), and 17 residues in the enzyme core. The substrate cleft residues at positions 48 and 50 are also in the protease flap tips.

Correlations between protease mutations.

To identify patterns of drug resistance mutations, we calculated the pairwise binary (phi) correlation coefficients among the 45 treatment-associated and 17 polymorphic protease residues. This analysis was performed separately for the 1,004 isolates from untreated persons and for the 1,240 isolates from treated persons to detect associations that were independent of the treatment status of the individuals from whom the sequenced isolates were obtained. Among the untreated isolates, 23 of the 2,080 possible pairwise correlations were statistically significant, including 19 positive (phi = 0.14 to 0.31) and 4 negative (phi = −0.14 to −0.21) correlations. Among the treated isolates, 115 of the possible 2,080 correlations were statistically significant, including 99 positive (phi = 0.13 to 0.63) and 16 negative (phi = −0.13 to −0.34) correlations.

Table Table33 shows the most strongly correlated pairs of positions among the 115 statistically significant correlations in isolates from treated persons. The three most strongly correlated pairs of positions among the treated isolates were 54 and 82 (phi = 0.63), 32 and 47 (phi = 0.51), and 73 and 90 (phi = 0.47). Mutations at two pairs of primary resistance positions had significant positive correlations: positions 84 and 90 and positions 48 and 82. Mutations at positions 82 and 90, although both common, were not significantly correlated with each other. Position 30 was negatively correlated with each of the other primary resistance positions. The positions with the greatest number of positive correlations were positions 10 (16 correlations), 46 (13 correlations), 71 (12 correlations), 90 (10 correlations), 20 (10 correlations), 73 (10 correlations), 82 (9 correlations), 63 (7 correlations), 84 (6 correlations), and 54 (6 correlations).

Most strongly correlated pairs of positions among 115 statistically significant correlations in isolates from treated personsa

Correlations usually involved the most common mutation at each of the two correlated positions (Table (Table3).3). For example, the strong positive correlation between positions 54 and 82 (phi = 0.63) is in large part due to the strong correlation between I54V, the most common substitution at position 54, and V82A, the most common substitution at position 82 (for I54V and V82A, phi = 0.55). Other combinations of substitutions for these two positions were less commonly observed: I54T and V82A (phi = 0.21) and I54V and V82T (phi = 0.15). In some cases, covariation was dominated very strongly by particular combinations of substitutions. For example the positive correlation between positions 30 and 88 (phi = 0.40) was represented entirely by D30N and N88D (phi = 0.52) rather than by D30N and N88S (phi = −0.05), and the correlation between positions 48 and 54 (phi = 0.29) was represented largely by G48V and I54T (phi = 0.44) rather than G48V and I54V (phi = 0.19).

We can use our measurements of comutation frequencies to construct a graphical model that summarizes the relationships among positions in HIV-1 protease. In this model, we attempt to place positions with high degrees of comutation close together and positions with low or negative degrees of comutation far apart. These relationships are modeled as consistently as possible within the framework of a two-dimensional plot. One computational technique that generates such graphical models is called PCA. We performed PCA on the 45 positions that were associated with protease inhibitor treatment and used the matrix of correlation coefficients as a measure of similarity between positions. The results of our PCA are shown in Fig. Fig.2.2. The figure shows that positions 30 and 88 cluster together and are separate from most other positions. It also shows a clustering of positions 54 and 82 and their separation from positions 73, 84, 90, and 93.

FIG. 2.
PCA of the 45 positions associated with protease inhibitor treatment. The graph is a two-dimensional projection of the distances among the 45 positions, where the similarity between any two positions is measured by their binary (phi) correlation coefficient ...

Correlated mutations and protease residue contacts.

Among the 115 correlated residue pairs, 59 (51%) contained residues that were within 8 Å of each other—many more than the 5.5 pairs predicted when 115 pairs were selected at random. Most of the 59 pairs were close in each of the three structures we examined (liganded, unliganded, and open flap), but four were close only in the open-flap structure from a molecular-dynamics simulation. For example, residues 54 and 82 were separated by 5.4 Å in the open-flap structure but by 8.4 and 8.6 Å in the liganded and unliganded structures, respectively. One of the residue pairs could be explained only by contact between residues on different chains of the protease dimer (residue pair 48 and 82).

Fifty-six (49%) of the 115 correlated pairs were separated by >8 Å. Our Markov chain analysis showed that of these 56 pairs of residues, 16 could be linked by one residue, 21 by two residues, 13 by three residues, and 1 by five residues. However, our permutation analysis, which was designed to determine whether such chains were statistically significant, showed that this amount of chained covariation would be expected by chance in a molecule with 56 correlated, neighboring residues having the size and topology of HIV-1 protease. Therefore, compared with randomly selected residue pairs, the covarying residues we observed were significantly more likely to be within 8 Å of one another but not significantly more likely to be linked by chained covariation.

Figure Figure33 shows the strongest positive correlations superimposed on the structure of the protease. Most of the strong correlations are in a plane that is adjacent to the substrate cleft and include residues 10, 24, 30, 46, 54, 82, 84, and 90.

FIG. 3.
The 50 most highly correlated residues in isolates from treated persons are shown superimposed on the locations of these residues within the folded enzyme. The blue lines represent positively correlated residues (n = 44; phi > 0.2); the ...

Clusters of correlated residues.

Pairs of correlated residues can be further grouped into clusters in which all possible pairs within the cluster are mutually correlated. Among the 23 highly correlated pairs found in isolates from untreated persons, there were two mutational clusters, one of three residues and one of four residues. Among the 115 correlated pairs found in isolates from treated persons, there were 30 mutational clusters, ranging in size from three to six residues (Table (Table44).

Clusters of correlated protease positions

Twenty of the 30 clusters in treated isolates contained one or more primary protease mutations, including L90M (12 clusters), V82ATF (6 clusters), and D30N (6 clusters). The substrate cleft mutation I84V was in four of the L90M clusters. The substrate cleft mutation G48V was in one of the V82ATF clusters. Flap tip positions were included in 4 of the 12 clusters containing L90M (position 46, 4 clusters), and each of the 6 clusters containing V82ATF (position 46, 4 clusters; position 53, 1 cluster; position 54, 4 clusters).

Six representative clusters from Table Table44 are shown in Fig. Fig.4.4. These six clusters occurred in 17% of isolates from all treated persons and 29% of isolates from persons receiving two or more protease inhibitors. Published in vitro susceptibility results for isolates containing each of these six patterns of mutations (and no additional known resistance mutations) reveal that each pattern is associated with reduced susceptibility to each of the protease inhibitors: amprenavir, 2- to 5-fold; indinavir, 10- to 15-fold; lopinavir, 2- to 20-fold; nelfinavir, 10- to 30-fold; saquinavir, 3- to 30-fold; and ritonavir, 3- to 100-fold (25).

FIG. 4.
Six representative clusters from Table Table4.4. Each position in a cluster demonstrates statistically significant mutational covariation with each of the other positions within a cluster. (A) Positions 10, 63, 71, 90, and 93; (B) positions 10, ...


Mutation prevalence.

Of the 99 amino acids in HIV-1 protease, we found that 45 exhibit treatment-associated mutations, 17 have non-treatment-related polymorphisms, and 37 rarely if ever vary. Only subtype B isolates were included in this analysis because few sequences of non-B isolates from persons receiving antiretroviral therapy are available. However, although each subtype is characterized by different polymorphisms, the 37 invariant positions in subtype B isolates are also highly conserved in non-B protease isolates (<1% mutation frequency) (8).

The large number of isolates analyzed in this study and the fact that a large proportion were from patients who received multiple protease inhibitors allowed us to identify 22 new treatment-associated positions. Mutations at eight of these new positions were observed to develop in a previous longitudinal study of protease isolates from 178 treated persons, but the associations of these mutations with treatment in that study were in most cases not statistically significant (24). The newly identified mutations occur primarily in combination with previously reported drug resistance mutations, suggesting that they act as accessory mutations to increase the level of resistance to multiple protease inhibitors or to compensate for losses in fitness. Most of the new mutations involve the replacement of one hydrophobic residue with another, possibly resulting in the repacking of hydrophobic regions in the core domain of the monomer.

Of the newly identified sites of mutation, residue 23—located at the base of the P1 pocket, where it is flanked by V82 and I84—is the position most likely to have a direct impact on inhibitor binding. The mutation L23I likely tightens or reshapes the P1 pocket and may compensate for the increase in size of the pocket that occurs with either V82A or I84V. Alternatively, L23I may directly interfere with inhibitor binding, as it is near the active site (30). Site-directed mutagenesis experiments in which L23I is placed in a wild-type enzyme or in an enzyme containing other mutations (e.g., V82A or I84V) are needed to clarify the effect of this mutation on protease function and protease inhibitor resistance.

Our analysis of the association between mutation prevalence and drug therapy has several limitations. First, the lack of available data on the isolates used in this study made it impossible to demonstrate a direct association between mutations and reduced in vitro susceptibility. Second, we did not control for the duration of HIV-1 infection or protease inhibitor treatment. However, despite these limitations, our analyses do establish a conservative lower limit to the extent of HIV-1 protease mutability and generate hypotheses about specific mutations. These hypotheses can be confirmed by demonstrating the longitudinal development of the mutations with treatment or the effects of the mutations on in vitro drug susceptibility.

Of the 22 newly described treatment-associated mutations, the 13 that are conserved in untreated persons are of more interest than the 9 polymorphic positions. Indeed, we cannot exclude the possibility that the increased prevalence of mutations at the nine polymorphic mutations reflects the increased variability of virus populations in persons infected for a longer period of time—a population that is likely to include more treated than untreated persons.

A recent computational study evaluated the variability of protease residues in HIV-1, other primate lentiviruses, and feline immunodeficiency virus, as well as the theoretical free-energy contribution of each residue to the binding of HIV-1 substrates and inhibitors (29). Our analyses complement this effort by quantifying the variability of this enzyme in isolates that have evolved in the presence of one or more protease inhibitors. However, positions reported to be invariant may develop mutations within virus populations under different selection pressures. Mutations other than those described in this paper have been reported during in vitro passage experiments. Mutation at the invariant residue 91 (T91S) has been reported during in vitro passage with lopinavir (2). The substrate cleft mutations R8QK and A28S have been reported after passage with the experimental inhibitors A-77003 and TMC-126, respectively (11, 31).

Mutation covariation.

The presence of positions within a molecule that covary, or mutate in a correlated manner, suggests that mutations at one position may require a compensatory mutation at a second position for optimal function (7, 16; Lapedes et al., AMS/SIAM Conference on Statistics in Molecular Biology). Covariation analysis has been used to help predict unsolved protein structures and to better understand the functions of proteins with known structures. Previous analyses of covariation have used alignments of sequences in a protein family rather than an alignment of variants of a single protein. However, the high mutation rate and mutation tolerance of HIV-1 have made it possible for us and others (14) to identify statistically significant covariation within a single HIV-1 subtype.

One of the major challenges of covariation analysis is to differentiate covariation resulting from the functional dependency between two positions from the shared inheritance of both mutations from a founder virus. In this study, covariation almost certainly reflects functionality rather than evolutionary relatedness. The fact that mutational correlations were so much more common among treated isolates is consistent with the repeated selection of the correlated mutations in many different isolates during selective drug pressure rather than with the inheritance of the correlated mutations from a small number of ancestral isolates.

Although biochemical and biophysical experiments are required to demonstrate the mechanism for the correlation between pairs of residues, our analyses provide preliminary hypotheses that help prioritize which residues to study. For example, more than one-half of the 115 pairs of significantly correlated positions were within 8 Å of each other. This proportion exceeds that expected by chance, suggesting that in many cases covariation results from a direct interaction between the correlated mutations.

The correlations between amino acids that were not close to one another in the three-dimensional protease structure are more difficult to explain. For example, mutations at position 46 were highly correlated with mutations at many distant positions (Fig. (Fig.33 and and4).4). Although we found that many nonneighboring but highly correlated residues could be linked through a chain of covarying residues, our statistical analysis suggested that the compact shape of the protease and the large number of correlated residues could cause this to occur by chance. An alternative explanation for correlation between distant residues comes from the work of others who have shown that the wild-type protease enzyme may be partially down-regulated and that mutations at certain residues, such as M46I and L63P, increase catalytic activity and may be selected in enzymes with other mutations that decrease catalytic activity, regardless of the locations of these other mutations (9, 22).

The negative correlation between D30N and the other primary protease inhibitor resistance mutations may reflect the fact that D30N decreases protease fitness without contributing resistance to any protease inhibitor other than nelfinavir (6, 12, 18). Alternatively, enzymes containing D30N together with other primary mutations appear to have decreased activity (28).

The frequent occurrence of mutational clusters, as well as other common patterns of mutations, suggests that mutations can interact as part of higher-order networks. These mutation patterns are tangible evidence for the high genetic barrier to resistance to the protease inhibitors. However, these patterns are complex and frequently overlapping, suggesting that there are few, if any, absolute dependencies between drug resistance mutations. Determining the biochemical and biophysical properties of enzymes with these patterns of mutations will be important for designing new protease inhibitors that are less likely to trigger resistance or are effective against already drug-resistant isolates.


M.J.G. and R.W.S. were supported in part by NIH/NIAID (AI-46148-02). R.K. was supported in part by a Stanford University BioX Interdisciplinary Project Grant. R.W.S. and C.A.S. were supported in part by NIH/NIGMS P01 GM66524-01.


1. Benjamini, Y., and Y. Hochberg. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. R. Stat. Soc. Ser. B 57:289-300.
2. Carrillo, A., K. D. Stewart, H. L. Sham, D. W. Norbeck, W. E. Kohlbrenner, J. M. Leonard, D. J. Kempf, and A. Molla. 1998. In vitro selection and characterization of human immunodeficiency virus type 1 variants with increased resistance to ABT-378, a novel protease inhibitor. J. Virol. 72:7532-7541. [PMC free article] [PubMed]
3. Chen, Z., Y. Li, E. Chen, D. L. Hall, P. L. Darke, C. Culberson, J. A. Shafer, and L. C. Kuo. 1994. Crystal structure at 1.9-Å resolution of human immunodeficiency virus (HIV) II protease complexed with L-735,524, an orally bioavailable inhibitor of the HIV proteases. J. Biol. Chem. 269:26344-26348. [PubMed]
4. Condra, J. H., D. J. Holder, W. A. Schleif, O. M. Blahy, R. M. Danovich, L. J. Gabryelski, D. J. Graham, D. Laird, J. C. Quintero, A. Rhodes, H. L. Robbins, E. Roth, M. Shivaprakash, T. Yang, J. A. Chodakewitz, P. J. Deutsch, R. Y. Leavitt, F. E. Massari, J. W. Mellors, K. E. Squires, R. T. Steigbigel, H. Teppler, and E. A. Emini. 1996. Genetic correlates of in vivo viral resistance to indinavir, a human immunodeficiency virus type 1 protease inhibitor. J. Virol. 70:8270-8276. [PMC free article] [PubMed]
5. Condra, J. H., W. A. Schleif, O. M. Blahy, L. J. Gabryelski, D. J. Graham, J. C. Quintero, A. Rhodes, H. L. Robbins, E. Roth, and M. Shivaprakash. 1995. In vivo emergence of HIV-1 variants resistant to multiple protease inhibitors. Nature 374:569-571. [PubMed]
6. Devereux, H. L., V. C. Emery, M. A. Johnson, and C. Loveday. 2001. Replicative fitness in vivo of HIV-1 variants with multiple drug resistance-associated mutations. J. Med. Virol. 65:218-224. [PubMed]
7. Gobel, U., C. Sander, R. Schneider, and A. Valencia. 1994. Correlated mutations and residue contacts in proteins. Proteins 18:309-317. [PubMed]
8. Gonzales, M. J., R. N. Machekano, and R. W. Shafer. 2001. Human immunodeficiency virus type 1 reverse-transcriptase and protease subtypes: classification, amino acid mutation patterns, and prevalence in a northern California clinic-based population. J. Infect. Dis. 184:998-1006. [PMC free article] [PubMed]
9. Gulnik, S. V., L. I. Suvorov, B. Liu, B. Yu, B. Anderson, H. Mitsuya, and J. W. Erickson. 1995. Kinetic characterization and cross-resistance patterns of HIV-1 protease mutants selected under drug pressure. Biochemistry 34:9282-9287. [PubMed]
10. Hertogs, K., S. Bloor, S. D. Kemp, C. Van den Eynde, T. M. Alcorn, R. Pauwels, M. Van Houtte, S. Staszewski, V. Miller, and B. A. Larder. 2000. Phenotypic and genotypic analysis of clinical HIV-1 isolates reveals extensive protease inhibitor cross-resistance: a survey of over 6000 samples. AIDS 14:1203-1210. [PubMed]
11. Ho, D. D., T. Toyoshima, H. Mo, D. J. Kempf, D. Norbeck, C. M. Chen, N. E. Wideburg, S. K. Burt, J. W. Erickson, and M. K. Singh. 1994. Characterization of human immunodeficiency virus type 1 variants with increased resistance to a C2-symmetric protease inhibitor. J. Virol. 68:2016-2020. [PMC free article] [PubMed]
12. Kantor, R., W. J. Fessel, A. R. Zolopa, D. Israelski, N. Shulman, J. G. Montoya, M. Harbour, J. M. Schapiro, and R. W. Shafer. 2002. Evolution of primary protease inhibitor resistance mutations during protease inhibitor salvage therapy. Antimicrob. Agents Chemother. 46:1086-1092. [PMC free article] [PubMed]
13. Kempf, D. J., J. D. Isaacson, M. S. King, S. C. Brun, Y. Xu, K. Real, B. M. Bernstein, A. J. Japour, E. Sun, and R. A. Rode. 2001. Identification of genotypic changes in human immunodeficiency virus protease that correlate with reduced susceptibility to the protease inhibitor lopinavir among viral isolates from protease inhibitor-experienced patients. J. Virol. 75:7462-7469. [PMC free article] [PubMed]
14. Korber, B. T., R. M. Farber, D. H. Wolpert, and A. S. Lapedes. 1993. Covariation of mutations in the V3 loop of human immunodeficiency virus type 1 envelope protein: an information theoretic analysis. Proc. Natl. Acad. Sci. USA 90:7176-7180. [PubMed]
15. Kuiken, C. L., B. Foley, B. H. Hahn, P. Marx, F. E. McCutchan, J. Mellors, J. I. Mullins, S. Wolanski, and B. Korber. 1999. Human retroviruses and AIDS: a compilation and analysis of nucleic and amino acid sequences. Los Alamos National Laboratory, Los Alamos, N.Mex.
16. Larson, S. M., A. A. Di Nardo, and A. R. Davidson. 2000. Analysis of covariation in an SH3 domain sequence alignment: applications in tertiary contact prediction and the design of compensating hydrophobic core substitutions. J. Mol. Biol. 303:433-446. [PubMed]
17. Maguire, M., D. Shortino, A. Klein, W. Harris, V. Manohitharajah, M. Tisdale, R. Elston, J. Yeo, S. Randall, F. Xu, H. Parker, J. May, and W. Snowden. 2002. Emergence of resistance to protease inhibitor amprenavir in human immunodeficiency virus type 1-infected patients: selection of four alternative viral protease genotypes and influence of viral susceptibility to coadministered reverse transcriptase nucleoside inhibitors. Antimicrob. Agents Chemother. 46:731-738. [PMC free article] [PubMed]
18. Martinez-Picado, J., A. V. Savara, L. Sutton, and R. T. D'Aquila. 1999. Replicative fitness of protease inhibitor-resistant mutants of human immunodeficiency virus type 1. J. Virol. 73:3744-3752. [PMC free article] [PubMed]
19. Molla, A., M. Korneyeva, Q. Gao, S. Vasavanonda, P. J. Schipper, H. M. Mo, M. Markowitz, T. Chernyavskiy, P. Niu, N. Lyons, A. Hsu, G. R. Granneman, D. D. Ho, C. A. Boucher, J. M. Leonard, D. W. Norbeck, and D. J. Kempf. 1996. Ordered accumulation of mutations in HIV protease confers resistance to ritonavir. Nat. Med. 2:760-766. [PubMed]
20. Patick, A. K., M. Duran, Y. Cao, D. Shugarts, M. R. Keller, E. Mazabel, M. Knowles, S. Chapman, D. R. Kuritzkes, and M. Markowitz. 1998. Genotypic and phenotypic characterization of human immunodeficiency virus type 1 variants isolated from patients treated with the protease inhibitor nelfinavir. Antimicrob. Agents Chemother. 42:2637-2644. [PMC free article] [PubMed]
21. Schapiro, J. M., M. A. Winters, F. Stewart, B. Efron, J. Norris, M. J. Kozal, and T. C. Merigan. 1996. The effect of high-dose saquinavir on viral load and CD4+ T-cell counts in HIV-infected patients. Ann. Intern. Med. 124:1039-1050. [PubMed]
22. Schock, H. B., V. M. Garsky, and L. C. Kuo. 1996. Mutational anatomy of an HIV-1 protease variant conferring cross-resistance to protease inhibitors in clinical trials. Compensatory modulations of binding and activity. J. Biol. Chem. 271:31957-31963. [PubMed]
23. Scott, W. R., and C. A. Schiffer. 2000. Curling of flap tips in HIV-1 protease as a mechanism for substrate entry and tolerance of drug resistance. Struct. Fold. Des. 8:1259-1265. [PubMed]
24. Shafer, R. W., P. Hsu, A. K. Patick, C. Craig, and V. Brendel. 1999. Identification of biased amino acid substitution patterns in human immunodeficiency virus type 1 isolates from patients treated with protease inhibitors. J. Virol. 73:6197-6202. [PMC free article] [PubMed]
25. Shafer, R. W., D. Stevenson, and B. Chan. 1999. Human immunodeficiency virus reverse transcriptase and protease sequence database. Nucleic Acids Res. 27:348-352. [PMC free article] [PubMed]
26. Shafer, R. W., M. A. Winters, S. Palmer, and T. C. Merigan. 1998. Multiple concurrent reverse transcriptase and protease mutations and multidrug resistance of HIV-1 isolates from heavily treated patients. Ann. Intern. Med. 128:906-911. [PubMed]
27. Spinelli, S., Q. Z. Liu, P. M. Alzari, P. H. Hirel, and R. J. Poljak. 1991. The three-dimensional structure of the aspartyl protease from the HIV-1 isolate BRU. Biochimie 73:1391-1396. [PubMed]
28. Sugiura, W., Z. Matsuda, Y. Yokomaku, K. Hertogs, B. Larder, T. Oishi, A. Okano, T. Shiino, M. Tatsumi, M. Matsuda, H. Abumi, N. Takata, S. Shirahata, K. Yamada, H. Yoshikura, and Y. Nagai. 2002. Interference between D30N and L90M in selection and development of protease inhibitor-resistant human immunodeficiency virus type 1. Antimicrob. Agents Chemother. 46:708-715. [PMC free article] [PubMed]
29. Wang, W., and P. A. Kollman. 2001. Computational study of protein specificity: the molecular basis of HIV-1 protease drug resistance. Proc. Natl. Acad. Sci. USA 98:14937-14942. [PubMed]
30. Wlodawer, A., and J. W. Erickson. 1993. Structure-based inhibitors of HIV-1 protease. Annu. Rev. Biochem. 62:543-585. [PubMed]
31. Yoshimura, K., R. Kato, M. F. Kavlick, A. Nguyen, V. Maroun, K. Maeda, K. A. Hussain, A. K. Ghosh, S. V. Gulnik, J. W. Erickson, and H. Mitsuya. 2002. A potent human immunodeficiency virus type 1 protease inhibitor, UIC-94003 (TMC-126), and selection of a novel (A28S) mutation in the protease active site. J. Virol. 76:1349-1358. [PMC free article] [PubMed]

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)