|Home | About | Journals | Submit | Contact Us | Français|
Differential response patterns to optimal antiviral therapy, peginterferon alpha plus ribavirin, are well documented in patients with chronic hepatitis C virus (HCV) infection. Among many factors that may affect therapeutic efficiency, HCV quasispecies (QS) characteristics have been a major focus of previous studies, yielding conflicting results. To obtain a comprehensive understanding of the role of HCV QS in antiviral therapy, we performed the largest-ever HCV QS analysis in 153 patients infected with HCV genotype 1 strains. A total of 4,314 viral clones spanning hypervarible region 1 were produced from these patients during the first 12 weeks of therapy, followed by detailed genetic analyses. Our data showed an exponential distribution pattern of intra-patient QS diversity in this study population in which most patients (63%) had small QS diversity with genetic distance (d) less than 0.2. The group of patients with genetic distance located in the decay region (d>0.53) had a significantly higher early virologic response (EVR) rate (89.5%), which contributed substantially to the overall association between EVR and increased baseline QS diversity. In addition, EVR was linked to a clustered evolutionary pattern in terms of QS dynamic changes.
EVR is associated with elevated HCV QS diversity and complexity, especially in patients with significantly higher HCV genetic heterogeneity.
Hepatitis C virus (HCV) infection is a major public health concern worldwide. Over 2.7 million Americans are chronically infected with HCV, which results in an estimated 10,000 deaths each year and is a leading indication for liver transplantation (1). Currently, optimal antiviral therapy of chronic hepatitis C with peginterferon alpha plus ribavirin cures up to 80% of patients infected with HCV genotypes 2 and 3. However, the same treatment regimen is effective in only about 50% of patients infected with HCV genotype 1 (2). It is thus important to be able to identify factor(s), either host or viral, which affect results of therapy as such information may be valuable in improving current antiviral strategy.
In this setting, HCV quasispecies (QS) characteristics have been a major focus of study in patients undergoing antiviral therapy. However, previous studies have generated conflicting data with regard to the role of HCV QS in the determination of therapeutic efficiency (see a recent review in ref. 3). Such results are to some extent not surprising since the responses to antiviral therapy represent a complex phenotype that is affected by multiple factors from both virus and host sides. The involvement of these factors certainly interferes with the data interpretation from HCV QS studies, especially when the study population is small. In addition, techniques used to assess HCV QS diversity may be another source for data discrepancy. The effect of mutations on gel mobility of a given DNA molecule is sometimes unpredictable (4). Thus, data from gel-based assays is not always consistent with the results from cloning/sequencing, which is thought to be a gold standard technique to assess viral diversity. In current study, we have performed a detailed QS analysis in 153 patients undergoing combination antiviral therapy (peginterferon alfa-2a plus ribavirin). Compared to many previous studies, the current project has several unique features, such as being the largest study population, with an exclusive focus on HCV genotype 1 and the application of large-scale cloning and sequencing techniques. These characteristics allow a thorough dissection of the potential effect of HCV QS during antiviral therapy.
This was an ancillary study of a large clinical trial that aimed to compare therapeutic efficiency of peginterferon alpha-2a and alpha-2b in treatment-naïve patients with chronic HCV infection (5). Of 380 patients enrolled in the trial, 189 patients were treated with peginterferon alpha-2a and are the subjects in present study. Patient recruitment was restricted to HCV genotype 1 (5). Serum samples were collected at multiple time points during the early phase of antiviral therapy, including baseline (w00), week 4 (w04), week 8 (w08) and week 12 (w12). De-identified specimens were shipped to Saint Louis University (SLU) and John Hopkins University (JHU) and stored at −80°C until use. For each patient, molecular cloning was planned for two serum samples, one at the baseline and the other at the latest time point during the early phase of antiviral therapy before week 12 with a minimum HCV viral load of more than 1000 copies per milliliter, approximately equal to 1111 IU/ml when HCV RNA level is quantified with Roche Amplicor HCV Monitor, v2.0 (lower limit of quantification, 600 IU/mL).
A 442-bp fragment covering HCV HVR1 was amplified by Reverse transcription-PCR (RT-PCR), followed by gel purification and TA cloning. About 15 independent clones for each sample were sequenced. Detailed experimental procedures were provided in the Support Information.
Raw sequences were edited with the programs ClustalW (6) and BioEdit (7) in which HCV H77 strain (AF009606) served as the reference sequence. After the removal of primer sequences, the target domain for genetic analyses is 399 bp in length. Nucleotide positions containing insertions or deletions within this domain were removed for the present analysis and will be analyzed separately for their potential influence on antiviral therapy. HCV QS nature was characterized by measuring both genetic complexity and genetic diversity. The definitions and measurement of these genetic parameters were outlined in the Supporting Information.
Phylogenetic analysis was used to verify HCV genotypes and/or subtypes and potential sequence clusterings corresponding to response patterns. We constructed two phylogenetic trees, one with all 4,314 clones (big tree) and the other with only 153 clones respectively representing the dominant HCV QS variant at the baseline from each patient (small tree). Both trees were computed using program MEGA 4 with Neighbor-Joining approach, using the maximum composite likelihood model. We also assumed a rate variation among sites with an experienced value of gamma parameter (α =0.5). Forty-five reference HCV sequences, representing different HCV genotypes and subtypes, were included.
Values of genetic parameters from comparative analyses were tested for statistical significance using two-tailed Student’s t test. Categorical data from cross analyses were tested for statistical significance using the χ2 test with Yate’s correction or Fisher’s exact test. With regard to HCV QS diversity at baseline, we explored its potential distribution pattern in this study population (n=153). In doing so, one-sample Kolmogorov-Smirnov test was first used to test common distribution patterns. Next, a newly developed procedure was applied to see if the data fits a power-law or power-law-like distribution, such as exponential distribution, in which the given quantities are tightly clustered around their average values with much reduced probability far from the mean in a one-way direction (8). In this setting, the term “decay region” was used to denote the low-boundary phase of the distributions and the value at proposed low-bound point (χmin) on the curve was calculated (8). All statistical analyses were done with SPSS (version 13.0) except for the HCV QS distribution analysis that was run in MATLAB (http://www.mathworks.com).
A total of 3909 distinct HCV QS sequences generated in this study have been deposited in GenBank under accession numbers FJ688411 through FJ692319. The entire data set including 4313 sequences is available upon request.
We had a 100% success rate for the amplification of a 442-bp target by using our protocol described above. This high success rate mainly resulted from our efforts to optimize PCR primers, including their positions and composition. In some samples with low viral loads near 1000 copies per milliliter (n=15, 5.7%), we found that it was necessary to increase the input amount of serum RNA for reverse transcription (RT). This was achieved by using 280 μl of serum, instead of regular 140 μl, for RNA extraction, followed by the elution into the same volume of Tris buffer (60 μl). Most experiments resulted in a single visible band with expected size on agarose gel. In the molecular cloning experiments, the positive rate of recombinant clones was approximately 95%.
For each patient, HCV QS profiles were generated at two time points, the baseline and the latest time point during the early phase of antiviral therapy (≤ week 12) with a minimum HCV viral load more than 1000 copies per milliliter. Based on this standard, we finally identified 110 patients with two time points and 43 patients with only one time point (baseline), which resulted in 263 serum samples to be studied. A total of 4,314 clones were generated and sequenced from these samples, an average 16.4 clones per sample. Among 153 patients, 104 (68%) achieved early virological response (EVR), defined as more than 2 log decrease of HCV RNA level at week 12 compared to the baseline (9). Potential differences between these two groups have been examined in terms of viral factors, including baseline viral load, HCV subtypes, QS diversity, complexity and dynamics.
The phylogenetic tree constructed with all 4,314 clones (big tree) displayed single patient-based clusterings, indicating the lack of HCV co-infection with different subtypes or strains. This observation further excluded any contamination during experimental performance. We verified HCV genotype/subtype for all patients through the phylogenetic analysis of 153 viral sequences representing dominant QS variant from each patient. Thus, 115 patients were infected with HCV genotype 1a and 38 with HCV genotype 1b (Figure 1). HCV genotype 1a isolates further formed three clusters, named subgroups 1, 2, and 3 (Figure 1). Among 109 patients with HCV subtypes based on Inno-LiPA HCV II Line Probe assay, 17 appeared to have been mistyped based on our phylogenetic analysis (Figure 1). Thus, Inno-LiPA HCV II Line Probe assay seems to have an error rate of HCV subtyping at 15.6%, consistent with a previous report (10). More interestingly, Inno-LiPA HCV II Line Probe assay mistyped 16 of 17 HCV 1a patients as HCV genotype 1b, suggesting the existence of an intrinsic bias to HCV genotype 1b.
There was no significant difference between HCV subtypes with regard to early response patterns, EVR vs. non-EVR, 1a, 76.9% vs. 23.1% and 1b, 71.4% vs. 28.6% (Figure 1). In the phylogenetic tree, HCV genotype 1a strains were further clustered into three subgroups, supported by bootstrap test (Figure 1). Again, no statistical significance was detected with respect to the relationship between HCV 1a subgroups and early response patterns in terms of current treatment regimen (Figure 1).
We also investigated the potential relationship and interactions between pretreatment HCV RNA levels and early response patterns, HCV genotypes and subgroups. As shown in Table 1, pretreatment HCV viral load was not associated with early virological response patterns (p=0.137), HCV genotypes (p=0.489) or HCV 1a subgroups (p=0.171).
We separated our amplified region into two domains, HVR1 (81 bp) and non-HVR1 (318 bp), to avoid possible masking of statistical significance due to apparently unequal nucleotide substitution rates between these two domains. Next, we performed the analyses at two levels, pretreatment genetic diversity and its early dynamic changes during antiviral therapy.
With regard to pretreatment genetic diversity, subjects with EVR had overall higher values than non-EVR group in the most of parameters measured. Thus for HVR1, d 0.253 vs. 0.1723; dS 0.0849 vs. 0.0810; dN 0.1451 vs. 0.1033 and for Non-HVR1, d 0.0333 vs. 0.0265; dS 0.0528 vs. 0.0528 and dN 0.0064 vs. 0.0051. However, only HVR1 dN reached statistical significance (p=0.039) (Figure 2).
The criteria for sample selection identified 110 patients with serum samples to be cloned and sequenced at two time points, one at the baseline and the other at the latest time point during the early phase of antiviral therapy (≤ week 12) with a minimum HCV viral load more than 1000 copies per milliliter. Again, these patients were separated into two groups, EVR (n = 66) and Non-EVR (n = 44). We conducted two kinds of group-based analyses, the average change and net change of genetic diversity between two time points. Average change simply compares population-based genetic diversity without the consideration of sequence differences between two populations, which is reflected by the net change. In other words, the net change estimates the extent of “clustered evolution” in terms of its phylogenetic representation (11).
The average change in genetic diversity was calculated based on either HVR1 (81 bp) or non-HVR1 region (318 bp). The net change was determined with HVR1 only. Both EVR and non-EVR groups displayed a trend toward decreasing genetic diversity over time, shown as all positive values of genetic parameters. However, the average change of non-HVR1 dN increased in EVR group (Figure 3). Because the absolute values of dN change of non-HVR1 are actually very minimal in both EVR and non-EVR groups (0.0013 vs. 0.0018), such an increase may not be biologically significant. The EVR group had a more prominent decrease of genetic diversity over time than the non-EVR group (Figure 3). However, the difference between two groups was not statistically significant (Figure 3).
The net change of genetic diversity had similar patterns as the average change did. The EVR group was associated with more apparent decrease of net genetic diversity. Again, the difference between EVR and Non-EVR group was not supported statistically (Figure 3).
Genetic complexity was estimated by measuring average Shannon entropy in HVR1 domain. Like genetic diversity, EVR group had a higher pretreatment genetic complexity than Non-EVR group at either nucleotide or amino acid level (Figure 4). The latter reached statistical significance (p = 0.0499). The average change of genetic complexity was also higher in EVR group than that in Non-EVR group although this difference was not statistically supported (Figure 4).
The current project, the largest-ever QS study yet to focus on HCV genotype 1, investigated possible distribution patterns of QS diversity. Using HVR1 genetic distance (d), we first plotted its histogram in this study population. The distribution patterns were subsequently estimated by one-sample Kolmogorov-Smirnov test. The data supported an exponential distribution (p=0.132) with the exclusion of normal (p<0.001), uniform (p<0.001) and Poisson (p<0.001) distributions (Figure 5). We further calculated the lower bound (χmin) using described formulas under the hypothesis of either continuous power-law or exponential distribution (8). The power-law distribution was not favored due to the short tail, only 15 patients located in the power-law region (χmin =0.58), which is too few to be meaningful. The χmin for exponential distribution was equal to 0.53, putting genetic distance from 19 patients in the decay region (Figure 5).
We have evaluated HCV QS heterogeneity at baseline and its dynamic changes during the early therapeutic period. In this type of study, sampling bias is often a concern due to the lack of normalization of entry HCV RNA amount used for RT-PCR (12). Thus, viral QS heterogeneity may be to some extent dependent on viral titers. However, under our experimental procedure and study protocol, we consider the potential for sampling bias to be minimal, and its role on our observations and conclusions unimportant. First, we previously failed to detect a statistical relationship between QS diversity and viral titers (13); Second, for a given viral region used to measure QS diversity, such as HVR1 in this study, QS diversity is maintained by sequencing an adequate number of clones, usually >10, and the use of fixed PCR primers (14, 15); Finally, we have focused on comparative HCV QS analyses between early virologic response (EVR) and non-EVR groups. There is no statistical difference with regard to their average HCV viral loads in both groups, which may further reduce the sampling bias assuming it exists. An additional limitation, by the nature of the PEAK study that only followed patients through the first 12 weeks of therapy, is the inability to correlate HCV QS heterogeneity with sustained virologic response (SVR). Rather, we have chosen to correlate it with EVR. The number of patients with rapid virologic response (RVR), defined as undetectable viral RNA at week 4, was only 11 patients (7.2%), making it difficult to analyze. Nonetheless, we actually didn’t find any viral factors analyzed in this study that were specifically associated with RVR (data not shown). Thus, we focused on the EVR rather than RVR. EVR is also a response pattern that appears to reflect intrinsic sensitivity of HCV to combination therapy. It has been reported that EVR is a very good predictor of SVR (16). Data interpretation from our study may have general applicability in terms of HCV antiviral therapy.
HCV genotype is a well-documented independent factor affecting the efficiency of antiviral therapy (2). The large size of this study population allows us to examine whether or not such an observation can be extended to HCV subtype level. Statistically, we have demonstrated that pretreatment HCV viral loads, HCV subtypes and HCV subgroups within HCV genotype 1a are not determinants of early response patterns. Previous studies suggest that a low baseline HCV viral load is an independent predictor to SVR (reviewed in ref. 17). In our study, EVR group even had a higher average viral load although the differences between groups were not statistically supported (Table 1). The discordance may be attributed to multiple factors, such as different therapeutic regimens, patient selection and various stages on disease progress (18, 19). Alternatively, it may simply suggest that pre-treatment HCV viral load is not an independent factor to predict therapeutic efficiency.
An important finding came from our QS analysis. Due to known differences in mutation frequency between HVR1 and non-HVR1 domains, our analysis focused on the HVR1. EVR was associated with increased baseline QS diversity, shown as higher values of genetic diversity and complexity. Especially, dN reached statistical significance (p=0.039). The dN reflects the strength of evolutionary selection that is frequently interpreted as immune pressure (20). Since HVR1 contains putative B and T cell epitopes (21, 22), our observation suggests a role of pretreatment immune status in the determination of early virological response patterns. Compared with previous studies, it should be emphasized that our conclusion is more solid in terms of statistical power given the large number of patients studied. Additionally, two aspects of HCV QS dynamics have been assessed, including genetic diversity/complexity (average change) and sequence diversification (net change). In the former aspect, both EVR and Non-EVR groups display a general trend of reduced genetic diversity and complexity after the start of antiviral therapy. Such a trend seems more apparent in EVR group. Given the fact of higher pretreatment genetic diversity or complexity in EVR group, this trend may be only the reflection of the elimination of drug-sensitive HCV quasispecies variants. However, this explanation cannot be applied to the sequence diversification (net change) in which a similar trend has been featured. Thus, consistent with previous reports, HCV QS diversification is more likely associated with early response patterns (11).
By taking the advantage of this largest-ever HCV QS study, we explored potential distribution patterns of intra-patient HCV QS diversity at the population level. In contrast to viral load that displayed a typical normal (Gaussian) distribution among 153 patients we studied (data not shown), the QS diversity showed a best fit to an exponential distribution (Figure 5) (23). This finding has several important implications. First, the exponential distribution of QS diversity may explain long-existing controversies with regard to the role of QS diversity in HCV antiviral therapy (11, 24–40). While the overall EVR rate in this study population is about 68%, the group of patients with HVR1 genetic distance beyond the low bound (χmin=0.53) has a significantly higher EVR rate (89.5%, p=0.032) (Figure 5). In fact, when taking out the patient group with large QS diversity (d>0.53), the statistical significance of dN values was lost in terms of its association with the EVR (EVR, 0.106 versus Non-EVR, 0.092, p=0.194). Thus, assuming a potential role of QS diversity in the antiviral therapy, we can conclude that such a role may become dominant only in patients with large QS diversity (d>0.53). This conclusion was also true in the similar analysis using dN values (data not shown). Since the QS diversity in most patients, as measured by genetic distance (d), is less than 0.53 (Figure 5), intrinsic uncertainty is actually accompanied with any HCV genetic studies to explore the role of QS diversity in the antiviral therapy, especially when the study population size is small. For example, in our previous study, there is only one patient to be confirmed with genetic distance large than 0.53 among 29 patients studied (36).
Second, since the introduction of QS theory into virology, most viral genetic studies simply use this term to describe viral genome heterogeneity. Its original definition is largely ignored (41, 42). According to this theory, all viral variants in infected individuals form a network and act as a unit in response to internal or external interference, such as antiviral therapy. In this study, we have found that the EVR is associated with high QS diversity. Such an observation is not easily understood with classical population biology because high QS diversity indicates an increased possibility to contain drug-resistant viral variants. In this setting, QS theory may provide a plausible explanation by treating the entire viral population as an acting unit presumably in a status of the self-organized criticality (SOC), which is a prevailing hypothesis to explain many complex phenomena in nature, including exponential distribution (43). Consequently, high QS diversity may imply a critical status in which HCV reaches its maximum capability to maintain the QS network. Such a status, as assumed by SOC hypothesis, would be extremely sensitive to any external stimulus, such as antiviral therapy, resulting in the collapse of the entire QS network (virus extinction). Thus our data provide indirect evidence for the support of QS theory in virology and the SOC hypothesis should be an important addition to this theory.
Third, the formation of QS diversity results from the complicated interaction between virus and host. The exponential distribution of QS diversity suggests the lack of an ideal genetic distance that may favor HCV infection. In other words, the QS nature is not necessarily the only prerequisite for HCV to establish or maintain its persistent infection. Indeed, most RNA viruses, such as dengue and West Nile virus, share the QS nature (44, 45). However, only a few of them result in a persistent infection in human.
Finally, our data showed the large variability of HCV QS diversity in infected patients. With the fact of a considerable relatedness between the EVR and high HCV QS diversity, the modulation of HCV QS diversity may represent a novel strategy for antiviral therapy. The underlying mechanism responsible for the formation of high QS diversity in HCV patients therefore warrants further investigation.
This work was supported by Roche, USA and NIH grants R01 DK80711 (to XF) and R21 AI076834 (to AMD).
We thank Aaron Clauset (Santa Fe Institute) for his assistance on statistical analysis.