Search tips
Search criteria 


Logo of jvirolPermissionsJournals.ASM.orgJournalJV ArticleJournal InfoAuthorsReviewers
J Virol. 2011 April; 85(7): 3649–3663.
Published online 2011 January 19. doi:  10.1128/JVI.02197-10
PMCID: PMC3067842

Coevolution of the Hepatitis C Virus Polyprotein Sites in Patients on Combined Pegylated Interferon and Ribavirin Therapy[down-pointing small open triangle] §


Genotype-specific sensitivity of the hepatitis C virus (HCV) to interferon-ribavirin (IFN-RBV) combination therapy and reduced HCV response to IFN-RBV as infection progresses from acute to chronic infection suggest that HCV genetic factors and intrahost HCV evolution play important roles in therapy outcomes. HCV polyprotein sequences (n = 40) from 10 patients with unsustainable response (UR) (breakthrough and relapse) and 10 patients with no response (NR) following therapy were identified through the Virahep-C study. Bayesian networks (BNs) were constructed to relate interrelationships among HCV polymorphic sites to UR/NR outcomes. All models showed an extensive interdependence of HCV sites and strong connections (P ≤ 0.003) to therapy response. Although all HCV proteins contributed to the networks, the topological properties of sites differed among proteins. E2 and NS5A together contributed ~40% of all sites and ~62% of all links to the polyprotein BN. The NS5A BN and E2 BN predicted UR/NR outcomes with 85% and 97.5% accuracy, respectively, in 10-fold cross-validation experiments. The NS5A model constructed using physicochemical properties of only five sites was shown to predict the UR/NR outcomes with 83.3% accuracy for 6 UR and 12 NR cases of the HALT-C study. Thus, HCV adaptation to IFN-RBV is a complex trait encoded in the interrelationships among many sites along the entire HCV polyprotein. E2 and NS5A generate broad epistatic connectivity across the HCV polyprotein and essentially shape intrahost HCV evolution toward the IFN-RBV resistance. Both proteins can be used to accurately predict the outcomes of IFN-RBV therapy.

Hepatitis C virus (HCV) is the major etiologic agent of blood-borne non-A, non-B hepatitis (25). Chronic HCV infection is an established risk factor for the development of liver diseases, such as fibrosis, cirrhosis, and hepatocellular carcinoma (33, 124, 125). Approximately 70% to 80% of HCV-infected patients fail to clear the virus and progress to chronicity (89a). At present, there are no preventive vaccines against HCV. The current, accepted therapeutic approach to treating chronic hepatitis C infection involves a 24- or 48-week course of pegylated alpha interferon (IFN-α) combined with ribavirin (RBV) (i.e. IFN-RBV therapy) (48, 52). Because only 50% to 70% of chronically infected patients develop a sustained virologic response (SVR) to this treatment (48, 52, 55, 80) and because patient intolerance to such therapy is common (61, 68, 120), the development and application of other therapeutic approaches using antiviral compounds that act against HCV more efficaciously and yet generate lower rates of adverse effects are major clinical management and public health objectives. Therapeutic failure presents in two forms: (i) complete resistance to treatment (no response [NR]) and (ii) unsustainable response (UR), which is characterized by an increase in HCV load observed during therapy after an initial period of decline in viral load (breakthrough) or observed after cessation of therapy (relapse) (52).

Several factors are known to affect therapy outcome in HCV-infected patients, most notably the infecting HCV genotype. There are six major HCV genotypes, 1 to 6 (108, 109). Patients infected with genotype 2 are the most responsive, with SVR being achievable in 70% to 80% of cases (52, 80). In contrast, only 50% to 60% of genotype 1-infected patients achieve SVR (48, 55, 80, 90). Genotype 1 is the most prevalent genotype worldwide (78). The dependence of IFN-RBV response rates on HCV genotype (48, 52, 55, 80) implies that the composition of the HCV genome plays a role in influencing therapy outcome.

The mechanism of IFN action against HCV is not fully known. It was shown that treatment with IFN activates the host's innate antiviral immune responses by inducing IFN-stimulated genes (47, 59, 64, 84). Several HCV genomic regions have been found to be associated with resistance to IFN treatment (74). Since responses to IFN differ among HCV strains, associations between IFN therapy outcome and HCV genomic variability in regions such as hypervariable region 1 (HVR1) of E2 (87, 118) and the V3 domain of NS5A (34, 79) have been frequently investigated. A correlation was reported between NR and the high complexity of HVR1 variants before treatment (87, 118), but it was not confirmed in a subsequent study (75). A high level of V3 heterogeneity was associated with IFN sensitivity (34, 99, 119). Specific mutations in the core protein have also been suggested to determine the early response to IFN-RBV therapy (36).

Both E2 and NS5A proteins have been implicated in binding to the IFN-inducible, double-stranded, RNA-activated protein kinase R (PKR), which is involved in the IFN-induced antiviral response (49). A 12-amino-acid (aa) region located between positions 659 and 670 in E2 known as the PKR-α subunit of eukaryotic initiation factor 2 (PKR-eIF2α) phosphorylation homology domain (PePHD) was shown to bind PKR in vitro (113). The PePHD sequence has similarity to the autophosphorylation sites of PKR and the phosphorylation site in eIF2α. This similarity is greater for HCV genotype 1 than genotype 2 or 3. However, the association between PePHD sequence and therapy outcomes has not been consistently shown (1, 98). A PKR-binding domain is located in the C-terminal region of NS5A (49). A variable 40-aa region of this domain, termed the interferon sensitivity determining region (ISDR), was reported to play a key role in the IFN therapy response (37, 38). Analysis of HCV 1b sequences showed an association between the number of ISDR mutations and the response to the IFN therapy (92). However, studies of HCV genotype 2b and 3a did not find such a relation between SVR and NS5A variability (8, 89). Additionally, no binding between PKR and the genotype 3a NS5A from the IFN-resistant HCV strains was observed in vitro (20).

RBV, a guanosine nucleotide analog, is inefficacious against HCV when used alone but when combined with IFN therapy dramatically improves viral clearance and decreases relapse rates (42). The mechanism by which RBV improves treatment responses is not well understood. Several mechanisms of its therapeutic action have been proposed, including inosine monophosphate dehydrogenase inhibition (133), viral inhibition (77), facilitation of Th1 immunoresponses (111), mutagenesis (27, 76), inhibition of 5′ cap formation on mRNAs (53), and upregulation of genes involved in IFN signaling (44, 132). However, none of these mechanisms has been convincingly shown to be responsible for its efficacy when combined with IFN (42). Nonetheless, RBV was recently shown to improve early responses to IFN (43), thus supporting its role in enhancing IFN signaling (44, 132) and emphasizing the leading role of IFN in combination therapy.

Host factors have been also found to affect both the natural course of HCV infection and the outcome of treatment (116). For example, common-source HCV infections frequently lead to differential outcomes among incident cases, with some patients resolving the infection and some developing chronic hepatitis C (122, 123), or patients chronically infected with the same genotype respond differently to IFN-RBV treatment despite carrying similar HCV viral loads (55, 103). In addition to genotype, demographic factors such as ethnicity and gender have been associated with therapy outcomes (48, 80, 103). Several studies reported the role of the host genetic polymorphism, e.g., in the IL28B locus, in defining the rate of spontaneous clearance (114) and IFN-RBV SVR (50, 110).

Many host selection pressures, including innate and adaptive immune responses, shape HCV evolution, and their effects should be reflected in HCV genetic composition and epistatic connectivity among genomic sites. Indeed, polymorphic sites within the HCV genome have been shown to be organized as a network of coordinated substitutions (17), with the topology of the network being different for HCV strains that are resistant or sensitive to treatment (7). Although indicating a strong association of many HCV sites with outcomes of therapy, these networks, however, do not provide quantitative measures for viral genomic parameters related to IFN treatment.

In this paper, we report modeling of quantitative associations between a global epistatic connectivity among the HCV polymorphic amino acid sites and UR/NR outcomes of the IFN-RBV therapy. While NR represents complete resistance to IFN-RBV, UR reflects incomplete suppression of HCV or the intrahost HCV evolution toward IFN-RBV resistance (93). Both UR and NR are associated with HCV persistence despite treatment (52). With HCV available for analysis at the start and end of therapy, these outcomes provide an important setting for analyzing genetic changes in the HCV genome associated with resistance.


Sequence data.

Analyses were conducted using the HCV 1a full-length polyprotein consensus sequences from 20 patients (10 UR and 10 NR cases) identified through the Virahep-C study (18, 26). Sequences in the Virahep-C study were sampled from patients before (n = 20) and at the end of treatment (n = 20) with pegylated IFN-α2a and RBV. Analyses included all sites from the entire HCV polyprotein except for the most C-terminal 56 aa from the NS5B protein. This sequence data set served as a training set for developing models for prediction of therapy outcomes. For some analyses, a total of 298 HCV 1a full-length consensus polyprotein sequences from GenBank were used. In addition, full-length NS5A protein consensus sequences from 18 treatment-naïve patients (6 UR and 12 NR) identified through the HALT-C trial (131) were used as a test data set to validate the NS5A predictive models constructed from the Virahep-C data. A full listing of the GenBank accession numbers of all sequences used in this study can be found in the supplemental material.

An alignment of the HCV viral sequences from all three data sets was generated using the Clustal W program (115) implemented in BioEdit v7.0.5.3 (58). HCV H77 (GenBank accession no. AF009606) was used as the reference sequence. In addition, alignments of consensus sequences for individual gene products were generated using the Virahep-C data. Each amino acid site was numbered according to its position in the HCV polyprotein. For modeling, each sequence was associated with the IFN-RBV therapy outcome, UR or NR. Together, the sequences and assigned therapy outcome attributes constituted the entire set of viral features representing each HCV variant. These viral features of the Virahep-C data were used for modeling dependencies among sites in relation to treatment response.

Conditional independence analysis.

Pairwise conditional independencies (CI) among HCV viral features (amino acid sites and therapy outcome) were examined using full-length polyprotein consensus sequences from the Virahep-C study (18, 26). Testing for CI was performed in the form of undirected independence graphs (71), which present the CI among a collection of variables. Nodes in the graph represent the HCV polyprotein sites and therapy outcome, while links between nodes represent dependencies among the features.

The CI testing was used to validate dependency among the polyprotein sites in relation to the therapy outcome. Only polymorphic sites were considered for finding CI from the data. The identified dependency between two features was shown in the graph as a link. This type of statistical analysis assumes the null hypothesis of independence between any two given features. Relative strengths assigned to links in the graph were based on the marginal dependencies between observed associations. Marginal dependence for each link connecting variables A and B was quantified through P value. For each set, C, of conditioning variables, a P value for {A, B} was computed, which expresses the probability that A and B are conditionally independent given C. The marginal P value is the value corresponding to C = {A, B}. The marginal dependence between A and B is defined as 1 minus the marginal P value associated with {A, B}, where a marginal dependence of 0 means that A and B are completely independent and 1 means that they are completely dependent. The CI among the features was measured at several different levels of significance (thresholds between 0.05 and 5 × 10−6). Undirected independence graphs and statistical computations of CI were conducted as implemented in the commercially available software package Hugin Researcher (v6.8).

Bayesian network (BN).

Relationships among amino acid sites of the HCV polyprotein and therapy outcome were examined using probabilistic graphical models in the form of a Bayesian network (BN) (63), where nodes in the graph represent variables (here, amino acid sites and therapy outcome) and links between the nodes represent relationship. Unlike the undirected independence graphs, BNs provide a more complex notion of the relationships. This includes the notion of the conditional probability and directionality of the relationship. Links connecting two variables (nodes in the graph) are represented as arcs, which may project toward the node (incoming links) or from the node (outgoing links), thus specifying the direction of influences among variables. Relationships between variables in a BN may be interpreted as causal (22). The conditional probability distributions are represented in the conditional probability tables (CPTs) of the variables (features) in the network. CPTs of the BNs in this study represent amino acid probability distributions at each site and probabilities associated with therapy outcome. Inference of the network structure and parameter estimations (i.e., CPTs) of all BNs constructed for this study was performed through Bayesian artificial intelligence learning algorithms.

BNs were inferred from the Virahep-C data using the HCV polyprotein sequences and associated therapy outcomes. The objectives of analysis with BN were to examine the complexity of the probabilistic interrelationship and measure the importance (or strength) of links among the HCV amino acid sites and therapy outcome (variables). Measurements of the importance of links were used to identify the most influential amino acid sites in the polyprotein BN. The importance of a variable can be estimated using the number and strength of links associated with the corresponding node in the BN. The amino acid sites that most strongly influence the probabilities of the treatment outcome were of a particular interest.

The greedy thick thinning (GTT) method (31) was used to infer the BN structure for the task of examining complexity of interrelationships among the variables. The number of incoming links to any given node was constrained between 3 and 10. Parameter estimation of the CPTs was performed using the K2 priors (28) of each variable in the network. Complexity of the probabilistic interrelationship among amino acid sites and therapy outcomes was also examined by individual protein regions. BNs were constructed for each individual protein using the same methods as described above for structure learning and network parameterization (GTT and the K2 priors, respectively). BNs were constructed using the GeNIe software (

It is important to note that with the increase in the number of variables, the number of possible networks grows superexponentially and computation of the probabilities of all links becomes NP-hard (24). Therefore, a search heuristics method was adopted to compute the strengths of the links in order to derive measures of the importance of relationships among amino acid sites and therapy outcome. The maximum spanning tree (MST) algorithm was used to infer the BN structure from the data.

The strength of the probabilistic relationships (or force of the influences) among variables (amino acid sites and therapy outcome) was inferred by computing the Kullback-Leibler (KL) divergence (69) between the joint probability distribution with and without the link. The greater the KL divergence between these two distributions, the greater the strength of the link, hence, the importance of the relationship it represents. The global importance of an amino acid site was calculated as the sum of strength of incoming and outgoing links associated with the node representing this site in the network. The overall strength of links for individual protein regions and relevance to the therapy outcome was calculated by summing the strength of incoming (incoming strength) and outgoing links (outgoing strength) associated with each region.

The relative significance of the contribution that each amino acid site independently provided to the knowledge of therapy outcome was determined using a naïve BN (28) approach. The BN structure was inferred from the Virahep-C data using the MST algorithm. This approach identifies associations between the therapy outcome and amino acid sites, with sites considered to be independent from each other. Mutual information was used to measure contribution of each site to the knowledge of therapy outcome (29). All algorithms based on heuristic methods used here to infer the BN structures as well as computation of link strength and relevance of variables were carried out as implemented in the Professional Edition of BayesiaLaB software (Bayesia SAS, Laval, France). The Pearson correlation coefficient was calculated using SAS (version 9.2; SAS Institute Inc., Cary, NC).

Bayesian network classifier (BNC).

BNC was constructed for E2 and NS5A. Both BNCs can infer the probabilities of the UR/NR responses to IFN-RBV treatment directly from amino acid sequence. The E2 BNC and NS5A BNC were inferred from the Virahep-C data as follows: (i) the network was initialized as a naïve BN (28), where the therapy outcome was directly linked to all amino acid sites; (ii) conditional probabilities for amino acid sites were computed. The K2 learning algorithm (28) was used to infer BN structure. The maximum number of incoming links associated with each node (feature) in BN was constrained to 4. Parameter estimation of CPTs of each feature in the BN was empirically derived from the data.

The NS5A BNC based on the selected amino acid sites was constructed using the hybrid decision table-naïve Bayes method (DTNB) (56). The DTNB is a BN where CPTs are represented by a decision table. This method has been shown to perform well when applied with feature selection (56). The DTBN model splits the features into two groups: one group assigns class probabilities based on naïve Bayes, and the other assigns probability class based on a decision table. The resulting probability estimates are then combined to estimate the probability of the outcome class association.

Physicochemical properties of HCV variants.

Each amino acid can be represented as a set of physicochemical properties. Using these properties, the HCV polyprotein consensus sequences from the Virahep-C data set were converted into the respective physicochemical vectors, which were subsequently used to identify their association with therapy outcome. Analyses were conducted for the HCV polyprotein and individual gene products. Conserved positions were not considered for the physicochemical representation of HCV variants. Position numbering of polymorphic sites was maintained according to the HCV polyprotein. Sequence alignments comprised of polymorphic sites were transformed into N × 5 dimensional numerical vectors, where N is the sequence length and 5 represents the number of physicochemical values assigned to each amino acid site in the sequence. The five physicochemical factors used in this study have been previously described (6). Each vector was then associated with the known therapy outcome (18, 26).

Physicochemical mapping of the data was conducted using a projection pursuit-based technique in the form of a two-dimensional linear projection (LP) (32). The method was used to search for a combination of the physicochemical vectors (projections) that most accurately separates HCV variants into two classes: UR and NR therapy outcomes. The LP mapping can be tested on new data without having to reconstruct the original mapping (32).

Feature selection was used to identify amino acid sites and their properties most relevant to the therapy outcome-based clustering of the HCV variants. A minimal subset of site-specific properties (features) from the NS5A protein was derived, using a heuristic method (73), to search for “interesting” projections that were most associated with the therapy outcome. Projections were evaluated during the global and local searching that was performed using the k-nearest neighbor method (k = 10) and tested by 10-fold cross validation (10-fold CV) for classification correctness. Correctness estimation was based on the average probability of a projection to be assigned to the correct therapy outcome class. During the global and local searches, 5 × 106 and 3 × 106 projections, respectively, were evaluated.

Feature selection (FS).

FS was applied to alignments of the full-length consensus polyprotein sequences and individual gene products of the Virahep-C data to determine which amino acid sites were most associated with therapy outcome. FS reduces dimensionality of the data and improves the prediction performance of BNCs. The usefulness of each amino acid site for the prediction of the therapy outcome was evaluated using FS techniques for ranking or selecting an optimal subset of features. Feature ranking was conducted using divide-and-conquer approaches (decision trees) and information-based metrics. Correlation was used as the filtering metric to search for optimal subsets of features. Given that FS techniques have biases known to affect the variable selection optimization method (30, 54), several FS methods were applied.

Three FS techniques based on information theory were used: information gain (101), Gini gain (16), and gain ratio (101). These methods rank the elevance of the features (amino acid sites) based on a score that each feature receives in relation to the therapy outcomes, UR and NR. The top 25 ranked amino acid sites relevant to the UR/NR outcome were selected and used for comparison between the techniques. Features that by themselves are not useful for prediction (those with a low score) may, however, become useful when combined with other features and, hence, be relevant to the prediction (54). Therefore, the feature subset selection method based on correlation (CFS) (57) was applied to the Virahep-C data. Unlike the ranking methods, the CFS identifies a subset of features (amino acid sites) based on their degree of correlation to the class variable (therapy outcome) and low intercorrelation between features. This method was used to search for a minimal subset of complementary amino acid sites to improve the BNC accuracy.

Evaluation and validation of the therapy outcome predictors.

The E2 and NS5A BNC were evaluated by 10-fold CV. Briefly, the HCV variants represented by all polymorphic sites or selected amino acid sites from E2 or NS5A were randomly divided into 10 parts of equal size. Each part was held out strictly as a testing data set to evaluate the prediction accuracy of the BNC trained with the remaining nine parts of the data. This process was executed until the BNC was evaluated with all 10 parts. The 10 accuracy estimates were then averaged to estimate the overall accuracy of the BNC.

Also, BNCs trained with data sets—where the E2 and NS5A protein sequences were randomly assigned with UR/NR outcome—were evaluated for prediction accuracy. The results were then compared to the accuracy obtained from the BNCs trained with the correct outcome assignment in order to account for any random statistical correlations present in the Virahep-C data.

Two measures of accuracy were used for classification performance: overall percent classification correctness and precision. The overall percent correctness was measured as [(no. correctly classified instances/total no. of instances) × 100]. Precision was determined in the following manner (where TP is the number of true positives, TN is the number of true negatives, FP is the number of false positives, and FN is the number of false negatives): equation M1equation M2; equation M3.

The validation of the NS5A predictive models was conducted using the consensus sequences of the NS5A protein from the HALT-C study (131), which were not part of any of the analyses described herein. Estimation of the NS5A BNC and NS5A-LP accuracy of prediction of treatment outcome for HCV NS5A variants from the HALT-C study was based on the overall percent classification correctness.


Complex interdependence between polymorphic sites and therapy outcome.

CI analysis of interdependencies between HCV polymorphic amino acid sites and their potential linkage to the UR/NR outcome of IFN-RBV therapy was conducted using 40 HCV full-genome sequences obtained before and at the end of therapy from 10 UR and 10 NR patients from the Virahep-C study (18). The HCV sequences from before and after therapy were used to account for HCV evolution during treatment. A total of 551 polymorphic sites were found in the HCV polyprotein consensus sequences from these patients. CI tests were performed to measure the degree of dependency among the polymorphic amino acid sites and the UR/NR outcome of IFN-RBV therapy. Results of the CI test were visually displayed as the undirected independence graph (71) (Fig. (Fig.1),1), in which the conditional dependencies among amino acid sites and UR/NR outcome (shown as nodes in the graph) are represented as undirected links or edges. The undirected graph displayed numerous links representing a dense and complex network of dependencies (P < 4 × 10−4) between amino acid polymorphic sites across the entire HCV polyprotein and therapy outcome. A large number of links among sites within and between individual proteins remained present up to a threshold value of 2 × 10−5. E2 protein sites formed the strongest dependencies. For example, site 612 of E2 is strongly connected to site 233 in E1 (P = 3 × 10−8), and site 642 of E2 to site 1756 in NS4B (P = 2 × 10−8). Also, sites 482 and 612 are strongly connected to site 642 (P = 2 × 10−9). It is important to note that links connecting amino acid sites to therapy outcome were among the strongest (P ≤ 7 × 10−5). As shown in Fig. Fig.1,1, therapy outcome was strongly linked (P ≤ 0.003) to amino acid sites from the E1 (site 242), E2 (sites 397, 434, 524, and 655), P7 (site 790), NS3 (site 1090), NS5A (sites 2280, 2283, 2320, 2366, 2411, 2413, and 2414), and NS5B (sites 2530, 2633, 2730, and 2747) regions. The strongest dependencies were found with sites from P7 (site 790; P = 2 × 10−4), NS5A (site 2280 and 2283; P = 3 × 10−4 and P = 7 × 10−5, respectively), and NS5B (site 2633; P = 9 × 10−4). These data suggest strong coordination of substitutions at sites along the entire HCV polyprotein and association between polymorphic sites and therapy outcome.

FIG. 1.
Undirected independence graphs showing relative strengths of the dependencies (links in the graph) found among HCV polyprotein sites (nodes in the graph) and UR/NR outcomes following IFN-RBV therapy from 40 sequences obtained from 10 UR and 10 NR patients ...

Contribution of different proteins to therapy outcome.

To infer a more insightful representation of the relationships among polymorphic sites and therapy outcome, a Bayesian network (BN) approach (63) was used. The complexity of relationships among HCV polymorphic sites and UR/NR outcome was evaluated by inferring BNs from the full-length HCV polyprotein consensus sequences. The properties of the network are listed in Table Table1.1. In concordance with the undirected interdependence graph findings, interrelationships among all polymorphic sites were found to be highly complex. Figure Figure22 shows the structure of the polyprotein BN containing 551 polymorphic amino acid sites and their association to therapeutic outcome. Although all sites are interdependent, the number of links broadly varies from 1 to 30 among sites. Sites contributing into this polyprotein BN are listed in Table S1 in the supplemental material.

FIG. 2.
BN (P = 3) of inferred relationships among the full-length HCV polyprotein sites and IFN-RBV therapy outcome. Polyprotein sites and outcome are represented as nodes in the graph. Relationships among features are represented as arcs. Features whose ...
Propertiesa of the HCV polyprotein BN

As shown in Table Table1,1, HCV proteins do not contribute equally to the network topology. The E2 and NS5A regions of the HCV polyprotein are the two major contributors of sites into the HCV polyprotein BN (21% and 18% of amino acid sites, correspondingly). The E1, E2, and NS5A regions are also major contributors of links into the network (25.3%, 42%, and 26.8% of all links, correspondingly). The majority of links are between proteins, with only 17.5% of all links being within individual protein regions. Among all E2 links, 18.9% are among E2 sites, whereas all other proteins contain only 1.4% to 10.5% of intraprotein links. Owing to the large number of polymorphic sites contributing to the network, the E2 and NS5A proteins are extensively connected to each other and to all other proteins. As shown in Fig. Fig.3,3, ~20% of all E2 sites have direct links to NS5A, and ~35% of all NS5A sites have direct links to E2 in the polyprotein BN, indicating a significant coordination of substitutions between these two proteins.

FIG. 3.
Distribution of links among polymorphic sites of the HCV 1a NS5A or E2 proteins with other viral proteins in the HCV polyprotein BN. E2 and NS5A interrelationships are compared between the HCV polyprotein BNs inferred from GenBank data and Virahep-C data. ...

Despite generating many connections (n = 554) and contributing many sites (n = 118) to the polyprotein BN (Table (Table1),1), E2 does not have direct links to therapy outcome. Only six sites form such direct connections, with two sites (at positions 864 and 934) being from NS2, a single site (at position 1841) from NS4B, two sites (at positions 2280 and 2283) from NS5A, and a single site (at position 2633) from NS5B.

The core protein contributes only 11 sites (1.8% of all sites) but 136 links (10.3% of all links) to the polyprotein BN, with each site being connected to ~13 other sites, which is ~3 to 8 times more than the individual sites from any other HCV protein (Table (Table1).1). The E1 sites contain 4.5 connections on average, while sites of all other proteins are linked on average to 1.5 to 2.7 other sites. The essential difference is in the directionality of links among proteins. Two proteins, core and E1, located at the N terminus of the HCV polyprotein, have 92% and 69.7% of their links directed outside, respectively, suggesting their important causal role in defining states of many polyprotein sites connected to these two proteins. All other proteins have almost equal measures of incoming (in-degree) and outgoing (out-degree) links.

Many essential properties of the polyprotein BN constructed using the 40 Virahep-C sequences, except for linkage to therapy, were observed with another BN constructed using HCV genotype 1a full-length genome sequences obtained from GenBank (n = 298). As shown in Fig. Fig.33 and and4,4, the GenBank BN and Virahep-C BN have similar distributions of links, and interrelationships among individual proteins are highly correlated (r = 0.99, P < 0.0001), indicating that the overall coordination among substitutions in the HCV genotype 1a data set has been adequately represented by the Virahep-C sequences used in this study. However, variations in the number of polymorphic sites are observed between the GenBank BN (n = 1,296) and polyprotein BN (n = 551). Despite the greater number of polymorphic sites in the GenBank sequences, the Virahep-C sequences contain 25 unique polymorphic sites distributed among all but core proteins: at positions 230, 349, and 381 in E1; 385, 582, 631, and 742 in E2; 768 in P7; 826 and 926 in NS2; 1385, 1461, 1520, 1528, 1565, and 1592 in NS3; 1681 in NS4A; 1805, 1820, and 1846 in NS4B; 2003, 2049, and 2343 in NS5A; and 2500 and 2548 in NS5B. These findings—in conjunction with the observation of the 1.7-fold increase in the number of links between sites in E1 and E2 and the 2-fold increase between sites in E2 and NS5A in the Virahep-C BN compared to the GenBank BN (Fig. (Fig.3)—suggest3)—suggest the treatment-specific variations in coordination of substitutions at the genomic sites in the UR/NR HCV strains.

FIG. 4.
Relative strength and direction of links associated with individual HCV proteins in the Virahep-C BN (A) and GenBank BN (B). The total strength of all outgoing links (blue bars), incoming links (red bars), and the global strength (green bars) are shown ...

Protein sites relevant to therapy outcome.

Observation of a significant interconnection and coordination among HCV proteins suggests that all proteins contribute to determining the UR/NR outcome. To analyze these contributions in more detail, BNs were constructed for the individual gene products. Extensive dependencies between sites and association to therapy outcome were found in all individual polyprotein regions, albeit to different degrees. The E2 and NS5A regions were found to form a more dense set of links than other regions of the polyprotein (Table (Table22).

Propertiesa of the BNs for individual protein regions

Although many polymorphic sites were found to be interlinked in the model shown in Fig. Fig.1,1, indicating a significant coordination of heterogeneity along the HCV polyprotein, there are a large number of sites with very few links, suggesting their marginal contribution to the polyprotein BN. To evaluate which proteins and amino acid sites were most associated with the outcome, we conducted feature selection experiments. By using a naïve Bayesian network with feature selection, the E2 and NS5A polyprotein regions were found to contribute the greatest number of sites relevant to outcome (27.5% and 26.3%, respectively) (Fig. (Fig.5).5). Similar results were observed with four filtering methods for feature selection (Fig. (Fig.6).6). Each of the feature selection techniques extracted a certain number of the most relevant sites. A greater proportion of sites were selected from E2 and NS5A as relevant to the outcome (Fig. (Fig.6).6). The NS5A region consistently contributes a large number of relevant sites with all four feature selection techniques. Depending on the technique, 14.3% to 32% and 24.0% to 44% of amino acid sites were, respectively, selected from E2 and NS5A as contributing to the outcome. All of the techniques used selected significantly overlapping sets of the relevant amino acid sites from all proteins, albeit with variations in ranking among the selected sites (see Table S2 in the supplemental material). A set of sites selected using one of the techniques is shown in Table Table33.

FIG. 5.
BN with selection of relevant sites linked to the UR/NR outcomes. Site selections were based on the BN choice of relevant features for outcome prediction. A total of 80 HCV polyprotein sites are shown. Nodes are color coded by region (inset).
FIG. 6.
Contribution of the UR/NR-relevant sites from individual HCV proteins identified using four filtering methods.
Correlation-based feature selection (CFS) of HCV sites relevant to UR/NR outcomesa

Relationships between variables in a BN may be interpreted as being causal (22), which can be applied to detect relevance of a variable to define a target feature, in this case, therapy outcome. Analysis of the strength of influence measured as the Kullback-Leibler divergence (69) between the joint probability distribution with and without the arc shows that sites from the E2 and NS5A proteins have the strongest overall influences on outcome (Fig. (Fig.7).7). Additionally, analysis of contribution of individual sites to the UR/NR outcome was conducted using a ratio of the mutual information calculated for each site and the outcome over the maximal mutual information (MI) (MI = 0.3951, P = 0.0001) was calculated for site 2283 in the NS5A protein. Using this ratio as a measure of the relative significance of each site for determining outcome, 25 sites were identified in six proteins with values for this ratio being >0.5 (Fig. (Fig.8).8). Among these sites were one site at position 242 in E1, eight sites at positions 391, 394, 397, 400, 401, 434, 528, and 655 in E2, two sites at positions 753 and 790 in P7, one site at position 941 in NS2, nine sites at positions 2153, 2198, 2280, 2288, 2320, 2339, 2375, 2376, and 2413 in NS5A, and four sites at positions 2633, 2730, 2747, and 2755 in NS5B. The important observation from this analysis is that E2 and NS5A together contain ~70% of these highly relevant sites. Another interesting observation is that hypervariable region 1 (HVR1) contributes five of eight relevant sites in E2, thus suggesting that HVR1 heterogeneity is associated with HCV evolution toward the IFN-RBV resistance.

FIG. 7.
Total strength of association between sites of individual HCV proteins and the UR/NR outcome.
FIG. 8.
Relative significance of association of the HCV polyprotein sites to the UR/NR outcome. Only sites with relative significance of >0.5 are shown. Color code: black, E1; red, E2; green, P7; yellow, NS2; blue, NS5A; and cyan, NS5B. Relative significance ...

Association of protein physicochemical properties with IFN-RBV resistance.

The observation of coordinated substitutions in all HCV proteins suggests extensive interrelationships among phenotypic traits encoded by these proteins and an important role of these interrelationships in defining HCV evolution toward IFN-RBV resistance. Although not clearly determined, these phenotypic traits can be further analyzed using amino acid physicochemical properties as a quantitative approximation to phenotype. The factors affecting sequence variation and diversity should be also reflected in the physicochemical properties of the HCV polyprotein. Herein, the physicochemical space dispersion of the HCV variants from the UR/NR Virahep-C cases (18) was examined using a linear projection technique (32). The analysis was conducted using polymorphic sites of the HCV polyprotein or individual gene products (see Table S1 in the supplemental material). The polymorphic sites from each protein were converted into vectors of amino acid physicochemical properties (6). For each protein, these vectors were used to generate a multidimensional physicochemical space and project this space into the optimized linear two-dimensional (2D) spaces. The probabilistic mapping of NR and UR outcomes in these 2D physicochemical spaces is shown in Fig. Fig.99.

FIG. 9.
Physicochemical projection of HCV polyprotein and individual proteins. Shown are the optimized 2D linear projections. Variation in shade of colors reflects probability estimates for UR (red) and NR (blue) outcomes, with darker shades corresponding to ...

This analysis showed that the probability of outcomes mapped in the optimized 2D physicochemical spaces of the polyprotein, E2, and NS5A was distributed in the least convoluted way, providing almost equal representations of UR and NR (Fig. (Fig.9).9). These observations suggest that the physicochemical properties of all HCV proteins are related to outcome, albeit to various degrees.

Strong association of E2 and NS5A with IFN-RBV resistance.

The results shown above strongly suggest that the IFN-RBV resistance is encoded in many regions of the HCV polyprotein, with E2 and NS5A being strongly linked to this resistance. To further investigate the strength of association of the IFN-RBV resistance with variation in the E2 and NS5A primary structure, BN classifiers (BNCs) were developed using polymorphic sites from these two proteins. The accuracy of performance of the models was evaluated using the 10-fold CV protocol. The results of the evaluation are shown in Fig. Fig.10.10. The E2 and NS5A BNCs constructed using all polymorphic sites were found to be 82.5% and 90% accurate in the prediction of outcomes in the 10-fold CV, respectively. BNCs constructed using 15 sites selected from E2 and 9 sites selected from NS5A (Table (Table3)3) improved accuracies to 85% and 97.5%, respectively, while the randomized data sets produced BNCs showing accuracies of only 35% to 47.5% (Fig. (Fig.10).10). Thus, although the networks of sites from both proteins have a strong association with the IFN-RBV resistance, the NS5A BNCs significantly outperformed the E2 BNCs in the CV experiments.

FIG. 10.
10-fold CV performance of the E2 BNC and NS5A BNC constructed using all polymorphic sites (black bar) and selected relevant sites (white bar). Results for BNCs with randomized labels are shown using patterned bars (black for all and white for selected ...

Prediction of UR/NR outcomes using NS5A.

A high accuracy of the BNC models described above suggests a strong association of coordinated substitutions in NS5A with evolution toward the IFN-RBV resistance. However, since these models were generated using only 40 sequences from 20 patients, it is critical to demonstrate that the interrelationships identified for these patients are representative of those for other patients. For this purpose, two predictive models were constructed using the same Virahep-C data set and tested using the HCV NS5A sequences from baseline specimens obtained from patients in the HALT-C study (131). Because no additional data were available for E2 from patients with NR and UR outcomes investigated in a single study, only the NS5A models were validated.

One model was constructed using physicochemical properties of five NS5A sites selected using a heuristic method (73). The secondary structure for sites at position 2153 (projection X167 in Fig. Fig.11)11) and 2413 (X492), the electrostatic charge for site at position 2198 (X195), the polarity for site at position 2280 (X281), and the molecular volume or size for site at position 2320 (X328) were selected as the most relevant features for outcome in the Virahep-C data set. The LP model mapping UR and NR outcomes into the 2D space generated using linear projection from the 5D physicochemical space is shown in Fig. Fig.11.11. Another model was constructed as a hybrid between the decision table and a naïve Bayes (DTBN)-based machine-learning technique (56) using 12 NS5A sites: nine shown in Table 3 and three additional sites, at positions 2153, 2198, and 2413, used in the linear projection approach.

FIG. 11.
Projection of five selected physicochemical features of five NS5A sites from the HALT-C sequence data set onto the physicochemical space-based model derived from the Virahep-C sequence data set. Lines originating from the center of the graph are projections ...

After a 10-fold CV, both Virahep-C models were tested on the HALT-C data set with 6 NS5A sequences obtained from UR and 12 from NR patients. The hybrid DTBN model showed an overall accuracy of 72.2% and the linear projection model showed an overall accuracy of 83.3% of outcome prediction for the HALT-C patients (Table (Table4).4). This finding suggests that, although many sites along the entire HCV polyprotein are relevant to development of the IFN-RBV resistance, the small number of features from the NS5A protein alone may be sufficient for the prediction of therapy outcomes.

Validation of the NS5A Virahep-C models using the HALT-C NS5A sequencesa


Two important features of HCV infection, persistence following primary infection and resistance to IFN-based therapy, have been related to the extensive HCV genetic variability (39, 41). Although HCV has developed a very efficient capacity to escape from adaptive (15, 35, 104, 128) and innate immune responses (12, 13, 85, 126), ~20% to 30% of all HCV infections are cleared by the host (23) and 50% to 70% of chronic infections can be successfully treated with IFN-RBV (48, 52, 55, 80). The variation in response to therapy among HCV strains remains poorly understood. However, differential sensitivity of HCV genotypes to IFN therapy (52, 80) suggests that viral genetic factors play an important role in determining therapy outcomes. Despite a low degree of response to treatment during chronic infection, 80% to 98% of patients with acute HCV infection can achieve complete virological response to IFN therapy (51, 62), suggesting that HCV acquires a significant degree of IFN resistance during chronic infection. Taken together, these observations indicate a strong connection between the intrahost HCV evolution and success of the IFN-RBV therapy.

In the current study, an integrative approach was implemented for the evolutionary analysis of the HCV genome. This approach was based on modeling interrelationships between polymorphic sites along the entire HCV polyprotein and relating the modeled coordination among amino acid substitutions to the UR/NR outcomes of therapy. Models constructed here showed an extensive interdependence of all polymorphic sites within the HCV polyprotein, suggesting a significant coevolution among individual HCV proteins. The data indicate that all HCV proteins contain sites coordinating their polymorphism with sites in all other proteins (Fig. (Fig.2).2). A similar observation has been recently made using a correlation network analysis of the HCV genotype 1a full-genome sequences from untreated patients (17) and patients on therapy (7). Among all connections identified using the polyprotein BN in this study, only 17.5% were among sites within individual proteins. It is interesting to note that E2 shows the most extensive coordination among its sites, with all other proteins having ~2 to 13 times fewer connections among intraprotein sites than E2. With 82.5% of all connections in the network being among proteins, HCV evolution is evidently defined by coadaptation among many phenotypic traits encoded by different HCV proteins.

Although all HCV proteins contribute to the network, the topological properties of sites differed among proteins. The core protein contributes fewer sites (n = 11) per its size than any other HCV protein. However, each core site forms ~2 to 4 times more links in the network than any site from other proteins (Table (Table1).1). This protein has 12.4 times more outgoing than incoming links in the polyprotein BN, while the ratio between outgoing and incoming links for all other proteins varies from 0.8 to 2.2 (Table (Table1).1). Another important feature of core connectivity in the polyprotein BN is that 98.6% of all core links are with other proteins. The presence of only two intraprotein links (polyprotein positions 90→110 and 47→29) makes the core protein the least intraconnected protein, indicating a minimal direct coordination among core polymorphic sites. Thus, the contribution of core to the network topology differs considerably from those of all other proteins, suggesting that this protein has a unique role in coordinating substitutions and defining heterogeneity at many sites of the HCV polyprotein.

This observation is in agreement with the multitude of functions performed by the core protein and emphasizes its important role in HCV infection. In addition to forming the nucleocapsid (105), this protein was shown to interfere with many cellular signaling pathways involved in apoptosis (134), transcription (60, 130), and transformation (21, 65, 102, 129). The core protein is also involved in lipid metabolism (10, 96). It inhibits the microsomal triglyceride transfer protein, binds to apolipoprotein AII, and induces accumulation of cytoplasmic lipid droplets (2). Core and NS5A are key factors for assembly of infectious particles. Both colocalize on the surface of lipid droplets, a proposed site for HCV particle assembly (4). With lipid droplets playing a crucial role in the assembly and release of infectious HCV particle (83), interactions involving domain 2 of core and domain 3 of NS5A (5, 14, 81, 82) are essential for virion production and, therefore, have a strong impact on infectivity and viral fitness. Mutation at position 147 in domain 2 of the core protein was found to affect adherence of core to lipid droplets and virus production (107). Our data show that this site has direct links in the polyprotein BN to sites in E1, E2, NS2, and domain 3 of NS5A. Another site, from domain 2 of core at position 161, linked to P7 in addition to these four proteins. All of these proteins play a role in the membrane-associated viral replication (86). These observations suggest coordination of heterogeneity across the HCV polyprotein related to viral production and the important role played by the core protein in this coordination.

Two proteins, E2 and NS5A, together contribute ~40% of all sites and ~62% of all links to the polyprotein BN and, therefore, essentially define the state of this entire network. In combination with E1, these three proteins contribute ~50% of all sites and ~77% of all links to the polyprotein BN. It is interesting that E2 and NS5A also mutually coordinate their heterogeneity (Fig. (Fig.3).3). Although coordination between sites from any two HCV proteins is a common feature of the polyprotein BN, this coordination is most extensive between sites of E2 and NS5A, owing to the large number of sites contributed by these two proteins to the network. Thus, the states of many sites in one of these two proteins reflect the states of many sites in the other protein, suggesting a high degree of coevolution between these two proteins. Additionally, it was observed that sites from E2 formed the strongest links with many other sites in the polyprotein as determined by CI testing (Fig. (Fig.1),1), among which were links between sites 482 and 642 in E2 (P = 2 × 10−9), 612 in E2 and 233 in E1 (P = 3 × 10−8), and 642 in E2 and 1756 in NS4B (P = 2 × 10−8). Taking into consideration that site 482 is from the CD81-binding region (45, 127) and site 612 from one of two E2 regions proposed to be involved in the viral fusion process (72, 91, 95), we speculate that the tight coordination between sites 482 and 642 as well as that between sites 612 and 233 is associated with viral entry.

Another important observation made in this study is that all HCV proteins have association with the UR/NR outcome of IFN-RBV therapy. Taking into consideration the aforementioned extensive linkage among polymorphic sites from different proteins, this observation, although not surprising, reveals that the HCV response to immunomodulatory therapy is a very complex trait involving numerous viral functions that require coordination. All networks constructed for individual proteins included the UR/NR outcome as a variable (Table (Table2).2). However, this observation cannot be unequivocally interpreted in terms of equal contribution of each protein to the IFN-RBV response. Nevertheless, it suggests that the genome-wide coordination among sites is important for this response, with some proteins possibly playing accessory roles and reflecting the IFN-RBV-related changes in other proteins that are mainly responsible for resistance. The analysis conducted here revealed that sites substantially associated with the outcome are scattered along the entire HCV polyprotein. Among the sites with relevant significance of >0.5 (Fig. (Fig.8)8) are sites in E1 (n = 1), E2 (n = 8), p7 (n = 2), NS2 (n = 1), NS5A (n = 9), and NS5B (n = 4). Two proteins, E2 and NS5A, shared 68% of these 25 sites, suggesting their strong connection to IFN-RBV resistance. E2, NS5A, and P7 have, respectively, 6.8%, 9.0%, and 11.7% of their polymorphic sites being highly relevant to the therapy outcome, while all other proteins have only 1.5% to 3.1% of these sites.

One surprising finding was that five among the eight sites most relevant to therapy outcome are located in HVR1 of the E2 protein (aa 384 to 410), emphasizing a strong connection of HVR1 heterogeneity to IFN-RBV resistance. Association of HVR1 sites with outcomes of therapy can be also found in the correlation networks (7). However, the significance of these observations is not apparent. Analysis of HVR1 connectivity in the polyprotein BN showed that polymorphic HVR1 sites have a total of 140 links to all HCV proteins, with each HVR1 site being connected to three to nine sites in the HCV polyprotein. Such an extensive interdependence of HVR1 sites with many sites across the entire HCV polyprotein (Fig. (Fig.3),3), in conjunction with the earlier similar observations using network analysis (17), suggests that the HVR1 substitutions are not random and that HVR1 evolution is substantially coordinated with all HCV proteins. Coordination of HVR1 heterogeneity is especially noticeable with E1, E2, and NS5A, which share, respectively, 15%, 26.4%, and 14% of all HVR1 links in the polyprotein BN, while any other HCV protein shares 3.6% to 9.3% of HVR1 links.

HVR1 contains antigenic epitopes (66, 67, 112, 121) with HCV neutralizing activity (40). Rapid HVR1 evolution is associated with immune escape (70). However, the conservation of the HVR1 physicochemical properties and conformation (94) argues that this region is significantly functionally constrained despite its extensive heterogeneity. The observation that compensatory mutations in the ectodomain of E2 (46) and the I347L mutation in E1 compensate for HCV fusion impairment (9) in HCV mutants whose HVR1 have been excised suggests potential functional relationships of this region with other parts of the HCV genome. HVR1 was shown to be involved in the SR-B1-facilitated entry of HCV pseudoparticles in cell culture (11). It was suggested that HVR1 plays an important role in HCV entry by modulating receptor recognition and affects lipoprotein composition and infectivity of viral particles (9). HVR1 heterogeneity was also associated with the development of resistance to therapy (74, 87, 117, 118). We hypothesize that complex functional relationships of HVR1 are reflected in coordinated evolution with other HCV proteins and that HVR1 mirrors the evolution of the entire HCV genome, including evolution toward the IFN-RBV resistance.

There are many sites from different HCV proteins strongly linked to the IFN-RBV resistance (Fig. (Fig.11 and and8).8). However, consideration of individual sites allows only for the identification of connections to the therapy outcome in the form of a trend and does not have a strong predictive power. Correlation of the IFN-RBV therapy outcomes has been reported with site polymorphisms in the core (36), E2 (87, 106), and NS5A (88, 106) proteins. Although these observations revealed numerous associations between the HCV genetic polymorphism and evolution toward IFN-RBV resistance, these associations were never explored in terms of their interrelationships and formulated into an integrative model capable of revealing accurate quantitative connections between HCV genetic changes and therapy outcomes.

The current report presents several probabilistic models connecting the UR/NR outcome to coordinated changes at polymorphic sites across the entire HCV polyprotein as well as from individual HCV proteins. Analysis of individual sites without consideration of their relationships seems inefficient in detecting a reliable connection to the outcomes. Only 3 among 25 sites having the highest value of mutual information with the outcome (Fig. (Fig.8)8) were found to be directly linked to the outcome in the polyprotein BN (Fig. (Fig.2).2). The same 3 sites, 2280, 2283, and 2633, are among the 14 most relevant sites extracted from the HCV polyprotein using correlation-based feature selection (Table (Table3)3) and among 18 sites that have the strongest connections to outcome in the undirected dependence graph (Fig. (Fig.1).1). All computational techniques used in this study ranked the contribution of various sites differently. For example, only 12 sites were shared by 18 sites shown in Fig. Fig.11 and 25 sites shown in Fig. Fig.8.8. Although sites 2280 and 2283 from NS5A and site 2633 from NS5B were frequently identified as most relevant to the IFN-RBV response, analysis of states at these sites is not sufficient for an accurate prediction of the therapy outcome (data not shown). Such a prediction requires the use of a combination of sites selected for their collective contribution to the outcome.

For that purpose, we conducted a series of experiments for selection of site sets most relevant to the therapy outcome from the entire HCV polyprotein and individual proteins (Table (Table3).3). Two proteins, E2 and NS5A, were explored in detail. As mentioned earlier, both proteins have many polymorphic sites and contributed many links to the polyprotein BN. These two proteins consistently made substantial contributions of the most relevant sites identified using different feature selection techniques (Fig. (Fig.55 and and6).6). Probabilistic mapping of UR and NR outcomes in 2D physicochemical space showed an equally representative distribution of the outcome probabilities for E2, NS5A, and the polyprotein (Fig. (Fig.9).9). All these findings strongly suggest that these two proteins have a strong connection to therapy response and can be used for the accurate prediction of therapeutic outcomes. However, as can be seen in Fig. Fig.10,10, the 10-fold CV experiments showed that the NS5A BN outperforms the E2 BN constructed using complete sets of polymorphic sites (82.5% versus 90% accuracy) or feature-selected sites (85% versus 97.5% accuracy). These results, taken together with the observation that NS5A contains two of six sites directly connected to the therapy outcome in the polyprotein BN while E2 has no direct links to the outcome, suggest that NS5A has a very strong relevance to evolution toward the IFN-RBV resistance.

Two sites, at positions 2376 and 2414 in NS5A, have experimentally been associated with the development of resistance to RBV (97). It is important to note that these two sites were consistently selected as being relevant to the therapy outcome (Table (Table33 and Fig. Fig.8),8), indicating that the NS5A BN as well as polyprotein BN constructed using all or feature-selected sites includes links that reflect contribution of RBV to therapy. Site 2414 located in domain 3 of NS5A is linked to site 161 in domain 2 of core in the polyprotein BN. As mentioned earlier, both domains are involved in protein-protein interactions between these two proteins, association with lipid droplets, and assembly and release of viral particles (81, 83). There seems to be a linkage between coevolution of the core and NS5A proteins and RBV resistance, and this resistance is associated with interaction between these two proteins. The final validation of the two predictive NS5A Virahep-C models using the HALT-C data strongly confirms a robust connection between coordination among the NS5A sites and IFN-RBV resistance. Additionally, it shows that a small number of features from NS5A alone may be sufficient for the prediction of therapy outcomes (Fig. (Fig.11).11). This finding suggests that analysis of a very few sites from a small HCV genomic region, such as NS5A, may be used for monitoring sensitivity to the IFN-RBV therapy.

A general interconnectivity among HCV proteins was comparable for the 40 Virahep-C sequences and the 298 HCV genotype 1a full-genome sequences obtained from GenBank (Fig. (Fig.33 and and4),4), indicating that the modeled coordination among substitutions is essentially similar for all HCV variants from treated and treatment-naïve patients. This observation additionally suggests that the development of resistance during immunomodulatory therapy is generally shaped by selection pressures similar to the HCV evolution in untreated patients. However, there are some important differences between the polyprotein BNs generated using sequences from treated and treatment-naïve patients. The GenBank sequences from untreated patients contain more polymorphic sites (n = 1,296) than the Virahep-C sequences (n = 551). Despite this fact, the Virahep-C sequences contain 25 polymorphic sites that are conserved in the GenBank sequences. These sites are distributed within E1 (n = 3), E2 (n = 4), P7 (n = 1), NS2 (n = 2), NS3 (n = 6), NS4A (n = 1), NS4B (n = 3), NS5A (n = 3), and NS5B (n = 2). Among them, sites at positions 230 in E1, 768 in P7, and 1461 and 1592 in NS3 are the most relevant to the IFN-RBV response (Table (Table3).3). Furthermore, the two BNs had topological differences in the number of interprotein links, most notably the 1.7- and 2-fold proportional increase in the number of links between E1 and E2 and between E2 and NS5A in the Virahep-C BN compared to those in the GenBank BN (Fig. (Fig.3).3). These observations suggest that despite the similarity of these two networks, there are distinct differences in coordination among substitutions in HCV from treated and treatment-naïve patients.

IFN is a major component of innate immunity (19, 100). Several HCV proteins are involved in modulation of the host IFN response (12, 13, 85, 126). RBV used as a component of combined therapy seems to facilitate early response to IFN (43) rather than playing a strong independent role. Resistance to IFN is not clearly linked to any specific mutation within the HCV genome. As shown in this study, HCV adaptation to IFN is a complex trait encoded in the interrelationships among many sites along the entire HCV polyprotein. The extensive coevolution among HCV amino acid sites leads to a significant integration among the HCV IFN-response-related phenotypic traits. Each HCV protein contributes to the IFN resistance, albeit to a different degree. With E2 and NS5A contributing many polymorphic sites to the network and generating a broad epistatic connectivity to sites in other HCV proteins, intrahost HCV evolution toward the IFN resistance is essentially defined and, therefore, can be accurately predicted using a carefully selected combination of sites from these two proteins.

Treatment with IFN does not exert an unusual selection pressure on HCV, unlike treatment using direct-acting antiviral compounds, but rather generates an unusually strong selection pressure of the innate immune system. Thus, HCV strains capable of resisting or evolving toward resistance to immunomodulatory therapy are most efficient in overcoming the host immune system. With the entire HCV genome being responsible for the response to IFN, there is no single IFN resistance mutation. Once established, the wide-ranging epistatic connectivity among sites involved in the IFN response may not be rapidly reverted even with reduction of the selection pressure in the absence of treatment, thus locking the HCV genome into the state of resistance to IFN. Without being eliminated by IFN-RBV therapy, these variants can continue to circulate among human hosts. In contrast, IFN-RBV-sensitive strains are being removed from circulation. This consideration implies that the current widespread adoption of IFN-based therapy, although extremely beneficial for individual patients with SVR, may affect the composition of the circulating HCV population and enlarge the reservoir of IFN-resistant HCV, a potentially alarming public health issue that warrants a further investigation.

Supplementary Material

[Supplemental material]


We are grateful to Chong-Gee Teo for critical review and discussion of findings in this paper as well as to two anonymous reviewers for important comments.

This work was supported by CDC intramural funding.

This information has not been formally disseminated by the Centers for Disease Control and Prevention/Agency for Toxic Substances and Disease Registry. It does not represent and should not be construed to represent any agency determination or policy.


[down-pointing small open triangle]Published ahead of print on 19 January 2011.

§Supplemental material for this article may be found at


1. Abid, K., R. Quadri, and F. Negro. 2000. Hepatitis C virus, the E2 envelope protein, and alpha-interferon resistance. Science 287:1555. [PubMed]
2. Andre, P., G. Perlemuter, A. Budkowska, C. Brechot, and V. Lotteau. 2005. Hepatitis C virus particles and lipoprotein metabolism. Semin. Liver Dis. 25:93-104. [PubMed]
3. Reference deleted.
4. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein 5A for hepatitis C virus infectious particle assembly. PLoS Pathog. 4:e1000035. [PMC free article] [PubMed]
5. Appel, N., et al. 2008. Essential role of domain III of nonstructural protein 5A for hepatitis C virus infectious particle assembly. PLoS Pathog. 4:e1000035. [PMC free article] [PubMed]
6. Atchley, W. R., J. Zhao, A. D. Fernandes, and T. Druke. 2005. Solving the protein sequence metric problem. Proc. Natl. Acad. Sci. U. S. A. 102:6395-6400. [PubMed]
7. Aurora, R., M. J. Donlin, N. A. Cannon, and J. E. Tavis. 2009. Genome-wide hepatitis C virus amino acid covariance networks can predict response to antiviral therapy in humans. J. Clin. Invest. 119:225-236. [PMC free article] [PubMed]
8. Bagaglio, S., et al. 2003. Genetic heterogeneity of hepatitis C virus (HCV) in clinical strains of HIV positive and HIV negative patients chronically infected with HCV genotype 3a. J. Biol. Regul. Homeost. Agents 17:153-161. [PubMed]
9. Bankwitz, D., et al. 2010. Hepatitis C virus hypervariable region 1 modulates receptor interactions, conceals the CD81 binding site, and protects conserved neutralizing epitopes. J. Virol. 84:5751-5763. [PMC free article] [PubMed]
10. Barba, G., et al. 1997. Hepatitis C virus core protein shows a cytoplasmic localization and associates to cellular lipid storage droplets. Proc. Natl. Acad. Sci. U. S. A. 94:1200-1205. [PubMed]
11. Bartosch, B., et al. 2003. Cell entry of hepatitis C virus requires a set of co-receptors that include the CD81 tetraspanin and the SR-B1 scavenger receptor. J. Biol. Chem. 278:41624-41630. [PubMed]
12. Blindenbacher, A., et al. 2003. Expression of hepatitis c virus proteins inhibits interferon alpha signaling in the liver of transgenic mice. Gastroenterology 124:1465-1475. [PubMed]
13. Bode, J. G., et al. 2003. IFN-alpha antagonistic activity of HCV core protein involves induction of suppressor of cytokine signaling-3. FASEB J. 17:488-490. [PubMed]
14. Boulant, S., et al. 2006. Structural determinants that target the hepatitis C virus core protein to lipid droplets. J. Biol. Chem. 281:22236-22247. [PubMed]
15. Brady, M. T., A. J. MacDonald, A. G. Rowan, and K. H. Mills. 2003. Hepatitis C virus non-structural protein 4 suppresses Th1 responses by stimulating IL-10 production from monocytes. Eur. J. Immunol. 33:3448-3457. [PubMed]
16. Brieman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and regression trees. Chapman & Hall/CRC, Boca Raton, FL.
17. Campo, D. S., Z. Dimitrova, R. J. Mitchell, J. Lara, and Y. Khudyakov. 2008. Coordinated evolution of the hepatitis C virus. Proc. Natl. Acad. Sci. U. S. A. 105:9685-9690. [PubMed]
18. Cannon, N. A., M. J. Donlin, X. Fan, R. Aurora, and J. E. Tavis. 2008. Hepatitis C virus diversity and evolution in the full open-reading frame during antiviral therapy. PLoS One 3:e2123. [PMC free article] [PubMed]
19. Carney, D. S., and M. Gale, Jr. 2006. HCV regulation of host defense, p. 375-398. In Seng-Lai Tan (ed.), Hepatitis C viruses. Horizon Bioscience, Norfolk, United Kingdom.
20. Castelain, S., et al. 2002. Variability of the nonstructural 5A protein of hepatitis C virus type 3a isolates and relation to interferon sensitivity. J. Infect. Dis. 185:573-583. [PubMed]
21. Chang, J., et al. 1998. Hepatitis C virus core from two different genotypes has an oncogenic potential but is not sufficient for transforming primary rat embryo fibroblasts in cooperation with the H-ras oncogene. J. Virol. 72:3060-3065. [PMC free article] [PubMed]
22. Charniak, E. 1991. Bayesian networks without tears. AI Mag. 12:50-63.
23. Chen, S. L., and T. R. Morgan. 2006. The natural history of hepatitis C virus (HCV) infection. Int. J. Med. Sci. 3:47-52. [PMC free article] [PubMed]
24. Chickering, D. M., D. Heckerman, and C. Meek. 2004. Large-sample learning of Bayesian networks is NP-hard. J. Mach. Learn. Res. 5:1287-1330.
25. Choo, Q. L., et al. 1990. Hepatitis C virus: the major causative agent of viral non-A, non-B hepatitis. Br. Med. Bull. 46:423-441. [PubMed]
26. Conjeevaram, H. S., et al. 2006. Peginterferon and ribavirin treatment in African American and Caucasian American patients with hepatitis C genotype 1. Gastroenterology 131:470-477. [PubMed]
27. Contreras, A. M., et al. 2002. Viral RNA mutations are region specific and increased by ribavirin in a full-length hepatitis C virus replication system. J. Virol. 76:8505-8517. [PMC free article] [PubMed]
28. Cooper, G. F., and E. Herskovits. 1992. A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9:309-347.
29. Cox, L. A. 2006. Detecting causal non-linear exposure-response relations in epidemiological data. Dose Response 4:119-132. [PMC free article] [PubMed]
30. Daelemans, W., V. Hoste, F. De Meulder, and B. Naudts. 2003. Combined optimization of feature selection and algorithm parameters in machine learning of language. Machine Learning: Ecml 2003 2837:84-95.
31. Dash, D., and M. J. Druzdzel. 2003. Robust independence testing for constraint-based learning of causal structure, p. 167-174. In The 19th Annual Conference on Uncertainty in Artificial Intelligence (UAI-03). Morgan Kaufmann, San Francisco, CA.
32. Demsar, J., G. Leban, and B. Zupan. 2005. FreeViz—an intelligent visualization approach for class-labeled multidimensional data sets, p. 61-66. In Intelligent Data Analysis in Medicine and Pharmacology (IDAMAP) Workshop, Aberdeen, United Kingdom.
33. Deuffic-Burban, S., T. Poynard, M. S. Sulkowski, and J. B. Wong. 2007. Estimating the future health burden of chronic hepatitis C and human immunodeficiency virus infections in the United States. J. Viral Hepat. 14:107-115. [PubMed]
34. Duverlie, G., et al. 1998. Sequence analysis of the NS5A protein of European hepatitis C virus 1b isolates and relation to interferon sensitivity. J. Gen. Virol. 79:1373-1381. [PubMed]
35. Emi, K., et al. 1999. Magnitude of activity in chronic hepatitis C is influenced by apoptosis of T cells responsible for hepatitis C virus. J. Gastroenterol. Hepatol. 14:1018-1024. [PubMed]
36. Enomoto, N., and S. Maekawa. 2010. HCV genetic elements determining the early response to peginterferon and ribavirin therapy. Intervirology 53:66-69. [PubMed]
37. Enomoto, N., et al. 1995. Comparison of full-length sequences of interferon-sensitive and resistant hepatitis C virus 1b. Sensitivity to interferon is conferred by amino acid substitutions in the NS5A region. J. Clin. Invest. 96:224-230. [PMC free article] [PubMed]
38. Enomoto, N., et al. 1996. Mutations in the nonstructural protein 5A gene and response to interferon in patients with chronic hepatitis C virus 1b infection. N. Engl. J. Med. 334:77-81. [PubMed]
39. Farci, P. 2001. Hepatitis C virus. The importance of viral heterogeneity. Clin. Liver Dis. 5:895-916. [PubMed]
40. Farci, P., et al. 1996. Prevention of hepatitis C virus infection in chimpanzees by hyperimmune serum against the hypervariable region 1 of the envelope 2 protein. Proc. Natl. Acad. Sci. U. S. A. 93:15394-15399. [PubMed]
41. Farci, P., et al. 2002. Early changes in hepatitis C viral quasispecies during interferon therapy predict the therapeutic outcome. Proc. Natl. Acad. Sci. U. S. A. 99:3081-3086. [PubMed]
42. Feld, J. J., and J. H. Hoofnagle. 2005. Mechanism of action of interferon and ribavirin in treatment of hepatitis C. Nature 436:967-972. [PubMed]
43. Feld, J. J., et al. 2010. Ribavirin improves early responses to peginterferon through improved interferon signaling. Gastroenterology 139:154-162. [PMC free article] [PubMed]
44. Feld, J. J., et al. 2007. Hepatic gene expression during treatment with peginterferon and ribavirin: identifying molecular pathways for treatment response. Hepatology 46:1548-1563. [PMC free article] [PubMed]
45. Flint, M., et al. 1999. Characterization of hepatitis C virus E2 glycoprotein interaction with a putative cellular receptor, CD81. J. Virol. 73:6235-6244. [PMC free article] [PubMed]
46. Forns, X., et al. 2000. Hepatitis C virus lacking the hypervariable region 1 of the second envelope protein is infectious and causes acute resolving or persistent infection in chimpanzees. Proc. Natl. Acad. Sci. U. S. A. 97:13318-13323. [PubMed]
47. Frese, M., T. Pietschmann, D. Moradpour, O. Haller, and R. Bartenschlager. 2001. Interferon-alpha inhibits hepatitis C virus subgenomic RNA replication by an MxA-independent pathway. J. Gen. Virol. 82:723-733. [PubMed]
48. Fried, M. W., et al. 2002. Peginterferon alfa-2a plus ribavirin for chronic hepatitis C virus infection. N. Engl. J. Med. 347:975-982. [PubMed]
49. Gale, M. J., Jr., et al. 1997. Evidence that hepatitis C virus resistance to interferon is mediated through repression of the PKR protein kinase by the nonstructural 5A protein. Virology 230:217-227. [PubMed]
50. Ge, D., et al. 2009. Genetic variation in IL28B predicts hepatitis C treatment-induced viral clearance. Nature 461:399-401. [PubMed]
51. Gerlach, J. T., et al. 2003. Acute hepatitis C: high rate of both spontaneous and treatment-induced viral clearance. Gastroenterology 125:80-88. [PubMed]
52. Ghany, M. G., D. B. Strader, D. L. Thomas, and L. B. Seeff. 2009. Diagnosis, management, and treatment of hepatitis C: an update. Hepatology 49:1335-1374. [PubMed]
53. Goswami, B. B., R. Crea, J. H. Van Boom, and O. K. Sharma. 1982. 2′-5′-Linked oligo(adenylic acid) and its analogs. A new class of inhibitors of mRNA methylation. J. Biol. Chem. 257:6867-6870. [PubMed]
54. Guyon, I., and A. Elisseeff. 2003. An introduction to variable and feature selection. Mach. Learn. Res. 3:1157-1182.
55. Hadziyannis, S. J., et al. 2004. Peginterferon-alpha2a and ribavirin combination therapy in chronic hepatitis C: a randomized study of treatment duration and ribavirin dose. Ann. Intern. Med. 140:346-355. [PubMed]
56. Hall, M., and E. Frank. 2008. Combining naive Bayes and decision tables, p. 318-319. In D. Wilson and H. Chad (ed.). Proceedings of the 21st Florida Artificial Intelligence Research Society Conference. AAAI Press, Coconut Grove, FL.
57. Hall, M. A. 1999. Correlation-based feature subset selection for machine learning. Ph.D. thesis, Department of Computer Science, University of Waikato, Waikato, New Zealand.
58. Hall, T. A. 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41:95-98.
59. Helbig, K. J., D. T. Lau, L. Semendric, H. A. Harley, and M. R. Beard. 2005. Analysis of ISG expression in chronic hepatitis C identifies viperin as a potential antiviral effector. Hepatology 42:702-710. [PubMed]
60. Hsieh, T. Y., et al. 1998. Hepatitis C virus core protein interacts with heterogeneous nuclear ribonucleoprotein K. J. Biol. Chem. 273:17651-17659. [PubMed]
61. Huber, M., et al. 2005. Interferon alpha-2a plus ribavirin 1,000/1,200 mg versus interferon alpha-2a plus ribavirin 600 mg for chronic hepatitis C infection in patients on opiate maintenance treatment: an open-label randomized multicenter trial. Infection 33:25-29. [PubMed]
62. Jaeckel, E., et al. 2001. Treatment of acute hepatitis C with interferon alfa-2b. N. Engl. J. Med. 345:1452-1457. [PubMed]
63. Jensen, F. 1996. An introduction to Bayesian networks. UCL Press, London, United Kingdom.
64. Jiang, D., et al. 2008. Identification of three interferon-inducible cellular enzymes that inhibit the replication of hepatitis C virus. J. Virol. 82:1665-1678. [PMC free article] [PubMed]
65. Jin, D. Y., et al. 2000. Hepatitis C virus core protein-induced loss of LZIP function correlates with cellular transformation. EMBO J. 19:729-740. [PubMed]
66. Kato, N., et al. 1994. Genetic drift in hypervariable region 1 of the viral genome in persistent hepatitis C virus infection. J. Virol. 68:4776-4784. [PMC free article] [PubMed]
67. Kato, N., et al. 1993. Humoral immune response to hypervariable region 1 of the putative envelope glycoprotein (gp70) of hepatitis C virus. J. Virol. 67:3923-3930. [PMC free article] [PubMed]
68. Kraus, M. R., et al. 2001. Compliance with therapy in patients with chronic hepatitis C: associations with psychiatric symptoms, interpersonal problems, and mode of acquisition. Dig. Dis. Sci. 46:2060-2065. [PubMed]
69. Kullback, S., and R. A. Leibler. 1951. On information and sufficiency. Ann. Math. Stat. 22:79-86.
70. Kurosaki, M., N. Enomoto, F. Marumo, and C. Sato. 1993. Rapid sequence variation of the hypervariable region of hepatitis C virus during the course of chronic infection. Hepatology 18:1293-1299. [PubMed]
71. Lauritzen, S. L. 1996. Graphical models. Clarendon Press, Oxford, United Kingdom.
72. Lavillette, D., et al. 2007. Characterization of fusion determinants points to the involvement of three discrete regions of both E1 and E2 glycoproteins in the membrane fusion process of hepatitis C virus. J. Virol. 81:8752-8765. [PMC free article] [PubMed]
73. Leban, G., I. Bratko, U. Petrovic, T. Curk, and B. Zupan. 2005. VizRank: finding informative data projections in functional genomics by machine learning. Bioinformatics 21:413-414. [PubMed]
74. Le Guillou-Guillemette, H., et al. 2007. Genetic diversity of the hepatitis C virus: impact and issues in the antiviral therapy. World J. Gastroenterol. 13:2416-2426. [PubMed]
75. Lopez-Labrador, F. X., et al. 1999. Relationship of the genomic complexity of hepatitis C virus with liver disease severity and response to interferon in patients with chronic HCV genotype 1b infection [correction of interferon]. Hepatology 29:897-903. [PubMed]
76. Lutchman, G., et al. 2007. Mutation rate of the hepatitis C virus NS5B in patients undergoing treatment with ribavirin monotherapy. Gastroenterology 132:1757-1766. [PubMed]
77. Maag, D., C. Castro, Z. Hong, and C. E. Cameron. 2001. Hepatitis C virus RNA-dependent RNA polymerase (NS5B) as a mediator of the antiviral activity of ribavirin. J. Biol. Chem. 276:46094-46098. [PubMed]
78. Magiorkinis, G., et al. 2009. The global spread of hepatitis C virus 1a and 1b: a phylodynamic and phylogeographic analysis. PLoS Med. 6:e1000198. [PMC free article] [PubMed]
79. Mangoni, E. D., D. M. Forton, G. Ruggiero, and P. Karayiannis. 2003. Hepatitis C virus E2 and NS5A region variability during sequential treatment with two interferon-alpha preparations. J. Med. Virol. 70:62-73. [PubMed]
80. Manns, M. P., et al. 2001. Peginterferon alfa-2b plus ribavirin compared with interferon alfa-2b plus ribavirin for initial treatment of chronic hepatitis C: a randomised trial. Lancet 358:958-965. [PubMed]
81. Masaki, T., et al. 2008. Interaction of hepatitis C virus nonstructural protein 5A with core protein is critical for the production of infectious virus particles. J. Virol. 82:7964-7976. [PMC free article] [PubMed]
82. McLauchlan, J. 2000. Properties of the hepatitis C virus core protein: a structural protein that modulates cellular processes. J. Viral Hepat. 7:2-14. [PubMed]
83. McLauchlan, J. 2009. Hepatitis C virus: viral proteins on the move. Biochem. Soc. Trans. 37:986-990. [PubMed]
84. Melen, K., P. Keskinen, A. Lehtonen, and I. Julkunen. 2000. Interferon-induced gene expression and signaling in human hepatoma cell lines. J. Hepatol. 33:764-772. [PubMed]
85. Miller, K., et al. 2004. Effects of the hepatitis C virus core protein on innate cellular defense pathways. J. Interferon Cytokine Res. 24:391-402. [PubMed]
86. Moradpour, D., et al. 2003. Membrane association of hepatitis C virus nonstructural proteins and identification of the membrane alteration that harbors the viral replication complex. Antiviral Res. 60:103-109. [PubMed]
87. Moribe, T., et al. 1995. Hepatitis C viral complexity detected by single-strand conformation polymorphism and response to interferon therapy. Gastroenterology 108:789-795. [PubMed]
88. Munoz de Rueda, P., et al. 2008. Mutations in E2-PePHD, NS5A-PKRBD, NS5A-ISDR, and NS5A-V3 of hepatitis C virus genotype 1 and their relationships to pegylated interferon-ribavirin treatment responses. J. Virol. 82:6644-6653. [PMC free article] [PubMed]
89. Murakami, T., et al. 1999. Mutations in nonstructural protein 5A gene and response to interferon in hepatitis C virus genotype 2 infection. Hepatology 30:1045-1053. [PubMed]
89a. National Institutes of Health. 2002. NIH consensus statement on management of hepatitis C: 2002. NIH Consens. State Sci. Statements 19(3):1-46. [PubMed]
90. Neumann, A. U., et al. 2000. Differences in viral dynamics between genotypes 1 and 2 of hepatitis C virus. J. Infect. Dis. 182:28-35. [PubMed]
91. Pacheco, B., et al. 2006. Membrane-perturbing properties of three peptides corresponding to the ectodomain of hepatitis C virus E2 envelope protein. Biochim. Biophys. Acta 1758:755-763. [PubMed]
92. Pascu, M., et al. 2004. Sustained virological response in hepatitis C virus type 1b infected patients is predicted by the number of mutations within the NS5A-ISDR: a meta-analysis focused on geographical differences. Gut 53:1345-1351. [PMC free article] [PubMed]
93. Pawlotsky, J. M., et al. 1999. Evolution of the hepatitis C virus second envelope protein hypervariable region in chronically infected patients receiving alpha interferon therapy. J. Virol. 73:6490-6499. [PMC free article] [PubMed]
94. Penin, F., et al. 2001. Conservation of the conformation and positive charges of hepatitis C virus E2 envelope glycoprotein hypervariable region 1 points to a role in cell attachment. J. Virol. 75:5703-5710. [PMC free article] [PubMed]
95. Perez-Berna, A. J., et al. 2008. Interaction of the most membranotropic region of the HCV E2 envelope glycoprotein with membranes. Biophysical characterization. Biophys. J. 94:4737-4750. [PubMed]
96. Perlemuter, G., et al. 2002. Hepatitis C virus core protein inhibits microsomal triglyceride transfer protein activity and very low density lipoprotein secretion: a model of viral-related steatosis. FASEB J. 16:185-194. [PubMed]
97. Pfeiffer, J. K., and K. Kirkegaard. 2005. Ribavirin resistance in hepatitis C virus replicon-containing cell lines conferred by changes in the cell line or mutations in the replicon RNA. J. Virol. 79:2346-2355. [PMC free article] [PubMed]
98. Polyak, S. J., et al. 2000. The protein kinase-interacting domain in the hepatitis C virus envelope glycoprotein-2 gene is highly conserved in genotype 1-infected patients treated with interferon. J. Infect. Dis. 182:397-404. [PubMed]
99. Puig-Basagoiti, F., et al. 2005. Dynamics of hepatitis C virus NS5A quasispecies during interferon and ribavirin therapy in responder and non-responder patients with genotype 1b chronic hepatitis C. J. Gen. Virol. 86:1067-1075. [PubMed]
100. Pulaski, B. A., M. J. Smyth, and S. Ostrand-Rosenberg. 2002. Interferon-gamma-dependent phagocytic cells are a critical component of innate immunity against metastatic mammary carcinoma. Cancer Res. 62:4406-4412. [PubMed]
101. Quinlan, R. J. 1986. Induction of decision trees. Mach. Learn. 1:81-106.
102. Ray, R. B., L. M. Lagging, K. Meyer, and R. Ray. 1996. Hepatitis C virus core protein cooperates with ras and transforms primary rat embryo fibroblasts to tumorigenic phenotype. J. Virol. 70:4438-4443. [PMC free article] [PubMed]
103. Romero-Gomez, M., et al. 2005. Insulin resistance impairs sustained response rate to peginterferon plus ribavirin in chronic hepatitis C patients. Gastroenterology 128:636-641. [PubMed]
104. Saito, K., M. it-Goughoulte, et al. 2008. Hepatitis C virus inhibits cell surface expression of HLA-DR, prevents dendritic cell maturation, and induces interleukin-10 production. J. Virol. 82:3320-3328. [PMC free article] [PubMed]
105. Santolini, E., G. Migliaccio, and N. Lamonica. 1994. Biosynthesis and biochemical properties of the hepatitis C virus core protein. J. Virol. 68:3631-3641. [PMC free article] [PubMed]
106. Sarrazin, C., et al. 2000. Mutations within the E2 and NS5A protein in patients infected with hepatitis C virus type 3a and correlation with treatment response. Hepatology 31:1360-1370. [PubMed]
107. Shavinskaya, A., S. Boulant, F. Penin, J. McLauchlan, and R. Bartenschlager. 2007. The lipid droplet binding domain of hepatitis C virus core protein is a major determinant for efficient virus assembly. J. Biol. Chem. 282:37158-37169. [PubMed]
108. Simmonds, P., et al. 2005. Consensus proposals for a unified system of nomenclature of hepatitis C virus genotypes. Hepatology 42:962-973. [PubMed]
109. Simmonds, P., et al. 1993. Classification of hepatitis C virus into six major genotypes and a series of subtypes by phylogenetic analysis of the NS-5 region. J. Gen. Virol. 74:2391-2399. [PubMed]
110. Suppiah, V., et al. 2009. IL28B is associated with response to chronic hepatitis C interferon-alpha and ribavirin therapy. Nat. Genet. 41:1100-1104. [PubMed]
111. Tam, R. C., et al. 1999. Ribavirin polarizes human T cell responses towards a type 1 cytokine profile. J. Hepatol. 30:376-382. [PubMed]
112. Taniguchi, S., et al. 1993. A structurally flexible and antigenically variable N-terminal domain of the hepatitis C virus E2/NS1 protein: implication for an escape from antibody. Virology 195:297-301. [PubMed]
113. Taylor, D. R., S. T. Shi, P. R. Romano, G. N. Barber, and M. M. Lai. 1999. Inhibition of the interferon-inducible protein kinase PKR by HCV E2 protein. Science 285:107-110. [PubMed]
114. Thomas, D. L., et al. 2009. Genetic variation in IL28B and spontaneous clearance of hepatitis C virus. Nature 461:798-801. [PMC free article] [PubMed]
115. Thompson, J. D., D. G. Higgins, and T. J. Gibson. 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22:4673-4680. [PMC free article] [PubMed]
116. Tillmann, H. L., et al. 2010. A polymorphism near IL28B is associated with spontaneous clearance of acute hepatitis C virus and jaundice. Gastroenterology 139:1586-1592. [PubMed]
117. Torres-Puente, M., et al. 2008. Genetic variability in hepatitis C virus and its role in antiviral treatment response. J. Viral Hepat. 15:188-199. [PubMed]
118. Toyoda, H., et al. 1997. Quasispecies nature of hepatitis C virus and response to alpha interferon: significance as a predictor of direct response to interferon. J. Hepatol. 26:6-13. [PubMed]
119. Veillon, P., C. Payan, H. Le Guillou-Guillemette, C. Gaudy, and F. Lunel. 2007. Quasispecies evolution in NS5A region of hepatitis C virus genotype 1b during interferon or combined interferon-ribavirin therapy. World J. Gastroenterol. 13:1195-1203. [PubMed]
120. von Wagner, et al. 2008. Placebo-controlled trial of 400 mg amantadine combined with peginterferon alfa-2a and ribavirin for 48 weeks in chronic hepatitis C virus-1 infection. Hepatology 48:1404-1411. [PubMed]
121. Weiner, A. J., et al. 1992. Evidence for immune selection of hepatitis C virus (HCV) putative envelope glycoprotein variants: potential role in chronic HCV infections. Proc. Natl. Acad. Sci. U. S. A. 89:3468-3472. [PubMed]
122. Wiese, M., F. Berr, M. Lafrenz, H. Porst, and U. Oesen. 2000. Low frequency of cirrhosis in a hepatitis C (genotype 1b) single-source outbreak in Germany: a 20-year multicenter study. Hepatology 32:91-96. [PubMed]
123. Wiese, M., et al. 2005. Outcome in a hepatitis C (genotype 1b) single source outbreak in Germany—a 25-year multicenter study. J. Hepatol. 43:590-598. [PubMed]
124. World Health Organization. 1997. Hepatitis C. Wkly. Epidemiol. Rec. 72:65-72. [PubMed]
125. World Health Organization. 1997. Hepatitis C: global prevalence. Wkly. Epidemiol. Rec. 72:341-344. [PubMed]
126. Xu, J., S. Liu, Y. Xu, P. Tien, and G. Gao. 2009. Identification of the nonstructural protein 4B of hepatitis C virus as a factor that inhibits the antiviral activity of interferon-alpha. Virus Res. 141:55-62. [PubMed]
127. Yagnik, A. T., et al. 2000. A model for the hepatitis C virus envelope glycoprotein E2. Proteins 40:355-366. [PubMed]
128. Yao, Z. Q., et al. 2005. SOCS1 and SOCS3 are targeted by hepatitis C virus core/gC1qR ligation to inhibit T-cell function. J. Virol. 79:15417-15429. [PMC free article] [PubMed]
129. Yoshida, T., et al. 2002. Activation of STAT3 by the hepatitis C virus core protein leads to cellular transformation. J. Exp. Med. 196:641-653. [PMC free article] [PubMed]
130. You, L. R., et al. 1999. Hepatitis C virus core protein interacts with cellular putative RNA helicase. J. Virol. 73:2841-2853. [PMC free article] [PubMed]
131. Yuan, H. J., M. Jain, K. K. Snow, J. M. Gale, and W. M. Lee. 2010. Evolution of hepatitis C virus NS5A region in breakthrough patients during pegylated interferon and ribavirin therapy. J. Viral Hepat. 17:208-216. [PMC free article] [PubMed]
132. Zhang, Y., et al. 2003. Ribavirin treatment up-regulates antiviral gene expression via the interferon-stimulated response element in respiratory syncytial virus-infected epithelial cells. J. Virol. 77:5933-5947. [PMC free article] [PubMed]
133. Zhou, S., R. Liu, B. M. Baroudy, B. A. Malcolm, and G. R. Reyes. 2003. The effect of ribavirin and IMPDH inhibitors on hepatitis C virus subgenomic replicon RNA. Virology 310:333-342. [PubMed]
134. Zhu, N. L., et al. 1998. Hepatitis C virus core protein binds to the cytoplasmic domain of tumor necrosis factor (TNF) receptor 1 and enhances TNF-induced apoptosis. J. Virol. 72:3691-3697. [PMC free article] [PubMed]

Articles from Journal of Virology are provided here courtesy of American Society for Microbiology (ASM)