The E2 glycoprotein has been reported to contain primarily high mannose glycans [15
]. In order to determine specific glycan structures on E2, the intact native protein was first analyzed by MALDI-TOF MS. The amino acid sequence of the glycoprotein E2 is shown in , with the numbering starting from Ala1 to Lys333 (which corresponds to the numbering Ala383 to Lys715 within the entire HCV polyprotein of reference strain H GenBank access number AF009606) [38
]. The 11 consensus glycosylation sites on E2, highlighted in , have been shown to be primarily occupied with high mannose glycans [15
]. As shown in , the MALDI-TOF mass spectrum of the intact glycosylated protein contains broad ions which correspond in mass to singly, doubly and triply charged molecular ions. E2 has a theoretical molecular weight of 36.5 kDa, but the observed mass of the singly charged ion in MALDI-TOF mass spectrum was approximately 50 kDa, indicating a large amount of N-linked glycosylation with extensive heterogeneity, which hampers the ability to determine the exact molecular weight of the intact protein.
MALDI-TOF mass spectrum (mass range 10,000–100,000 Da) of HCV E2 showing singly, doubly, and triply charged molecular ions. The broadness of the peaks is due to of the extensive glycosylation of E2.
As alternatives to MALDI-TOF MS, methods such as LC-MS can be applied for the study of glycoproteins [23
] by providing detailed information at high sensitivity. Glycopeptides can be identified using LC–MS of the enzymatic digestion mixture in the MS mode, based on characteristic ions arising from in-source decay which can be monitored by generating extracted ion chromatograms of these ions, for example, m/z
204.1 (protonated HexN
Ac) and m/z
366.1 (protonated HexN
]. A search of the MS/MS data can be performed for these characteristic fragment ions [43
]. Another approach for the detection of the glycopeptides in LC-MS analyses is through parent ion detection (PID) [40
]. We performed an initial experiment in which the tryptic and chymotryptic digests were analyzed on the Q-Tof using PID. Identification of glycosylated peptides was accomplished using the parent ion detection mode for specific sugar oxonium ions (Hex+
163.1 and HexN
204.1). A mass spectrum at low collision energy (5V) followed by one at high collision energy (30V) was acquired for each peptide. The presence of specific sugar fragment ions in the spectrum at high collision energy was diagnostic for the presence of glycopeptides (data not shown).
The identities of the glycans attached at each N-glycosylation site were determined from the LC-MS analyses of the tryptic and chymotryptic digests of reduced and alkylated E2 protein. From the LC/MS/MS analysis of the tryptic digest, 4 of 11 consensus N-linked sites were unambiguously determined. The primary sequence of E2 contains only a few potential trypsin cleavage sites, therefore, longer proteolytic fragments, containing multiple glycosylation sites were formed. Consequently, these peptides were not observed in the LC-MS analysis of the tryptic digest. In order to elucidate the glycosylation pattern at the remaining sites, chymotrypsin was used to generate shorter proteolytic fragments, which would contain a single glycosylation position within each peptide. Glycopeptides with the same backbone structure containing different glycan moieties (site microheterogeneity) show a specific pattern in ESI-MS [26
]. Those glycopeptides ions forming a charge envelope are separated by an absolute mass difference of 162 amu (hexose), in the case of high mannose/hybrid type oligosaccharides. In addition, collision activated dissociation (CAD) spectra of glycopeptides contain carbohydrate marker ions, such as at m/z
), 204 (HexN
), or 366 (HexHexN
), which simplify their identification if these masses are extracted from the MS/MS total ion current. Different batches of E2 glycoprotein were used in these experiments and slight differences were noticed in the results, especially in the relative abundance of some glycoforms compared to others, as well as in the number of attached mannose residues. However, similar glycosylation patterns of the protein were observed in all experiments.
The N-linked glycosylation sites in E2 glycoprotein contain a broad variety of high mannose glycans, ranging from the minimal core structure (Man3) to, at most, 9 hexose residues attached to the trimannosyl chitobiose moiety (Hex3Man9). It should be noted that the relative abundances correspond to the abundance ratios observed in the raw data, and that differences in sensitivity and ionization efficiency were not considered. The relative abundance of the glycoforms contained at one N-linked site was determined from the deconvoluted mass spectrum over the chromatographic range containing the charge envelops of the ions for the corresponding glycopeptides. The peptides determined for each glycosylation site and the corresponding glycan populations are presented in and .
Glycosylation sites and amino acid sequences of high mannose glycopeptides observed in LC-MS/MS analysis of a tryptic digest of reduced and alkylated E2.
Glycosylation sites and amino acid sequences of high mannose glycopeptides observed in LC-MS/MS analysis of a chymotryptic digest of reduced and alkylated E2.
To corroborate our interpretation, the experimental glycopeptide masses were submitted to GlycoMod, a software tool used for determination of glycosylation compositions from mass spectrometric data [33
]. This program compares the experimental mass of a glycopeptide to a list of precompiled masses of possible monosaccharide compositions, taking into account the possible peptides containing the N-X-S/T/C motif. For example, the observed m/z
of 1011.90 (corresponding to a quadruply charged ion), was assigned to the tryptic peptide 233–246, corresponding to the mass of 4043.66 Da, containing a carboxymethylated cysteine residue, with an attached high mannose glycan of the composition Hex2
. Multiple monosaccharide compositions, however, were proposed as possible matches within a mass tolerance of 0.1 Da, in addition to the assigned Hex2
structure, which has been previously proposed by Duvet et al [3
]. In order to verify the glycan structure, the MS/MS data of the [M+4H]4+
ion of m/z
1011.90 was acquired. The fragment ions observed by MS/MS confirmed the assigned glycan composition Hex2
and the identity of the tryptic peptide 233–246 containing the glycosylation site N241 (data not shown). The number of hexose residues larger than 9 indicates incomplete processing of the precursor glycan. As mass spectrometry using conventional CAD can not distinguish between the isobaric monosaccharides mannose and glucose, the structures containing more than 9 mannose residues are indicated as Hex1-3
( and ). In the chymotryptic digest, the same site N241 was observed containing only Man9 residues in total, and in both digests the most abundant ion was the one corresponding to Man6 structure ( and ). As explained above, the differences in the number of mannose residues is caused by the slight differences between the E2 glycoprotein batches that were used. The tryptic glycopeptide 65–73, containing the glycosylation site N66 (N448)
, indicates the presence of high mannose species, with a maximum of nine mannose residue attached to the trimannosyl chitobiose core (Hex3
) and with Man6 as the most abundant glycan. In the chymotryptic digest, the peptide 66-74 containing the N66 site of glycosylation showed a maximum of Man9 attached to the site, with Man6 being also the most abundant ion.
The MS and MS/MS spectra of the chymotryptic peptide 156–168 corresponding to the glycosylation site N158 (N540) are presented in . The high mannose species observed in the deconvoluted mass spectrum () were assigned to glycopeptide 158–168 carrying five to nine mannose units attached to the GlcNAc2 moiety (see ). The notation ManX indicates a high mannose glycan with X the number of mannose residues attached to the chitobiose moiety (GlcNAc- GlcNAc). Among these, Man6 is the most abundant, followed by Man5 and Man7. The identity of the glycopeptide containing the Man6 glycan, corresponding to the deconvoluted mass 2905.26 Da, was determined from the CAD data of the doubly charged precursor of m/z 1453.78 (). The precursor ion represents the base peak in the MS/MS and the observed fragment ions originate from the successive neutral loss of hexoses from the non-reducing end of the oligosaccharide. The glycan was fragmented down to one GlcNAc moiety attached to the peptide backbone, which was observed as both doubly (m/z 866.28) and singly charged fragment ions (m/z 1731.56) in the spectrum. Another fragmentation pathway of the precursor led to formation of singly protonated sugar ions, which are characteristic of high mannose glycans. A single backbone fragment ion observed at m/z 830.42 (y7), which resulted from the cleavage of the amide bond between Arg161 and Pro162, confirmed the identity of the peptide as 156–168.
Figure 3 (A) Deconvoluted mass spectrum over the mass range 2000–4000, showing the glycopeptides with the amino acid sequence 156–168 and high mannose glycans with composition from Man5 to Man9. The masses indicated by an empty diamond could not (more ...)
The microheterogeneity of glycosylation site N174 (N556) observed in glycopeptide 173–178 is consistent with a high mannose type glycan (Man3-9) and is presented in . The masses in the deconvoluted spectrum were assigned to glycopeptide 173–178 containing glycans with a length ranging from Man3 to Man9. The Man5 and Man6 are the most abundant populations, followed by Man4, Man7, Man8 and Man9, which are almost equally represented. The tandem mass spectrum of the doubly charged ion of m/z 1017.85, corresponding to glycopeptide 173–178 which contains a Man6 glycan is presented in . This precursor ion also fragmented to the intact peptide backbone (m/z 859.66) with a GlcNAc moiety attached at site N174. Interestingly, the protonated peptide ion of m/z 656.66 resulted from the complete loss of the oligosaccharide, by glycosidic bond cleavage between the sugar and the Asn side chain. Unlike the other MS/MS data, a large number of specific backbone cleavages were observed (b3, b4, b5 and y4) and the peptide backbone could be almost completely assigned based on this spectrum (). In addition, backbone fragments still containing one or two GlcNAc units were abundant. This fragmentation pattern and the data discussed above suggest that the presence of backbone fragmentation may be dependent on the amino acid sequence and length of the peptide containing the sugar, on the position of the glycosylation site within the peptide backbone, or on a combination of these factors.
Figure 4 ESI-MS of chymotryptic peptide 173–178: (A) Deconvoluted mass spectrum over the mass range 1500–3000, showing the glycopeptides with the amino acid sequence 173–178 containing high mannose glycans with composition from Man3 to (more ...)
For the tryptic peptide 181–206 containing the glycosylation site N194 (N576), a maximum of Man9 residues were observed, with Man8 being the most abundant ion. In the chymotryptic digest the peptide 191-204 containing the same site of glycosylation was seen having the same number of mannoses residues attached to this N-linked site (data not shown).
The deconvoluted mass spectrum of the region of the chromatogram where glycoforms of the peptide 251–264 (N645) appeared is presented in . These data indicate the presence of high mannose glycans with a minimum Man5 and maximum Man9 composition. Man6 is the most abundant among these species. The tandem mass spectrum of the triply charged ion of m/z 1135.77 () is consistent with the proposed glycan composition (Man9) for the peptide 251–264. The major fragmentation pathway arises from successive cleavage of the monosaccharides from the non-reducing end of the glycan. The composition of the glycan was deduced from the mass difference between these fragments. The doubly charged molecular peptide ion of m/z 770.83 is less abundant, while the ion of m/z 872.37, assigned to the peptide 251–264 with a single GlcNAc moiety attached at N263, represents the most abundant ion (). This N-linked site showed slight differences in the number of attached mannoses, where up to Hex3Man9 were identified for N263 in the tryptic peptide 258-275, compared to Man9 that were observed in the chymotryptic peptide 251-264. This high number of hexoses suggests incomplete processing of the precursor glycan. As discussed above, these results are due to the differences in the protein batches that were used ( and ).
Figure 5 (A) Deconvoluted mass spectrum over the mass range 2500–4000, showing the glycopeptides with the amino acid sequence 251–264 containing high mannose glycans with composition from Man5 to Man9. (B) MS/MS of the precursor ion m/z 1135.77 (more ...)
The deconvoluted mass spectrum of the region of the chromatogram where glycopeptides 48–58 and 22–36, and their glycoforms containing the N-linked sites N48 and N35 respectively elute, is shown in . For the site N35, a maximum of Man9 residues was observed, with Man4, Man5 and Man6 being the most abundant species. The deconvoluted masses were assigned to each peptide containing different glycan structures. The mass spectrum was deconvoluted over the entire range containing the charge envelops of multiply charged ions, for the corresponding glycopeptides. The identity of these species was confirmed by MS/MS of the corresponding multiply charged ions (data not shown). Small amounts of the species Man1 and Man2 were observed in the spectrum, indicating that, to some extent, the glycopeptides underwent in-source decomposition. For the peptide 48–58, complex type glycans were exclusively observed at site N48 (N430)
. In the deconvoluted mass spectrum (), ions for these glycoforms with masses at m/z
2198.90, 2401.98 and 2605.08 were observed and they are separated by a mass interval of 203 amu. This mass corresponds to the sugar N
-acetylglucosamine, therefore indicating the presence of zero, one or two terminal GlcN
Ac residues. This was assigned as N
-acetylglucosamine because this monosaccharide is expected to elongate the Man3 core structure in complex N-linked glycans. The identity of the peptide observed with this sugar moiety and the composition of the glycan were determined from the MS/MS of the doubly charged precursor of m/z
1201.95 (). Singly charged fragment ions separated by 146 amu clearly indicate the presence of fucose (Fuc) attached at the first GlcN
Ac residue of the Man3
core. The singly charged ion of m/z
1510.68 corresponds in mass to amino acids 48–58, plus a GlcN
AcFuc rest. The ion of m/z
1364.68 corresponds in mass to residues 48–58 with a single GlcN
Ac. Two series of fragment ions were observed in the MS/MS spectrum (): one series is composed of doubly charged fragment ions that result from the successive neutral loss of monosaccharides from the non-reducing end. The other series of ions are formed by charge reduction of the precursor ion. These data are consistent with the glycan structure GlcN
AcFuc – Man3
at the site N48. The deconvoluted masses of m/z
2198.90 and 2605.08 correspond in mass to residues 48–58 plus the complex type glycans Fuc – Man3
Fuc – Man3
, respectively (see and ). Interestingly, these glycopeptides could result from anomalous chymotrypsin activity, which cleaved after glycine residues in both cases instead of residues specific for this enzyme. However, in the absence of backbone fragmentation, the observed mass of the peptide by itself cannot rule out the possibility that cleavage at other residues occurred; the peptide QHKFNSSGCP, bearing the site N66 (N448) and with the cysteine alkylated, has a mass (1160.50) similar to peptide 48–58 (1160.55), and, thus, could also represent a plausible candidate. Cleavage after proline, however, is very uncommon for all serine proteases, irrespective of their specificity, and this plus the observed mass error argues against (but does not disprove) assignment of the peptide as QHKFNSSGCP. There is also the possibility that a peptide modification or protein mutation occurred, as E2 is a viral recombinant protein, and that, instead of cleavage after glycine, this ion may arise from an unidentified amino acid sequence that contains multiple mutations/modifications. No evidence, however, for the presence of mutations was observed in the rest of our data. Therefore, the most likely origin of this ion is that it is due to a peptide formed by a non-specific cleavage occurring after glycine, as chymotrypsin is less specific than trypsin. Because complex types N-glycans have a well defined sugar composition, this enables the identification of the peptide containing this type of glycan from the MS/MS data (i.e. the fragment containing a single GlcN
Ac moiety). The backbone fragment with the amino acid sequence 48–58 was the only peptide that matched the mass of these glycopeptide fragment ions. Usually chymotrypsin cleaves the protein backbone after aromatic amino acids at higher rates than after other amino acids, but under different circumstances (glycan type, conformation of the protein) it also may cleave other amino acids [44
]. These data show that the presence of the glycan moieties attached to E2 can alter the specificity of chymotrypsin. The formation of glycopeptides from non-specific proteolytic cleavages complicates the assignment of the N-linked sites and of the corresponding glycan structures using MS data alone.
Figure 6 (A) Deconvoluted mass spectrum over the mass range 2000–4000, illustrating the microheterogeneity of the glycopeptides 48–58 and 22–36. The consensus sequences for N-glycosylation are underlined. For the clarity of the figure, (more ...)
Glycosylation sites and amino acid sequences of complex-type glycopeptides observed in LC-MS/MS analysis of a chymotryptic digest of reduced and alkylated E2.
In addition to N48, complex type glycans were determined for the site N41 (N423), observed in peptide 39–45 (). The sugar composition of the complex type glycans was determined from the MS/MS data of the glycopeptides observed as doubly charged ions of m/z 926.38 (1849.78 Da) as: GlcNAc – Man3GlcNAc2 and m/z 999.39 (1995.83) as GlcNAcFuc – Man3GlcNAc2, respectively (data not shown). High mannose type oligosaccharides ranging from Man3 to Man6 were also observed at the site N41 (peptide 39–45, ) and the corresponding glycopeptides eluted slightly later than those carrying the complex type glycans. Interestingly, for both sites containing complex type glycans, the Man3GlcNAc-Fuc species is the most abundant ion. Regarding the sugar composition, we may say that these are not mature complex type glycans. This is the first time when complex type glycans were positively identified on the E2 glycoprotein. Although it has not been reported that these complex glycans transit through the medial Golgi, it is possible that they are not sufficiently long enough to develop into mature structures, and thereby are immediately translocated into the ER compartment.
, and summarize the glycosylation sites identified after tryptic and chymotryptic digestions, indicating the position of glycosylation and the observed glycopeptides, as well as the glycopeptide neutral masses. Using GlycoMod software the E2 glycopeptides resulted from both chymotrypsin as well as trypsin digests, with a mass accuracy of 0.1 Da were identified. Multiple monosaccharide compositions were proposed as possible matches in some instances, and the correct glycan structure was determined from MS/MS analyses. The data identification, however, was unreliable when nonspecific chymotryptic cleavages occurred, so that manual interpretation of the MS/MS data was mandatory in order to ascertain the correct amino acid sequence of the glycopeptides and the composition of the attached glycans. As presented in , short glycopeptides were observed (e.g. 32-38 or 173-178) eluting from the C18 column. From our experiments it does not appear that short glycopeptides were lost during the trapping process, however this fact can not be absolutely excluded.
The envelope proteins play a major role in a virus life cycle. Envelope proteins are known to be involved in viral entry into the cell by binding to a receptor present on the host cell and inducing fusion between the viral envelope and the membrane of the host cell [16
]. The E2 protein is heavily glycosylated with 11 potential sites of glycosylation. Nine of these 11 sites are highly conserved, suggesting that the glycosylation may play an essential role in some biological functions or conformation of the glycoprotein [15
]. In the early secretory pathway, the glycans play a role in protein folding and in certain sorting events [17
]. It is known that during glycosylation of a protein, a precursor oligosaccharide composed of GlcN
Ac, Man and glucose (Glc), with the composition Glc3
is transferred to nascent proteins in the ER in a co-translational event [17
]. The diversity of these N-linked oligosaccharide structures on mature glycoproteins arises from major modification of this precursor structure, which occurs posttranslationally. While still in the ER, the glucose residues are quickly removed from the oligosaccharides of most glycoproteins. This process continues in the Glogi apparatus, and, thus, glucose is not observed on the mature glycoprotein [3
]. From our data, peptides with levels of mannosylation higher than Man9 were detected, indicating incomplete processing. Incomplete processing of glycans might be a function of protein processing while still in the ER compartment, with a high level of mannosylation being present in HCV in its natural setting. Incomplete processing in which the hexose residues in excess of the expected Man9 maximum are glucoses could also be explained by the folding of the protein, which might alter the accessibility of these glycans to the processing by glucosidases and mannosidases. The hypothesis that steric hindrance interferes with glycan processing is consistent with our hypothesis that the glycans play a significant role in stabilization of protein tertiary structure. In the low energy CID experiments involved for the structural characterization of E2 glycopeptides one can not differentiate between isobaric mannose and glucose structures, consequently the glycans observed as having a number of hexoses higher than Man9 are depicted as Hex1-3
To accurately characterize a glycoprotein, mass spectrometry has proven to have a tremendous ability to identify the type of glycans as well as the sites of glycosylation on a protein. For example, we have used a combination of MALDI-TOF and LC followed by nanoLC/MS/MS in order to characterize the glycan structures attached to the human immunodeficiency virus (HIV) gp120 glycoprotein, and high mannose glycans and hybrid glycans were found attached to the protein [32
]. Although LC-MS analysis of released glycans may provide a detailed picture of the structure of the glycans derived from a protein or any complex protein mixture, information on the original attachment sites of the glycans and the underlying proteins is lost. This critical information can either be obtained by LC-MS analysis of the remaining peptides after glycan release based on Asn to Asp conversion but cannot connect specific glycans to specific sites or, by the direct analysis of glycopeptides, which provides the connection between glycan type and location.
In the present paper the MS/MS spectra of high mannose oligosaccharides were readily differentiated from those of hybrid/complex type and provided immediate information about the glycans on E2. Large glycopeptides with a high sugar to peptide ratio are expected to be highly sensitive to in-source fragmentation by glycosidic bond cleavage, and, therefore, need special analysis conditions. Furthermore, efficient formation of multiply charged ions of glycopeptides is crucial for their detection within the mass range of the instrument as well as for high resolution of the isotopic pattern. The major collision induced dissociation fragmentation pathway of high mannose glycan containing peptides can be characterized by the successive loss of the sugar moieties from the non-reducing end of the glycan, thereby generating a series of ions containing the peptide backbone and the remaining sugars attached at the reducing end [26
]. In addition to the losses from the non-reducing end, tandem mass spectra of complex type glycans contain at least one additional series of ions that result from the initial loss of the α(1 – 6) linked fucose from the first GlcN
Ac residue of the trimannosyl chitobiose core. Although they are not usually observed in the CAD spectra of protonated glycopeptides, ions due to backbone cleavages along the amino acid chain, commonly with complete or partial loss of the glycan chain, have also been observed in the MS/MS spectra. These data allow for the identification of the peptide which contain the glycans, as well as the precise location of the glycan. The observation of amino acid backbone cleavages may depend on several factors: (i) the amino acid sequence; (ii) the number of amino acid residues contained in a glycopeptide; (iii) the position of the sugar chain within the glycopeptide or, (iv) a combination of these factors.
To successfully investigate the glycans and their role in the structure and function of the E2 protein, a structure of the glycoprotein or, at a minimum, a working model of the glycoprotein is required. To date there is no crystal structure available for the E2 protein. Therefore, the homology model of E2 was based on the Tick-borne encephalitis virus envelope glycoprotein E virus (TBEV). Because of that, it would be reasonable to assume that its physical relationship to the viral membrane would also be similar. Previous studies showed that Flavivirus
envelope glycoprotein E from TBEV shows functional similarity to E2 [36
] and these proteins are similar from the point of view of the parameters in these fold recognition structures [35
]. Moreover, the organization of E2 into multiple antigenic domains has similarities to the large envelope glycoprotein E on TBEV and also with the envelope protein E1 from Semliki Forest virus, an alphavirus [46
] having similar structural and functional properties [47
]. Thus, in order to map the location of the glycans on a model structure, the TBEV structure (1SVB) was selected as a good candidate for a homology model of E2 [36
]. These viruses undergo structural rearrangements at low pH environments, which hypothetically lead to the exposure of initially, buried hydrophobic residues, and this process is believed to be part of an endocytosis entry pathway [47
]. Therefore, a published homology model for the E2 protein based on the TBEV structure was employed as a working model [35
The location of the glycans on the E2 was thus mapped to the homology model and is presented in . The two complex- type glycans that were newly identified by mass spectrometry are represented in green. The N41 (N423)
site previously described as buried in this model [35
] rather than surface exposed, is located in a region rich in β-sheet structural elements. By comparison to N41, N48 (N430)
is mainly surface exposed, and located on the opposite site of the molecule, being nearly parallel to the location of N41. Based on a hydrophobicity plot, Yagnik et al. predicted that the region between amino acids 35-55 (418-438)
(thus encompassing the two complex type glycans) is hydrophobic and mainly surrounded by β-sheet structures [35
]. The distance between the two glycosylation sites in this model is 19Å. This region of the protein 35-55 has not been previously characterized to have a specific biological function, but considering the high β-sheet content, one might speculate that it is involved in the folding process of the E2 protein, maybe implicating the complex type glycans that are located in its vicinity, which may have an impact on the 3-D structure and folding of the protein. It has been previously reported that modification of the complex type glycans by sialylation could affect the potential biological activity of the virus, possibly by reducing the infectivity of the primate lentivirus [48
]. Furthermore, in the case of gp120 it was reported that the removal of the fucose associated with sialylated glycans changed the protein conformation, most likely by exposing epitopes that were previously buried [49
]. It has been demonstrated that the N-linked glycans of the HIV envelope glycoprotein limit its immunogenicity and restrict binding of certain antibodies to their epitopes on the virion surface [50
Figure 7 HCV E2 homology model based on the Tick Borne Encephalitis Virus envelope glycoprotein E (TBEV) structure (PDB id: 1SVB): (A) Cartoon representation of E2 glycoprotein showing the location of the glycans attached to the protein. The nine sites of glycosylation (more ...)
Viral envelope proteins usually contain N-linked glycans which can play a major role in their folding, entry functions or in modulating the immune response [50
]. Previous studies indicated that mutation of some glycosylation sites in the HCV envelope glycoproteins can reduce or abolish HCV pseudoparticles (HCVpp) infectivity without affecting incorporation of the glycoproteins into the particles [16
], possibly by changes in the local conformation of the antibody recognition sites. Moreover, E2 N-linked glycans at position N41 (N423)
and N66 (N448)
were reported as high mannose and have been shown to be essential for the entry functions of HCV envelope glycoproteins [16
]. Interestingly, N41 (N423)
is one of the glycan sites that was identified here as containing both complex type glycans as well as high mannose type glycans, indicating the microheterogeneity of this specific site. Based on these observations, one might predict that the presence of this glycan at the N41 (N423)
site is essential for proper folding of the protein as well as for antibody recognition on E2. In addition, a recent study showed that the loss of glycosylation at N41 (N423)
site leads to noncovalent heterodimer formation as well as CD81 binding, indicating that the removal of a large sugar moiety leads to better exposure of the CD81 binding site [52
]. Furthermore, they observed reduction in HCVpp infectivity upon glycan removal. Glycosylations at sites N35 (N417)
, N94 (N476)
, N150 (N532)
and N263 (N645)
have also been shown to modulate HCVpp entry. Furthermore, N174 (N556)
and N241 (N623)
were indicated to have a direct effect on protein folding [16
]. The presence of a large, polar oligosaccharide is indeed known to affect protein folding by orienting polypeptide segments toward the surfaces of the protein domains [17
]. Moreover, to establish the natural properties of the virus, lectin-binding assays were performed by Sato et al. in order to characterize the glycan moiety on the surface of HCV particles recovered from sera of infected patients [34
]. This study suggested that the envelope glycoproteins E1 and E2 of HCV might contain complex type glycans, and their results also indicated that the N-linked glycans are present on the surface of native virions of HCV [34
]. These authors also postulated that the selectivity of HCV and hepatitis B virus (HBV) in binding different lectins is related to the nature of the carbohydrate structures on the virion surface. Using different types of lectins and based on their binding efficiency, it was concluded that these viruses would contain complex type sugar chains and that the sugar moieties present on HCV virions are very similar to those for HBV. However, no further evidence for the presence and location of complex type glycans on E2 was presented. Although no possible biological function of these complex type glycans was reported, we hypothesize that there may be a correlation between the types of N-linked glycans and the structure and function of the E2 protein, but this hypothesis remains to be investigated in a future study.