|Home | About | Journals | Submit | Contact Us | Français|
Crystal structures of the AAV-6 capsid at 3 Å reveal a subunit fold homologous to other parvoviruses with greatest differences in two external loops. The electrostatic potential suggests that receptor-attachment is mediated by four residues: Arg576, Lys493, Lys459 and Lys531, defining a positively charged region curving up from the valley between adjacent spikes. It overlaps only partially with the receptor-binding site of AAV-2, and the residues endowing the electrostatic character are not homologous. Mutational substitution of each residue decreases heparin affinity, particularly Lys531 and Lys459. Neither is conserved among heparin-binding serotypes, indicating that diverse modes of receptor attachment have been selected in different serotypes. Surface topology and charge are also distinct at the shoulder of the spike, where linear epitopes for AAV-2’s neutralizing monoclonal antibody A20 come together. Evolutionarily, selection of changed side-chain charge may have offered a conservative means to evade immune neutralization while preserving other essential functionality.
Adeno-associated viruses (AAVs) are 4.7-kb single-stranded DNA viruses that depend on helper viruses such as adenovirus for replication (Berns, 1996). There are twelve distinct human serotypes with AAV-6 as a minor variant of AAV-1 (Gao et al., 2004; Rutledge, Halbert, and Russell, 1998; Schmidt et al., 2008). AAVs are small viruses that are widespread in human population, but non-pathogenic. Many of the serotypes have the potential to be developed as transducing vectors for gene therapy (Halbert, Allen, and Miller, 2001; Rutledge, Halbert, and Russell, 1998) and also as vaccine vectors for genetic immunization (Carter, 2005; Dudek and Knipe, 2006; Manning et al., 1997; Xin et al., 2001). In constructs designed for such therapies, the recombinant AAV (rAAV) is a non-replicative form in which its replication and capsid genes are replaced with the foreign genes to be delivered (Carter, 2006a; Carter, 2006b). To date, vectors in trials have been mostly derived from AAV-2. However, challenges remain with AAV-2 vectors, including varying transduction efficiencies in different cell types, and inefficient in vivo transduction following prior exposure to AAV (Gao et al., 2005; Halbert et al., 1997; Halbert et al., 1998; Louboutin, Wang, and Wilson, 2005; Su et al., 2008; Wang et al., 2005; Wu, Asokan, and Samulski, 2006). Through improved structural biology of AAV (Xie et al., 2002), it may be possible to engineer the surface of the capsid for improved vector transduction, perhaps by taking advantage of the diversity among naturally occurring AAV serotypes and their distinctive characteristics.
Vectors based on other serotypes could prove especially useful for transducing cells that are resistant to AAV-2 infection (Halbert, Allen, and Miller, 2001). The primary receptor for AAV-2 on the cell surface is heparan sulfate proteoglycan (HSPG) (Summerford and Samulski, 1998), as it is for AAV-3 (~87% identity) albeit with reduced binding affinity (Rabinowitz et al., 2002). AAV-1 (which is ~83% identical to AAV-2), like AAV-4 & AAV-5 (~58% identical to AAV-2), uses sialic acid-mediated cell attachment and has low affinity for heparin (Wu et al., 2006a). AAV-6, like AAV-1, uses sialic acid in attachment (Wu et al., 2006b). However, unlike AAV-1, from which the capsid differs at a handful of sites, AAV-6 also binds heparin (Wu et al., 2006a).
Diverse serotypes are also being investigated as possible vectors, due to their potential to evade host immune responses directed against AAV-2 (Kuck, Kern, and Kleinschmidt, 2007). Infection by wild type AAV results in the production of neutralizing antibodies. 50–80% of adults have neutralizing antibodies to AAV, predominantly against the AAV-2 serotype (Blacklow, Hoggan, and Rowe, 1968; Parks et al., 1970). Neutralizing antibodies to AAV-2 are the most prevalent in all regions of the world, followed by antibodies to AAV-1 (Calcedo et al., 2009). The presence of high-titer neutralizing antibodies is expected to decrease transduction rates severely upon (repeated) vector treatments for some modes of administration. Thus additional transduction was not measurable after re-administration of AAV-2 vectors in animal models (Halbert et al., 1997). Russell’s group has shown that vectors derived from AAV-3 and AAV-6 differ from AAV-2 vectors not only in host cell tropism, but serological reactivity, eliciting distinct humoral responses (Rutledge, Halbert, and Russell, 1998). Thus, the different AAV serotypes are being investigated both for their potential to transduce otherwise refractory cells, and to deliver genes in patients presenting a neutralizing immune response to some of the serotypes.
Many of the relevant viral-host properties are mediated at the level of the capsid. Structural studies were initiated both to better understand the basic biology of AAV serotypes and as a resource for the future development of vectors based on the various serotypes. Here, we report the crystallographic structure of serotype AAV-6 and its implications for receptor binding, tested by site-directed mutagenesis.
Structures were determined at 3 Å resolution from both rhombohedral and orthorhombic crystal forms. The rhombohedral form with its high quality diffraction data (Rmerge = 0.08; Table 1) and packing-constrained translation function yielded the better results on independent phase extensions from 8 to 3 Å. The rhombohedral structure provided an improved starting point for refinement of the orthorhombic phases, which, after refinement using 60-fold non-crystallographic symmetry (NCS), yielded an excellent map (Figures 1 & S1). The structures differ by 0.8 Å RMSD (0.5 Å Cα), which is less than the cross-validated maximum likelihood estimated coordinate errors of 1.0 Å (Table 1). The structures have similar R/Rfree of 0.273/0.286 (rhombohedral) and 0.251/0.286 (orthorhombic) when refined at 3.0 Å. Full statistics are provided in Table 1. Unless otherwise stated, figures have been prepared using the rhombohedral structure.
Interpretable density was present for all but 19 of the 534 amino acids present in each VP3 subunit (Figure 1). With differential splicing, AAV expresses three variants of the capsid protein VP1, VP2 and VP3 which were confirmed by gel chromatography to be present in the expected 1:1:8 ratio (Johnson, 1984; Xie et al., 2004; Xie et al., 2008). Resolved in the crystal structure are the parts of the capsid protein common to VP1, 2 & 3 and present in all 60 capsid subunits. Likewise, the genomic DNA does not adhere to the icosahedral symmetry and is not seen in most virus structures. Even though it is VP3 that is resolved, we follow the conventional VP1 numbering with the model starting at residue 221.
The largest differences (2.8 Å backbone rms) between our two crystal forms occur at residues 586–590 on the outer surface. The backbone density for the rhombohedral form is the least ordered of either structure, suggestive of multiple conformations. There is some discontinuous density suggesting that the conformation clearly seen in the orthorhombic form (and other serotypes) is present, but stronger discontinuous density follows the alternative path modeled as the predominant conformation for the rhombohedral form. Only one of the 20 NCS-related subunits forms an inter-particle contact in this region, so crystal packing is not the cause of the disorder – it might reflect inherent flexibility.
Subsequent to our determination, a structure was reported for an AAV-6 virus like particle (VLP), lacking genomic DNA and VP1 (Ng et al., 2010). The fold and overall structure are similar: the Cα RMS difference between the empty VLP particles and our infectious virions is 0.7 & 0.5 Å for the rhombohedral and orthorhombic forms respectively. A recent cryo-EM study of AAV-1 (a minor variant of AAV-6) at 10 Å resolution reported capsid changes, mostly on the inner surface, dependent on the DNA content of particles (Gerlach, Kleinschmidt, and Bottcher, 2011). However, comparison at 3 Å resolution of our full AAV-6 particles and the empty VLPs, shows no evidence of changes near the DNA of a magnitude detectable at 10 Å. The largest differences between the infectious virion structures and the VLP are at selected points on the outer surface. Where disorder was noted (previous paragraph) for residues 586–590 in the rhombohedral structure, it differs from the VLP by 2.9 Å (Cα RMSD), but the orthorhombic and VLP structures are similar (0.4 Å Cα RMSD). The largest differences are at 453–456, the tip of another external loop, where density is understandably weaker than average, but supports similar rhombohedral and orthorhombic conformations (0.7 Å Cα RMSD) over a different conformation reported for the VLP (2.9 Å Cα RMSD). Overall, however, the Cα differences of 0.5 – 0.7 between infectious virions and VLP is less than that expected from the overall error estimates. That said, all atom RMSDs (whole subunit, side chains included) between our capsids and the VLP are substantially larger (3.0 Å) than the difference between our two structures (0.8 Å). Flipping of pseudo-symmetrical side chains contributes to this, but larger contributions come from a small number of side chains pointing in different directions particularly at the sites (above) where the backbone differs. Differences in side chain conformation also reflect technical challenges that are common at the 3 Å resolution available for the three structures.
Precautions taken during the structure determination, together with multiple cross-validations, support the structures reported here: (1) Diffraction data are of high quality for large complexes (Rmerge= 0.08 & 0.11; Table 1); (2) Bias from molecular replacement was avoided through model-independent phase extension from 8 to 3 Å using only the icosahedral symmetry; (3) Rhombohedral form test and working reflection sets were selected to maintain their independence in spite of the icosahedral symmetry (Fabiola, Korostelev, and Chapman, 2006), supporting full cross-validation and checks against over-fitting; (4) The structures of the two crystal forms of infectious particles were initially determined independently to check that refinements converged to the same structure.
Of greatest functional interest are regions where the sequence differs between the AAV serotypes. They are predominantly surface-exposed, but the backbone and much of the side-chain structure was at least partially ordered and fully traceable, even in the initial 3 Å rhombohedral map, due to the beneficial impact of 20-fold non-crystallographic averaging. Full density was available for the four surface residues characteristic to AAV-6 (see below) when the rhombohedral structure was used as the starting point for improved phase refinement for the orthorhombic form. To avoid bias, these four residues, together with 580–595, were omitted from the phasing model (which had not been refined against the orthorhombic data). Lack of perceptible difference between omit and non-omit maps demonstrates that the 60-fold NCS of the orthorhombic form is sufficient to yield a reliable and unbiased map. Orthorhombic density, consistent with both structures offers a local real-space cross-validation for the residues of greatest interest: Lys459, Lys493, Lys531 and Arg576 (Figures 1 & S1).
Each AAV-6 subunit shares the jelly-roll β-barrel subunit fold of other parvoviruses (Figure 1b) (Tsao et al., 1991) and many other viral capsids (Chapman and Liljas, 2003). The structures of 4 other AAVs are available for comparison: AAV-2 (Xie et al., 2002; PDB id 1lp3), AAV-4 (Govindasamy et al., 2006; PDB id 2g8g), AAV-8 (Nam et al., 2007; PDB id 2qa0) and AAV-3B (Lerch, Xie, and Chapman, 2010; PDB id 3kic).
At the level of backbone, the structures are similar (Figure 1b), but there are local regions where the serotypes differ. These were highlighted through alignments of the structures starting with conserved secondary structures, then iterating joint sequence-structure alignment (Krissinel and Henrick, 2004). There is good overall agreement between AAV-6 and AAV-2, -3B & -8 (0.7 < Cα RMSD < 0.9 Å). For AAV-4, the RMSD is 1.6 Å. With AAV-4, the greatest differences are in the nine variable regions (VR I-IX) identified in comparisons of AAV-2 & -4 (Table S1) (Agbandje-McKenna and Chapman, 2006; Govindasamy et al., 2006).
When AAV-6 is compared to AAV-2, -3B & -8, two of the nine regions show substantial differences. The first is at the tip of the loop between strands B and C (βBC, also known as VR-I). We can now see that VR-I is the site of greatest diversity in the serotype structures (Table S1), each being distinct. VR-I of AAV-6 contains residues 262–269, corresponding to part of the epitope in AAV-2 (262–268) of monoclonal antibody A20 (see next section). With the insertion of Thr265 in AAV-6 relative to AAV-2, the local RMS difference of 2.7 Å within VR-I is large and significant when compared to the overall coordinate error of 1.0 Å.
The second region where AAV-6 differs from its closest relatives is within loop βGH, the ~220 residue segment running between β strands G & H (Xie et al., 2002). Loop βGH contains 16 small β strands, labeled βGH1 through 16, and can be divided into three sub-loops (3a, 3b and 4). These sub-loops form many of the distinctive features on the viral surface (Chapman and Agbandje-McKenna, 2006; Xie et al., 2002), exposing several of the variable regions. It is sub-loop 3a that is most distinctive for AAV-6, in the GH2/3 β-ribbon and turn that constitute VR-IV (Govindasamy et al., 2006). AAV-6 has its own turn configuration. Loop 3a is one of the loops from a pair of neighboring subunits that come together to form the three-fold proximal spikes. In AAV-2, VR-IV forms a highly structured β-ribbon of 22 residues with 10 inter-strand hydrogen bonds and a tight 2-residue turn. In the other serotypes (3B, 6, 8 and 4), the ribbon extends out only half as far with the strands connected by a meandering turn that is different in each serotype. The average atomic B-factor for these 22 residues is lowest in AAV-2 at 27 Å2, modestly higher in AAV-4 & -8 (41 & 45 Å2 respectively), and raised 30 Å2 above average for both forms of AAV-6 & -3B, an indication of some disorder. The need to conform to different crystal packing contacts could be contributing to higher B-factors. For example, in the rhombohedral form, 6 of the 20 NCS-related subunits form (distinct) contacts that could result in local structural diversity which would be reflected in high B-factors for an NCS averaged structure. However, the likely role of packing contacts should not be over-stated as the rhombohedral and orthorhombic forms have very different packing interactions yet fundamentally similar structures and B-factor distributions. With the exception of AAV-4, which has a distinct structure, sequences in this 20-residue region are very similar among the serotypes. The diversity of structure in a region of conserved sequence suggests a loop built for flexibility or adaptability to different molecular interactions. VR-IV figures prominently in the 3-fold proximal spikes. With the addition of the AAV-6 structure, we see that AAV-2 is the outlier with a long GH2/3 β-ribbon forming a particularly pointed spike. Of the blunt-spike serotypes, two groups are emerging: AAV-4’s loop is folded over towards the three-fold axis, while the loops of AAV-3B, -6 & -8 point away leading to the most distinctive differences in the surface topology (Figure 2).
In the other seven variable regions (Govindasamy et al., 2006), AAV-6 differs significantly only from AAV-4. AAV-4 now emerges as the outlier with more subtle differences between the other serotypes. In fact, within most of the variable regions, the differences between AAV-6 and AAV-4 exceed twice those between AAV-6 and other serotypes. The structures have been determined with coordinate precisions of ~0.9 Å, so the expected error on distance measurements is ~1.2 Å. In VR-I and VR-IV (highlighted in the previous paragraph), the RMS Cα differences among non-AAV-4 serotypes are 1.6 to 3.4 Å, i.e. statistically significant. For the other seven variable regions, the variation is 0.4 to 1.7 Å (Cα RMSD), i.e. commensurate with the error. It is only when AAV-6 is compared to AAV-4 that the differences in these seven regions are significant (Cα RMSDs of 1.7 to 4.7 Å). In summary, with the addition of AAV-6, it becomes clear that the human AAV serotypes differ in backbone primarily at VR-I and VR-IV, but that AAV-4 is an outlier differing in backbone at seven other locations.
Several monoclonal antibodies (mAbs) reacting with AAV capsids have been described (Hermens et al., 1999; Weger et al., 1997; Wistuba et al., 1995; Wobus et al., 2000). Antibody A20 binds both AAV-2 and AAV-3. Antibodies C37B & C24 bind to AAV-2 capsids selectively, while antibody D3 shows broad reactivity with different serotypes from 1–6, 8–9. Five antibodies highly specific for assembled AAV1/6, AAV-4 or AAV-5 were described recently (Kuck, Kern, and Kleinschmidt, 2007).
Of all these antibodies, mAb A20 is the best characterized, recognizing a conformational epitope specific to assembled particles of AAV-2 (Wobus et al., 2000). With the AAV-6 structure, we can examine features that distinguish it from AAV-2 & AAV-3 that are ~85% sequence identical, but bound by mAb A20. Several sites have previously been implicated as antigenic in AAV-2 (Figure 3; Table S2). Residue Ala266, herein designated “site 1”, was identified by scanning insertional mutagenesis (Wu et al., 2000). Site 2 (Gln263, Ser264, Ser384, Gln385 & Val708) was identified by individual site-directed mutations, as was site 3 (Glu548) (Lochrie et al., 2006). By peptide scanning, Wobus et al. (2000) identified three regions presumably containing antigenic residues: Arg566 to Gln575 that we designate as site 4, His271 to Gly280 as site 5 and Phe370 to Leu378 as site 6. Sites 1, 2 and 5 come close together in the 3D structures (Figure 3). Here, there are structural differences between A20-binding serotypes (AAV-2 & AAV-3) and non-binding AAV-6. Site 1 is surface-accessible in AAV-2 & -3, but buried in AAV-6. Within the otherwise sequence-conserved site 2, there is an insertion (Thr265) in AAV-6 and decreased accessibility of the loop compared to AAV-2 or AAV-3. Site 5 is mostly buried underneath site 2 and likely impacts antibody-binding only indirectly. Sites 4–6 have similar surface shape in all serotypes, but the electrostatics of sites 3 & 4 are distinctive. The surface potentials were calculated using a Poisson-Boltzmann continuum approach and reveal surprising overall diversity considering the ~85% sequence identity between AAV-6, -2 & -3B (Figure 3). A glutamate in AAV-2 renders site 3 negative in contrast to neutral or slightly positive potential in AAV-3B and -6 respectively (Figure 3). An additional basic residue renders site 4 positive in AAV-6 in contrast to the negative potential in both AAV-2 & -3B.
In summary, A20 binding determinants should include surface features conserved in AAV-2 and AAV-3B but different in non-binding AAV-6. The surface shape differs where sites 1 & 2 come together, and the electrostatic charge at site 4 also correlates with A20-binding. These sites are close enough to fall within a typical ~30 Å antibody footprint and lie ~30 Å from AAV-2’s HS attachment site, so it is plausible that A20 could inhibit AAV-2 cell entry by steric conflict with a receptor (Figure 1c). High diversity in electrostatics over much of the surface (considering the conserved sequence and structure) might be the result of the ease with which side chain charge could be modulated in immune escape variants without impacting viral assembly or other essential interactions.
AAV-6 has been shown to bind to heparin (Halbert, Allen, and Miller, 2001), an analog of the HSPG primary cellular receptor for several human AAVs (Summerford and Samulski, 1998). Like AAV-2, AAV-6 can be purified using heparin sepharose columns, but the affinity is weaker (Halbert, Allen, and Miller, 2001; Rabinowitz et al., 2002). Recently, the binding of heparin to the most positively charged region of the AAV-2 surface has been visualized directly through cryo-electron microscopy at 8 Å resolution (O’Donnell, Taylor, and Chapman, 2009). The binding site had been predicted from the electrostatic potential, calculated from the AAV-2 crystal structure (Xie et al., 2002) on the expectation that negatively charged HSPG would be bound at a positively charged region of the viral surface. Intriguingly, the AAV-2 arginines Arg585 and Arg588 that form the core of the binding site, and have been implicated genetically in receptor-binding (Wu et al., 2000), are unique to AAV-2 and not conserved in other heparin-binding serotypes. For insights into heparin-binding differences, the electrostatic potentials of AAV-6 and AAV-2 assembled capsids were compared following calculation by the Poisson-Boltzmann equation (Figure 4).
In AAV-6, relative to AAV-2, the most positively charged region is moved down the side of the spike towards the valley between neighboring spikes. An additional region of positive charge is found around the spike and closer to its tip (Figure 4). Several sequence differences are responsible for the changed electrostatics. In AAV-2, positive charge is concentrated around Arg585 & Arg588, where the heparin is most tightly bound in the cryo-EM structure (O’Donnell, Taylor, and Chapman, 2009). Arg585 & Arg588 are unique to AAV-2 among the 12 characterized serotypes. Absent these two arginines, the corresponding exact location in AAV-6 is less charged, but the surroundings remain predominantly positively charged (Figure 4) with seven basic residues contacting AAV-2’s heparin when it is superimposed on the AAV-6 structure (Table S3). Four of the seven residues are not conserved between AAV-6 and AAV-2, and might compensate for the absence of AAV-2’s Arg585 & Arg588. Close to the floor of the valley between spikes, positive charge is increased with the substitution of Lys531 & Arg576 for AAV-2’s Glu530 & Gln575. Positive charge is also gained near the tip of the spike on the side facing away from the 3-fold axis (Figure 4). This is due to the substitutions of AAV-6’s Lys459 for AAV-2 sSer458 and Lys493 for Ser492. Alone or in combination, these AAV-6 residues could compensate for the absence of AAV-2’s Arg585 & Arg588 though not in exactly the same location.
Some of these basic residues are unique to AAV-6 and its close relative, AAV-1. Lys459 and Lys493, which form the positive patch close to the spike tip, are not conserved in any of the other serotypes. Lys493 is near the distal end of loop 3b which is sandwiched between loop 3a and 4 from a neighboring subunit. The Lys459 that is ~6 Å from Lys493 comes from loop 3a of the neighboring subunit. Lys493 is ~14 Å from the site corresponding to AAV-2’s Arg585 & Arg588 and the valley that is more positively charged in AAV-6, with Arg576 and Lys531 is ~13 Å in the opposite direction. Lys531, Arg576, Lys459 and Lys493 form a spiral of positive potential rising from the valley towards the top of the spike (Figure 4). Although AAV-2, -3 & -6 all have positively charged regions on the side of the spike, AAV-6’s overlaps only partly with those of its close relatives, and it is comprised of distinctly different basic amino acids (Lerch, Xie, and Chapman, 2010; Xie et al., 2002). Sequence variability suggests that other serotypes will be at least as different in this region.
In spite of the differences in how the positively charged regions are constituted, it is plausible that the three serotypes bind HSPG in analogous ways. This can be illustrated by superimposing the experimentally-determined heparin location from AAV-2 onto the surface of AAV-6 (Figure 4) (O’Donnell, Taylor, and Chapman, 2009). In AAV-2, 23 residues make direct contact with the heparin density (contoured at 4.7 error units), with 13 more having through-solvent access (O’Donnell, Taylor, and Chapman, 2009). In AAV-6, 28 residues have direct contact with the superimposed AAV-2 heparin density, while 10 have through-solvent access (Table S3). Only a modest adjustment of AAV-2’s heparin down into the valley would be required to optimize interactions with a different cast of residues in AAV-6. Heparanoid polymers could still wind around the virus between symmetry-related binding sites, as proposed for AAV-2 (O’Donnell, Taylor, and Chapman, 2009). Indeed, the trail of positive charge spiraling up from the valley floor towards Lys493 and Lys459 high on the spike, suggests that the virus makes multiple interactions with cell surface carbohydrate.
AAV-6 and AAV-1 differ at only six amino acids in their capsid sequences. However, the serotypes differ in receptor-binding, with AAV-1 reported to lack heparin affinity (Wu et al., 2006a). Qualitative single-step elutions used in the earlier affinity chromatography may have exaggerated the effect (see below), but differences in heparin affinity are measurable for which only a subset of the six sites can be responsible. Phe129 is in the VP1-unique region of unknown structure, present in only 13% of subunits. Asp418 and His642 are on the inner surface of the capsid, far-removed from receptor-binding. Lys531, Leu584 and Val598 are surface-exposed and near the proposed receptor-binding site (Table S3). Neither Leu584 nor Val598 are conserved and the residue types at these locations are not covariant with heparin-binding. AAV-6’s Lys531 is unique among all 12 serotypes, with a glutamate at the corresponding location in AAV-1, -2, -3B, -7, -8, -9, -10, and glycine, serine or alanine in the remaining serotypes. Site-specific mutations at the six sites implicated only Lys531 as impacting heparin binding (Wu et al., 2006a).
The combination of structure and prior mutagenesis suggested to us that AAV-6 receptor attachment is through a set of positively charged residues that are not homologous to the heparin-binding motif of AAV-2, but are in neighboring locations and serve an analogous function. This hypothesis was tested with substitution mutations at each position in the putative binding site where the sequence differs between AAV-6 and -2. Mutants K531E, R576Q, K493S and K459S all show reduced heparin-binding affinity (Figure 5). All of these mutants are infectious in HeLa cells, so their phenotypes appear to result from direct impact on binding interactions and not collateral damage.
K531E is one of two weakly-binding mutants, consistent with its earlier designation as an important determinant (Wu et al., 2006a). Through the use of lower ionic strength (56 mM) buffers, weak heparin-binding can be measured with elution at 140 ± 10 mM NaCl. This is, in contrast to an earlier designation as non-binding from measurements in 123 mM buffers. To our knowledge, mutations at the other sites have not previously been characterized and they all have weakened binding, but to varying degrees. K459S has similarly large impact upon binding as K531E, while R576Q and K493S have more modest impact. It is important to note that there is not a single amino acid that alone is critical, but there are several contributors to binding of which K531 is among the more influential.
Thus, heparin-attachment in AAV-6 results from interactions with several positively charged amino acids forming a binding site that overlaps with, but differs somewhat from AAV-2 s. The site in AAV-6 is more similar to that of AAV-2 than AAV-3B (Lerch, Xie, and Chapman, 2010). In AAV-3B, it was proposed that either the absence of AAV-2’s Glu499 (which neutralizes the conserved Arg447), or the presence of Arg594, unique to AAV-3B, compensated for the absence of AAV-2’s Arg585 and Arg588. In AAV-6, the compensating residues are closer to the binding site in AAV-2, lying on the side/base of the spike with direct or solvent-mediated access to the heparin as it is seen in AAV-2 (O’Donnell, Taylor, and Chapman, 2009). The one exception is Arg447 (AAV-3B/6) which is on the side of the spike facing away from the 3-fold, and out of contact with AAV-2’s heparin density. In AAV-2 its charge is neutralized by a salt-bridge partner (Glu499) unique to AAV-2, a serotype that binds heparin strongly. Thus, positive electrostatic potential near Arg447 appears less important.
In conclusion, heparin binding in AAV-6 involves a variation on the mode visualized in AAV-2 (O’Donnell, Taylor, and Chapman, 2009). Intriguingly, each serotype appears to be marshalling its own unique cast of basic amino acids for heparin-binding. This suggests that there are multiple ways that AAV can achieve adequate cell attachment. Evolutionary diversity may be driven by the selective pressure to mutate surface residues in response to immune surveillance. Tolerance of diversity at these loci may have allowed highly related serotypes to accumulate more sequence variability at this exposed functional site than would otherwise be expected. The diversity is also encouraging for gene therapy – there may be considerable freedom to engineer desirable traits into the capsid without ablating cell attachment.
Wild type AAV-6 were produced from an infectious plasmid clone (Rutledge, Halbert, and Russell, 1998) in human HeLa cells using a modification of high yield methods originally developed for AAV-2 (Xie et al., 2004). Details of the production, purification, crystallization and data collection/processing were reported previously (Xie et al., 2008). In summary, several data sets were collected from two pre-frozen crystals at F1 beamline at Cornell High Energy Synchrotron Source (CHESS) of which the best two are used here. One was processed with a primitive rhombohedral R3 unit-cell with parameters a = b = 258.4 Å, c = 613.0 Å in the hexagonal setting. The other crystal was processed in a primitive orthorhombic unit-cell with parameters a = 354.8, b = 363.9, c = 371.9 Å. The program suite HKL2000 was used for data processing (Otwinowski and Minor, 1997).
Crystal structures of AAV-6 were determined for two crystal forms. Due to variability in the diffraction from individual crystals, structures were determined from the datasets of individual crystals without merging. The best datasets were 30–40% complete, not uncommon in virus crystallography, where the datasets are ~100-fold larger than for typical proteins, and data collection is limited by radiation damage even at cryogenic temperatures. Structure determination continued with these partial datasets, encouraged by precedents where viral symmetry had compensated for even sparser data (Badger et al., 1988). It was not immediately clear which 3 Å data set would yield the better structure. The rhombohedral data were of higher quality (Rmerge = 0.08; Table 1), the particle position was predetermined by the space group, and they ultimately yielded a higher quality phase extension.
Virus orientations were determined for both crystals through the self-rotation function, locked according to the icosahedral symmetry, using the program GLRF (Tong and Rossmann, 1997). There is only one particle in the primitive rhombohedral cell, so it can be positioned arbitrarily on the crystallographic 3-fold axis. There are 4 particles in the orthorhombic cell (1 per asymmetric unit). The virus position was determined by R-factor search over the asymmetric unit, starting at 12 Å resolution using the program Phenix (Adams et al., 2010). The position was refined by rigid-body refinement at 7.5 - 3.0 Å resolution using the program CNS (Brünger et al., 1998).
Phases were determined for the rhombohedral form at low 8 Å resolution from atomic model AAV-2, then extended by 20-fold non-crystallographic symmetry averaging using RAVE and the CCP4 program suite (Collaborative Computational Project Number 4, 1994). The initial model was built using the program O (Jones et al., 1991), followed by alternate cycles of refinement in CNS and model re-building using updated (2Fo − Fρav, ϕρav) averaged maps, where the subscript ρav indicates (model-free) calculation by back-transform from the symmetry-averaged map. The final round of model building was performed using COOT (Emsley and Cowtan, 2004) with refinement completed in Phenix (Adams et al., 2010).
To monitor refinement of the rhombohedral form, a non-biased test set of 985 reflections were set aside using Rfree2005 (Fabiola, Korostelev, and Chapman, 2006). This method eliminates bias from non-crystallographic symmetry by selecting the test reflections as thin shells in resolution. Beyond the usual precautions, it then omits from both test and working sets, the reflections in neighboring resolution shells that would allow cross-talk through the interference function. From the 83,983 scaled reflections between 40 and 3.0 Å, 985 test reflections were selected, and 15,150 neighbors were omitted, leaving a working set of 68,833 reflections. The reported Rfree of many other virus structures have not been calculated with the precautions needed to eliminate bias, and are little different from Rwork. The higher values of Rfree here are more in line with the usual properties of the statistic.
Even though of modest resolution, the availability of 20-fold NCS for the rhombohedral form led to reliable electron density into which the protein sequence could be fit readily (Figure 1). Tests were performed to check that the averaging with 20-fold NCS and 30% completeness were sufficient to remove bias towards the initial AAV-2 phasing model. In separate repetitions of the phase refinement and extension, three regions were omitted from the phasing model: AAV-2 VP1 450–463, 465–476 and 545–558, where differences between AAV-2 and AAV-6 were expected. The averaged omit maps showed good density that was essentially unchanged from the conventionally calculated non-omit averaged maps. Furthermore, clear density was recovered for a lysine at the site of the S459K sequence difference between AAV-2 and AAV-6.
Phases for the orthorhombic form were determined from both the rhombohedral AAV-6 and AAV-2 atomic models. Upon iterative NCS phase refinement to 3.5 Å resolution, both starting points yielded similar statistics, and the latter was to 3.0 Å using 60-fold NCS averaging in RAVE and the CCP4 program suite. Concerned about the potential for model bias, NCS phase refinements for the orthorhombic form were compared where the initial phasing model was either a complete rhombohedral model or one from which residues 459, 493, 531, 576 & 580–95 were omitted. Following NCS phase refinement, there were no perceptible differences anywhere in the omit and non-omit maps, confirming the power of the 60-fold NCS to yield unbiased maps which were of somewhat higher quality than the 20-fold averaged rhombohedral maps.
The orthorhombic model was fit to the NCS-averaged omit map manually and with real-space torsion angle simulated annealing refinement using a new implementation of RSRef as an extension to CNS 1.2 (Brünger et al., 1998; Chapman, 1995). This was followed by reciprocal space refinement at 3.0 Å against a maximum likelihood target using Phenix 1.7 (Adams et al., 2010).
the program APBS was used to calculate the potentials (Baker et al., 2001). First, the potentials were estimated over a coarse grid for assembled capsids, and then refined on a fine grid near the icosahedral 3-fold axis.
Mutants were designed to characterize regions implicated by the structure in heparin-binding. The starting point was the pAAV6 infectious clone, a gift from David Russell, comprising AAV6 rep and cap flanked by AAV6 inverted terminal repeats (ITRs). For mutagenesis, to avoid complications from the ITRs, cap was excised (pAAV6 nt 1889–4614) using endogenous flanking SacI sites, and inserted into the pET-41a(+) cloning vector (Novagen), using a homologous SacI site within the multiple cloning site (MCS) to yield the plasmid pET41a/pAAV6-SacI. The following mutations were introduced into this plasmid by mutagenic PCR using the QuikChange II XL Site-Directed Mutagenesis Kit (Stratagene), and primers listed in Table S4: K531E, R576Q, K459S, and K493S. Mutant plasmids were digested with SacI and the cap region extracted and ligated back into the SacI-digested parental pAAV6 plasmid to yield infectious mutant clones. Following amplification in E. coli, infectious mutant viruses were produced by HeLa transfection in the presence of Adenovirus helper, and then purified by at least one cesium gradient ultracentrifugation (Xie et al., 2008). Constructs were confirmed by sequencing of the cap region of the plasmid, and of the purified virion genomes.
Heparin binding affinity was measured using 1mL HiTrap Heparin HP columns (GE HealthCare) that were pre-washed with 10 mL 5M NaCl followed by 10 mL loading buffer. Virus samples of 0.05 to 0.25μg were diluted in 5mL loading buffer (25 mM Na Hepes, 31.25 mM NaCl, pH = 7.2 – 7.4). To better resolve weakly-binding virus particles, buffers used for the affinity chromatography were at lower ionic strength (56mM) than in earlier work (123 mM) (Wu et al., 2006a). After sample loading, columns were washed with 25 mL loading buffer. Bound virus was eluted with a NaCl gradient consisting of 5mL steps in increments of 100mM from 100 mM to 1 M. Fractions, flow-through (FT) and wash (W) were assayed for virus using PCR in the following way. Fraction aliquots of 2 – 8 μL were made up to 10 μL by mixing with DNA extraction buffer (2 μL proteinase K and 1 μL Tween-20 in 1mL TE buffer (10 mM Tris-HCl, 1 mM EDTA, pH 7.5). Capsid digestion proceeded at 56° C for 1.5 hours and was quenched for 1 hour at 90° C to denature proteinase K. PCR was performed using the DreamTaq Green Master Mix (Fermentas) (Mitchell et al., 2006). Amplified fragments were separated on 1.5% agarose gels which were scanned using the FluorChem 5500 (Alpha Innotech Inc.). The centroid elution concentrations were obtained by non-linear least-squares fitting of Gaussian functions (SigmaPlot) to the integrated relative band densities for the PCR products of the heparin column fractions. Results are the average of 3 or 4 elution profiles for each mutant.
The authors would like to thank Heather M. Ongley, Thayumanasamy Somasundaram, Weishu Bu and the staff at CHESS who helped with data collection. Thanks are also due to Andrew Trzynka and Omar Davulcu for technical support. CHESS is supported by the NSF & NIH/NIGMS via NSF award DMR-0225180, and the MacCHESS resource is supported by NIH/NCRR award RR-01646. This research is supported by the National Institute of Health R01-GM66875 (M.S.C) and the American Heart Association 10-post-2600203 (TFL).
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.