Crystal Structure of AAV-6
Structures were determined at 3 Å resolution from both rhombohedral and orthorhombic crystal forms. The rhombohedral form with its high quality diffraction data (Rmerge
= 0.08; ) and packing-constrained translation function yielded the better results on independent phase extensions from 8 to 3 Å. The rhombohedral structure provided an improved starting point for refinement of the orthorhombic phases, which, after refinement using 60-fold non-crystallographic symmetry (NCS), yielded an excellent map ( & S1
). The structures differ by 0.8 Å RMSD (0.5 Å Cα
), which is less than the cross-validated maximum likelihood estimated coordinate errors of 1.0 Å (). The structures have similar R/Rfree
of 0.273/0.286 (rhombohedral) and 0.251/0.286 (orthorhombic) when refined at 3.0 Å. Full statistics are provided in . Unless otherwise stated, figures have been prepared using the rhombohedral structure.
Data collection and structure refinement statistics. Values in parentheses are for the 3.4-3.2 Å resolution shell.
Figure 1 Introduction to the AAV-6 structure. (a) Representative experimental electron density for the orthorhombic crystal form AAV-6 at 3.2Å is shown near one of the surface residues of interest, Arg576. Molecular replacement phases, calculated from (more ...)
Interpretable density was present for all but 19 of the 534 amino acids present in each VP3 subunit (). With differential splicing, AAV expresses three variants of the capsid protein VP1, VP2 and VP3 which were confirmed by gel chromatography to be present in the expected 1:1:8 ratio (Johnson, 1984
; Xie et al., 2004
; Xie et al., 2008
). Resolved in the crystal structure are the parts of the capsid protein common to VP1, 2 & 3 and present in all 60 capsid subunits. Likewise, the genomic DNA does not adhere to the icosahedral symmetry and is not seen in most virus structures. Even though it is VP3 that is resolved, we follow the conventional VP1 numbering with the model starting at residue 221.
The largest differences (2.8 Å backbone rms) between our two crystal forms occur at residues 586–590 on the outer surface. The backbone density for the rhombohedral form is the least ordered of either structure, suggestive of multiple conformations. There is some discontinuous density suggesting that the conformation clearly seen in the orthorhombic form (and other serotypes) is present, but stronger discontinuous density follows the alternative path modeled as the predominant conformation for the rhombohedral form. Only one of the 20 NCS-related subunits forms an inter-particle contact in this region, so crystal packing is not the cause of the disorder – it might reflect inherent flexibility.
Subsequent to our determination, a structure was reported for an AAV-6 virus like particle (VLP), lacking genomic DNA and VP1 (Ng et al., 2010
). The fold and overall structure are similar: the Cα
RMS difference between the empty VLP particles and our infectious virions is 0.7 & 0.5 Å for the rhombohedral and orthorhombic forms respectively. A recent cryo
-EM study of AAV-1 (a minor variant of AAV-6) at 10 Å resolution reported capsid changes, mostly on the inner surface, dependent on the DNA content of particles (Gerlach, Kleinschmidt, and Bottcher, 2011
). However, comparison at 3 Å resolution of our full AAV-6 particles and the empty VLPs, shows no evidence of changes near the DNA of a magnitude detectable at 10 Å. The largest differences between the infectious virion structures and the VLP are at selected points on the outer surface. Where disorder was noted (previous paragraph) for residues 586–590 in the rhombohedral structure, it differs from the VLP by 2.9 Å (Cα
RMSD), but the orthorhombic and VLP structures are similar (0.4 Å Cα
RMSD). The largest differences are at 453–456, the tip of another external loop, where density is understandably weaker than average, but supports similar rhombohedral and orthorhombic conformations (0.7 Å Cα
RMSD) over a different conformation reported for the VLP (2.9 Å Cα
RMSD). Overall, however, the Cα
differences of 0.5 – 0.7 between infectious virions and VLP is less than that expected from the overall error estimates. That said, all atom RMSDs (whole subunit, side chains included) between our capsids and the VLP are substantially larger (3.0 Å) than the difference between our two structures (0.8 Å). Flipping of pseudo
-symmetrical side chains contributes to this, but larger contributions come from a small number of side chains pointing in different directions particularly at the sites (above) where the backbone differs. Differences in side chain conformation also reflect technical challenges that are common at the 3 Å resolution available for the three structures.
Precautions taken during the structure determination, together with multiple cross-validations, support the structures reported here: (1) Diffraction data are of high quality for large complexes (Rmerge
= 0.08 & 0.11; ); (2) Bias from molecular replacement was avoided through model-independent phase extension from 8 to 3 Å using only the icosahedral symmetry; (3) Rhombohedral form test and working reflection sets were selected to maintain their independence in spite of the icosahedral symmetry (Fabiola, Korostelev, and Chapman, 2006
), supporting full cross-validation and checks against over-fitting; (4) The structures of the two crystal forms of infectious particles were initially determined independently to check that refinements converged to the same structure.
Of greatest functional interest are regions where the sequence differs between the AAV serotypes. They are predominantly surface-exposed, but the backbone and much of the side-chain structure was at least partially ordered and fully traceable, even in the initial 3 Å rhombohedral map, due to the beneficial impact of 20-fold non-crystallographic averaging. Full density was available for the four surface residues characteristic to AAV-6 (see below) when the rhombohedral structure was used as the starting point for improved phase refinement for the orthorhombic form. To avoid bias, these four residues, together with 580–595, were omitted from the phasing model (which had not been refined against the orthorhombic data). Lack of perceptible difference between omit and non-omit maps demonstrates that the 60-fold NCS of the orthorhombic form is sufficient to yield a reliable and unbiased map. Orthorhombic density, consistent with both structures offers a local real-space cross-validation for the residues of greatest interest: Lys459
( & S1
Comparison of AAV-6 to other serotypes
Each AAV-6 subunit shares the jelly-roll β-barrel subunit fold of other parvoviruses () (Tsao et al., 1991
) and many other viral capsids (Chapman and Liljas, 2003
). The structures of 4 other AAVs are available for comparison: AAV-2 (Xie et al., 2002
; PDB id 1lp3), AAV-4 (Govindasamy et al., 2006
; PDB id 2g8g), AAV-8 (Nam et al., 2007
; PDB id 2qa0) and AAV-3B (Lerch, Xie, and Chapman, 2010
; PDB id 3kic).
At the level of backbone, the structures are similar (), but there are local regions where the serotypes differ. These were highlighted through alignments of the structures starting with conserved secondary structures, then iterating joint sequence-structure alignment (Krissinel and Henrick, 2004
). There is good overall agreement between AAV-6 and AAV-2, -3B & -8 (0.7 < Cα
RMSD < 0.9 Å). For AAV-4, the RMSD is 1.6 Å. With AAV-4, the greatest differences are in the nine variable regions (VR I-IX) identified in comparisons of AAV-2 & -4 (Table S1
) (Agbandje-McKenna and Chapman, 2006
; Govindasamy et al., 2006
When AAV-6 is compared to AAV-2, -3B & -8, two of the nine regions show substantial differences. The first is at the tip of the loop between strands B and C (βBC, also known as VR-I). We can now see that VR-I is the site of greatest diversity in the serotype structures (Table S1
), each being distinct. VR-I of AAV-6 contains residues 262–269, corresponding to part of the epitope in AAV-2 (262–268) of monoclonal antibody A20 (see next section). With the insertion of Thr265
in AAV-6 relative to AAV-2, the local RMS difference of 2.7 Å within VR-I is large and significant when compared to the overall coordinate error of 1.0 Å.
The second region where AAV-6 differs from its closest relatives is within loop βGH, the ~220 residue segment running between β strands G & H (Xie et al., 2002
). Loop βGH contains 16 small β strands, labeled βGH1 through 16, and can be divided into three sub-loops (3a, 3b and 4). These sub-loops form many of the distinctive features on the viral surface (Chapman and Agbandje-McKenna, 2006
; Xie et al., 2002
), exposing several of the variable regions. It is sub-loop 3a that is most distinctive for AAV-6, in the GH2/3 β-ribbon and turn that constitute VR-IV (Govindasamy et al., 2006
). AAV-6 has its own turn configuration. Loop 3a is one of the loops from a pair of neighboring subunits that come together to form the three-fold proximal spikes. In AAV-2, VR-IV forms a highly structured β-ribbon of 22 residues with 10 inter-strand hydrogen bonds and a tight 2-residue turn. In the other serotypes (3B, 6, 8 and 4), the ribbon extends out only half as far with the strands connected by a meandering turn that is different in each serotype. The average atomic B-factor for these 22 residues is lowest in AAV-2 at 27 Å2
, modestly higher in AAV-4 & -8 (41 & 45 Å2
respectively), and raised 30 Å2
above average for both forms of AAV-6 & -3B, an indication of some disorder. The need to conform to different crystal packing contacts could be contributing to higher B-factors. For example, in the rhombohedral form, 6 of the 20 NCS-related subunits form (distinct) contacts that could result in local structural diversity which would be reflected in high B-factors for an NCS averaged structure. However, the likely role of packing contacts should not be over-stated as the rhombohedral and orthorhombic forms have very different packing interactions yet fundamentally similar structures and B-factor distributions. With the exception of AAV-4, which has a distinct structure, sequences in this 20-residue region are very similar among the serotypes. The diversity of structure in a region of conserved sequence suggests a loop built for flexibility or adaptability to different molecular interactions. VR-IV figures prominently in the 3-fold proximal spikes. With the addition of the AAV-6 structure, we see that AAV-2 is the outlier with a long GH2/3 β-ribbon forming a particularly pointed spike. Of the blunt-spike serotypes, two groups are emerging: AAV-4’s loop is folded over towards the three-fold axis, while the loops of AAV-3B, -6 & -8 point away leading to the most distinctive differences in the surface topology ().
Figure 2 Surface topology. Panel (a): The molecular surface of AAV-6 is shown, with one subunit highlighted in yellow. Panels (b) through (f) show the boxed areas of AAV-6, -2, -3B, -4 and -8 with variable region1 (VR-I) highlighted in cyan, and VR-IV highlighted (more ...)
In the other seven variable regions (Govindasamy et al., 2006
), AAV-6 differs significantly only from AAV-4. AAV-4 now emerges as the outlier with more subtle differences between the other serotypes. In fact, within most of the variable regions, the differences between AAV-6 and AAV-4 exceed twice those between AAV-6 and other serotypes. The structures have been determined with coordinate precisions of ~0.9 Å, so the expected error on distance measurements is ~1.2 Å. In VR-I and VR-IV (highlighted in the previous paragraph), the RMS Cα
differences among non
-AAV-4 serotypes are 1.6 to 3.4 Å, i.e.
statistically significant. For the other seven variable regions, the variation is 0.4 to 1.7 Å (Cα
commensurate with the error. It is only when AAV-6 is compared to AAV-4 that the differences in these seven regions are significant (Cα
RMSDs of 1.7 to 4.7 Å). In summary, with the addition of AAV-6, it becomes clear that the human AAV serotypes differ in backbone primarily at VR-I and VR-IV, but that AAV-4 is an outlier differing in backbone at seven other locations.
Several monoclonal antibodies (mAbs) reacting with AAV capsids have been described (Hermens et al., 1999
; Weger et al., 1997
; Wistuba et al., 1995
; Wobus et al., 2000
). Antibody A20 binds both AAV-2 and AAV-3. Antibodies C37B & C24 bind to AAV-2 capsids selectively, while antibody D3 shows broad reactivity with different serotypes from 1–6, 8–9. Five antibodies highly specific for assembled AAV1/6, AAV-4 or AAV-5 were described recently (Kuck, Kern, and Kleinschmidt, 2007
Of all these antibodies, mAb A20 is the best characterized, recognizing a conformational epitope specific to assembled particles of AAV-2 (Wobus et al., 2000
). With the AAV-6 structure, we can examine features that distinguish it from AAV-2 & AAV-3 that are ~85% sequence identical, but bound by mAb A20. Several sites have previously been implicated as antigenic in AAV-2 (; Table S2
). Residue Ala266
, herein designated “site 1”, was identified by scanning insertional mutagenesis (Wu et al., 2000
). Site 2 (Gln263
) was identified by individual site-directed mutations, as was site 3 (Glu548
) (Lochrie et al., 2006
). By peptide scanning, Wobus et al. (2000)
identified three regions presumably containing antigenic residues: Arg566
that we designate as site 4, His271
as site 5 and Phe370
as site 6. Sites 1, 2 and 5 come close together in the 3D structures (). Here, there are structural differences between A20-binding serotypes (AAV-2 & AAV-3) and non-binding AAV-6. Site 1 is surface-accessible in AAV-2 & -3, but buried in AAV-6. Within the otherwise sequence-conserved site 2, there is an insertion (Thr265
) in AAV-6 and decreased accessibility of the loop compared to AAV-2 or AAV-3. Site 5 is mostly buried underneath site 2 and likely impacts antibody-binding only indirectly. Sites 4–6 have similar surface shape in all serotypes, but the electrostatics of sites 3 & 4 are distinctive. The surface potentials were calculated using a Poisson-Boltzmann continuum approach and reveal surprising overall diversity considering the ~85% sequence identity between AAV-6, -2 & -3B (). A glutamate in AAV-2 renders site 3 negative in contrast to neutral or slightly positive potential in AAV-3B and -6 respectively (). An additional basic residue renders site 4 positive in AAV-6 in contrast to the negative potential in both AAV-2 & -3B.
Figure 3 Comparison of the structure and surface properties near the putative binding site of neutralizing antibody mAb A20. The surface structures of AAV-2 and AAV-3B (which are recognized by mAb A20) are compared to AAV-6 (which is not). Panel (a) shows the (more ...)
In summary, A20 binding determinants should include surface features conserved in AAV-2 and AAV-3B but different in non-binding AAV-6. The surface shape differs where sites 1 & 2 come together, and the electrostatic charge at site 4 also correlates with A20-binding. These sites are close enough to fall within a typical ~30 Å antibody footprint and lie ~30 Å from AAV-2’s HS attachment site, so it is plausible that A20 could inhibit AAV-2 cell entry by steric conflict with a receptor (). High diversity in electrostatics over much of the surface (considering the conserved sequence and structure) might be the result of the ease with which side chain charge could be modulated in immune escape variants without impacting viral assembly or other essential interactions.
Cell receptor binding
AAV-6 has been shown to bind to heparin (Halbert, Allen, and Miller, 2001
), an analog of the HSPG primary cellular receptor for several human AAVs (Summerford and Samulski, 1998
). Like AAV-2, AAV-6 can be purified using heparin sepharose columns, but the affinity is weaker (Halbert, Allen, and Miller, 2001
; Rabinowitz et al., 2002
). Recently, the binding of heparin to the most positively charged region of the AAV-2 surface has been visualized directly through cryo
-electron microscopy at 8 Å resolution (O’Donnell, Taylor, and Chapman, 2009
). The binding site had been predicted from the electrostatic potential, calculated from the AAV-2 crystal structure (Xie et al., 2002
) on the expectation that negatively charged HSPG would be bound at a positively charged region of the viral surface. Intriguingly, the AAV-2 arginines Arg585
that form the core of the binding site, and have been implicated genetically in receptor-binding (Wu et al., 2000
), are unique to AAV-2 and not conserved in other heparin-binding serotypes. For insights into heparin-binding differences, the electrostatic potentials of AAV-6 and AAV-2 assembled capsids were compared following calculation by the Poisson-Boltzmann equation ().
Figure 4 Surface charge near the heparin-binding sites on the spikes surrounding each 3-fold axis. AAV-6 (a, left) is compared to AAV-2 (b, right). The solvent accessible surfaces are colored according to the surface electrostatic potential, ranging from −50 (more ...)
In AAV-6, relative to AAV-2, the most positively charged region is moved down the side of the spike towards the valley between neighboring spikes. An additional region of positive charge is found around the spike and closer to its tip (). Several sequence differences are responsible for the changed electrostatics. In AAV-2, positive charge is concentrated around Arg585
, where the heparin is most tightly bound in the cryo
-EM structure (O’Donnell, Taylor, and Chapman, 2009
are unique to AAV-2 among the 12 characterized serotypes. Absent these two arginines, the corresponding exact location in AAV-6 is less charged, but the surroundings remain predominantly positively charged () with seven basic residues contacting AAV-2’s heparin when it is superimposed on the AAV-6 structure (Table S3
). Four of the seven residues are not conserved between AAV-6 and AAV-2, and might compensate for the absence of AAV-2’s Arg585
. Close to the floor of the valley between spikes, positive charge is increased with the substitution of Lys531
for AAV-2’s Glu530
. Positive charge is also gained near the tip of the spike on the side facing away from the 3-fold axis (). This is due to the substitutions of AAV-6’s Lys459
for AAV-2 sSer458
. Alone or in combination, these AAV-6 residues could compensate for the absence of AAV-2’s Arg585
though not in exactly
the same location.
Some of these basic residues are unique to AAV-6 and its close relative, AAV-1. Lys459
, which form the positive patch close to the spike tip, are not conserved in any of the other serotypes. Lys493
is near the distal end of loop 3b which is sandwiched between loop 3a and 4 from a neighboring subunit. The Lys459
that is ~6 Å from Lys493
comes from loop 3a of the neighboring subunit. Lys493
is ~14 Å from the site corresponding to AAV-2’s Arg585
and the valley that is more positively charged in AAV-6, with Arg576
is ~13 Å in the opposite direction. Lys531
form a spiral of positive potential rising from the valley towards the top of the spike (). Although AAV-2, -3 & -6 all have positively charged regions on the side of the spike, AAV-6’s overlaps only partly with those of its close relatives, and it is comprised of distinctly different basic amino acids (Lerch, Xie, and Chapman, 2010
; Xie et al., 2002
). Sequence variability suggests that other serotypes will be at least as different in this region.
In spite of the differences in how the positively charged regions are constituted, it is plausible that the three serotypes bind HSPG in analogous ways. This can be illustrated by superimposing the experimentally-determined heparin location from AAV-2 onto the surface of AAV-6 () (O’Donnell, Taylor, and Chapman, 2009
). In AAV-2, 23 residues make direct contact with the heparin density (contoured at 4.7 error units), with 13 more having through-solvent access (O’Donnell, Taylor, and Chapman, 2009
). In AAV-6, 28 residues have direct contact with the superimposed AAV-2 heparin density, while 10 have through-solvent access (Table S3
). Only a modest adjustment of AAV-2’s heparin down into the valley would be required to optimize interactions with a different cast of residues in AAV-6. Heparanoid polymers could still wind around the virus between symmetry-related binding sites, as proposed for AAV-2 (O’Donnell, Taylor, and Chapman, 2009
). Indeed, the trail of positive charge spiraling up from the valley floor towards Lys493
high on the spike, suggests that the virus makes multiple interactions with cell surface carbohydrate.
AAV-6 and AAV-1 differ at only six amino acids in their capsid sequences. However, the serotypes differ in receptor-binding, with AAV-1 reported to lack heparin affinity (Wu et al., 2006a
). Qualitative single-step elutions used in the earlier affinity chromatography may have exaggerated the effect (see below), but differences in heparin affinity are measurable for which only a subset of the six sites can be responsible. Phe129
is in the VP1-unique region of unknown structure, present in only 13% of subunits. Asp418
are on the inner surface of the capsid, far-removed from receptor-binding. Lys531
are surface-exposed and near the proposed receptor-binding site (Table S3
). Neither Leu584
are conserved and the residue types at these locations are not covariant with heparin-binding. AAV-6’s Lys531
is unique among all 12 serotypes, with a glutamate at the corresponding location in AAV-1, -2, -3B, -7, -8, -9, -10, and glycine, serine or alanine in the remaining serotypes. Site-specific mutations at the six sites implicated only Lys531
as impacting heparin binding (Wu et al., 2006a
The combination of structure and prior mutagenesis suggested to us that AAV-6 receptor attachment is through a set of positively charged residues that are not homologous to the heparin-binding motif of AAV-2, but are in neighboring locations and serve an analogous function. This hypothesis was tested with substitution mutations at each position in the putative binding site where the sequence differs between AAV-6 and -2. Mutants K531E, R576Q, K493S and K459S all show reduced heparin-binding affinity (). All of these mutants are infectious in HeLa cells, so their phenotypes appear to result from direct impact on binding interactions and not collateral damage.
Figure 5 Heparin elution profiles for AAV-6 mutants. A) Representative PCR product gels for fractions from a heparin affinity column run with wild-type and 4 mutants, and eluted at the salt concentrations indicated. pAAV6 DNA was used to verify the location of (more ...)
K531E is one of two weakly-binding mutants, consistent with its earlier designation as an important determinant (Wu et al., 2006a
). Through the use of lower ionic strength (56 mM) buffers, weak heparin-binding can be measured with elution at 140 ± 10 mM NaCl. This is, in contrast to an earlier designation as non-binding from measurements in 123 mM buffers. To our knowledge, mutations at the other sites have not previously been characterized and they all have weakened binding, but to varying degrees. K459S has similarly large impact upon binding as K531E, while R576Q and K493S have more modest impact. It is important to note that there is not a single amino acid that alone is critical, but there are several contributors to binding of which K531 is among the more influential.
Thus, heparin-attachment in AAV-6 results from interactions with several positively charged amino acids forming a binding site that overlaps with, but differs somewhat from AAV-2 s. The site in AAV-6 is more similar to that of AAV-2 than AAV-3B (Lerch, Xie, and Chapman, 2010
). In AAV-3B, it was proposed that either the absence of AAV-2’s Glu499
(which neutralizes the conserved Arg447
), or the presence of Arg594
, unique to AAV-3B, compensated for the absence of AAV-2’s Arg585
. In AAV-6, the compensating residues are closer to the binding site in AAV-2, lying on the side/base of the spike with direct or solvent-mediated access to the heparin as it is seen in AAV-2 (O’Donnell, Taylor, and Chapman, 2009
). The one exception is Arg447
(AAV-3B/6) which is on the side of the spike facing away from the 3-fold, and out of contact with AAV-2’s heparin density. In AAV-2 its charge is neutralized by a salt-bridge partner (Glu499
) unique to AAV-2, a serotype that binds heparin strongly. Thus, positive electrostatic potential near Arg447
appears less important.
In conclusion, heparin binding in AAV-6 involves a variation on the mode visualized in AAV-2 (O’Donnell, Taylor, and Chapman, 2009
). Intriguingly, each serotype appears to be marshalling its own unique cast of basic amino acids for heparin-binding. This suggests that there are multiple ways that AAV can achieve adequate cell attachment. Evolutionary diversity may be driven by the selective pressure to mutate surface residues in response to immune surveillance. Tolerance of diversity at these loci
may have allowed highly related serotypes to accumulate more sequence variability at this exposed functional site than would otherwise be expected. The diversity is also encouraging for gene therapy – there may be considerable freedom to engineer desirable traits into the capsid without ablating cell attachment.