|Home | About | Journals | Submit | Contact Us | Français|
Adeno-associated viruses (AAV) are widely spread throughout the human population, yet no pathology has been associated with infection. This fact, together with the availability of simple molecular techniques to alter the packaged viral genome, has made AAV a serious contender in the search for an ideal gene therapy delivery vehicle. However, our understanding of the intriguing features of this virus is far from exhausted and it is likely that the mechanisms underlying the viral lifestyle will reveal possible novel strategies that can be employed in future clinical approaches. One such aspect is the unique approach AAV has evolved in order to establish latency. In the absence of a cellular milieu that will support productive viral replication, wild-type AAV can integrate its genome site specifically into a locus on human chromosome 19 (termed AAVS1), where it resides without apparent effects on the host cell until cellular conditions are changed by outside influences, such as adenovirus super-infection, which will lead to the rescue of the viral genome and productive replication. This article will introduce the biology of AAV, the unique viral strategy of targeted genome integration and address relevant questions within the context of attempts to establish therapeutic approaches that will utilize targeted gene addition to the human genome.
Adeno-associated viruses (AAVs) belong to the family of Parvoviridae, which are widespread throughout the biosphere. Based on this wide distribution, two subfamilies have been created, the Parvovirinae, infecting vertebrates, and the Densovirinae, infecting arthropods . AAV, a member of the Parvovirinae, was identified as a contaminant of adenovirus stocks nearly half a century ago [2,3]. Yet, very little is known about its relationship with the human host. The surprising absence of information about the natural infection might, in part, be explained by the absence of any discernable pathology associated with this virus in vivo . However, it has been documented that approximately 80% of the population has detectable levels of anti-AAV antibodies against serotypes 1–3 and 5 [5–10]. These antibodies can be detected at the age of 10 years, and will persist into adulthood. The absence of a correlation of viral infection and disease has made it difficult to establish a viral lifecycle in vivo. However, studies in tissue culture systems have introduced the possibility that AAV might have evolved an optimal relationship with its human host. In tissue culture, it has been demonstrated that AAV on its own is an inherently replication-defective virus. Only when the cell is challenged by adverse effects (i.e., super- or co-infection by a pathogenic virus, such as herpes-, adeno- and papilloma-viruses) [2,11–13], will AAV replicate to significant levels that surpass most of the replication efficiencies of other DNA viruses [14–16]. Consequently, when an AAV-infected cell is challenged by a pathogenic helper virus, it will probably die as a result of AAV replication. A potential protective effect can thus be extrapolated when papilloma virus acts as a helper for AAV replication. Consistent with this hypothesis are the results of epidemiological studies that have found 85% of healthy women to test positive for anti-AAV antibodies. By contrast, only 14% of cervical cancer patients were found to be seropositive for AAV antibodies . In addition, AAV antibody titers were significantly increased in healthy women when compared with a cervical cancer cohort . Conversely, human papilloma virus-infected individuals appear less likely to develop cancer in the presence of AAV . Although compelling, it must be noted that these observations have shed little light on the viral lifestyle in the human host, and no underlying possible mechanisms have been directly tested to date.
The natural propensity of AAV to persist in human cells and its nonpathogenic character have sparked an enormous interest in the development of AAV as a gene therapy vector, leading to more than 20 approved clinical trials currently ongoing. Recombinant AAV (rAAV) vectors typically consist of the gene of interest and the elements that regulate its expression, flanked by the inverted terminal repeats (ITRs) of AAV, resulting in a vector genome of approximately 4.5 kb, which is unable to replicate or integrate site specifically as it is missing the viral proteins necessary to direct these events . The ITRs are the only viral elements that are present in the vector and constitute the signals necessary for AAV DNA replication and packaging , which can take place when the required helper virus genes and AAV rep and cap genes are provided in trans . AAV production schemes are mostly based on dual or triple transfections of 293T cells with vector plasmid and helper and/or rep-cap plasmids. Approximately 48 h post-transfection, the cells are lysed and recombinant virions can be purified from the lysate . Other strategies used for the production of AAV vectors include the use of producer cell lines , combining viral features into adenovirus–AAV hybrids , the use of herpesvirus systems  and production in insect cells using baculoviruses . Purification protocols are based on any of the following techniques: ultracentrifugation through cesium chloride density gradients, the use of nonionic iodixinol gradients and various forms of column chromatography .
Early work in animals, including nonhuman primates, demonstrated that AAV vectors had a tropism for certain postmitotic cell types, present in the CNS, liver, skeletal and cardiac muscle, and long-term expression was predominantly achieved through vectors stabilized in a nonintegrated episomal form, and AAV vectors elicited little or no immune response . This response remains mild and highly transient, with limited activation of inflammatory signals. It has been proposed that Toll-like receptor-dependent pathways are involved in the induction of an adaptive immunity to AAV .
The positive outcome of preclinical studies in animal models of monogenetic diseases led to the initiation of clinical trials using AAV vectors to treat disorders, such as cystic fibrosis and hemophilia B . The first clinical study of AAV Factor IX treatment in hemophilia B patients showed that intramuscular injection of up to 2 × 1012 vector genomes/kg was safe, but was insufficient to reach therapeutic levels of more than 1% [29,30]. The second trial, in which AAV Factor IX was administered by means of hepatic artery infusion, showed that therapeutic levels (10–12% at 4 weeks postinfusion) could be reached at a dose of 2 × 1012 vector genomes/kg, but decreased steadily with only baseline levels registered at 10 weeks postinfusion. The decline in Factor IX was accompanied by a rise in liver transaminases, which could be observed 4 weeks after treatment and slowly disappeared without medical intervention . The investigators could demonstrate that the observed phenomenon was due to a cellular immune response directed at specific epitopes in the AAV capsid .
This observation highlights one of the limitations of AAV-based gene therapy, which is the fact that most patients have previously been exposed to the AAV2 serotype, commonly used in gene therapy trials . This problem can potentially be overcome by a temporary immunosuppressive treatment or the use of alternative AAV serotypes [34,35]. Crosspackaging vectors that use the AAV2 genome in the AAV8 capsid are currently being tested for the treatment of patients with severe hemophilia B. The effect of coadministration of immunosuppressants and AAV2 vector is presently being tested in a separate hemophilia B trial.
Despite the occurrence of an immune response in treated patients, it is important to note that the higher dose of vector did initially produce therapeutic levels of Factor IX, and continuous efforts in basic AAV research are feeding into the development and production of better vectors that have overcome some of the initial limitations. For example, the AAV vector packaging capacity can be expanded by the use of dual vectors, which consist of cis-acting vectors in which regulatory elements are separated from the therapeutic gene or, alternatively, trans-splicing vectors, which encode different exons that can be reconstituted through splicing or overlapping viruses that exploit homologous recombination. Simultaneous infection of the different vectors will lead to expression of the transgene . Self-complementary vectors, which through their particular design circumvent the requirement for dsDNA conversion, have a smaller packaging capacity but offer efficient gene expression with a rapid onset of transduction .
The two fields of studies of AAV vectors and AAV biology have always been intertwined, and simple observations made for wild-type (wt) AAV2 can potentially lead to improved vector design. Side-to-side comparison of the infectious, physical and genome-containing titers of wt AAV2 and rAAV2 revealed that the wt virus has a near-perfect physical-to-infectious particle ratio, whereas for the recombinant virus only one out of 50–100 particles is infectious . Insights gained by this kind of study could result in improved vector design, and lead to significantly lower viral vector doses in clinical studies, thereby diminishing the immune response to the vector.
The fact that AAV-mediated gene therapy can be successful when the immune system is evaded has been demonstrated by recent clinical studies, which were aimed at treating patients with a genetic form of blindness by injection of the therapeutic vector into a relatively immuno-privileged site (i.e., the subretinal space). Some patients with Leber's congenital amaurosis carry mutations in the gene encoding retinal pigment epithelium-specific 65-kD protein (RPE65), which leads to early-onset retinal dystrophy causing blindness in early adulthood . Three independent Phase I dose-escalation studies show that subretinal injection of AAV2 vectors encoding RPE65 cDNA is safe and, in some cases, leads to sustained improvement in objective and subjective measurements of vision [40–43]. The strongest improvements were seen in children, with an 8-year-old child reaching the same level of light sensitivity as age-matched controls with normal vision .
These exciting results warrant the continuous search for new and improved AAV vectors. The isolation of an impressive array of capsid variants from animals has brought us a new range of AAV vectors with specific transduction characteristics, and the availability of a number of capsid crystal structures will ultimately lead to the design of vectors with exclusive tropism [35,45]. It is likely that the advances brought by the dynamic – and bidirectional – relationship between the fields of basic biology and vector development will result in the development of future gene therapies.
The AAVs are parvoviruses of the Dependovirus genus that all share the dependence on unrelated DNA viruses (i.e., adenovirus, herpesvirus, papilloma virus and vaccinia virus) for completion of the productive part of their lifecycle [2,11–13]. They are small nonenveloped ssDNA viruses with icosahedral capsid symmetry . The first serotype isolated was AAV2 and, since then, eight more have been characterized [47–53]. Recent discoveries of AAV isolates derived by PCR from nonhuman primate and human tissues have extended the list of capsid variants to over 120 [54–56].
The AAVs are genetically very simple, with two major open reading frames, encoding for the nonstructural Rep proteins and structural Cap proteins, respectively (Figure 1) . Furthermore, Sonntag and coworkers have discovered an additional open reading frame within the VP2 coding sequence . The authors further demonstrated that this protein, which initiates at a nonconventional start codon, targets newly synthesized capsid proteins to the nucleolus where it is also involved in capsid assembly, hence the designation assembly-activating protein (AAP). There are three viral promoters that are known by their relative position in the viral genome: p5, p19 and p40 (Figure 1) [59,60]. Although the transcription profile differs slightly between different serotypes, in AAV2, the p5 and p19 promoters regulate the production of two overlapping mRNAs of different lengths, which both contain an intron that can be spliced out. Unspliced mRNAs encode Rep78 and Rep52, whereas Rep68 and Rep40 are encoded by spliced messages . All Rep isoforms have the central AAA+ domain in common, which has ATPase and DNA helicase activities [62–64]. Rep68 and Rep78 also have a DNA origin interaction domain, present at the N-terminus, which has site- and strand-specific nicking activity [65–67]. Rep52 and Rep78 share a putative zinc-finger domain, the role of which is not entirely clear but it is thought to interact with diverse cellular factors [68,69]. The small Rep proteins, Rep40 and Rep52, are required for efficient packaging of the AAV genome into AAV capsids [70–72]. The large Reps, Rep68 and Rep78, are essential for AAV DNA replication, as well as site-specific integration of AAV DNA into the human genome [70,73].
The right open reading frame is regulated by the p40 promoter, which produces two mRNAs that, through the use of an alternative splice acceptor site (VP1) and unusual ACG start codon (VP2), lead to the production of the three capsid proteins, VP1, VP2 and VP3, respectively . In total, 60 copies of these proteins, present in a 1:1:10 ratio, build the 20-nm icosahedral capsid .
The 4.7-kb genome is flanked by two ITRs, which are imperfect palindromes that fold back on themselves to form hairpin-like secondary structures . The 145-nt ITRs contain all of the cis-acting signals needed to support DNA replication, packaging and integration . Within the ITRs, a Rep binding site (RBS) allows for specific recruitment of the large Rep proteins (i.e., Rep 68 and Rep 78) to the origin of replication [77–79]. A Rep-specific endonuclease site (terminal resolution site [TRS]) is separated from the RBS by a 13-nucleotide (nt) spacer [66,80,81]. Together, RBS and TRS can act as a minimal origin for Rep-mediated DNA replication (vide infra) (Figure 1) .
Although many findings have been presented, it must be noted that some uncertainty remains with regard to viral entry and cellular trafficking. In part, the ambiguities are invited by the fact that much of what we have learned is derived from observations with recombinant viruses of a variety of serotypes. However, it is likely that, while some aspects will be conserved among serotypes, some key steps (e.g., receptor binding and, consequently, downstream effects) will differ between the serotypes used. In addition, AAV has been shown to be a somewhat promiscuous virus, and it would not be surprising if, depending on tissues and serotypes, a range of alternative cellular trafficking routes could be exploited by these viruses. In order to further complicate this discussion, it must be noted that the majority of studies presented to date have been conducted with tissue culture-adapted viruses. As is the case for many viruses, this adaptation in AAV has resulted in changes to the virus capsid. Supporting the significance of this potential caveat is the observation that, in contrast to human isolates , the laboratory strain of AAV2 binds heparin.
In the interest of brevity, we will only highlight key findings that relate to binding and trafficking of the virus, as a more in-depth description of the underlying pathways and viral components has recently been reviewed by Parrish . To date, a number of attachment molecules have been identified. Among those are heparan sulfate proteoglycans, which are thought to act as attachment moieties for AAV2 on the cell surface . The AAV2 capsid binding partner has been mapped to amino acid residues 585 and 588 [86,87], which, unsurprisingly, are absent from human isolates . For AAV4 and AAV5, the equivalent attachment function has been determined to be provided by sialic acid molecules [88,89]. In addition, AAV2 has been shown to bind to FGF receptor-1 and the presence of this molecule on cell surfaces has been correlated to increased transduction efficiencies by rAAV2 . To date, PDGF receptor has been identified to be required for rAAV5 uptake . Interestingly, AAV5, AAV4, bovine AAV, as well as possibly AAV8 and AAV9, have been shown to be able to pass through barrier endothelia and epithelia using a pathway that is different from the normal route of infection [92,93]. Recently, it has been demonstrated that the membrane glycoprotein gp96 is the likely carrier of the chitotriose that is required for transcytosis of bovine AAV, but not the infection by this virus .
Following cell surface attachment and receptor binding, AAV enters trafficking pathways that are still under discussion, directing the particle genomes to the nucleus [95,96]. The characterization of the specific pathways for cellular trafficking has been of particular interest to the gene-transfer field, which, understandably, focuses on AAV infection in the absence of helper virus coinfection and, thus, possibly represents the pathway involved in the establishment of latent infection of wt AAV. In the presence of helper virus, hence during productive replication of AAV, it is reasonable to assume that the virus follows the trafficking pathway of the helper virus, which in the case of adenovirus would predict the viral escape from early endosomes . By contrast, in the absence of helper virus infection, AAV has been detected in late endosomes [98–100]. The fate of the virus downstream of the endosome remains somewhat elusive, although uncoating of the viral genome has been proposed to represent a rate-limiting step [97,98,100]. Recently, it has been further proposed that AAV2 enters the nucleus, where it can be sequestered into nucleoli and remain as an infective particle. The authors go on to hypothesize that recruitment from the nucleoli into the nucleoplasm then facilitates uncoating of the viral genome in order to enable transcription of the recombinant genome  or, presumably, the integration or replication of wt AAV.
DNA replication of AAV highlights some of the intriguing aspects of this nonpathogenic human virus. In this context, it must be appreciated that although AAV is among the most efficiently replicating DNA viruses (up to 105–106 replicated copies per cell), and has a seroprevalence of 80% among the human population, AAV has not yet been associated with any disease. This apparent contradiction would predict a tight regulation of the conditions and the components involved in productive replication of the virus. AAV has evolved to overcome this hurdle by tightly linking its replication to that of a – usually pathogenic – helper virus. With this strategy of helper-virus dependency, AAV adds an additional layer of replication control, which, in turn, could make the extensive replication of the virus potentially beneficial to the host by ensuring the destruction of cells that are affected by pathogenic viruses. It has yet to be determined whether this evolutionary marvel of human virology remains unique or whether possible interactions of coinfecting viruses might provide a regulatory potential that is more generally applicable to the panoply of mechanisms underlying host–virus interactions.
In a simplified scheme of AAV DNA replication, after uncoating of the virus the single-stranded genome is found in distinct areas within the nucleus [101,102]. In the presence of helper-virus components, the genome is extended in order to generate a template for transcription [103–108]. The viral Rep proteins are then expressed in a manner that involves numerous control mechanisms (e.g., the activation of the rep promoters p5 and p19 with the help of adenovirus factors) [109–114]. Rep then initiates replication on the origin, which is present within the ITRs of the virus . The primer for this initiation is provided by the 3′-ITR, which can fold onto itself and, thus, create the heteroduplex structure required for the initiation. Following second-strand synthesis, the initiating ITR needs to be resolved. This is accomplished by the binding of Rep to the ITR RBS and the subsequent Rep-induced introduction of a nick at the TRS. This nick then provides the free 3′-hydroxyl group necessary for replication through the ITR. All of the steps involved in DNA replication have been reconstituted in cell-free assays using recombinant Rep proteins that were over-expressed and purified either from vaccinia virus or Escherichia coli expression systems [2,115–119]. Together, these studies have supported a model of unidirectional, strand-displacement replication. It remains to be determined how this model will incorporate the structural information that has recently accumulated [64,120] and that, in the case of Rep68, suggests a bidirectional double-octameric ring conformation on helicase substrates .
In the case of adenovirus coinfection, the cellular replication machinery with polymerase δ supports AAV replication, together with factors provided by the helper virus . Helper functions necessary for AAV replication can be provided by a number of rather different DNA viruses, of which the best studied are adenoviruses and herpes viruses (reviewed in [118,122]). In particular, the contributions by adenovirus to AAV replication have been well established, and the helper components have been identified as E1A, E1B, E2A, E4orf6 and virus-associated RNA. The E1A gene product coregulates the expression of the AAV p5 promoter, which is controlling the expression of the large Rep proteins, Rep68 and Rep78 [123,124]. The YY1 element, discovered in these studies, is responsible for this regulatory step, and has since been found to be involved in the regulation of a number of cellular promoters . After relief of Rep repression, significant amounts of the replication initiator protein Rep are synthesized. In vitro studies have further suggested that Rep-mediated replication suffers from a lack of processivity if not complemented with ssDNA binding protein from the E2A gene [117,126]. The E4orf6 gene has been shown to be essential for replication on several levels. First, it was demonstrated that the E4orf6 gene product can overcome the rate-limiting step of second-strand synthesis of the single-strand virus genome . Subsequently, it has been shown that together with E1B55k, E4orf6 counteracts the function of the cellular Mre11 DNA repair complex, which is thought to pose a barrier to AAV replication, through ubiquitination and degradation . Although the role of this observation in AAV replication is not yet elucidated, E1B55k and E4orf6 also mediate polyubiquitination of the small Rep proteins (Rep52 and Rep40) [128,129]. The function of the virus-associated RNA in AAV DNA replication remains somewhat elusive. A possible contribution is the recently observed inhibition of PKR and, consequently, the phosphorylation of the translation factor eIF2α .
There are two possible products of AAV DNA replication, a single-stranded displacement product and a double-stranded replication product. It is tempting to hypothesize that these replication products represent intermediates for two distinct subsequent steps. A plausible scenario would be that after reaching a currently undetermined threshold in replication, the single-stranded template could be utilized for further replication rounds, while the double-stranded molecule could serve as a template for genome packaging. However, to date, these steps remain poorly characterized.
Adeno-associated virus is intriguing in that it appears to have built a near-perfect relationship with its host as the virus is only detrimental to the host cell when it has already suffered an insult caused by the helper virus. In the absence of helper factors, the virus can integrate its genome into a specific locus on human chromosome 19 without causing any apparent adverse effects . The question rises as to whether the coevolution of virus and host also applies to the phenomenon of site-specific integration. Does this particular genetic environment, combined with specific host factors, provide something unique, potentially beneficial to the virus, and vice versa, does the host cell profit from this viral integration? These questions remain largely unanswered as most observations concerning the lifecycle of AAVs were made in tissue culture and very few data have been generated that could contribute to our knowledge of the in vivo lifecycle of the virus. However, attempts have been made to identify a reservoir of latent AAV. The virus has been described to be present in a small percentage of hematopoietic cells  in the human genital tract [133–136] and in a surprisingly large percentage of muscle biopsies (17%) . However, these studies did not present any data on integration of AAV in chromosome 19. In a more recent study, the presence of AAV was tested in tonsil-adenoid samples, as well as other tissues, including heart, lung, muscle, spleen and liver. A total of seven out of 101 tonsil-adenoid samples tested positive, whereas only two out of 74 other tissue samples tested positive (spleen and lung). Using sensitive, unbiased PCR techniques, the investigators were able to show that the viral genome predominantly exists as a circular double-stranded episome and only one sample (tonsil specimen) was shown to contain AAV integrated into chromosome 1, arguing that site-specific integration may not be part of the in vivo lifecycle of the virus . Interestingly, the samples in which AAV was found, whether tonsils/adenoids or lungs, could represent the site of primary productive infection, as the vast majority of samples were collected from children between the ages of 2 and 14 years. Furthermore, the mechanism by which AAV establishes latency is rather precise and elaborate, which calls the hypothesis that AAV site-specific integration is a ‘tissue culture artefact’ into question.
The discussion of our knowledge about integration by AAV is complicated by the fact that some basic aspects of this phenomenon have – to a large extent – eluted investigations. One of the frequently discussed yet never comprehensively addressed topics is the frequency by which wt AAV integrates into the human genome in a relevant cell. Although not statistically analyzed, our studies indicate that in human embryonic stem (hES) cells no more than 5–10% of randomly picked and unselected clones can be expected to carry AAV DNA within their target locus. This discussion becomes further complicated when the expected nonspecific integration events are taken into account. In this case, technical hurdles, as well as the choice of cells under investigation, have made the search for a conclusive answer to this apparently trivial question elusive. For example, from the AAV vector field, we have learned that recombinant viral genomes have the propensity to attach and, thus, integrate into preformed double-strand breaks . Consequently, the analyses of wt virus integration events must take into account this Rep-independent pathway of genome integration, in particular when cell lines such as HeLa are used, which are genotypically mosaic cultures known for frequent chromosome breaks. Wt or recombinant viruses will integrate into these loci, and such integrants can easily be misinterpreted as Rep-mediated off-target events. However, the availability of a molecular model for integration now provides hallmarks for Rep-mediated events that can be used to distinguish these distinct pathways . In the subsequent section, we outline the current knowledge of the Rep-mediated mechanism for AAV site-specific integration.
Integration of AAV was first discovered in Detroit 6 cells, which were infected with wt AAV (serotype 2), then cultured for more than 30 passages before being cloned and challenged with adenovirus infection . In 30% of the clones, AAV could be rescued after adenovirus infection, indicating that AAV must have been present in a latent form . One of those clones, termed 7374, was used to confirm integration of AAV. Kotin et al. mapped and partially sequenced one viral–viral, as well as two viral–cellular junctions, and determined that several copies of AAV, present in a tail-to-tail conformation, were integrated into a nonrepetitive DNA sequence . In a subsequent study, the same investigators used the cellular sequence flanking the AAV DNA, which they had isolated from 7374 cells, to generate a probe for Southern blot analysis, and screened a panel of independently generated latently infected human cell lines. The results showed that a particular cellular sequence had been disrupted in 78% of the wt AAV-infected cell lines, indicating that integration had occurred in a site-specific manner, into a locus that was then sequenced and termed AAVS1 [144,145].
Experiments performed in tissue culture also determined that the minimal requirements for site-specific integration of AAV are the presence of the large Rep proteins 68 or 78, the viral RBS sequence and a cellular sequence, present in AAVS1, consisting of a TRS and RBS signal . Cotransfection of 293 cells with an ITR-containing plasmid and a plasmid expressing the large Reps demonstrated that Rep78/68 are the only viral proteins directing the site-specific integration event . The only cis-acting sequence required for site-specific integration appears to be the RBS since transfection experiments using plasmids containing the RBS [73,146] or plasmids carrying the p5 promoter , which also contains an RBS motif , are sufficient to direct ITR-containing donor DNA constructs to AAVS1. Viral TRS mutations leading to reduced nicking activity do not affect integration efficiency, showing that this viral sequence is not a major player in AAV-mediated site-specific integration . Finally, wt AAV infection of cells that stably express an Epstein–Barr virus-based shuttle vector containing the AAVS1 preintegration site determined that the cellular sequence per se and not higher order chromatin structures direct this unique integration event . These experiments further showed that a 33-nt sequence containing both TRS and RBS are necessary and sufficient for AAV-mediated integration to occur in AAVS1 [149,150].
The preintegration site is located on the long arm of chromosome 19, at position 19q13.42 [151,152] and has several interesting features, including a high number of short repetitive sequences, a unique minisatellite sequence, the presence of CpG islands and putative cis-acting DNA elements, as well as a DNaseI hypersensitive site [131,145,153]. Most intriguingly, the first 500 bp of AAVS1 contain TRS and RBS sequences very similar to those found in the AAV genome , which were shown to support Rep-mediated replication of the AAVS1 sequence .
As predicted from its characteristics, a gene was identified within AAVS1, and the translation initiation start codon is located only 17 nt downstream from the RBS . This 27-kb gene encodes the protein phosphatase I regulatory inhibitor subunit 12C (PPP1R12C), also termed Mbs85 or myosin-binding subunit 85. Transcription of the 22 coding exons results in a unique 3-kb mRNA, which is ubiquitously expressed in adult human and mouse tissues. Data on Mbs85 expression during development are not yet available .
The myosin-binding subunit proteins in general are the regulatory subunit of myosin light-chain phosphatase and Mbs85 (85 kDa) is no exception, as determined by biochemical assays. Characteristic features of these proteins are the presence of a PP1-binding motif and ankyrin repeat motif at the N-terminal end of the protein; a phosphorylation inhibitory motif in the central domain and an α-helical domain with leucine zippers at the C-terminus . Mbs proteins bind through their N-terminal end to the catalytic subunit of myosin light-chain phosphatase, as well as to phosphorylated myosin light chains, by which they regulate dephosphorylation of phosphorylated myosin light chains, leading to actin–myosin disassembly. Phosphorylation of the phosphorylation inhibitory motif causes a conformational change in the Mbs proteins, leading to myosin light-chain phosphatase inactivation, shutdown of phosphorylated myosin light chain dephosphorylation and actin–myosin assembly . Given its potentially prominent role in the regulation of actin–myosin (dis)assembly and ubiquitous expression pattern, it can be assumed that the presence of Mbs85 is essential for the development and/or adult physiology of mammalians. Data from heterozygous and homozygous Mbs85-knockout mice will be necessary to prove/disprove this assumption.
The MBS85 gene is closely linked to three other genes, which also play a role in the regulation of actin interactions, namely TNNI3, TNNT1 and EPS8L1 (Figure 1) [131,158]. TNNI3 and TNNT1 code for the inhibitory subunit of the cardiac muscle troponin complex and the tropomyosin-binding subunit of the skeletal muscle troponin complex, respectively [159,160]. The primary function of the troponin complex is to control the interaction between the actin and tropomyosin filaments during muscle contraction and relaxation. EPS8L1 encodes for a protein that is thought to be involved in signal transduction leading to actin cytoskeleton remodeling [161,162].
In summary, the human target site substitutes a very gene-dense region of chromosome 19, which contains genes whose disruption by AAV-mediated integration could have detrimental effects on the cell.
Since the discovery of site-specific integration of AAV in the latently infected cell line 7374, numerous latently infected cell lines have been generated, and attempts have been made to uncover the molecular organization of the integrated provirus . The presence of site-specifically integrated AAV is usually determined by Southern blot analysis using an MBS85-specific and virus-specific probe. Disruption of the MBS85 gene and comigration of virus-positive and disrupted MBS85-positive band(s) indicate site-specific integration of AAV. Extensive rearrangements of the MBS85 gene have been observed; however, they are always confined to one allele . The integration event is often further analyzed by the identification of the viral–cellular junctions, and an extensive analysis of the junction sequences published to date showed that the vast majority of integration events are scattered throughout the first exon and intron of the MBS85 gene [143,152,158,163–168]. Most of the junctions are found in close proximity to the TRS/RBS signal. This observation may not be surprising, in light of the fact that most of the PCR primers used specifically bind to this particular region in the MBS85 gene. When unbiased approaches (e.g., linker-mediated PCR) were used to identify the viral–cellular sequences, it was revealed that integrated AAV can be found at a significant distance from the TRS/RBS sequences [140,169]. Careful analysis of the junction sequences also revealed that many of those show a short homology (1–5 nt) between AAV and MBS85 [164,168,170], and that the viral ITR and p5 promoter are often present at the recombination junction. In addition, the ITR sequences present at the junction are most often partially deleted . Despite the knowledge that AAV integration causes MBS85 rearrangements and that the point of integration is not exact, but rather scattered throughout the first exon and intron, very little is known about the molecular organization of integrants as the complete characterization of a full AAV genome has yet to be published. In fact, most of our knowledge about AAV site-specific integration is based on the characterization of one of the two viral–cellular junctions, mostly containing the right ITR .
In an attempt to gain a better insight into the molecular organization of this unique integration event, we used newly generated, latently infected human cell lines and identified the sequences of the junctions with the left as well as the right ITR . In agreement with previously published data, the ITRs are partially deleted, and the breakpoints are lying close to the viral RBS motif. Other hallmarks are the microhomology between the viral and cellular sequences, the occasional presence of unknown sequences between AAV- and MBS85-specific nucleotides and the close proximity of the right junction to the cellular TRS motif. Interestingly, all left ITR-containing junctions were found more than 9 kb downstream from the TRS/RBS signal. The second surprising feature is the 5′–3′ direction of the MBS85 sequences adjacent to the left and right ITRs. Namely, all MBS85 sequences present upstream, as well as downstream of the integrated provirus, are oriented in the 5′–3′ transcriptional direction of the gene .
Altogether, these observations provoked us to test the hypothesis that AAV integrates by duplicating the upstream MBS85 sequences, while leaving the downstream sequences virtually unaltered (Figure 2). Southern blot analyses and PCR-based assays confirmed that the proposed duplication-based integration event had, indeed, taken place, both with wt AAV, as well as with rAAV, where Rep had been provided in trans. These intriguing new molecular characteristics of AAV site-specific integration gave us a better view on a possible mechanism of site-specific integration, and allowed us to extend the existing model .
It was previously proposed that, prior to integration, a circular form of AAV donor DNA is tethered to the cellular RBS through simultaneous binding of the viral Rep78/68 proteins to viral and cellular RBS, after which these proteins introduce a strand-specific nick at the cellular TRS. The Rep78/68–RBS complex formation, as well as the Rep-dependent nicking of the AAVS1 TRS, have been demonstrated in cell-free assays [154,155,171]. The next step involves DNA replication starting from the free 3′-OH terminus created by the site-specific nick, and is dependent on the cellular replication machinery [119,150,155,172]. Given the observation that the junction with the left ITR is most often located at a significant distance from the nicking site, it appears that the extent of DNA replication is larger than previously contemplated. The next step in this model involves the replication machinery switching templates onto AAV, generating the junction with the left ITR. In order to account for the previously ‘unknown’ DNA stretches found at one of the junctions, it is hypothesized that after replication of the AAV template, the replication fork switches back to replicating the MBS85 sequences adjacent to the left junction, thereby generating a short sequence that, after integration is complete, can be found at the opposite (right) junction, as was observed in some of our latently infected cell lines . As for wt AAV integration, replication of AAV does not usually cease after one round, but continues, similar to rolling-circle replication, generating head-to-tail AAV concatemers . In order to generate the overall MBS85 duplication we observed, the displaced single strand, which is covalently bound to Rep by its 5′-end, forms a junction with the 3′-end of the newly replicated strand; a step that is potentially mediated by Rep's proposed ligation activity . The structure of the left and right junctions suggests that the first recombination event involves the left side of the viral genome, while the right junction occurs at a later stage, thereby defining the 5′–3′ orientation of the integrant. It is possible that the viral p5, or an exogenous promoter in the case of rAAV, plays an important role in the formation of the initial recombination complex. Since the TRS–RBS motifs are located within the 5′-UTR region of the MBS85 gene , it is likely that one or more factors of the cellular transcriptional machinery are involved in the positioning of the viral donor molecule within this complex . The integration is finalized through the introduction of a second nick in the template strand, generating a free 3′-OH terminus, from which the DNA polymerase can fill in the newly generated AAV AAVS1 sequences. Alternatively, the template strand is completed during the consecutive round of cell division. In Figure 3, we propose a simplified model for the mechanism of site-specific integration.
Although only few aspects of this integration model have formally been proven, it forms the ideal platform to test some, if not all, of the molecular aspects of the integration mechanism. In addition, the proposed duplication-based integration implies that AAV potentially integrates without a functional disruption of the altered MBS85 allele. The molecular structure of the integrant is such that, in some cases, the downstream promoter remains intact, allowing for biallelic expression of the MBS85 gene (vide infra and Figure 2). Alternatively, the observed duplication leaves the possibility that the viral and duplicated cellular DNA can be spliced out, which in turn could restore normal MBS85 expression. We are currently investigating whether MBS85 expression, postintegration, is unchanged in all latently infected cell lines, and whether expression from the duplicated allele could lead to unstable or aberrant MBS85 mRNA products. The question remains as to whether AAV and its mode of integration have evolved to specifically maintain two functional copies of MBS85, or if this integration mechanism is most suitable for AAV as the initiation of site-specific integration and viral replication are not that dissimilar.
The ability of AAV2 to site-specifically integrate into the human genome and, by doing so, employs a mechanism that potentially secures normal expression of the target gene warrants further studies that address safety and feasibility of Rep-mediated targeted transgene insertion for gene therapy purposes. Several years ago, an AAV integration site was found in the African green monkey genome . This site has 98% homology to the human site, and contains TRS and RBS motifs very similar to those present in the human AAVS1. The investigators were able to demonstrate site-specific integration of AAV into the simian site. However, the discovery of a mouse ortholog of AAVS1 brought us closer to a feasible animal model, which permits functional characterization of AAV site-specific integration, an important step towards establishing if integration in MBS85 is safe . Although transgenic mice carrying randomly integrated human AAVS1 have been created and used to prove that site-specific integration of AAV can occur in vivo, they do not represent a model to study the downstream effects of integration in this particular chromosomal context [177–179].
The mouse ortholog of the target site is present on chromosome 7 in a region that is syntenic to the AAVS1-containing region on human chromosome 19, and both share the same overall genomic organization . The Mbs85 gene spans 20 kb, and the deduced protein sequence is 86% identical to its human counterpart. The TRS/RBS motifs are located 25-nt upstream of the translation initiation codon and have been shown to undergo Rep-mediated strand-specific nicking, and support replication in cell-free assays . Based on coinfection with rAAV (GFP and neomycin resistance genes) and wt AAV (providing REP) viruses, it was shown that AAV can integrate site specifically into the mouse ortholog . Careful analysis of the viral–cellular junction sequences revealed that the junction with the left ITR was located 8.5 kb downstream from the TRS, whereas the junction with the right ITR was present close to the TRS/RBS motif, essentially sharing the same overall molecular organization as the integrants characterized in human cells. Southern blot, as well as PCR assays, demonstrated that one copy of rAAV had integrated through the same duplication-inducing mechanism, which was observed in wt AAV-infected human cell lines . There was no evidence for integration of wt AAV. Since Mbs85-targeted integration of AAV had been achieved in mouse ES cells, these cells could then be used to functionally characterize this particular mode of integration. Mouse ES cells carrying site-specifically integrated rAAV performed equally as well as their unmodified counterparts in a series of stringent in vitro differentiation assays, providing evidence that the integration event did not have any discernable effects. Moreover, the cells maintained their ability to fully participate in mouse development when injected into blastocysts in vivo . In addition, as recombination with the right ITR had occurred 262 nt upstream of the TRS, probably owing to nicking at a cryptic TRS (of which the existence had been demonstrated in a cell-free nicking assay ), the activity of the promoter downstream of the right junction remained unaltered and, thereby, secured normal Mbs85 expression levels. During the course of these studies, it became clear that GFP expression, encoded by the site-specifically integrated rAAV vector, remained strong throughout differentiation . Thus, the integration properties of AAV appear to have evolved to minimally interfere with the host genome, and allow for integration into a site that appears to be highly suitable for efficient transgene expression.
Gene transfer to cells with proliferative potential requires integration of the transgene into the host genome in order to achieve lifelong expression. To date, gene therapy studies have mainly employed γ-retroviral vectors, which establish persistence through integration in a largely random fashion. Inherent to this approach is that the chromosomal context and, thus, the expression of a transgene will vary between vector-transduced cells . In addition, while in differentiated cells the potential for insertional mutagenesis might be negligible, in cells with high proliferation potential (e.g., stem cells) this aspect becomes important. The mutagenesis potential has been documented by the emergence of leukemia, as a result of retrovirally mediated gene therapy of X-linked severe combined immuno-deficiency (SCID-X1) in an otherwise highly successful clinical trial [181–183]. This observation underlines the potential value of new gene transfer approaches, which have a significantly reduced risk of insertional mutagenesis.
Before describing various strategies for the addition of transgenes, it is worthwhile noting that AAV has the ability to significantly increase the frequency of gene targeting through homologous recombination, with frequencies as high as 1%, allowing for high-fidelity nonmutagenic DNA repair in host cells (reviewed in ). Although the underlying reasons for this targeting efficiency, which is orders of magnitude higher than comparable in vitro approaches using plasmid transfections, remain largely unknown, this strategy has successfully been applied in a number of disease models in vitro and in vivo [185–187]. In addition, a proof-of-principle study in induced pluripotent stem cells has been presented .
Additional recently developed strategies for site-specific genome modification include the use of zinc finger nucleases, homing nucleases and bacteriophage integrases. Zinc finger nucleases consist of a zinc finger DNA binding domain fused to a nonspecific nuclease domain, and are designed to introduce double-strand breaks at a specific target locus [189,190]. When provided together with the desired DNA donor, they have been demonstrated to significantly increase homologous recombination-based gene targeting efficiencies . Zinc finger nucleases are typically engineered to contain two subunits designed to bind ‘half-sites’ (9 bp) that are separated by a spacer sequence. Each subunit is connected to a FokI endonuclease domain by a short linker, and is positioned in a tail-to-tail configuration . Upon dimerization, the endonulease is activated and introduces a double-strand break in the spacer sequence . Although zinc finger nucleases can theoretically be designed to target any chromosomal location, experience has taught that it can be challenging to develop effective nucleases that lack cytotoxicity . Off-target cleavage due to nonspecific binding of the zinc finger domain or nonspecific cleavage by the FokI domain can lead to apoptosis and cell death . Human loci, which have been targeted by zinc finger nucleases include CCR5 , IL2Rγ [195,196] and AAVS1 . Interestingly, expression of the transgene, targeted to AAVS1 by zinc finger nucleases, was found to be robust. This was in accordance with what was observed with AAV-mediated transgene addition to AAVS1 .
Homing endonucleases or meganucleases are proteins harboring DNA binding, as well as endonuclease activities. They recognize long DNA sequences (12–45 bp) and can be engineered to recognize specific sequences in the mammalian genome . Although the engineering challenges and genotoxicity may be similar to those observed for zinc finger nucleases , successful attempts have been made for the introduction of double-strand breaks in the human XPC and RAG1 genes [200–202].
Bacteriophage integrases have also been exploited for site-specific genome modification. ΦC31 integrase, the best-characterized bacteriophage integrase used for gene therapy studies, is a serine recombinase that catalyzes the unidirectional reaction between the attB bacterial attachment site (34 bp) and the attP phage attachment site (39 bp) . This system can be used to target a transgene-containing plasmid, which is designed to carry the attB sequence, to pseudo attP sites present in the mammalian genome. Despite the absence of serious adverse effects in animal models treated with this technology, the specificity of integration is limited in that insertion occurs in several locations in the genome. Efforts have been made to increase the efficiency and specificity of these integrases by the introduction of mutations . An additional challenge is that the integrase occasionally carries out aberrant events leading to intrachromosomal deletions and interchromosomal rearrangements .
To our knowledge, AAV genome integration is the only example of targeted genome addition that has evolved to occur in eukaryotes. A variety of viruses establish latency by integrating their genome into the host genome. The integration event generally occurs in a nonspecific manner, precluding the prediction of functional consequences from resulting disruptions of affected host genes. By contrast, AAV integrates site specifically, thereby offering the possibility to study potential adverse effects on the host prior to its use in gene and cell therapy trials. In addition, the discovery of a mouse ortholog of the human integration site enables studies that address the safety aspects of Rep-mediated targeted integration.
We have begun to understand several aspects of the unique mechanism of integration of AAVs. However, for the development of AAV-mediated targeted gene addition in therapeutically relevant approaches, many questions remain to be addressed, including the identification of safe delivery strategies for the Rep protein and the efficiency of successful integration events. In our hands, temporary p5-controlled Rep expression yielded an acceptable frequency of MBS85-targeted hES cell clones in the absence of visible cytotoxicity. However, further experiments will need to establish the threshold of protein levels that will affect cellular physiology, based on either genotoxic Rep effects (such as AAVS1 amplification ) or through the potent general transcriptional repression that has been attributed to Rep in overexpression experiments .
Human ES cells and induced pluripotent stem cells are two types of stem cells that hold the promise for the development of a new generation of cell-based therapies. Induced pluripotent stem cells in particular can be derived from tissues of patients suffering from inherited or chronic degenerative diseases, offering the possibility for researchers to use these cells to study the molecular basis of the disease process, screen new drugs, or even correct genetic defects in cells that could then be used for transplantation . However, the success of this new line of research and the development of cell replacement therapies will rely on safe and efficient genetic modification of these cells. The ability of AAV to mediate targeted transgene integration to a specific site in the human genome that possibly allows for optimal and regulated transgene expression makes AAV, in our view, the ultimate tool to genetically manipulate the cells that hold the promise for the development of a new generation of cell-based therapies. For example, the generation of hES cells carrying an AAVS1-targeted marker gene aided in the identification of hES cell-derived cardiac progenitor cells that were able to functionally integrate into a mouse heart .
In addition, the development of new targeting vectors with increased tropism for hematopoietic stem cells could lead to the development of safer gene therapy treatment options for diseases, including SCID.
Adeno-associated virus has evolved a near-perfect relationship with its host, possibly owing to it being only one of many yet-to-be-discovered human viruses. The molecular strategies employed by this virus enable the exclusive replication of AAV in cells that have potentially lost their utility to the host (by coinfection of adenovirus and herpes viruses) or in cells that might represent significant risks to the host (in the case of papilloma virus-infected cells). In this article, we highlight key aspects of the molecular mechanisms that support this unique viral lifestyle with particular emphasis on the establishment of latency by site-specific genome integration. Although the in vivo relevance of this somewhat sophisticated, virus–host interaction has not yet been established, we propose that the underlying mechanisms invite themselves as an ideal platform for the development of approaches that include the addition of genes into defined chromosomal regions, both for basic research purposes and for future clinical treatment regimens.
Financial & competing interests disclosure: The authors have no relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.
No writing assistance was utilized in the production of this manuscript.
Papers of special note have been highlighted as:
of considerable interest