|Home | About | Journals | Submit | Contact Us | Français|
Whole-genome sequencing of human adenovirus type 11 (HAdV-11) strain QS, isolated in China, was conducted, and its sequence was compared with the sequences of strains within the species of HAdVs. The HAdV-11 QS genome contains 34,755 nucleotides. Similar to the other HAdV subgenus B sequences, the HAdV-11 QS genome coded 37 functional proteins and could be divided into four early, two intermediate, and five late transcription regions. The amino acid sequences of the fiber and the hypervariable regions (HVRs) within the hexon gene of HAdV-11 QS were identical to the corresponding sequences of the HAdV-11a strain; further analyses that compared those amino acid sequences with the amino acid sequences of the HAdV species subgenus B:2 strains revealed that the highest degree of homology (>99.2%) existed between HAdV-11 QS and the prototypical HAdV-14 strain, except for a few coding sequences of HVRs within the hexon gene, DNA polymerase, pVI, and pre-terminal protein. This indicate that HAdV-11 strain QS, isolated in China, is a recombinant adenovirus of HAdV-14, and the recombination analyses also confirmed this finding. It is difficult to clarify the time and manner of the recombination, and further investigations are required to determine whether the emergence of recombination between HAdV-11a and HAdV-14 might increase virulence, thereby posing a new global challenge with regard to acute respiratory diseases in the near future.
Adenoviruses (AdVs) have thus far been classified into five genera, namely, Atadenovirus, Aviadenoviridae, Mastadenoviridae, and Siadenoviridae (2) and unclassified genus Adenoviridae. Human AdVs (HAdVs) belonging to the Mastadenoviridae genus are composed of seven subgenera (subgenera A to G), including 52 serotypes, on the basis of their hemagglutination, oncogenicity in rodents, restriction fragment analyses, tropism, genome homology (3, 13), and serum neutralization ability. Generally, AdV infections are subclinical; however, AdVs are associated with a broad spectrum of illnesses, including conjunctivitis, a febrile upper respiratory tract illness, pneumonia, and gastrointestinal diseases (34). HAdV type 11 (HAdV-11), which belongs to subgenus B:2, can further be divided into at least two genomic types, designated HAdV-11p and HAdV-11a, on the basis of the similarities in the fragment comigration patterns during restriction genome typing (20). HAdV-11p is generally the causative agent of kidney and urinary tract infections, while HAdV-11a is associated with respiratory tract infections. HAdV-14, another component of subgenus B:2, has been recognized as a pathogen that causes acute respiratory illnesses (24, 34), and these subgenus B:2 viruses have generally been recognized to be mild or rare pathogens. Until recently, HAdV-11a has been found in China, Spain, Turkey, and Latin America (7, 14, 20), while diseases associated with HAdV-14 have also been reported from many other regions, such as Taiwan (6) and the United States (24). However, inconsistent with the findings of previous reports, AdVs appear to have been exhibiting a tendency to cause more severe clinical outcomes, including death (21, 24).
In 2006, an outbreak of HAdV-associated diseases occurred in Qishan County of Shaanxi Province, China. One of the patients died of acute respiratory disease-induced multiple-organ failure during the outbreak. On the basis of the epidemiological data and the results of multiple laboratory assays (36), HAdV-11a was identified as the pathogen responsible for that outbreak. Interestingly, we found that there was the possibility of intraspecies recombination between HAdV-11a and HAdV-14 on the basis of whole-hexon-gene sequence analysis (36). Our proposition differed from the findings in previous reports that HAdV-11 is very closely related to serotypes 34 and 35 (20, 31). In order to gain a better understanding of this phenomenon, we determined and analyzed the whole-genome sequence of HAdV-11 strain QS isolated from that outbreak.
The viral strain designated HAdV-11 QS in this study was isolated from hydrothorax fluid samples from the patient who died during the AdV outbreak in Qishan County, Shaanxi Province, China, in 2006. The virus was cultured and propagated in a HEp-2 cell line, which was maintained in minimal essential medium supplemented with 2% fetal bovine serum. Viral DNA was extracted from the infected cultures by using a QIAamp DNA mini kit (Qiagen, Valencia, CA), according to the manufacturer's instructions. The DNA was solubilized with an elution buffer and was stored at −70°C until further use.
Appropriate primers were designed according to the HAdV sequences available from GenBank. For genome sequencing, a standard PCR with 61 primer pairs was conducted to amplify the overlapping fragments encompassing the entire genome of HAdV-11 QS, except its 5′ and 3′ termini. For the terminal genome sequences, a method similar to that which uses a 5′/3′ rapid amplification of cDNA ends kit was followed. First, the covalent junction between the purified DNA template and the terminal protein (TP) was broken by the addition of 0.4 N NaOH. The mixture was then incubated at room temperature for 1 h and further purified by chromatography. Subsequently, in the presence of T4 DNA ligase (Promega), the blunt-ended DNA templates without TP were linked by using a DNA double-strand linker containing two synthesized single strands (5′-CGGTCGTGAGTGCTTATAG-3′ and 5′-PCTATAAGCACTCACCGTC-3′). Lastly, heminested PCR was used for amplification of the terminal DNA sequences to obtain the sequencing templates. In addition to the common primer (5′-CGGTCGTGAGTGCTTATAG-3′), external primers 5′-CAGTCCACGGAACTCAAAT-3′ and 5′-CTGCACCATTCCCAGTA-3′ and internal primers 5′-TCTTCTCGCTGGCACTCA-3′ and 5′-AGCCATGGCTTACCAGAC-3′ were used to amplify the 5′- and 3′-terminal ends, respectively.
The amplified fragments were purified by agarose gel electrophoresis by using a QIAquick gel extraction kit (Qiagen, KK, Japan). The sequencing reactions were performed bidirectionally with the appropriate primers and cycle sequencing kits (ABI Prism BigDye Terminator, version 3.1; PE Applied Biosystems) and were then resolved with a model 3100 genetic analyzer (Applied Biosystems).
The DNA sequences were assembled with the Sequencher program (version 4.0.5; Gene Codes Corporation). We divided the whole AdV genome into 1-kb-long nonoverlapping segments by the method of Lauer et al. (19) and systematically queried the sequence against the sequences in the nonredundant NCBI database using the BLASTX program. We searched the database using the default parameters with the BLOSUM62 matrix and gap penalties of 11 (existence) and 1 (extension). The DNA and protein sequence alignments were created by using BioEdit sequence alignment editor software (version 5.0.9; Tom Hall, North Carolina State University) and an online gene-wise program (http://www.ebi.ac.uk/Wise2/advanced.html). Phylogenetic tree analyses and constructions were conducted by using the Mega program (version 4.0; Sudhir Kumar, Arizona State University). The splice position and functional genome were predicted by using an online splice program (http://www.fruitfly.org/seq_tools/splice. html) and the online GEN SCAN program (http://genes.mit.edu/GENSCAN.html). Recombination analyses among HAdV strains 11p,14, 34, 35, and QS were performed with SimPlot software (Johns Hopkins University School of Medicine, Baltimore, MD).
The nucleotide sequence of the whole genome of HAdV strain QS, which was determined in this study, has been deposited in the GenBank nucleotide sequence database under accession number FJ643676.
The whole genome of HAdV strain QS comprised 34,755 nucleotides (nt); and the plus strand had a base composition of 26.1% A, 24.4% C, 24.4% G, and 25.1% T. Similar to the sequences of other HAdV subgenus B strains, the HAdV-11 QS genome also coded 37 functional proteins, which could be divided into four early, two intermediate, and five late transcription products. The detailed genome organization, putative splice sites, and polyadenylation signals are described in Table Table11.
The inverted terminal repeats (ITRs) were 137 bp in length and existed at the two ends of the HAdV-11 strain QS genome, and their function was probably associated with the initiation of DNA replication by a strand displacement mechanism (4). The extreme terminal sequence of the ITRs (CATCATCAAT) was relatively conserved and was the same motif as that found in subgenera B (HAdV-11p, HAdV-14, HAdV-34, and HAdV-35), C (HAdV-1 and HAdV-5), D (HAdV-17, HAdV-26, and HAdV-46), E (HAdV-4), and F (HAdV-40). Between bases 9 and 18 of HAdV-11 QS there was another conserved motif (ATAATATACC) which was present in almost all HAdVs and which was reported to be directly involved in interactions with a complex of pre-TP (pTP) and DNA polymerase within the origin of DNA replication (32).
It is recognized that the ends of the HAdV genome, including the E1 and E4 regions, are transcribed first, followed by transcription of the delayed early units (IX, Iva2, and E2 late) and the major late transcriptional units. The early transcription units of the HAdV genomes contained five components (E1A, E1B, E2, E3, and E4) (4). Compared with the sequence of HAdV-11p (GenBank accession number AY163756), the core transcription promoters of the early transcription units (E1A, E1B, E2, E3, and E4) of HAdV-11 QS were predicted to be the TATTTATA (nt 495 to 502), TATATA (nt 1562 to 1567), TATATTAT (nt 26844c to 26837c, where c indicates the complementary strand), TATAAAAA (nt 26841 to 26847), and TATATATA (nt 34454c to 34447c) motifs, respectively. On the basis of the differences in the polyadenylation signals, the E1 transcription unit could further be differentiated into two parts, namely, E1A and E1B. The proteins encoded by E1A probably have an important role in transcriptional activation and stimulating the host cell into the S phase, while those of E1B might inhibit the apoptosis mediated by p53 (4). Similarly, the E2 region could be also divided into E2A and E2B, which are probably crucial for AdV DNA replication (4) and which are located in the complementary strand. In all, three nonstructural proteins were encoded by in the E2 region: DNA-binding protein (DBP), pTP, and DNA polymerase. In addition to the TATA box, there was an SP1 transcription factor-binding site (GGGCGG; nt 23505c to 23500c) that encoded the DBP. The DBP amino acid sequence alignments among HAdV subgenera A to G showed that the DBP of HAdV-11 strain QS contained four conserved regions (CR1 to CR4) and two zinc-binding sites, one of which was HGCNDYEGKLKCLH and the other of which was four discontinuous cysteines. The E3 open reading frames (ORFs) of HAdV-11 QS encoded a total of eight predicted proteins (12.2, 14.3, 18.5, 20.3, 20.2, 10.3, 15.2, and 15.3 kDa) that were probably involved in evasion from the host immune response but that were not indispensable for viral growth and propagation (27). The E3 region is therefore a potential insertion site for gene therapy. In addition, a 9-kDa protein that was present only in subgenus B:1 (31) had not been found within the E3 region of HAdV-11 QS, and on the basis of that finding, HAdV-11 QS could further be confirmed to belong to subgenus B:2. The E4 region encoded five predicted proteins (14.2, 14.3, 13.6, 14.2, and 34.6 kDa) that probably possess a range of functions, such as the modulation of DNA replication, mRNA transcription, transport, translation, and cell apoptosis (4). Each of the splice acceptor and donor sites in the early transcription genes were determined by using Splice Site Finder software (available online at http://www.fruitfly.org/seq_tools/splice.html) and are described in detail in Table Table11.
Although IX and IVa2 were apparently included in the E1B and E2B gene regions, they could be classified as intermediate genes according to the different transcription stages to which they belong (23, 26, 31). The IX region (nt 3488 to 3907) encoded a 14.2-kDa protein (pIX), and the IVa2 region (nt 5594c to 3970c) encoded a 50.9-kDa protein (pIVa2). Inconsistent with the findings for the early genes, there was no predicted TATA box for pIX and pIVa2 in HAdV-11 strain QS. In addition to the common role of pIX and pIVa2 in enhancing the activity of the major late promoter (MLP), pIX functioned as one of the minor capsid proteins and pIVa2 mediated serotype-specific binding with the genome packaging signal (4, 35). The amino acid sequence alignments of pIVa2 of HAdV-2, HAdV-3, HAdV-5, and HAdV-7 with pIVa2 of HAdV-11 QS revealed that pIVa2 of HAdV-11 QS had maximum similarity with pIVa2 of HAdV-3 (93.5%), followed by that with pIVa2 of HAdV-7 (92.6%), HAdV-2 (79.9%), and HAdV-5 (79.9%). This finding therefore suggests that the IVa2 region is also a potential target site for the reconstruction of chimeric AdV vectors.
The late genes of HAdV-11 strain QS shared a common transcription promoter, namely, MLP (nt 5894 to 5900) (4), but utilized multiple poly(A) signals, owing to which five subunits (subunits L1 to L5) could be recognized. In addition, there were some other conserved control elements in HAdV-11 QS, such as an inverted CAAT box (nt 5849c to 5845c), an upstream stimulatory factor-binding site (nt 5865 to 5873), an initiator (INR) element (nt 5923 to 5934), the DE1 region (nt 6010 to 6020), and the DE2 region (nt 6025 to 6042). By examining the alignments of the MLPs of HAdV-3, HAdV-5, and HAdV-14 and the MLP of HAdV-11 QS, we observed that there was only one variation among these important control elements, i.e., a variation in the sixth base of the INR element, which indicated the importance of MLP in transcription regulation and control. Following the MLP, the tripartite leader (TPL) could be found in all late RNA species from L1 to L5. Three regions of strain QS, i.e., the first (nt 5925 to 5965), second (nt 6984 to 7056), and third (nt 9499 to 9585) leader sequences, constituted the TPL.
The late genes of HAdV-11 strain QS encoded a total of 14 predicted proteins, and these probably play an important role in ensuring the structural integrity of the virion (4). The L1 region had two predicted proteins, i.e., a 43.9-kDa protein and pIIIa, which shared a common poly(A) signal at nt 13607. The former was the analog of the L1 protein with a molecular weight of 52,000/55,000 (L1-52/55K protein), which is involved in virion assembly and in the DNA encapsidation process (10, 11). pIIIa, which extended from the exterior to the interior surface of the capsid, presumably performs a rivet-like function to stabilize the interfaces between the two facets adjacent to the capsid (28).
Within the L2 region, four predicted ORFs encoded pIII (62.5 kDa), pVII (21.3kDa), pV (40.1kDa), and pX (5.2kDa); and their common poly(A) signal was situated at nt 17317. pIII, also termed the penton base protein, was one of the major capsid proteins and measured 557 residues in length; it was presumed to mediate virus internalization by interacting with the ανβ3 and ανβ5 integrins through the RGD motif or with the α4β1 integrin through the LDV motif (18, 33). Multiple pIII amino acid sequence alignments among subgenera A to G showed that HAdV-11 QS had a penton base, i.e., a fiber-interacting domain (ESRLSNLLGIRKK) (5), and had the closest similarity with subgroup B, particularly serotype 14 (99.4%), followed by subgroups E (81.9%), D (75.6%), G (71.9%), A (71.6%), F (71.2%), and C (67.6%), in that order. The L2-encoded proteins (pVII, pV, and pX) together constituted the virion core and were in contact with the viral DNA through arginine-rich regions (4). The putative proteins pVII, pV, and pX were 192, 351, and 47 residues in length, respectively. pVII probably performs a histone-like function to condense the viral DNA (4); pV functions as a bridge between pVI and pIII to stabilize the virion (4); pX, also known as mu, is associated with AdV DNA condensation and charge neutralization (16).
Three L3-encoded proteins, namely, pVI (26.6 kDa), hexon (106.9 kDa), and the 23K protease (23.7 kDa), shared a common poly(A) signal at nt 21766. pVI was 246 residues in length, and it putatively transports hexon molecules into the nucleus and participates in the disruption of the endosomal membrane during infection with the virus (4, 15). The hexon, also known as pII, was 946 residues in length and is the major structural component of the capsid. In all, seven hypervariable regions (HVRs) of the hexon were regarded as type-specific epitopes of the AdV (29). On the basis of this finding, we confirmed that HAdV-11 QS belonged to HAdV-11a because it shared maximum homology with another strain of HAdV-11a (GenBank accession number AY972815) (36). The 23K protein analog of HAdV-11 QS was 209 residues in length and is presumably required for the cleavage of the viral protein precursors during virus maturation and assembly (4).
Four predicted proteins (91, 21.6, 14.6, and 25 kDa) identified in the L4 region used a common putative poly(A) signal at nt 27465. The 91-kDa protein (the analog of L4-100K) might play a role in the hexon trimer assembly and may be necessary for the efficient initiation of late viral protein synthesis (4, 12). Compared to the other B subgenera, the L4 region in this B subgenus contained two partially overlapping ORFs which encoded two proteins (21.6 and 14.6 kDa) whose functions remained to be clarified. The predicted pVIII (25 kDa) was 227 residues in length and interacted with three other proteins (pIIIa, pVI, and pIX) to stabilize the virion capsid (4).
The L5 region (nt 30775 to 31755) encoded a 35.3-kDa homolog (325 residues in length) of the fiber protein whose poly(A) signal was located at nt 31755. The fiber could structurally be divided into three parts, including an N-terminal tail, a central shaft with repeated motifs, and a C-terminal globular knob (4, 9). Generally, the knob is responsible for binding to the cellular receptors, such as the coxsackievirus-AdV receptor. However, it has been reported that the HAdV-11 knob may also interact with the membrane cofactor protein, also known as CD 46 (30), which enables HAdV-11 to be an alternative vector for gene therapy in place of HAdV-2 or HAdV-5. The N-terminal end of the fiber of HAdV-11 strain QS contains a hydrophobic motif (FNPVYPY) which has been reported to mediate the interaction between the penton base and the fiber through hydrogen bonds and salt bridges (37). Comparison of the amino acid sequence of the fiber with the amino acid sequences of the fibers of other B subgenera revealed that the variation was mainly present in the regions of the shaft and the knob. Further phylogenetic analyses suggested that the amino acid sequence of the subgenus B fiber could be classified into a large cluster and that HAdV-11 QS had a closer relationship with HAdV-14 (99%), HAdV-11p (92.3%), and HAdV-7 (90.7%). Comparisons of the nucleotide sequences of the fiber region also yielded a similar result. It was notable that the fiber of HAdV-11 QS showed maximum homology with that of another available HAdV-11a strain (GenBank accession number L08232) but exhibited greater dissimilarities with the prototype of HAdV-11 than with the prototype of HAdV-14 (Fig. (Fig.1A).1A). Although simian adenovirus (SAdV) is a different species, the fiber of SAdV type 21 (SAdV-21) showed 55.4% amino acid sequence homology with that of HAdV-11 QS, which conspicuously exceeded the homology with other HAdV subgenera (homologies of subgenera A, C, D, E, G, and F, 15.2%, 16.6%, 24.1%, 22.3%, 21.3%, and 19.5%, respectively). At the same time, comparisons of the corresponding nucleotide sequences also revealed the similar relationships of the various subgenera and SAdV-21, such that they could be classified into subgroups: one subgroup consisting of the whole B subgenera and SAdV-21; one subgroup consisting of subgenera F and G; and one subgroup consisting of subgenera A, C, D and E, respectively (Fig. (Fig.1A1A).
Whole-genome phylogenetic analyses among HAdV subgenera A to G revealed that HAdV-11 strain QS could be clustered into a subgroup along with HAdV-11, HAdV-14, HAdV-34, and HAdV-35; and there was maximum homology between HAdV-11 QS and HAdV-14 (Fig. (Fig.1B1B).
Recombination not only is a well-known feature of AdV genetics but also is an important driving force for virus evolution. Generally, AdV recombination occurs much more frequently within a species than among species (22, 34) and has been suggested to occur in a few local regions, such as the fiber gene and nonstructural regions (22, 31). AdVs have been reported to be continuously detectable in the tonsils or other adenoid tissues after acute infection (1, 8, 34), which enables the formation of new recombinants during coinfection with different AdV serotypes (22). In this study, phylogenetic analyses of the coding regions of the nonstructural proteins (L4-100K, DBP, and DNA polymerase), major capsid proteins (fiber, hexon, and penton base protein), minor capsid proteins (pIIIa and pVI), core proteins (pTP and pVII), and even the complete genome of the prototype strains of subgenera A to G revealed that, except for the hexon and fiber genes, HAdV-11 strain QS had the highest degree of homology with the prototype HAdV-14 strain (strain de Wit, GenBank accession number AY803294) (Fig. (Fig.1).1). The corresponding proteins also showed a similar trend, including even up to 100% similarity at local amino acid sequences, such as L1-52/55K, pVII, pV, and pIVa2 (Table (Table2);2); however, at the hexon gene, there was an apparent difference between HAdV-11 QS and HAdV-14. With regard to the fiber gene, although HAdV-11 QS shared maximum homology with HAdV-11a (GenBank accession number L08232), comparisons among the prototypes showed that HAdV-11 QS had a closer phylogenetic relationship with the prototype HAdV-14 strain than with the prototype HAdV-11 strain (Fig. (Fig.1A).1A). This finding is different from the finding presented previously that HAdV-11 is very closely related to serotypes 34 and 35, which was determined on the basis of the differences in restriction fragment migration patterns (20). This phenomenon indicates that recombination probably occurred between HAdV-11 QS (HAdV-11a) and HAdV-14. One reasonable explanation was that on the one hand, the recombinant HAdV-11 QS acquired the elements essential for virus replication and proliferation from HAdV-14 by means of recombination; on the other hand, it may have retained the key neutralizing antigen epitope (hexon) to escape the immune attack against HAdV-14. HAdV-14 has been reported to cause respiratory illnesses more frequently than HAdV-11a (6, 24). On the basis of that finding, the corresponding rate of seropositivity for HAdV-14 was suggested to be higher than that for HAdV-11a in the whole population. As a result, HAdV-11 QS not only could possess the virulence of HAdV-14 but also could avoid the neutralizing antibody against HAdV-14, which exists much more widely in the population than that against HAdV-11a.
The statement presented above could also be partly supported by the following clinical evidence. HAdV-11 is a recognized pathogen in urinary tract infections, while HAdV-14 is associated with pharyngoconjunctival fever and acute respiratory disease (34). The recombinant (HAdV-11 QS) showed the ability to cause respiratory tract diseases and not kidney and urinary tract infections, although HAdV-11a also shared a high degree of homology with the prototype HAdV-11 strain (Table (Table2).2). In addition, it is worth noting that HAdV-14, which also belongs to subgenus B:2, caused several outbreaks with severe clinical consequences (21, 24). At the same time, HAdV-11a, which has rarely caused severe respiratory system diseases, has been found all around the world in recent years, such as in Turkey, China, and Latin America (7, 14, 36). This coincidence also indicates the probability of recombination from the point of view of an epidemic.
In order to further clarify the possible recombination events, the SimPlot program was used for whole-genome sequence analyses of HAdV-11 QS and the other prototypical subgenus B strains (HAdV-11p, HAdV-14, HAdV-34, and HAdV-35). Similar to the analyses presented above, recombination between HAdV-11 strain QS and HAdV-14 was suggested to exist, except in the hexon gene region (Fig. (Fig.2),2), which further confirmed the recombination between HAdV-11 QS and HAdV-14.
In addition, a close genetic relationship has been reported to exist between the HAdVs and SAdVs (17, 25). When we selected SAdV-21 for comparison, we found that HAdV-11 QS also exhibited a high degree of amino acid sequence homology (average, >90%) with SAdV-21, except in the fiber region (Table (Table2),2), which enabled differentiation between the AdV species and interaction with the receptor on the target cells. This probably indicates a close phylogenetic relationship between HAdV-11 QS and SAdV-21.
In conclusion, this is the first report on recombination between HAdV-11 and HAdV-14, but it is difficult to clarify the exact time and manner of recombination. Whether the emergence of recombination between HAdV-11a and HAdV-14 might increase virulence, thereby posing a new global challenge with regard to acute respiratory diseases in the near future, warrants further investigation. Sentinel virological surveillance for HAdV has thus far not been established in China; both epidemiological and virological surveillance of this uninvestigated respiratory disease pathogen should be strengthened in order to cope with the emerging biosafety incidents caused by the various recombinants. Based on the genomic analyses, HAdV-11 strain QS has a HAdV-14 chassis with a partial HAdV-11 hexon, so it may be a novel adenovirus. This raises the possibility that it should be renamed HAdV-B55 (M. P. Walsh, J. Seto, M. S. Jones, J. Chodosh, W. Xu, and D. Seto, unpublished data).
We thank the Shaanxi CDC microbiology laboratory for specimen collection.
This work was supported by grant 2007AA02Z463 from the Ministry of Science and Technology of the People's Republic of China and grants 2009ZX1004-201 and 2009ZX1004-202 from the National Infectious Disease Surveillance Program of the Ministry of Science and Technology of the People's Republic of China.
We have no conflicts of interest to report.
Published ahead of print on 12 August 2009.