Human adenoviruses (HAdVs) are important not only as human pathogens but also as scientific models for seminal studies and discoveries in molecular and cell biology. RNA splicing is one notable process first discovered with HAdV. Recently, the application of genomics and bioinformatics to viral genomes, in particular HAdVs, has revealed insights into DNA virus evolution and the genesis of adenoviral pathogens. HAdVs are currently important as models for the applications of genomics, especially in the understanding of virus and pathogen evolution and the development of rapid, asymptomatic pathogen surveillance and detection and diagnostic methodologies; all of these require unambiguous primary sequence data. The original HAdV-D17 sequence and annotation are reflective of the state of the technology at the time—both are lacking in accuracy and completeness. This hinders the use of this important pathogen genome as a reference. In order to aid researchers studying adenovirus biology, a corrected DNA sequence is provided, along with a more complete annotation.
HAdV-D17 was obtained from the American Type Culture Collection (ATCC; Manassas, VA) as VR-1094. It was isolated from conjunctival scrapings in 1955 and designated Ch. 22 (1
). This adenovirus was grouped into species HAdV-D and confirmed by Hierholzer et al. (2
). Growth in A549 cells and DNA production were outsourced to Virapur, LLC (San Diego, CA), as described earlier (3
). The genome was sequenced using the Sanger method with the sequencing ladders resolved with an ABI 3130x genetic analyzer. This provided data at an average of 8-fold redundancy with both strands sequenced. Unreliable data were resequenced, using PCR amplification. Annotation of the genome and a comparative analysis with other HAdVs provided additional quality control.
The genome is a linear double-stranded DNA molecule comprising 35,139 bp, with a GC content of 56.7%; this is consistent with the species discriminatory content of 57% for species HAdV-D (5
). Thirty-nine genes are annotated, including the original eight. Sequencing errors are typical of the Sanger sequencing method, particularly if the protocol was manual and radioisotope based. The prior GenBank entry for HAdV-D17 contains 39 fewer nucleotides than those in the data newly reported here. A sequence alignment analysis revealed 160 mismatches between the two sequences as well as 59 deletions and 20 insertions in the original HAdV-D17.
Given the corrected nucleotide sequence and expanded gene annotation data, researchers studying adenoviral evolution, particularly of ocular pathogens, may now use HAdV-D17 as a reference genome. Current bioinformatics tools include recombination analysis and phylogenomics, which call for both genome nucleotide sequences and gene coding sequences. Species HAdV-D adenoviruses are implicated in current disease outbreaks and serve as progenitors to new types, particularly through genome recombination (4
). Thus, the correct genome data for HAdV-D17 are important as a reference tool. Understanding species HAdV-D prototypes is critical, as newly identified isolates appear to be predominantly species HAdV-D and are important as respiratory pathogens (unpublished data) (4
), as reported recently, as well as ocular (6
) and gastrointestinal (2a
) pathogens, as noted in the past.
Nucleotide sequence accession number.
Reference HAdV-D17 genome data are accessible in GenBank under accession number HQ910407.