genus was tentatively assigned within the Betaherpesvirinae
), but PCR analysis of conserved genes in seven distinct elephant endotheliotropic herpesvirus (EEHV) species from both Asian and African elephant hosts (2
) has led us to propose (4
) that they should instead be considered a new Deltaherpesvirinae
subfamily of mammalian herpesviruses. EEHVs cannot yet be propagated in cell culture, and the only source of viral genomic DNA available has been from necropsy tissue or viremic blood samples. Both walking procedures (5
) and partial phage lambda libraries (6
) have been used to characterize up to 85 kb of the central conserved core region of several strains, including the partially chimeric variant EEHV1B, but numerous additional novel genes were expected. Therefore, we used a next-generation approach to determine the complete sequence of EEHV1A from a 12-year-old hemorrhagic disease case named “Kimba.”
Twenty-two gigabases of raw sequence data reads from two paired-end Illumina runs were assembled de novo and confirmed and corrected by extensive targeted local PCR sequencing. The sequenced necropsy DNA sample was a mixture of EEHV DNA with host Elephas maximus DNA, neither of which had a close reference sequence available. However, a subset of expected viral reads with copy number between 25- and 40-fold relative to a single-copy host gene were selected for assembly by the program Velvet, yielding 417 contigs, including one of 83.7 kb. A draft final assembly was generated by (i) joining the six largest contigs totaling 162 kb by PCR amplification across adjacent ends, (ii) manual elimination of all contigs with features resembling host cell repetitive DNA, and (iii) successful amplification of a 15.7-kb PCR product that physically joined across the left and right outside ends. The eleven remaining contigs from the manual elimination step additively matched exactly to the PCR product in the amplification step; all were confirmed to be viral by PCR amplification from a set of six standard strains of EEHV-positive Elephas DNA, but not from three EEHV-negative Elephas DNA samples.
The complete 177,136-bp EEHV1A (Kimba) genome consists of 115 identified likely open reading frames (ORFs), including 14 for which we were able to assign likely splicing patterns. The protein-coding genes can be categorized into 37 conserved core genes common to all herpesviruses (all >50% diverged from their nearest orthologs), plus 15 genes in common with either both betaherpesviruses and gammaherpesviruses or just betaherpesviruses, three other core genes that are missing in human cytomegalovirus (HCMV) (OBP, RRB, and TK), and 60 novel ORFs not found in any other herpesviruses. The last group includes captured cellular genes vGCNT1, vFUT9, three versions of vOX2, five other immunoglobulin family proteins, and a large cluster of tandemly repeated highly diverged 7× transmembrane (TM) proteins, some of which resemble viral G-protein-coupled receptors (vGPCRs) and chemokine receptors. Furthermore, a 40-kb core segment of the Proboscivirus genome is inverted relative to the organization in Betaherpesvirinae. Phylogenetic trees of the core proteins are consistent with the designation of the probosciviruses as a new subfamily branching between the Betaherpesvirinae and Gammaherpesvirinae.
Nucleotide sequence accession number.
The genome sequence of EEHV1A (Kimba) has been deposited in GenBank under the accession no. KC618527