|Home | About | Journals | Submit | Contact Us | Français|
Herpesviruses or herpesviral sequences have been identified in various bat species. Here, we report the isolation, cell tropism, and complete genome sequence of a novel betaherpesvirus from the bat Miniopterus schreibersii (MsHV). In primary cell culture, MsHV causes cytopathic effects (CPE) and reaches peak virus production 2 weeks after infection. MsHV was found to infect and replicate less efficiently in a feline kidney cell, CRFK, and failed to replicate in 13 other cell lines tested. Sequencing of the MsHV genome using the 454 system, with a 224-fold coverage, revealed a genome size of 222,870 bp. The genome was extensively analyzed in comparison to those of related viruses. Of the 190 predicted open reading frames (ORFs), 40 were identified as herpesvirus core genes. Among 93 proteins with identifiable homologues in tree shrew herpesvirus (THV), human cytomegalovirus (HCMV), or rat cytomegalovirus (RCMV), most had highest sequence identities with THV counterparts. However, the MsHV genome organization is colinear with that of RCMV rather than that of THV. The following unique features were discovered in the MsHV genome. One predicted protein, B125, is similar to human herpesvirus 6 (HHV-6) U94, a homologue of the parvovirus Rep protein. For the unique ORFs, 7 are predicted to encode major histocompatibility complex (MHC)-related proteins, 2 to encode MHC class I homologues, and 3 to encode MHC class II homologues; 4 encode the homologues of C-type lectin- or natural killer cell lectin-like receptors;, and the products of a unique gene family, the b149 family, of 16 members, have no significant sequence identity with known proteins but exhibit immunoglobulin-like beta-sandwich domains revealed by three-dimensional (3D) structural prediction. To our knowledge, MsHV is the first virus genome known to encode MHC class II homologues.
Cumulative data indicate that bats are important reservoirs of many emerging viruses (11), including some highly infectious and pathogenic viruses, such as henipaviruses (16, 58), severe acute respiratory syndrome (SARS) coronavirus (43), and filoviruses (41). The discovery of SARS-like coronavirus sequences in Chinese horseshoe bats intensified efforts to isolate novel viruses from different bats species, but to date virus isolation attempts have been largely unsuccessful, with a lack of permissive cell lines considered one of the contributing factors. We previously developed a method to generate primary cell lines from different bat organs (19). During the preparation of primary cells from Schreiber's long-fingered bat, Miniopterus schreibersii, we observed spontaneous cytopathic effects (CPE) in lymph node cells after two passages, which was subsequently identified as being caused by a betaherpesvirus.
Herpesviruses are highly disseminated in nature, infecting mammals, birds, reptiles, amphibians, fish, and oysters. The family Herpesviridae is divided into three subfamilies: Alphaherpesvirinae, Betaherpesvirinae, and Gammaherpesvirinae. Phylogenetic analysis of the core genes yielded a tree with clear separation of the three subfamilies (51). Betaherpesvirinae is divided into five genera: Cytomegalovirus (CMV) (primate CMV), Muromegalovirus (murid CMV), Proboscivirus (elephantid herpesvirus), Roseolovirus (human herpesviruses 6 and 7 [HHV-6 and -7]), and unassigned members, which includes Tupaiid herpesvirus (tree shrew herpesvirus [THV]), Caviid herpesvirus 2 (guinea pig herpesvirus [GPCMV]), and Suid herpesvirus 2. Betaherpesviruses differ from alpha- and gammaherpesviruses in their restricted host range and long infection cycle (62).
Betaherpesvirus genome sizes range from 143 kbp for HHV-6 (31) to 241 kbp for chimpanzee CMV (CCMV) (21). The prototype strain of human CMV (HCMV) is estimated to encode 164 to 167 genes (21), of which about 70 are conserved in all betaherpesvirus genomes, including 40 core genes for all herpesviruses. The conserved genes are arranged largely colinearly in direction and position in betaherpesvirus genomes. It is known that betaherpesvirus genomes encode some proteins with homology to cellular counterparts, such as chemokine receptor, G protein-coupled receptor proteins, and major histocompatibility complex (MHC) class I molecules.
Alpha-, beta- and gammaherpesviruses have all been discovered in bats. Using PCR, Wibbelt et al. (94), Molnar et al. (56), and Watanabe et al. (93) discovered gamma- and betaherpesviruses in bats. Fifteen alphaherpesviruses were isolated from bats in Cambodia and Madagascar (65), and a betaherpesvirus (bat betaherpesvirus 2 [BHV-2]) was obtained from the spleen primary cell culture of Miniopterus fuliginosus (92).
In this study, we report the isolation, cell infection, and complete genome sequence of a novel bat betaherpesvirus from M. schreibersii primary cell culture (MsHV). Three groups of immune-related genes were found in the MsHV genome, including MHC class I and class II, C-type lectin, and a unique gene family of 16 members.
M. schreibersii bat lymph node (MsLn) and kidney primary cells (MsKi) were prepared as previously described (19). The cells were cultured in Dulbecco's modified Eagle medium (DMEM)-F12-Hams (Sigma) supplemented with 10% fetal calf serum (FCS) (HyClone) and Antibiotic-Antimycotic antibiotics (Gibco) at 37°C with 5% CO2. To purify MsHV, MsKi cells in a 96-well tissue culture plate (4 × 104/well) were infected with 10-fold serial dilutions of MsHV and incubated for 15 days to allow CPE to develop. Supernatant was collected from wells with CPE at the dilution where less than 25% of wells showed CPE, and one of them was used for the second round of purification. A single clone with typical CPE was picked to propagate in MsKi cells grown in 4 150-cm2 flasks. The infected culture medium was subsequently clarified at 2,000 × g for 5 min and then ultracentrifuged at 55,000 rpm for 1 h in a Beckman SW 55Ti rotor. The resulting viral pellet was resuspended in phosphate-buffered saline (PBS), and aliquots were stored at −80°C. Virus was titrated using endpoint dilutions as described above.
Cells in Eagle minimal essential medium (EMEM) or DMEM with 10% fetal calf serum (FCS) were seeded in 35-mm plates and incubated overnight prior to infection with 1,000 IU (infectious units) in 0.5 ml total volume. After 2 h of incubation, 2 ml fresh medium with 5% FCS was added. At indicated days postinfection (dpi), cells were washed four times in PBS, collected by centrifugation, and stored at −20°C. Total DNA was extracted with Qiagen DNeasy blood and tissue kit. The same volume of total DNA for each cell species was used to detect MsHV genomic DNA by real-time PCR with the primers MsHV-F (5′-TCACC AGGAT AGGGC GAGAC A-3′) and MsHV-R (5′-CTTTT CAAAT TCCAG CTTCA CAGG-3′) using the following parameters: 95°C for 2 min and 40 cycles of 95°C for 15 s, 60°C for 30 s, 84°C for 30 s for signal collection, followed by melting-curve analysis from 60°C to 95°C.
The viral pellet was diluted with TE (10 mM Tris-Cl, 1 mM EDTA, pH 8.0) and treated with 0.2% SDS and 100 μg/ml proteinase K at 56°C for 2 h. DNA was extracted using phenol-chloroform and precipitated with 0.2 M NaCl and an equal volume of isopropanol. The DNA was separated on a 0.4% agarose gel, and the large and distinct band corresponding to viral genomic DNA was sliced out and recovered by electroelution into a dialysis tube (71), followed by filtering through a prewet 0.45-μm filter, phenol-chloroform extraction, and isopropanol precipitation. The DNA sequence of the purified MsHV viral genomic DNA was determined using the Roche 454 GS-FLX platform (454 Life Sciences, Branford, CT) with sample preparation as described in their Titanium series manuals, Rapid Library Preparation and emPCR Lib-L SV.
De novo assemblies of the 454 sequencing output were performed independently with the Roche GS De Novo Assembler (Newbler) and CLC Genomics Workbench v4.8 (CLC Inc., Aarhus, Denmark) software programs, resulting in contigs ranging from 2 kb to 75 kb. Contigs generated from both next-generation-sequencing de novo algorithms were assembled using the Seqman module of the software program Lasergene DNASTAR 9 Core Suite (DNAStar, Madison, WI), resulting in a single large contiguous sequence of about 222.8 kb. To search for raw reads that had not been assembled at the ends of the large contig, 200 bp from each end was used as a template to find similarity with reads from the 454 raw reads library using the “scan for similarities” function in the program Clone Manager Professional 9 (Sci-Ed Software, Cary NC). This manual walking was repeated until no more sequences aligned. The final contig was then used as a reference to map back to the 454 raw reads using CLC Genomics Workbench, and the resulting consensus sequence was taken as the MsHV genome. PCR was used to confirm the sequence of regions with coverage lower than 50 reads and/or high GC content using primers designed according to the flanking sequences (L1-f, 5′-GTGGT TAAGT GTGGG TGTGT CTGG-3′; L1-r, GGAAG ACGGC GACAA CAGGC-3′; L2-f, 5′-GATTC GGTGG AATCA ATGAC CCTG-3′; L2-r, 5′-TTTCC TAACC AGTTA GCCCA ACGC-3′; L3-f, 5′-TTACA GCGTT AGGCG AATCA CACG-3′; L3-r, 5′-GTCAT CCTGC GCTTG GTGTA TCC-3′; L4-f, 5′-GTCTG CGACG AAGGC TCTCG C-3′; L4-r, 5′-TGTGG TTCTT GCAGG CAGCG-3′). 360 GC Enhancer buffer (ABI) was used for regions with high GC content according to the manufacturer's instructions.
Open reading frames (ORFs) were initially predicted with the software programs FgenesV (Softberry) and GeneMarkS (5). ORFs that contained canonical start and stop codons were BLASTed against the local protein database of betaherpesviruses, including THV (2), human CMV (HCMV) (24), murine CMV (MCMV) (64), rat CMV (RCMV) (89), GPCMV (38, 72), rhesus CMV (RhCMV) (33), chimpanzee CMV (CCMV) (21), HHV-6A (31), HHV-6B (25), and HHV-7 (53). Hits with an E-value of less than e−3 were used to annotate the MsHV genome, while ORFs with low-probability hits to betaherpesviruses were BLASTed against the GenBank nonredundant protein database; hits with E-values of less than e−3 were included in the annotation to obtain a draft annotated genome. The common strategy for herpesvirus' gene prediction, the presence of an ORF with a minimum length of 300 bp and less than 60% overlap with adjacent ORFs (64), was applied to other portions of the MsHV genome. The gaps between annotated ORFs were inspected again for ORFs with a minimum length of 150 bp. Analysis of the repeats in the genome was performed using the software program Tandem Repeats Finder (4). Protein sequences were subject to transmembrane and signal peptide prediction with the program Phobius (37). MsHV proteins were aligned pairwise with the homologues of THV, HCMV, and RCMV with the program ClustalW, and the pairwise identities are presented in Table 1. Multiple sequence alignment was conducted with ClustalW, and phylogenetic trees were constructed with the software program MEGA5 (82). Protein structure prediction was performed with Phyre2 (http://www.sbg.bio.ic.ac.uk/~phyre2/html/page.cgi?id=index) (39).
All the predicted ORFs are listed in Table 1. The nomenclature used for MsHV ORFs is based on a similar strategy used for other betaherpesviruses (2, 64, 72, 89), where ORFs are numbered in the order of appearance in the genome from left (5′) to right (3′); ORFs with homologues in THV, HCMV, or RCMV were named with uppercase prefixes (e.g., B23). ORFs with no obvious homologues in those viruses were designated using lowercase prefixes (e.g., b9). Decimal points were introduced when necessary, but the decimal suffixes do not necessarily indicate a relationship between these genes, e.g., B24 and b24.1 are not related, while B28 and B28.1 to B28.4 are all members of one gene family.
The MsHV genome sequence has been submitted to the GenBank database with the accession number JQ805139. Other herpesviruses used in this study are bat BHV-2 (accession no. AB517983), THV (NC_002794.1), HCMV Merlin (NC_006273.2), RCMV (Maastricht strain; NC_002512.2), GPCMV (ATCC-P5 strain; AB592928.1), MCMV (Smith strain; NC_004065.1), RhCMV (strain 68-1; AY186194.1), CCMV (AF480884.1), cynomolgus macaque CMV (CyCMV) (JN227533.1), HHV-6A (NC_001664.2), HHV-6B (NC_000898.1), HHV-7 (strain RK; AF037218.1), (herpes simplex virus 1 (HSV) (NC_001806.1), and Epstein-Barr virus (EBV) (NC_007605.1).
During establishment of primary cell cultures from different organs from the microbat M. schreibersii, CPE was observed in confluent cultures of primary lymph node cells. Initial PCR with family-specific degenerate primers revealed a betaherpesvirus. Since the original primary cell cultures were made from an organ pool from different individuals, we purified the virus with two rounds of endpoint dilution in MsKi primary cells.
MsKi cell monolayers were infected with MsHV at a multiplicity of infection (MOI) of 0.1 and incubated for up to 20 days. On 7 to 8 dpi, CPE started to form as some cells swelled, rounded up, and then detached from the culture surface; the viral CPE spread to neighboring cells until most of the cells in the monolayer died (Fig. 1A). Real-time PCR revealed that the amount of viral DNA detected increased with time (Fig. 1B).
To evaluate the species specificity of MsHV infection, cells from different species, including monkey (CV-1, Vero, and LLC-MK2), human (Hep2, MRC-5, HeLa, and U373), and seven other species (MDBK, MDCK, BHK, RK13, CRFK, PaKi, and B6WT3), were infected, and viral replication was monitored with real-time PCR at 0, 2, 4, and 7 dpi. As shown in Fig. 2, the amount of viral DNA decreased after infection in all the cell lines, as indicated by increasing threshold cycle (CT) values compared to the CT at 0 dpi, with the only exception being the feline kidney cell line, CRFK. In the CRFK cells, the CT value decreased by 4 cycles at 4 and 7 dpi, compared to 0 dpi, and by 6 to 10 cycles compared to results for other cells. PaKi, a bat kidney primary cell from the fruit bat Pteropus alecto (19), is nonpermissive for MsHV.
Next-generation 454 sequencing was used to sequence the genome of MsHV. We obtained 135,774 raw reads, with an average length of 428 bp. Two different NGS software programs were used for de novo assembly (Newbler and CLC Genomics Workbench) to obtain the full-length viral genome, but neither resulted in a single full-length contig. However, when the contigs from the different assembly approaches were pooled, a draft single large contig was obtained. This contig was then used as a reference sequence to map back to the raw reads, resulting in a consensus sequence of 222,870 bp, which we regard as the MsHV genome. The MsHV genome is close in size to the primate (14, 21, 24, 33) and murid (64, 89) CMV genomes and about 27 kbp longer than that of THV (2). The 222,870-bp genome incorporated 114,047 reads (84.0%). The read coverage ranged from 5- to 440-fold, with an average of 224-fold (Fig. 3A). The regions with less than 50-fold coverage were manually confirmed before being included in the final contig. Four regions, labeled L1, L2, L3, and L4 in Fig. 3A, had the lowest coverage, 5-, 34-, 11- and 14-fold, respectively. The sequence assembly across these regions was further confirmed by PCR with flanking primers, and the L1 region could be amplified only with an appropriate amount of GC enhancer buffer (Fig. 3B).
The overall G+C content of the MsHV genome is 51.9%, less than those of THV (66.5%), HCMV (57.5%), RCMV (61%), and MCMV (58.7%), respectively. As in THV (2), HCMV (24), and RhCMV (33), the termini of the genome have the highest G+C content (Fig. 3A). In contrast, there is a region of ~18 kbp, between L3 and L4, that has a significantly lower G+C content of 39.5% and a significantly lower sequence coverage than other regions of the genome. This was also true for L2, with a G+C content of 37.9%. L1, however, has a very high G+C content of 72% in 450 bp and up to 88% in 146 bp, which explains the difficulty in amplification of this region.
The MsHV genome was analyzed to identify repetitive sequences. No repeats were found at either terminus, but nine regions (RP1 to RP9) were found to contain tandem repeat sequences (Table 2). RP1, located at nucleotides 431 to 734, contains 4 repeat units, 3 of which overlap each other, while RP6, RP7, and RP8 have 2 overlapping units. RP5 has a single unit of 205 bp repeated 2.1 times. Of note is RP4, in the low-coverage area of L1, which includes two physically close units of 13 bp and 21 bp, repeated 4.8 and 6.8 times, respectively. RP4 is located between the genes B57 and B69, the conserved position of the betaherpesvirus origin of lytic replication (OriLyt). Repetitive and GC-rich sequences were found in the THV (2), HCMV (49), RCMV (88), and MCMV (64) genomes at the same location and were also difficult to sequence, as was CyCMV, by both next-generation and Sanger sequencing (48). RP1 to RP9 are annotated in Fig. 4.
Initial ORF analysis found 658 ORFs of 50 amino acid residues or greater. However, when analyzed by protein BLAST search against a protein data set of fully sequenced and well-annotated betaherpesviruses (a common approach used for herpesvirus gene annotation), 190 ORFs were predicted with high confidence to be true coding ORFs, and they are depicted in the genome map (Fig. 4). MsHV proteins were individually analyzed in comparison with those of THV, HCMV, and RCMV. Of the 190 ORFs, 93 encode proteins that have homologues in THV, HCMV, or RCMV, with 60 having the highest similarity with homologues of THV, 20 with HCMV, and 15 with RCMV (Table 2). All the core genes of herpesviruses were found in the MsHV genome; they are conserved in both position and orientation and are indicated by black arrows in Fig. 4, while other genes conserved in betaherpesviruses are indicated by gray arrows. In general, the core genes share higher identities, from 30 to 60%, than the noncore genes, from 20 to 50%, with other betaherpesvirus' counterparts (P < 0.001).
Like other betaherpesviruses (52, 59, 77, 78), MsHV encodes the homologues of the six genes of the DNA replication, including UL44 (DNA polymerase processivity subunit), UL54 (DNA polymerase [DPOL]), UL57 (single-stranded DNA binding protein), and the heterotrimeric helicase-primase consisting of UL70 (primase), UL102 (helicase-primase complex-associated factor), and UL105 (helicase). B44, B54, B57, and B105 are among the genes with highest identities among THV, HCMV and RCMV, ranging from 48% to 60% (Table 1), while B70 and B102, encoding the primase subunit and complex-associated factor, share about 45% and 33% identities, respectively. B98 and B114 encode the homologues of HCMV alkaline nuclease UL98 (74) and uracil DNA glycosylase UL114, both of which influence viral DNA replication (18). The large subunit ribonucleotide reductase homologue (B45) and the dUTPase homologue (B72) share 31% to 26% identity with their counterparts, which for HCMV are enzymatically inactive and dispensable for cell culture (12, 32) but in MCMV function as a cell death suppressor (10). HCMV UL84 (98) and UL112-113 (60) contribute to the initiation of lytic DNA replication, and the latter produces four products by alternative splicing. B112 and B113 were found to have less than 30% identity with UL112 and UL113.
HCMV UL36 and UL37 (17), UL69 (97), UL122 and UL123 (IE1 and IE2) (59), TRS1/IRS1 (80), and US3 (17) are reported to be immediate-early proteins that play a key role in initiation of viral replication via the regulation of gene expression. UL36 (viral inhibitor of caspase-8 activation [vICA]) and UL37 exon 1 (UL37x1) (viral mitochondrion-localized inhibitor of apoptosis [vMIA]) are two important proteins that inhibit cell death from apoptosis; the former prevents cleavage by binding to the prodomain of procaspase-8 (75), while the latter inhibits apoptosis by binding Bax and sequestering it at the mitochondrial membrane (1). All betaherpesviruses characterized to date carry homologues of UL36, but only primate CMVs retain a complete form of UL37 (UL37x1); MCMV, RCMV, and other betaherpesviruses, including MsHV, lack a homologue of UL37x1. Like THV, MsHV encodes the homologues of UL69 and UL122 but not UL123 or US3. TRS1/IRS1 belong to US22 family; MsHV has two loci at both ends of the genome encoding multiple US22 family members, with those at the right end sharing more similarity with TRS1/IRS1 than those at the left end. A UL82 (pp71) homologue was also found, which is the recently recognized viral transcriptional activator that activates viral immediate-early gene (IE1) expression in HCMV-infected cells through degradation of Daxx (70).
The structure of the HCMV virion has been divided into three components: nucleocapsid, tegument, and envelope. The nucleocapsid is composed of at least five proteins: UL86 (major capsid protein), UL85 (minor capsid protein), UL48A (the smallest capsid protein), UL46 (minor capsid binding protein), and UL80 (assembly protein) (8). As core genes, UL86 and UL85 are conserved with about 60% identity among MsHV, THV, HCMV, and RCMV, but B46 and B80 share less than 40% identity with UL46 and UL80. The homologue of UL48A, which is less than 100 amino acids (aa) in size but comprises 12.6% of virion mass (87), was also identified in MsHV, THV, and RCMV at the congruent position in the genome (2, 89). UL104, the portal protein for viral DNA encapsidation, reportedly interacting with the large terminase subunit UL56 and together with UL89, is involved in the resistance of HCMV to benzimidazole ribonucleosides (23).
An amorphous tegument is located between the virion capsid and envelope and comprises 50% of virion total mass (87). More than 27 HCMV proteins were reported to be tegument proteins, and of these the most abundant are the product of UL83 (pp65, lower matrix protein), with 15% of virion mass, and 9% (each) for UL82 (pp71, upper matrix protein and virion transactivator), UL32 (pp150, large matrix phosphoprotein), and UL48 (largest tegument protein). Although UL83 is the most abundant protein of the virion, it shares very low identity among HCMV, RCMV, and MCMV and was reported to be dispensable for growth in tissue culture (73). Neither MsHV nor THV carries the homologue of UL83. It will be interesting to determine which protein is instead the predominant protein in the virion of both viruses. In addition to their structural functions in virion morphogenesis, some tegument proteins have been shown to regulate viral gene expression or modify host cell responses to HCMV infection (44, 47, 79).
On the HCMV viral envelope, glycoproteins form three complexes: gCI, gCII and gCIII, or gB (glycoprotein B; UL55), gM-gN (UL100 and UL73), and gH-gL-gO (UL75, UL115, and UL74, respectively) (8). A study of the HCMV proteome using mass spectrometry revealed that the abundances of these glycoproteins relative to the total virion protein content are ordered as follows (from greatest to least): UL100 (9.2%), UL55 (1.4%), UL75/UL115 (0.6%/0.5%), UL73 (0.1%),and UL74 (<0.1%). Interestingly, the identities between the counterparts of MsHV and HCMV follow the same order: 52.8% (B100), 46.9% (B55), 34.2%/36.8% (B75/B115), 34% (B73), and 28.3% (B74).
Some betaherpesviruses encode multiple MHC class I homologues, such as UL18 and UL142 of HCMV (24) and CCMV (21), m144 of MCMV (64), R144 of RCMV (89), and gp147, gp148, and gp149 of GPCMV (72), and most recently, a rodent gammaherpesvirus, rodent herpesvirus Peru (RHVP), was reported to encode three ORFs, each of which was similar to a different region of the rat MHC class I protein (46). Although the scores are very low when BLASTed in GenBank, MsHV b39.5 is most similar to chimpanzee MHC class I chain-related protein, b39.4 to the MHC class I chain-related protein of crab-eating macaque, b39 and b39.2 to the MHC class II antigens of bovine and crab-eating macaque, respectively, and b39.1 to the hereditary hemochromatosis protein (HFE), which is a nonclassical MHC-related protein that associates with the transferrin receptor and regulates iron metabolism (68). Consistent with known MHC class I molecules, b39, b39.1, b39.2, b39.4, and b39.5 have a signal peptide at the N terminus, a transmembrane domain, and a cytoplasmic tail at the C terminus. Another MsHV protein, b149.22, contains an Ig-like domain profile according to a Prosite search but produces no significant hits in GenBank BLAST search. Hence, it most likely represents a new protein with no homologue in the known betaherpesvirus. MsHV b149.22 was predicted to be a type II transmembrane protein. The functions of these betaherpesvirus proteins were not fully characterized. For example, HCMV UL18 and UL142 and MCMV m144 have been reported to modulate natural killer cells (20, 63, 96), while RCMVΔr144 shows restricted replication in salivary glands and spleen of neonatal rats (3, 40). These proteins have no significant homology to any mammalian proteins (55) and are dispensable for virus replication in fibroblasts (7, 9, 28).
In addition to MHC class I homologues, betaherpesviruses also encode other genes which interfere with the MHC class I antigen-processing pathway, such as the HCMV U2 and U3 family members (45), MCMV m04, m06, and m152 (91), and RhCMV rh178 (67). Here we find that another protein in MsHV, b40, has 25% amino acid sequence identity with HHV-7 U21, which was recently reported to downregulate the MHC class I complex on the cell surface (50).
At the 3′ proximity of the MsHV genome is a cluster of four ORFs: b156, b161, b162, and b163. Three of them, b156, b161, and b163, contain C-type lectin family domains, while b162 does not. Instead, b162 it is more related to the natural killer cell lectin-like receptors than to C-type lectin proteins. Pairwise comparison showed that b156 and b163 share an identity of 39% whereas b162 and b163 have a sequence identity of 29% (Table 2). A transmembrane domain was predicted at the N-terminal end for all of these C-type lectin homologues, which is consistent with the type II transmembrane structure of C-type lectin proteins (100). The MsHV C-type lectin family shows a 24% sequence identity with its homologue in the English isolate of RCMV, which carries a spliced form of C-type lectin, and its interruption results in no change in virus replication in cell culture (90).
RCMV r127 (89) and HHV-6 U94 (83) were reported to be the only herpesvirus homologues of the parvovirus NS1 or Rep protein. Remarkably, situated at the congruent position in the MsHV genome, B125 has 22% and 27% identities with r127 and U94, respectively. The RCMV r127 protein is dispensable for virus replication both in vitro and in vivo (86); however, HHV-6 U94 was shown to inhibit the replication of not only HHV-6A and HHV-6B but HHV-7 and HCMV, suggesting a role in the regulation of replication and latency of human betaherpesviruses (13). The conservation in direction and position of the parvovirus Rep homologues in the genomes of MsHV, RCMV, and HHV-6 indicates a likely common origin.
There is an ~18-kbp region which has significantly lower G+C content than other parts of the MsHV genome. Interestingly, toward the left-hand end of this region, there is a unique gene family of 16 tandemly arranged members (Fig. 4). The proteins encoded by this gene family vary in length from 219 to 330 amino acid residues, except for b139.2, which is only 112 amino acid residues long with a truncated N terminus. Most of them have a potential signal peptide at the N terminus and a transmembrane domain at the C terminus. A sequence alignment revealed more than 10 conserved cysteine residues (Fig. 5A). An initial sequence BLAST search in GenBank resulted in no significant hits; however, 9 of the 16 proteins were predicted to have one or two immunoglobulin-like beta-sandwich domains or other folds related to proteins functioning in the immune system. As shown in Fig. 5A, the predicted domains cover most of the aligned sequence except for the two ends. Phylogenetic analysis indicates that the b149 family forms 4 clades and vicinal members tend to cluster together (Fig. 5B). The sequence identities range from 33% to 48% among the b149.2 clade, 19% to 48% among the b149.1 clade, and 16% to 28% among the b149.13 clade.
The US22 family is a large gene family present in all betaherpesviruses. Different betaherpesviruses carry different numbers of US22 family members, but the genomic arrangement is similar, with two clusters of tandemly repeated homologues at each end of the genome and some isolated homologues located in between. The exception is the THV genome, where the two clusters of US22 family tandem repeats are both located at the left end (2). The MsHV genome encodes 18 members of the US22 family. Although most of the members have highest sequence identities with those of THV, the distribution of the MsHV US22 family members more resembles that of GPCMV (38) (Fig. 6). B154, which is positioned in a direction opposite to that of its neighboring genes, is an exception. This gene organization has not been observed in any other betaherpesviruses.
MsHV B33 and B78 belong to the G-protein coupled receptor (GCR) family, which is characterized by seven transmembrane domains. While HCMV UL33 is genetically diverse and UL78 is highly conserved among HCMV clinical isolates, both are dispensable for replication in fibroblasts (22, 54). MsHV b132 is another protein with a GCR signature but with only one transmembrane domain instead of seven, and it is not homologous to any known proteins.
There are several predicted proteins in MsHV which have no homologue in GenBank. One of these, protein b43.1, is a protein predicted to span the membrane seven times. Two others, proteins b158 and b160, have 28% identity with each other but have no homologue found in GenBank.
The DPOL and gB sequences of MsHV were compared with those of two other betaherpesviruses from bats, bat BHV-2 (92) and bat BHV-1 (94). As predicted, DPOL and gB of bat BHV-2, which was isolated from Miniopterus fuliginosus, have high sequence identities to their MsHV counterparts, 86% and 91%, respectively. Bat BHV-1 was detected in two relatively distant species, Myotis nattereri and Pipistrellus pipistrellus. It was not included in the phylogenetic analysis because of the limited sequence available: only a 58-aa region of DPOL, with 51% identity to its counterparts in MsHV and bat BHV-2. Phylogenetic trees were constructed based on protein sequences of DPOL, gB, major capsid protein, and terminase, respectively (Fig. 7). The topologies of the four trees are very consistent with the taxonomy of betaherpesviruses, since the trees generally divide into 4 subfamilies, cytomegalovirus, muromegalovirus, roseolovirus, and proboscivirus. As expected, MsHV and bat BHV2 are closely related to THV according to these phylogenetic trees. Interestingly, GPCMV from guinea pig (38) together with Apodemus flavicollis cytomegalovirus 3 (AflaCMV3) from yellow-necked mouse (27), both within the Rodentia, were clustered closer to MsHV and THV than the muromegaloviruses. They form a branch distinct from but related to cytomegalovirus and muromegalovirus. Similarly, porcine cytomegalovirus (PCMV) consistently groups with the roseoloviruses rather than cytomegalovirus (30, 95).
Here we report the isolation and characterization of a new betaherpesvirus from the bat M. schreibersii. Similar to previously described betaherpesviruses, MsHV has a low rate of replication, reaching a peak 2 weeks after infection, which is similar to that of HCMV (61). MsHV has a restricted species tropism in vitro and is able to infect only one cell line of nonbat origin from a total of 14 cell lines representing 10 different animal species. The genome of MsHV was sequenced using 454 next generation sequencing and is 222,870 bp in length.
MsHV was initially identified in lymph node primary cells and then subjected to two rounds of purification and propagation with multiple passages in kidney primary cells. The number of passages was less than 10. Genetic stability of betaherpesviruses varies during in vitro passaging. HCMV and GPCMV have been shown to undergo genetic alterations, including gene loss (24, 38), while MCMV was reported to be stable (15). The genetic stability of MsHV is unknown. Given that our approach to identifying MsHV genes is based on finding homologues with the genes of other betaherpesviruses and the common criteria used for unique gene prediction, we acknowledge that data presented in this paper may not be a complete representation of MsHV genome content and that some of the predicted ORFs, especially those with splicing, may need revision in the future when we have transcription data to help better annotation.
Many herpesviruses have terminal repeat (TR) sequences, and according to the presence and location, herpesvirus genomes can be divided into six groups (61). The pattern of TR sequences in betaherpesviruses varies significantly. HCMV and CCMV have large TR at both termini and the junction of UL and US (21); RCMV (89) and MCMV (64) have 504-bp and 31-bp TR, respectively, and GPCMV has TR at the left terminus only (72). A number of herpesviruses have no TR, including THV, RhCMV, CyCMV (48), and the gammaherpesvirus RHVP (46). MsHV also has no TR. We cannot exclude the possibility that only one copy of the TR unit was incorporated into the assembled genome, but taking the 428-bp average length of the sequencing reads into consideration, we think it is unlikely. MsHV, RCMV, MCMV, GPCMV, RhCMV, and CyCMV share a similar genome organization, which is colinear with the prototype isomer of the HCMV genome, while the THV genome is colinear with another isomer of the HCMV genome because its US22 family locus reversely translocates from one end to the other (Fig. 6). So far, of all the betaherpesvirus genomes sequenced, only HCMV and CCMV have a large TR and form four genomic isomers; none of the others, including RhCMV and CyCMV from primates in the Cercopithecidae, share these genomic features. The emergence of large TR and genomic isomers may have occurred very recently in betaherpesvirus evolutionary history, but this hypothesis needs to be validated with more genome sequences from different species.
One of the most significant findings in this study is that MsHV encodes multiple families of gene homologues implicated in immunity. One family encodes MHC class I homologues. Virus-encoded MHC class I homologues have previously been identified in some betaherpesviruses (21, 64, 72, 89) and a gammaherpesvirus RHVP (46, 99). HCMV evades T-cell recognition by downregulating expression of MHC class I molecules on the surface of infected cells, an effect mediated by the products of the virus-encoded genes US2, US3, US6, and US11 (85). Although MsHV doesn't carry any homologues of US2, US3, US6, or US11, MsHV b40 is a homologue of HHV-7 U21, which downregulates classical and nonclassical MHC class I complexes from the cell surface (50). The downregulation of endogenous MHC class I would expose infected cells to attack by natural kill (NK) cells.
MsHV also encodes two MHC class II homologues and one homologue of the HFE protein which is closely related to MHC class II. To our knowledge, virus-encoded MHC class II homologues have not previously been reported. However, HCMC US2 and US3 not only inhibit the MHC class I antigen presentation pathway, which allows infected cells to evade recognition by CD8+ T lymphocytes, but also inhibit presentation of exogenous protein antigens to CD4+ T lymphocytes, which is part of the MHC class II antigen presentation pathway (84). Other viruses were also reported to target the MHC class II processing pathway for immune evasion (34). Epstein-Barr virus, a gammaherpesvirus, downregulates MHC class II expression during its reactivation (42). Future functional studies of MsHV MHC class II homologues are required to fully understand the significance of the roles they may play in potentially novel virus immune evasion pathways.
The term C-type lectin was originally introduced to distinguish between Ca2+-dependent and Ca2+-independent carbohydrate-binding lectins, but the C-type lectin superfamily now includes a large number of proteins that are homologous to the C type but don't bind carbohydrates (100). Recent studies have identified C-type lectins as an important family of pattern recognition receptors (PRRs) that are involved in the induction of specific gene expression profiles in response to specific pathogens (29). The human NK receptor complex includes a group of type II transmembrane proteins resembling C-type lectins, for which there is accumulating evidence to support crucial roles in the innate immune system (36, 69). MsHV b156, b161, b162, and b163 are related to C-type lectin or natural killer cell lectin-like receptors. The English strain of RCMV was reported to express a C-type lectin protein with high similarity with host C-type lectins (90), but a genome-wide comparative analysis between this virus and MsHV was not possible because of the limited sequence available to RCMV. C-type lectin homologues have also been found in the genomes of other viruses, including gammaherpesviruses (46, 57), poxviruses (6, 26, 81), and African swine fever virus (35). While some of them are dispensable or play some role in virus infection, the EBV C-type lectin homologue gp42 was reported to bind to the MHC class II receptor HLA-DR1 (57).
A gene family of 16 members, the b149 family was identified at the 3′ end of the MsHV genome and before the US22 family. The b149 family gene products have no significant homologues in GenBank, but three-dimensional (3D) structure predication analysis indicated they all contain immunoglobulin-like beta-sandwich folds. Interestingly, in the MCMV genome, the m145 family, a unique gene family of 12 ORFs, was predicted with MHC class I-like folds with a confidence of 70% to 90% (76). Remarkably, in both the b149 and m145 families, the members located in the center of the gene cluster are more conserved with each other (66). As reviewed by Revilleza et al. (66), the m145 family plays an important role in disrupting NK-cell recognition. For example, m138, m145, m152, and m155 downregulate different ligands of NKG2D, and m157 binds both the inhibitory NK receptor Ly49I and the activating receptor Ly49H. And interestingly, the surface molecules of NK cells, NKG2D, Ly49I, and Ly49H, belong to the C-type lectin family. The prediction confidence of the b149 family is low, ranging from less than 10% to 35%, which could be due to the fact that there is little information available in the public database for bat immune molecules. If we can experimentally prove their functionality in immune evasion, these molecules may represent an important new class of immune modulators yet to be fully characterized in other mammalian viruses.
We thank Jackie Pallister and Chris Cowled for critical readings of the manuscript, Chris Cowled, Michelle Baker, and Justin Ng for useful discussions about MHC and other immune-relevant molecules, and Hume Field, Craig Smith, and Carol De Jong for providing bats in the original primary cell line work. The technical help from Kaylene Selleck and the AAHL sequencing group is highly appreciated.
This study is supported in part by an NHMRC Australia-China Postdoctoral Exchange Fellowship (to H.Z.) and an OCE Science Leader Award from the CSIRO Office of the Chief Executive (to L.-F.W.).
Published ahead of print 23 May 2012