EHV-1 is associated predominantly with outbreaks of abortigenic disease and more rarely with outbreaks of neurological disease. For many years, it has been suspected that distinct strains of EHV-1 may be responsible for outbreaks of neurological disease. However, previous molecular epidemiology studies, although demonstrating genetic heterogeneity (e.g., via RFLP analyses), had failed to identify genetic markers associated with neuropathogenic disease outbreaks. The starting point for this study was the determination of all nucleotide differences between two strains of EHV-1, Ab4 (high frequency of abortion, neuropathogenic) and V592 (low frequency of abortion, nonneuropathogenic), which have been shown by experimental infection studies with ponies to exhibit markedly different pathogenicities. Following determination of the complete genomic sequence of V592 and comparison with the previously determined sequence of Ab4 (56
), we sought to test the following hypotheses. (i) Positions of sequence variation between Ab4 and V592 are indicative of regions of sequence variability among EHV-1 field isolates and will therefore provide markers enabling discrimination between EHV-1 strains. (ii) Ab4 is representative of a distinct group of genetically closely related EHV-1 strains capable of causing neurological disease. (iii) One or more specific sequence markers that vary between Ab4 and V592 are indicative of strains capable of causing neurological disease.
Overall, there was relatively low variation between the Ab4 and V592 sequences, with approximately one variable position per 1,000 bp. The majority of changes were point mutations, involving single nucleotide substitutions. There were also several small deletions or insertions, involving 1 or 2 nucleotides, together with larger deletions and insertions involving a contraction or expansion of short direct repeat elements. Analysis of the larger regions of variable repeat length were of limited value for typing of strains, due to their inherent instability, agreeing with previous studies (2
On the assumption that the pathogenic differences between Ab4 and V592 are the result of sequence changes in one or more viral proteins, we evaluated SNPs for several ORFs, found to vary in coding sequence between Ab4 and V592, for a panel of EHV-1 isolates. A region of ORF68 was found to be the most variable, with 19 distinct sequences observed, and was used as the primary marker for grouping isolates into six major strain groups (Table ; Fig. ). A frameshift in ORF68 had previously been identified (34
), due to the presence of either 7 or 8 G residues in a homopolymeric tract, with Ab4 (G8
) expressing a longer form of the protein (418 amino acids) than other isolates (303 amino acids). In this study, additional variants were identified with either G6
in this region. Isolates encoding G6
, or G9
were all assigned to ORF68 group 1 and were closely related in ORF68 sequence (and other SNPs [see below]) to group 2 isolates. The majority of field isolates were found to encode G7
and are therefore predicted to express the 303-amino-acid form of the polypeptide. ORF68 is homologous to Us2 of HSV-1 and related genes in other alphaherpesviruses (56
). The region of homology between ORF68 and Us2 lies upstream of the frameshift region, suggesting that all of the forms expressed by EHV-1 isolates are likely to retain important functional domains.
Analysis of SNPs for several other ORFs (Table ; Fig. ) supported the groupings assigned according to ORF68 sequence variation. Groups 1 and 2 appear to be closely related, with characteristic SNPs for ORF15, -37, -45, and -52, which were not observed with other strain groups. Group 1 is distinguished from group 2 solely on the basis of the number of Gs in a homopolymeric run, likely to be a relatively unstable marker. Groups 3, 4, and 5 shared SNPs for most of the ORFs analyzed, but SNPs distinguishing between these groups were also observed. Group 6 has unusual SNPs in several ORFs and, on the basis of the ORFs analyzed, is the most highly diverged from the other groups. Since the ORFs analyzed were selected on the basis of differences between Ab4 (group 1) and V592 (group 6), it is uncertain whether the high proportion of SNPs specific to group 6 reflects relatively early divergence of group 6 from the other groups or is due to sampling bias. These findings support the first hypothesis presented above and suggest that genetically distinct strains of EHV-1 are in circulation.
Studies of other herpesviruses have demonstrated differing degrees of sequence diversity. Clinical isolates of human cytomegalovirus (HCMV) show significant sequence diversity by RFLP analysis and sequencing of selected ORFs (10
). Furthermore, it is difficult to assign HCMV isolates into strain groups, due to a relatively high frequency of recombination, indicated by variation of gB via intragenic recombination and relative rarity of genetic linkages between nine separate variable loci (20
). Similarly, a study of sequence variation between clinical isolates for genes within the Us region of HSV-1 (gG, gI, and gE) concluded that homologous recombination (intra- and intergenic) was a common event (40
). In contrast, varicella-zoster virus (VZV) is relatively genetically stable and has been estimated to have a sequence diversity approximately 10 times lower than that of HSV and 40 times lower than that of HCMV (36
). Analysis of VZV strains indicates significant geographical/ethnic restriction of the circulation of strain groups and shows that recombination between different strain groups is a relatively rare event (28
). These differences in the various herpesviruses are likely to reflect differences in their biological properties, in particular relating to their modes of transmission and frequency of reactivation.
The low level of sequence divergence (0.1%) between the Ab4 and V592 genomes and the generally good agreement between the ORF68 grouping and SNPs in other ORFs suggest that EHV-1 genomes are relatively stable, in agreement with previous assessments of EHV-1 sequence diversity (4
). The current study has uncovered no evidence for recombination in the isolates and SNPs so far, although more data would be required to reach a definitive conclusion about the potential significance of recombination in the virus' life cycle. The relatively low degree of sequence diversity observed for EHV-1 is similar to that of VZV. EHV-1 is also similar to VZV in showing geographical restriction of certain strain groups. Thus, EHV-1 group 5 isolates were isolated predominantly from outbreaks in North America and group 3 isolates predominantly from outbreaks in Europe. Notably, there was no obvious association between any particular ORF68 strain group and outbreaks of neurological disease, with five of the six major groups including both neuropathogenic and nonneuropathogenic isolates. It is clear that Ab4 is not representative of a group of related strains capable of causing neurological disease, thereby disproving the second hypothesis presented above.
In contrast to all of the other markers tested, an SNP in ORF30 (G/A2254) did not cosegregate according to the ORF68 strain grouping (Table and unpublished data). Both forms were present in five of the six major strain groups (the exception being group 6, which comprises only three members, all of which encoded A2254). The ORF30 SNP (G/A2254), corresponding to amino acid variation D/N752 of the DNA polymerase (GAC-AAC is the only single nucleotide substitution that will result in a D-to-N change), instead showed a very strong association with neuropathogenic/nonneuropathogenic isolates, respectively. This result is in agreement with the third hypothesis presented above and demonstrates that one of the Ab4 specific alleles (ORF30 G2254) shows strong predictive value as a marker of isolates with a propensity for causing neurological disease. It was also noted that three of the seven isolates from neurological outbreaks that encoded ORF30 A2254 encoded an additional nucleotide change, A/G2279, resulting in the coding change D/G760. The single outbreak involving an isolate encoding G2279 that did not include neurological disease occurred on premises with only two pregnant mares (of which one aborted), where there were no other horses in contact with the infected horses. There was no statistically significant association of the G2279 SNP with neuropathogenicity (P = 0.066), although considering this allele occurred infrequently (4/131 isolates), analysis of additional isolates encoding this allele is required to determine whether this may be an uncommon ORF30 variant associated with increased pathogenic potential.
It is important to consider whether differences in the techniques used to generate DNA from the two different outbreak types (neuropathogenic and nonneuropathogenic) may have affected the results, in particular, the potential selection of ORF30 sequence variants in tissue culture. In this study, the majority of DNA samples were analyzed from isolates which had been prepared by limited passage (typically two to five passages) in tissue culture, with only 8/131 samples (all from abortigenic outbreaks) prepared directly from tissue samples without cell culture. There was not, therefore, a consistent difference in the techniques used to generate the panels of neuropathogenic and nonneuropathogenic samples within each laboratory submitting the samples (although the choices of cells used for virus isolation differed between the various laboratories). Where ORF30 sequences were compared between DNA samples prepared directly from tissues and samples prepared from tissue culture isolates (two neuropathogenic outbreaks), the sequences were found to be identical. Combined with the observation that where samples from more than one horse in a given outbreak were analyzed (seven neuropathogenic and five nonneuropathogenic outbreaks), all samples yielded the same ORF30 sequence, we are confident that the ORF30 D/N752 sequence variation is not an artifact of tissue culture selection.
Two tissue culture attenuated vaccine strains were characterized for their DNA polymerase sequence. One of these strains, the component of the vaccine Rhinoquin (47
), was found to encode the DNA polymerase D752
sequence typical of neuropathogenic strains. Although found to be attenuated in safety and efficacy studies (45
), Rhinoquin was associated with paralytic disease when used as a vaccine in the United States and as a consequence was withdrawn from the market (27
). It is difficult to verify whether the cases of neurological disease were directly related to administration of Rhinoquin or coincidental to the vaccine trial, but the cases showed a strong temporal link, having occurred 8 to 11 days after administration of the vaccine. It can be speculated that either Rhinoquin reverted the attenuating mutations and was then able to express neuropathogenicity or the Rhinoquin DNA polymerase sequence was introduced into wild-type field strains via recombination, resulting in neurological disease. A second tissue culture attenuated strain used as the progenitor of live vaccines (RacH [30
]) also encoded DNA polymerase D752
. In contrast to Rhinoquin, vaccines derived from RacH have not been associated with neurological disease, and such vaccines are in current use in the United States (Rhinomune; Pfizer Animal Health [6
]) and in Europe (Prevaccinol; Intervet [30
]). It may be significant, therefore, that RacH encodes a second amino acid coding change, Y/S753
, that was not identified in any of the field isolates. In this regard, it should be noted that the position corresponding to Y/S753
is conserved in other herpesviruses as a hydrophobic residue (Fig. ); hence, the S753
mutation of RacH is a nonconservative change.
The strong association of DNA polymerase D/N752
with neurological versus nonneurological disease outbreaks suggests that the viral genotype is a major contributory factor in determining whether a disease outbreak will include animals presenting neurological signs. This does not mean that infection with neuropathogenic strains inevitably results in clinical signs of neurological damage, since the clinical outcome of infection for individual animals within an outbreak will be modulated by other factors, which may include age, host immunity, challenge dose, and hormonal status. Consequently, the proportions of animals that ultimately show neurological signs when infected with a neuropathogenic strain vary for each outbreak. For example, during a large outbreak of neurological EHV-1 disease in the United Kingdom, the neurological attack rate (number of cases presenting neurological signs/number of horses confirmed to be infected) was 20% (33
). In four recent, large outbreaks of EHV-1 neurological disease in horses in the United States, for which complete records were kept (G. Allen, unpublished data), the mean neurological attack rate was 33% (range of 22% to 50%) and the mean neurological case fatality rate was 40% (range of 20% to 50%). All five outbreaks described above involved EHV-1 strains carrying the neuropathogenic-associated form of the DNA polymerase (D752
Notably, the D/N752
variable position of EHV-1 DNA polymerase corresponds to a highly conserved position in other herpesviruses. Alignment of herpesvirus DNA polymerase sequences from diverse host species indicated that the equivalent position was conserved as an acidic residue (almost invariably D) in other herpesviruses (Fig. ). This suggests that the ancestral EHV-1 virus probably encoded D752
and that variants expressing N752
arose subsequently, possibly due to a selective advantage. The observation that the D/N752
SNP is unlinked to any of the other SNPs tested suggests that either the D/N752
mutation has arisen relatively recently and independently in five of the six major strain groups or this position is relatively unstable. Outbreaks of EHV-1 neurological disease are relatively rare compared with outbreaks of nonneurological disease, which suggests that the majority of EHV-1 strains in circulation encode N752
. It may further be proposed that D752
variants arise spontaneously but may have a selective disadvantage for long-term maintenance in the population. Comparison between herpesvirus DNA polymerase sequences and those of other related polymerases (from organisms as diverse as bacteriophages and mammals) has revealed a number of conserved domains (23
). The variable D/N752
residue lies between the conserved domains designated II and VI, which comprise regions essential for DNA polymerase activity. Mutations proximal to this position (mostly within domains II and VI), which influence antiviral drug sensitivity for herpes simplex virus, varicella-zoster virus, and human cytomegalovirus, have been noted previously (11
). Thus, although the variable D/N752
residue is not within one of the domains known to be critical for function, it does lie within the core, catalytic region of the enzyme and may therefore influence enzymatic activity.
Considering the strong association of D/N752
with neuropathogenic potential, it is reasonable to speculate that the D/N752
coding change affects functional properties of the DNA polymerase that play a direct role in the etiology of EHV-1 neurological disease. Replication studies with various cell types in tissue culture (fibroblast and endothelial) have shown no consistent differences between V592 and Ab4 (55
; R. Chiam, K. Smith, and N. Davis-Poynter, unpublished data). It is possible, however, that alterations of DNA polymerase activity affecting replication may be apparent only in specific cell types or in the context of tissues in vivo. Other possible (albeit speculative) modes of action may include variation in the efficiency with which different forms of the DNA polymerase stimulate immune responses (e.g., cytotoxic T lymphocyte) against EHV-1-infected target cells, thereby affecting the efficiency of clearance of the infection. Alternatively, the neuropathogenic form of EHV-1 DNA polymerase may promote damage to blood vessels supplying the CNS, either directly via effects upon infected endothelial cells or indirectly via immunopathology. In order to test the hypothesis that the D/N752
substitution of the DNA polymerase affects neuropathogenicity, recombinant EHV-1 with the substitution at nt 752 engineered on a syngeneic background has been generated (L. Goodman and N. Osterrieder, unpublished data) and will be used in future in vivo studies to determine the influence of DNA polymerase sequence variation upon EHV-1 disease outcome.
The strong association of a naturally occurring SNP of a single gene with pathogenic potential, observed for EHV-1 DNA polymerase in this study, is to our knowledge unprecedented for a herpesvirus. The implication is that mutation of a single nucleotide has a significant effect upon pathogenicity, although this hypothesis has not yet been tested. This raises an important question concerning the origin of EHV-1 strains giving rise to neurological disease outbreaks, namely, whether they usually originate from nonneuropathogenic strains, via spontaneous mutation of the DNA polymerase (from N752 to D), or whether they are usually the result of reactivation from animals previously exposed to neuropathogenic strains. Currently, we have no data regarding the relative frequency of latent carriage of EHV-1 strains encoding D/N752. DNA polymerase activity may conceivably influence the establishment of, or reactivation from, latency. Thus, establishment of latency may be promoted by abortive replication, if DNA polymerase activity is insufficient to promote full late gene expression prior to the genome entering a quiescent state; conversely, during the early stages of reactivation, DNA polymerase activity may be critical in triggering full lytic cycle gene expression. If EHV-1 DNA polymerase sequence variation does have an impact upon latency/reactivation, this may provide a source of selective pressure acting upon the DNA polymerase D/N752 sequence. In the absence of a reliable system for evaluating the efficiency of EHV-1 latency establishment and reactivation, it will be difficult to test the abovementioned hypothesis.
In conclusion, determination of the complete genomic sequence of a nonneuropathogenic strain of EHV-1 (V592) and comparison with the previously published genomic sequence of a neuropathogenic strain (Ab4) has led to two important findings. First, a method for subtyping EHV-1 isolates recovered from the field via multilocus typing, which will be useful in tracing the transmission of virus strains between outbreaks, has been developed. Second, an SNP of ORF30 (G/A2254), resulting in an amino acid coding change of the DNA polymerase (D/N752), shows strong predictive value as a marker of isolates with a propensity to cause neurological disease. If a causal link between DNA polymerase sequence variation and neuropathogenic potential is established, determination of the molecular mechanisms underlying the link will assist future diagnostic, prophylactic, and therapeutic strategies designed to reduce the incidence and severity of neurological EHV-1 disease. These findings may also have implications for molecular epidemiological studies of other herpesviruses. In the case of EHV-1, variation of a single gene was found to be associated with differences in pathogenic potential, whereas none of the other SNPs tested showed significant association with disease. If single gene polymorphisms similarly have a significant association with disease manifestation for other herpesviruses, such strain-related differences may not have been apparent from previous analyses of a limited subset of variable gene sequences. The availability of methods for whole-genome SNP analysis will facilitate more-comprehensive studies and may reveal previously unrecognized associations between herpesvirus strain variation and disease.