|Home | About | Journals | Submit | Contact Us | Français|
In March 2005, the Centers for Disease Control and Prevention (CDC) investigated a large hemorrhagic fever (HF) outbreak in Uige Province in northern Angola, West Africa. In total, 15 initial specimens were sent to CDC, Atlanta, Ga., for testing for viruses associated with viral HFs known to be present in West Africa, including ebolavirus. Marburgvirus was also included despite the fact that the origins of all earlier outbreaks were linked directly to East Africa. Surprisingly, marburgvirus was confirmed (12 of 15 specimens) as the cause of the outbreak. The outbreak likely began in October 2004 and ended in July 2005, and it included 252 cases and 227 (90%) fatalities (report from the Ministry of Health, Republic of Angola, 2005), making it the largest Marburg HF outbreak on record. A real-time quantitative reverse transcription-PCR assay utilized and adapted during the outbreak proved to be highly sensitive and sufficiently robust for field use. Partial marburgvirus RNA sequence analysis revealed up to 21% nucleotide divergence among the previously characterized East African strains, with the most distinct being Ravn from Kenya (1987). The Angolan strain was less different (~7%) from the main group of East African marburgviruses than one might expect given the large geographic separation. To more precisely analyze the virus genetic differences between outbreaks and among viruses within the Angola outbreak itself, a total of 16 complete virus genomes were determined, including those of the virus isolates Ravn (Kenya, 1987) and 05DRC, 07DRC, and 09DRC (Democratic Republic of Congo, 1998) and the reference Angolan virus isolate (Ang1379v). In addition, complete genome sequences were obtained from RNAs extracted from 10 clinical specimens reflecting various stages of the disease and locations within the Angolan outbreak. While the marburgviruses exhibit high overall genetic diversity (up to 22%), only 6.8% nucleotide difference was found between the West African Angolan viruses and the majority of East African viruses, suggesting that the virus reservoir species in these regions are not substantially distinct. Remarkably few nucleotide differences were found among the Angolan clinical specimens (0 to 0.07%), consistent with an outbreak scenario in which a single (or rare) introduction of virus from the reservoir species into the human population was followed by person-to-person transmission with little accumulation of mutations. This is in contrast to the 1998 to 2000 marburgvirus outbreak, where evidence of several virus genetic lineages (with up to 21% divergence) and multiple virus introductions into the human population was found.
Marburgvirus was first identified in 1967, when infected monkeys, imported from the Lake Kyoga region of Uganda, transmitted the virus to laboratory workers and scientists at facilities in Germany and the former Yugoslavia (56). Marburgvirus and its close relative ebolavirus comprise the family Filoviridae and are known to cause severe hemorrhagic fever (HF) in humans and nonhuman primates (38). Marburgvirus is a single species, consisting of viruses differing from one another by up to 21% at the nucleotide level. In contrast, four distinct species of ebolavirus (Zaire, Sudan, Reston, and Ivory Coast) have been defined, which differ genetically from one another by approximately 37 to 41% (15). Marburgvirus is a single-strand, negative-sense RNA virus whose seven gene products are, in order, nucleoprotein (NP), VP35, VP40, glycoprotein (GP), VP30, VP24, and the polymerase (L) (14). Ebola and Marburg HF outbreaks are generally thought to involve the relatively rare introduction of the virus into the human population followed by waves of human-to-human transmission (usually through close contact with infected individuals or their body fluids), with little if any evolution of the virus during the course of the outbreak (42, 48). Symptoms of marburgvirus infection include general malaise, acute fever, abdominal cramping, bleeding disorders, and shock (29). Case fatality rates in previous large Marburg HF outbreaks have ranged from 23 to 83% (5, 30).
Historically, sources of marburgvirus have been confined to East Africa. They have been centered almost exclusively within 500 miles of Lake Victoria, with the exception of a single case in Zimbabwe in 1975, when a traveler became infected and, seeking medical treatment, subsequently transmitted the virus to a health care worker in South Africa. This previous close association of marburgvirus with East Africa contrasts with the observed distribution of ebolavirus, which has caused human HF outbreaks throughout tropical Africa, ranging from Cote d'Ivoire to Uganda. The largest Marburg HF outbreak previously on record occurred from 1998 to 2000 in Durba, Democratic Republic of Congo (DRC) (57), infecting 154 people and with a case fatality rate of approximately 83% (5). The majority of these patients were involved directly or indirectly with illegal gold mining activity in the region. The Durba outbreak was somewhat unusual compared to previous marburgvirus outbreaks in that it continued for almost 2 years, during which time multiple distinct genetic lineages were found to be circulating (37), likely indicating several independent introductions of virus into the human population from the unknown natural reservoir.
The recent emergence of marburgvirus in West Africa is described in this report. Based upon clinical presentations that meet the viral HF case definition (10), the outbreak likely started in October of 2004 in Uige Province in northern Angola yet continued undiagnosed until March 2005, when transmission of HF to health care workers alerted the community to the possibility of Marburg or Ebola HF (9, 58). The outbreak persisted until July 2005 (32), raising the possibility that multiple genetic lineages could have been introduced during this long epidemic period. In this study, we sequenced the entire marburgvirus genomes obtained from multiple laboratory-confirmed cases throughout the Angola outbreak and compared these sequences to those obtained at different times during the 1998 marburgvirus outbreak in Durba, DRC, as well as the complete genome of Ravn marburgvirus isolated in 1987. Collectively, these 16 newly derived complete genome sequences, along with those previously determined for the Popp, Musoke, and Ozolin strains, afforded us the opportunity to carry out the most comprehensive survey to date of the sequence spectrum circulating among marburgviruses. We determined that the newly emergent Angola marburgvirus is closely related to the majority of the previous East African isolates.
Blood and serum specimens from suspected viral HF (VHF) patients were received unfrozen at CDC in International Air Transport Association-approved containers. Samples 200501377 to -88, 200501411, and 200501412 were opened under biosafety level 4 conditions, and virus isolation using Vero E6 cells was performed as previously described (42). The Angola marburgvirus reference sample is clinical specimen number 200501379, or “1379c,” while the virus isolated from 1379c is designated “1379v”. The samples were inactivated with gamma-cell irradiation or chaotrope for RNA extraction and then tested for marburgvirus antigen, specific immunoglobulin M (IgM), IgG, and nucleic acid by using modifications of existing techniques for Ebola hemorrhagic fever (22, 23, 48, 49). The modifications for the IgM and IgG enzyme-linked immunosorbent assays (ELISAs) consisted of (i) a replacement of the pooled monoclonal antibody mix with one directed to marburgvirus (Musoke strain) NP and (ii) a replacement of the Vero cell lysates with ones in which Vero cells were infected with marburgvirus (Musoke strain). The modifications for the nucleic acid detection were to use a marburgvirus-specific quantitative reverse transcription-PCR (Q-RT-PCR) assay designed to detect all known species of marburgvirus (described below). Immunohistochemical (IHC) staining was performed on skin biopsies fixed in 10% formalin as previously described (60) and adapted for detection of marburgvirus. Specimens 2005000126 to -998, processed in the field lab, were not gamma irradiated but were treated additionally with 0.5% Triton X-100 and heated at 56°C for 30 min to greatly reduce infectivity.
RNA was extracted using an Applied Biosystems 6100 nucleic acid preparation station according to a modified protocol (49) using noncellular lysis buffer. Briefly, 50 μl of sample was mixed with 300 μl of 2× noncellular lysis buffer under biosafety containment conditions. Following >15 min of incubation, plastic ware containing samples in lysis buffer was externally decontaminated with 3% Lysol, and RNAs were transported out of biosafety containment and extracted according to the manufacturer's instructions, with the modification of adding two extra 1× noncellular lysis buffer washes at the beginning of the extraction process.
Briefly, a Q-RT-PCR assay was designed to detect a region in VP40. Q-RT-PCR assays were carried out by first random-primed conversion of the RNA to cDNA using the High Capacity cDNA synthesis kit (Applied Biosystems) according to the manufacturer's instructions. Ten microliters of the cDNA synthesis reaction mixture was then used to program probe-based real-time PCR assays, using Universal Master Mix (Applied Biosystems) in 50-μl total reaction volumes. The marburgvirus broadly reactive Q-PCR assay used the following forward and reverse primers and probe, respectively: 5′-GGACCACTGCTGGCCATATC-3′, 5′-GTCGGCAGGAGGIGAAATCC-3′, and 6-carboxyfluorescein-5′-CTCTGGGACTTTTTCIACCCTCAGTTGATGA-3′. The quencher, Black Hole, was internally placed on T indicated by boldface within the probe oligonucleotide. Amplifications were carried out for 40 cycles by denaturing for 20 seconds at 94°C followed by 30 seconds at 60°C degrees. Samples exhibiting cycle threshold (CT) values of less than 35 were considered positive. The VP40 gene sequence of the Angola-05 marburgvirus isolate was determined and allowed development of an Angola-specific Q-RT-PCR assay. The slightly modified primers and probe were 5′-GGTCCACTGCTGGCCATATC-3′, 5′-GTCGGCAGGAAGCGAAATCC-3′, and 6-carboxyfluorescein-5′-TTCTGGGACTTTTTCGACTCTCAGTTGATGA-3′, where the quencher was again internally placed on the T indicated by boldface. The modified Q-RT-PCR assay provided two to fourfold (1 to 2 CT units) better sensitivity in samples assayed side by side with the original assay (data not shown).
Ravn isolate (1987), strain M/Kenya/Kitum Cave (20), was passaged once in SW13 cells, followed by four passages in Vero E6 cells. 05DRC, 07DRC, and 09DRC viruses (Durba, DRC, 1998) were passaged three times each in Vero E6 cells.
The reference sequence, Ang1379c, was determined from RNA isolated directly from clinical material and also from the virus isolate (Ang1379v) after one passage on Vero E6 cells. Multiple RT-PCR fragments were generated using the Superscript III one-step high-fidelity RT-PCR kit (Invitrogen) according to the manufacturer's instructions. The fragments, amplified using the following combinations of primers (listed 5′ to 3′) and designed based upon a consensus sequence derived from an alignment of Musoke, Popp and Ozolin marburgviruses (GenBank ascension numbers DQ217792, NC001608, and AY358025, respectively), were as follows (the oligonucleotide names reflect the approximate nucleotide position of each primer): 21F (GACACACAAAAACAAGAGATGATG) and 810R (GAAGTCCTGAGAATCTAGTTTG), 132F (GTACAAAACCCACTGCCCCTCATG) and 3489R (AACCGTAGTCCCCTTACTAC), and 5363F (AGAIGGIATGATGAAGAAAAG) and 7781R (CAGGTCCIAGCACITTGCATGTTCC), followed by nested amplification using 5420F (TGAGAAITTICCTTTGAATGGCTTC) and 7671R (TCCAAGGATTTIGCAGTTTG), 7567F (TGGCCCTGGAATIGAAGGACTC) and 10571R (AGCATATGAACAATAGATC), and 10401F (TGTTCCAGAGTGGCAACAAAC) and 14933R (CTCAGAIAAATTAATCCACTTTAC), followed by heminested amplification using 10401F and 14882R (TGTGGCACCAATTAGCCTTTTCCC) and 14763F (TGGACGATGATTTATCTGAGTC) and 19113R (GGACACACTAAAAAGATGAAG). The region between nucleotides 3489 and 5363 was amplified by using primers VP35F1 (GCTTACTTAAATGAGCATGG) and GP170R (TGCCCACTCAGTGTAAATCC). The full-length genome sequences from clinical specimens from this outbreak (samples 1380, 1381, 1386, 1411, 0126, 0181, 0214, 0215, 0754, and 0998) and isolates 05DRC and 07DRC were similarly determined with primer modifications by amplifying six genome fragments, designated A to F, as follows: fragment A was amplified with 63F (TGACATTGAGACTTGTCAGTC) and 4997R (GCTTGATTTCCTTCACGC) followed by nested amplification using 375F (CAAGAATGCAGATGCAACC) and 4641R (GTTTGCACCGTGGTCAG); fragment B was amplified using primers 3781F (GTTCCTCCAGTGATAAGAGTC) and 8478R (GATTGAAGTAAGGCAAGTTGTTAA) followed by nested amplification using 4155F (CTAACAAAAGGGGTCTAACCTA) and 8304R (CTTAACTATGTCTTCAAGCCTA); fragment C was amplified using primers 7567F and 10571R followed by nested amplification (sample 1386 only) using 7621F (TTTITGITGIAGGTTGAGGCG) and 10521R (TTGATCCTTCAAAGCAACTCC); fragment D was amplified using primers 10000F (GTACCTCTAAGGAAAACCATGAAG) and 15460R (GTTGATATAATTGCACGTGTGG) followed by nested amplification using primers 10287F (GACCTCAATTCCACAGCA) and 15279R (GTCTGCAGTGCCTTGAGT); fragment E was amplified using 14763F and 19113R followed by heminested amplification using 14817F (CTAATTTCTTGAGGGCATATTC) and 19113R; and fragment F was amplified in a heminested reaction using 3F (CACACAAAAACAAGAGATGATG) and 925R (CTCAACGCCTGCTTGAAAG) followed by heminested amplification using 3F with 801R. The RT-PCR fragments for the Ravn and 09DRC virus isolates required considerable primer modifications. Nucleotides ~3 to 6000 were generated in four fragments using primers 3F with 721R (ACCTGTCACCAAATTTACTCC), 617F (CACTGGCTTACTACAGGCCA) with 1128R (TGTTCATGTCGCCTTTGTAG), 818F (ATTGCATCCACTTGTGCG) with 2117R (AGACAGGATTTTGGTTATTCC), and 1581F (CACTAAGGCAAACTCAGGAC) with GP170R. The remainder of the genome was generated using primers described above, namely, 7567F with 10571R, 10401F with 14882R, and 14763F with 19113R. If necessary, RT-PCR products of the correct predicted sizes were gel purified using either the Qiaquick gel extraction kit (QIAGEN, Valencia, CA) or the NucleoTrap gel extraction kit (BD Biosciences Clontech, Palo Alto, CA). The purified DNA was sequenced by primer walking of both strands, using ABI BigDye 3.1 dye chemistry and ABI 3730XL automated DNA sequencers (Applied Biosystems, Foster City, CA). Sequencing primers representing the most conserved regions were designed encompassing the entire genome fragment. Sequencing primers were synthesized by Integrated DNA Technologies (Coralville, IA). Chromatogram data were assembled using Seqmerge (Accelrys Inc, San Diego, CA), Phred/Phrap base calling and assembly software, and Consed for sequence editing (University of Washington). The terminal ~20 nucleotides were not experimentally determined but were based on the termini of the previously published Popp, Musoke, and Ozolin sequences.
Genomic analyses and alignments of filovirus sequences were performed using the Wisconsin package of GCG version 10.3. (Accelrys Inc., San Diego, CA). Phylogenetic analysis was performed using Paup 4.0b10 (Sinauer Associates Inc.).
The GenBank Accession numbers for the newly determined full-length genome sequences are as follows: Ravn, DQ447649; 07DRC, DQ447650; 05DRC, DQ447651; 09DRC, DQ447652; Ang1379c, DQ447653; Ang1381, DQ447654; Ang1386, DQ447655; Ang0126, DQ447656; Ang0214, DQ447657; Ang0215, DQ447658; Ang0754, DQ447659; and Ang0998, DQ447660.
In early 2005, reports were received by the CDC and World Health Organization (WHO) of a large HF outbreak in Uige Province in northern Angola, West Africa (Fig. (Fig.1).1). On 15 March, specimens were collected and sent to CDC in Atlanta, Ga., for testing for evidence of infection with viruses associated with VHF which were known to be present in West Africa. To be thorough, the VHF-associated Marburg virus was also included despite the origin of all earlier outbreaks being restricted to East Africa. Testing included virus isolation, antigen capture ELISA, IgM and IgG ELISA, RT-PCR and Q-RT-PCR assays, and in one case IHC staining. Contrary to expectations, evidence of acute marburgvirus infection was found in 12 of the 15 initial specimens (Table (Table1),1), all of which were from fatal cases.
Clearly the most sensitive diagnostic test was the Q-RT-PCR assay, which was designed to detect a conserved sequence of VP40 present in all known species of marburgvirus, including the more distantly related Ravn strain. All positive Q-RT-PCR results were obtained in duplicate, with each set yielding amplification curves within 0.4 CT units. Virus isolation proved to be nearly as sensitive as the Q-RT-PCR assay, although limited clinical material in some cases made virus isolation attempts difficult. The antigen capture assay confirmed the RT-PCR results in four cases yet overall proved less sensitive. None of the initial samples were positive in either the marburgvirus-specific IgM or IgG assays performed in parallel. In those ELISAs, the positive and negative controls in each assay performed as expected, thus suggesting that the negative results with the human specimens were not the result of any systematic errors. In no instances did specimens test positive by antigen capture, IgM, IgG, or virus isolation and not test positive by RT-PCR (data not shown).
Three samples (samples 1411 to 1413 in Table Table1)1) arrived the following week. One specimen was a skin biopsy from a patient in Uige, while the other two samples, 1411 and 1412 (a blood specimen and a serum specimen, respectively, drawn at the same time), were from a physician in Luanda with a recent contact history with the suspected VHF patients in Uige Province. The physician traveled directly to the capitol city of Luanda only days before her death. Both samples tested positive by Q-RT-PCR and virus isolation, while the skin biopsy tested positive by IHC staining.
To establish the genetic relationship of the Angola marburgvirus relative to prior East African viruses, and to determine the extent of sequence variation circulating within the Angola VHF outbreak, complete marburgvirus genomes were sequenced from 11 clinical samples collected from fatal cases throughout a time period of almost 3 months and encompassing the known geographic distribution of the outbreak. There were four municipalities within Uige Province from which virus-positive samples were collected (Fig. (Fig.1).1). The municipalities were Uige (samples 1379 to 1386 and 0998), Songo (sample 0754), Bungo (samples 0181, 0214, and 0215), and Damba (sample 0126). In addition, we included the sample from the Angolan capitol, Luanda (sample 1411) (Fig. (Fig.1).1). RNAs were extracted from multiple types of clinical specimens, including oral swabs, which collectively contained a wide range of viral loads. It should be noted that following the initial Marburg VHF diagnosis, field labs were established by CDC and the Public Health Agency of Canada on site in Luanda and Uige, respectively, at the request of the Ministry of Health of Angola and WHO. Samples 0126, 0181, 0214, 0215, 0754, and 0998 (Table (Table2)2) were obtained and tested within 24 h of specimen collection and then stored frozen until nucleotide sequence analysis could be initiated.
To provide a more complete representation of the overall potential for genetic diversity within a single marburgvirus outbreak and to provide a basis to which to compare the genetic diversity (or lack thereof) circulating in the Angola outbreak, we determined the complete genome sequences of four marburgvirus isolates obtained from the two most recent marburgvirus outbreaks prior to the one in Angola. Three of the isolates, 05DRC, 07DRC, and 09DRC, were from epidemiologically unlinked cases that occurred at different times during the 1998 outbreak in Durba, DRC, while the fourth isolate, Ravn (Rav), was obtained in 1987 from a 15-year-old Danish boy with a recent history of travel to Kitum Cave in Mount Elgon, Kenya, 9 days prior to disease onset.
All 16 newly determined complete marburgvirus sequences (19,114 nucleotides in length), in addition to those of Popp (Pop), Musoke (Mus), and Ozolin (Ozo), were analyzed to determine their phylogenetic relationship (Fig. (Fig.2A)2A) and nucleotide distances (Fig. (Fig.2B).2B). A maximum-likelihood analysis placed the Angola (Ang) isolates firmly (100% bootstrap support) within the clade containing the majority of the East African isolates. This result was surprising given the large geographic distance, ~1,000 miles, of Uige, Angola, from the locations of all other known sources of marburgvirus. The branching pattern of the phylogenetic tree shows five overall branches, with the Rav/09DRC branch being the most divergent, showing nucleotide differences greater than 21% relative to viruses in other lineages, including Ang. 05DRC, 07DRC, and Ozo comprise a second well-defined lineage, differing from viruses in all other lineages by greater than 7%. Finally, using a cutoff of ~5% nucleotide difference, Ang, Mus, and Pop form three additional lineages, with the Angola sequence diverging by 6.8 and 7.1% from Mus and Pop, respectively, and by greater than 7.4% from the Ozo, 05DRC, and 07DRC lineages. Mus and Pop differ from each other by 5.9%. For comparison, the difference between any of the marburgvirus genomes and those of either Zaire ebolavirus or Sudan ebolavirus is greater than 65%, while the Zaire and Sudan ebolaviruses differ from each other by greater than 40%.
Among the 11 clinical samples selected from the Angola outbreak, the genetic sequences were well conserved throughout the entire length (19,114 nucleotides) of the genome (Fig. (Fig.3).3). Ten of 11 genomes had five changes or fewer compared to the reference isolate Ang1379c. Remarkably, four of the sequences, obtained from clinical samples collected over a month and a half, showed no nucleotide differences for the entire 19,114-nucleotide genome. This confirmed the high fidelity of the RT-PCR-based sequence analysis performed and demonstrated that human-to-human passage of marburgvirus could occur in the absence of virus evolution. In addition, our reference sequence, Ang1379c, was also 100% identical to the sequence obtained from the corresponding virus isolate, Ang1379v, indicating the lack of selection occurring during culture of the virus from the clinical specimen. The most genetically diverse genome came from a specimen collected in the Songo municipality (specimen 0754), which had a total of 11 nucleotide changes (out of 19,114 bases) relative to the reference isolate (0.07% variation). Each of the unique changes were independently confirmed by generation of small (<1-kb) RT-PCR fragments followed by sequence analysis. In contrast, the analysis of three isolates from within the earlier Durba, DRC, outbreak showed at least 10 times greater sequence diversity, ranging from 0.8 to 21% nucleotide difference. The 05DRC and 07DRC isolates differed by 0.8% (>150 nucleotide differences) and differed by over 21% when each was compared to the 09DRC isolate. The 09DRC isolate is noteworthy because it represents the second member of the most distinct lineage within the marburgviruses, first defined by the Rav isolate (20). The minimal genetic diversity observed in the Angola outbreak (maximum of 0.07% variation) relative to that seen in the Durba (0.8 to 21% variation) is consistent with the Angola marburgvirus outbreak being the result of a rare introduction of virus into the human population from the unknown reservoir followed by direct human-to-human transmission.
The determination of the Ang, Rav, 05DRC, 07DRC, and 09DRC full-length sequences almost tripled the number of marburgvirus genomes available for analysis. Therefore, using this more extensive database of full-length sequences, a comprehensive effort was undertaken to reexamine many genetic features throughout the genome and determine the degree to which these elements are maintained across all eight marburgvirus strains. An initial macroscopic perspective of genome similarity is shown in Fig. Fig.4,4, in which full-length genomes were analyzed for similarity using a sliding window of 50 nucleotides. The similarity plot reveals a striking pattern of sequence conservation among the open reading frames flanked by regions of much greater variation in the noncoding sequences. However, the more variable noncoding regions are punctuated with spikes of high identity corresponding to the regions containing transcription start and stop sequences (arrows). Alignments of the individual cis-acting regulatory features demonstrate that 12 of 14 transcription start and stop sequences are 100% identical (Fig. (Fig.5).5). Of the two start/stop sequences that show variation, the differences are merely single-nucleotide transitions and are within the Ravn/09DRC lineages. In all genomic elements examined, the Angola sequence shows 100% identity with the consensus sequence. Well conserved genomic features include the lengths of the 5′ and 3′ untranslated regions of all seven predicted mRNAs, some of which differ slightly from previous reports (8, 15), as well as the length and composition of the six intergenic (IR) sequences. A few notable variations within the IRs are a requirement for a purine residue in the first position of the trinucleotide IR between VP24 and L and conserved differences within the Ravn/09DRC lineage at three of seven positions in the IR between NP and VP35.
An examination of the nucleotide and amino acid distances for each of the seven marburgvirus gene products is shown in Fig. 6A and B, respectively. At the nucleotide level the most conserved genes are, in order, VP40, NP, VP24, and VP35, showing 0.2 to 15.2% variation, closely followed by VP30, showing 0.3 to 17.4% variation. The gene with the greatest nucleotide differences is GP (0.7 to 22.5%), consistent with previous alignments using fewer marburgvirus strains (8, 20, 45). Surprisingly, the nucleotide difference in the polymerase (0.5 to 21.4%) is almost as much as that seen in the GP region despite having some stretches of very high conservation (Fig. (Fig.7G7G).
At the amino acid level, the percent differences among the marburgvirus strains for each of the seven gene products are quite different from those seen within the same genes at the nucleotide level. The degrees of variation within all the open reading frames, except GP, are decreased by 2- to 10-fold. The most striking example is the matrix protein VP40, in which the decrease is about 10-fold, demonstrating a distinct intolerance for amino acid changes (1.65% maximum variation). This intolerance is suggestive of tight physical constraints for VP40 in the virus assembly process. At the other end of the spectrum, GP showed no decrease at all in the level of amino acid variation (~23%) relative to that seen at the nucleotide level, suggesting a selective pressure for nonsynonymous changes, most likely exerted by the immune system of the natural reservoir host(s).
We next examined known protein domains and motifs in each of the seven open reading frames by comparative alignment of the amino acid sequences (Fig. 7A to G). The analysis of GP (Fig. (Fig.7D)7D) shows that the area of greatest diversity is a continuous 300-amino-acid (aa) stretch from residue 201 to 501, a region previously divided into two smaller variable domains (45). Despite this diversity, a number of previously described features remain well conserved, many of which reside within this central variable domain. These features include 13 of 14 proposed N-glycosylation sites (N-x-T/S) and 12 of 12 cysteines (45). Elsewhere in GP, the transmembrane domain (aa 649 to 670) and proposed fusion domain (aa 526 to 540) show 100% identity among all eight strains of marburgvirus, as does the furin cleavage site R-X-K/R-R (aa 432 to 435) (54). Volchkov et al. (52) proposed the presence of an immunosuppressive domain (ISD) in GP2 of ebolavirus based on analogy to retroviruses. This 26-amino-acid motif is also present in marburgvirus (55) and shows a single Thr-to-Ala substitution at position 12 of the Angola sequence. Lending importance to the function of the ISD motif, recent studies have shown that 17-residue monomers of filovirus ISDs are capable of suppressing T-cell activation and Th1-related cytokine production in activated human and nonhuman primate peripheral blood mononuclear cells (59). All other positions of the ISD show complete conservation. Another posttranslational modification ascribed to GP is the potential for phosphorylation at serine residues in two independent motifs between amino acids 260 and 273 (46). The GP alignments demonstrate that only one phosphorylation site, encompassing amino acids 268 to 273, is conserved among all eight marburgviruses. The other site, a diserine motif at amino acids 260 to 261, is not present in five of eight strains, thus discounting this motif as a general feature of all marburgviruses.
Within the alignment of marburgvirus L amino acid sequences (Fig. (Fig.7G),7G), the regions showing the greatest variation are clustered into five domains, i.e., amino acids 114 to 135, 262 to 348, 1143 to 1206, 1623 to 1645, and 1677 to 1866. The last domain, encompassing residues 1677 to 1866, may contain a hinge region within the polymerase based on analogy to morbilliviruses, which have been shown to tolerate the insertion of the green fluorescent protein within the proposed hinge (13, 31).The areas of greatest conservation within the marburgvirus L sequences are in three large blocks, amino acids 349 to 1142, 1207 to 1622, and 1867 to 2322. The first of these three areas, amino acids 349 to 1142, contain the box A, B, and C sequences common to paramyxo- and rhabdovirus L proteins (3, 36). Boxes A and B share 100% identity among the aligned marburgvirus sequences, while box C has a single K-to-R substitution in the Rav and 09DRC lineages. Two other motifs purported to be present in all polymerases of negative-sense RNA viruses are the diresidue D-D and QGDNQ motifs (3, 21). These were previously identified in the Mus marburgvirus L sequence (36) and are thought to be essential components of the polymerase catalytic core. In this alignment, these potential catalytic core motifs are all 100% conserved, with the exception of one diresidue D-D motif at amino acids 91 to 93, which shows degeneracy at two positions in the Ravn and 09DRC lineages. In addition, there is a putative ATP and/or purine ribonucleoside triphosphate binding domain found in all known L proteins of single-strand negative-sense viruses (amino acids 1931 to 1956) (36), referred to as motif C in an alignment of filovirus L sequences (53), that shows 100% identity at the core consensus glycine residues and identity at 23 of 26 positions overall within the motif. Three other potential ATP binding motifs at residues, 1325 to 1360, 1390 to 1420, and 1560 to 1593 (36), show complete identity among all taxa analyzed. Finally, 46 of 52 consensus cysteine residues are completely conserved, including the dicysteine motif at positions 1376 to 1377, as previously noted (36), which are found in most L proteins at similar locations and are believed to anchor protein secondary structure to maintain the necessary conformation of the putative active sites.
Outside of the polymerase and glycoprotein, a few notable genomic elements reside in the NP, VP35, and VP30 genes. An alignment of NP amino acid sequences (Fig. (Fig.7A)7A) shows that nearly all the variability is in the C-terminal half of the protein, similar to that seen in a recent comparison of Zaire and Sudan ebolavirus full-length genome sequences (44). Within this same half of the protein are seven unique phosphorylation domains (28), each containing one or more serine/threonine kinase substrate motifs. Yet, despite the overall variation, the majority of individual kinase recognition motifs are conserved to the point that, regardless of the marburgvirus strain examined, at least one motif is present within each of the seven domains. The maintenance of these motifs highlights the potential role for phosphorylation in this region of NP, an area postulated to effect protein-protein interactions.
VP35 protein of marburgvirus plays an essential role in transcription and replication of viral RNA. A predicted coiled-coil domain with residues 70 to 120 may effect VP35 oligomerization, an interaction which in turn may be necessary for VP35 to bridge NP with L to form the active polymerase complex (35). At the heart of the presumed coiled-coil domain are heptad repeats containing hydrophobic residues at the first and fourth positions. This domain shows variability at nine positions (Fig. (Fig.7B).7B). Yet despite this variation, the spacing of the hydrophobicity is strictly maintained among all the aligned sequences. The Angola sequence shows no variation whatsoever from the consensus sequence throughout the 50-amino-acid domain. In addition to its role in RNA replication, VP35 of ebolavirus has recently been shown to contain an 11-amino-acid motif (Zaire ebolavirus residues 304 to 314) that is thought to be essential for type I interferon (IFN) antagonism (19). This motif, which is possibly involved in RNA binding, is also present in marburgvirus (residues 293 to 303) and has identity at 9 of 11 positions with that of ebolavirus, including the three basic amino acids experimentally demonstrated to be important. Further highlighting the potential importance of this VP35 domain, the alignment (Fig. (Fig.7B)7B) reveals that this domain is a general feature of all known marburgviruses.
VP30 of Zaire ebolavirus has been shown to contain an unconventional Cys3-His zinc binding domain whose integrity has been shown to be required for VP30 function in virus transcription (33). The alignment in Fig. Fig.7E7E shows the high conservation of all four zinc-coordinating residues among all eight marburgvirus taxa. Adjacent to the zinc binding domain is a well-conserved tetraleucine motif which could facilitate VP30 oligomerization, similar to that observed with ebolavirus VP30 (18).
Finally, VP24, whose alignment is shown in Fig. Fig.7F,7F, is highly conserved throughout the protein. Recent studies of marburgvirus VP24 have implicated it to be involved in nucleocapsid assembly and interactions between nucleocapsids and budding sites at the plasma membrane (2). In addition, ebolavirus VP24 has been shown to bind Karyopherin α1 and to block STAT1 nuclear accumulation, thus implicating VP24 as a virulence determinant that allows ebolavirus to evade antiviral effects of IFNs (41). Consistent with the idea that VP24 may be a virulence determinant, both mouse- and guinea pig-adapted ebolaviruses, each of which is capable of 100% lethality in its respective animal model, have amino changes that map to VP24 (7, 51).
In this report we describe the laboratory confirmation of the first known marburgvirus outbreak in West Africa, along with an extensive comparative genomic analysis of marburgviruses. Virus associated with previous Marburg HF outbreaks had all originated in East Africa, and up to 21% virus genetic diversity had been found based on analysis of partial genome sequences. Consequently, the discovery of Marburg HF in Angola was a surprise, as was the finding that the Angolan marburgvirus was not more distinct, differing from the main group of East African marburgviruses by only ~7%. Presumably the absence of a greater genetic difference reflects virus natural host reservoir similarities between the East and West African locations. However, due to the resource-poor nature of the affected area, an index case was never identified, and thus the importation of the virus from East Africa, while unlikely, cannot be ruled out. By the same token, the large virus genetic difference seen between the Ravn/09DRC lineage and the other marburgviruses may represent reservoir host differences, such that viruses within this lineage may be associated with a different host species or subspecies than the other marburgviruses. Interestingly, earlier ecological niche modeling approaches had predicted that parts of northern and eastern Angola were within the range for potential filovirus reservoirs (39, 40).
The natural reservoir for marburgvirus remains unknown, but marburgvirus emergence in Angola will likely extend the scope of the reservoir search beyond East Africa. As recently reported (5), the marburgvirus outbreak in Durba, DRC, was closely associated with illegal mining activity. Epidemiological data linked over 70% of the cases with mines or caves, suggesting that the natural reservoir could well be associated with such environments. With the Angola outbreak, difficulties in surveillance and contact tracing, combined with the delay in the identification of the outbreak, led to poor epidemiological linkage of marburgvirus cases and ultimately to a lack of success in identifying a point source or mounting any ecological study. Filovirus outbreaks in general are relatively rare events. A recent report has suggested that bat species with rather wide geographic distributions may be a potential reservoir for ebolavirus (24). If the natural reservoir of marburgvirus is similar to that of ebolavirus, the emergence of marburgvirus in western Africa should not be surprising, as the sites of multiple large ebolavirus outbreaks are less than 500 miles away, including areas which have experienced almost yearly activity over the last decade (25, 26).
The initial diagnosis was based on 9 of 12 clinical specimens testing positive by virus isolation, antigen, and/or PCR-based methods, including a newly designed Q-RT-PCR assay designed to detect the VP40 genome region of all known strains of marburgvirus. This Q-RT-PCR assay outperformed all other assays designed to detect acute marburgvirus infection, including the “gold standard” virus isolation assay. The assay was also sufficiently robust to allow deployment into a field setting in Angola. The extensive virus genomic analysis presented here also confirms that the VP40 gene target was an excellent choice for a broadly reactive marburgvirus detection assay, as it is the most conserved virus gene. These features suggest this should be a highly useful assay for detection of potential naturally occurring or deliberate Marburg HF outbreaks in the future.
Following the initial diagnosis, we determined the full-length genome sequence directly from clinical material from 11 patients representing the temporal and geographic distribution of the outbreak. Our purpose was to fully characterize the level of sequence variation generated during the course of a large marburgvirus outbreak with a high case fatality rate and to use this information to determine whether multiple marburgvirus introductions into the human population had occurred. To generate a context with which to compare the Angola marburgvirus sequence data, we determined the complete genome sequence of the Ravn marburgvirus, first isolated in 1987 in Kenya, as well as three additional, uncharacterized marburgviruses from Durba, DRC, the location of the largest outbreak on record prior to the one in Angola. Our data increased the number of the complete genome reference sequences from three to eight. Therefore, we used this expanded database to review the known genetic elements throughout the genome in an effort to assess their importance based upon the degree to which the features are maintained across the eight marburgvirus strains. One of the most striking features is the high degree to which nearly all the genetic elements previously identified were maintained. To some extent this was expected, but we predicted that perhaps the Ravn and 09DRC lineages, differing by greater than 21% from all other marburgviruses at the nucleotide level, would show more versatility as to how the viral proteins could potentially carry out their functions. Diversity within marburgviruses is still, even with this more complete data set, considerably less than that seen with the ebolaviruses, to the point that the Ravn/09DRC lineage would not be considered a different species. In fact, the three most conserved genes, VP40, VP24, and L, among the major species of ebolavirus (Zaire, Sudan, and Reston) show 20 to 25% protein diversity (44). In contrast, those same genes within the marburgvirus lineages shown here have, respectively, 1.6, 4.3, and 12.3% maximum diversity.
Given the general error-prone nature of RNA virus polymerases, the fact that there were comparatively few changes observed among the 11 Angolan marburgvirus isolates was somewhat surprising. In fact, four of the complete genome sequences (19,114 nucleotides in length) from specimens collected over a month and a half and representing at least two to three human to human transmission cycles were 100% identical. In the absence of precise knowledge of transmission events from the natural reservoir, possibly bats (24, 47), this analysis attempts to answer the question of what level of sequence divergence constitutes a distinct lineage versus the degree of sequence variation to be expected in a large outbreak of single origination. To answer this question, we felt it important to compare the diversity seen in the Angola outbreak with that in the Durba, DRC, outbreak. Among the sequences from the Angola outbreak, the representative from the Songo municipality (specimen 0754) had 11 nucleotide differences from the reference isolate (0.07% variation). However, this maximum level of variation is 1/10 of the minimum diversity seen between the two closest lineages within the Durba outbreak, 05DRC and 07DRC, respectively (0.8% variation). Based on this comparison, we do not consider the Songo lineage to be indicative of a second introduction, although such a possibility cannot be ruled out. For the Angola marburgvirus outbreak, it would be interesting to measure the rate of accumulation of nucleotide changes over time, but unfortunately, the epidemiological links between the 11 patients could not be established and there is no Angola marburgvirus isolate from the presumed beginning of the outbreak in October 2004.
Genome plasticity and rapid evolution in response to positive selection are general features of RNA viruses which possess error-prone polymerases and lack proofreading mechanisms (12). Such features have been well documented in recent outbreaks of emergent RNA viruses, such as severe acute respiratory syndrome coronavirus and human immunodeficiency virus (HIV), in which genetic differences have been shown to accumulate during human-to-human transmission events and, in some instances, within the same tissues of a single host (27, 43). Severe acute respiratory syndrome coronavirus has recently been calculated to accumulate approximately two mutations per genome per human transmission event (50). Why, then, was so little variation found within this marburgvirus outbreak? The answer likely involves at least two related components, namely, (i) the time of progression within the host from initial infection to disease outcome and (ii) the host immune response to marburgvirus infection. With HIV, highly diverse quasispecies have been shown to develop from homogeneous populations in response to the development of host neutralizing antibodies and cytotoxic T-lymphocyte responses (6). Yet, in the later stages of HIV infection towards the development of AIDS, the immune system is no longer able to exert strong selective pressure on the replicating quasispecies and, as a result, HIV evolution dramatically slows (11). For marburgvirus, like Zaire ebolavirus, disease progression is so rapid that most individuals die before an effective immune response can be mounted (1, 22, 48). For ebolavirus the median time to death after onset of symptoms is only 8 to 9 days (22, 48). In addition to the rapid course of infection, ebolavirus actively suppresses the immune system by (i) directly antagonizing the host interferon system through an uncharacterized mechanism involving the virus VP35 protein (4) and (ii) preferentially infecting dendritic cells and macrophages as sites of primary infection, thus diminishing host cell populations that are critical for the establishment of innate and adaptive immune responses (16, 17). These features together allow for explosive virus growth within the host in a manner virtually unchecked by the immune system. The lack of IgM and IgG responses shown in Table Table11 is consistent with this view. The lack of genetic diversity within the Angola marburgvirus outbreak is therefore not surprising and is in fact consistent with the absence of genetic differences seen among genome fragments of viruses analyzed in previous Zaire and Sudan ebolavirus outbreaks (42, 48).
A trial sampling procedure that emerged during the outbreak was the use of oral swabs for sampling suspected Marburg VHF corpses. RNA extracted from oral swabs sometimes gave low CT values in the Q-RT-PCR assay (Table (Table22 and data not shown), indicating the potential for high viral loads in these secretions. Unfortunately, definitive interpretations of CT values from oral swabs are difficult, since the swabbing technique and the volumes recovered are inherently variable. In general, very high viral loads are seen in sera of individuals infected with filoviruses. For instance, during human infections with Sudan ebolavirus, viral loads in patient serum can vary from 105 RNA copies/ml at the time of fever onset to greater than 1010 RNA copies/ml at the time of death (48). However, the timing and extent to which this range of viral load is reflected in human oral/nasal secretions are currently unknown. Of particular concern is the use of this sampling technique as a means of establishing a particular etiology of a VHF outbreak when multiple agents must be considered and tested, many of which may not ever produce viral loads that can be detected by oral swabs. Blood-based sampling as a reliable source of virus for serology-, antigen-, and PCR-based detection assays is well documented and should remain the method of choice for sampling suspected VHF patients.
The management of the outbreak was greatly assisted by the Ministry of Health, Republic of Angola; the World Health Organization; and the International Response Team. We thank the Instituto Nacional de Saúde Publica and Amilcar Tanuri of the CDC Global Aids Program for their excellent technical and logistical assistance with the CDC field lab and their devotion to the resolution of this public health crisis. The efforts of Daniel Kertesz of the World Health Organization were critical for sending the initial sets of diagnostic specimens to CDC, Atlanta, Ga. We also thank the U.S. Embassy to Angola, the U.S. Office of Foreign Disaster Assistance for their essential financial and logistical support, and the Chevron Corporation for the loan of a crucial electrical generator. The CDC field lab was well supported by the technical assistance of Jennifer B. Oliver, Deborah L. Cannon, Kimberly A. Slaughter, and Thomas L. Stevens, while the field lab of the Public Health Agency of Canada was similarly supported by Lisa Fernando, Allen Grolla, Steven M. Jones, Jim E. Strong, and Heinz Feldmann. We also acknowledge Mike Frace and Brian Halloway for making available multiple components of the CDC Biotechnology Core Facility, which immensely facilitated this study.
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the funding agencies.