As the only known HCV recombinant in widespread circulation, the existence and emergence of CRF01_1b2k present an interesting question in HCV epidemiology and evolution. Investigating its evolutionary origins and transmission history helps to understand the circumstances that led to its unique properties. In contrast to HIV, which has 49 known CRFs and a much greater number of unique recombinant forms (
30), recombination typically contributes little to the generation and maintenance of HCV genetic diversity. Given that HCV has a higher global prevalence than HIV and, thus, all else being equal, there is a high likelihood of dual infections with divergent HCV strains, it is unlikely that epidemiological factors are restricting the opportunities for HCV to generate CRFs. Mixed infections with divergent HCV strains have been reported for many different populations and are noted to be prevalent among high-risk groups, particularly IDUs and some hemophiliacs (
3,
5,
16,
50,
53).
Since the opportunities for HCV recombination are not limited, it is more likely that fundamental molecular and evolutionary differences between HIV and HCV explain why HIV has many CRFs and HCV has few. These could include differences in the rate of template switching or differences in genomic or immunological constraints, such that HCV recombinants have, on average, lower fitness than HIV recombinants and therefore are rarely transmitted (
72). Although both viruses are associated with chronic infections, unlike HIV, HCV can be spontaneously cleared by the host. This may explain in part the differences in the number of recombinants between HIV and HCV, where partial protective immunity against the latter reduces that chance of
in vivo recombination of HCV strains (
1,
43,
69). However, the high rate of mixed infections observed suggests that this is likely at best to play a minor role in HCV recombination. The low frequency of HCV recombinants is more likely to reflect mechanistic constraints on viral replication. There is evidence that template switching in HCV is especially rare and that the replication complex is typically encoded on the same genomic strand that it will replicate and transcribe (
2). It is also interesting that when replication complexes are exchanged between different genotypes, the replication efficiency is substantially reduced (
15). The pseudodiploidy of the HIV genome certainly increases the likelihood of recombination occurring due to the ability of the virus to package two RNA templates (
17), while the secondary RNA structure in the HCV genome may limit the production of viable hybrid HCVs (
59,
68).
Our study of previously reported and newly obtained HCV isolates provides the first estimates of the date of the recombination event that generated CRF01_1b2k. We estimated the time of origin of CRF01_1b2k to be between 1923 and 1956, which is not much later than the origin and global spread of the parental subtype 1b (
33). This date is significantly earlier than we expected: we expected that the CRF's creation might be linked to the dramatic increase in IDU behavior following the breakup of the former Soviet Union. This result is robust to the manner of isolate sampling: if, because of nonrandom sampling, our isolates are more closely related to each other than under random sampling, then the TMRCA of CRF01_1b2k would be biased toward more recent dates. Furthermore, despite its small size, our data set provides a relatively short time window during which the recombinant must have arisen (). The involvement of subtype 1b in the recombinant is not surprising, as it is one of the most prevalent subtypes worldwide. However, to fully appreciate the origin of CRF01_1b2k, we need to consider the evolutionary history of both parental subtypes and of the recombinant lineage itself. Genotype 2 harbors considerable genetic diversity, especially in West Africa, which is where the genotype is thought to have originated (
34). Although the small number of subtype 2k isolates sampled to date likely underestimates the true extent of subtype 2k distribution, such viruses have been isolated from Martinique and Madagascar, implicating a role for the historical trans-Atlantic slave trade in the dissemination of the virus from West Africa (
34).
The current distribution of subtype 2k is associated with francophone regions and former Soviet Union countries. In contrast, CRF01_1b2k is more spatially limited, with all isolates being directly or indirectly linked to the former Soviet Union. As the nonrecombinant subtype 2k isolates that are most closely related to CRF01_1b2k are from Moldova and Azerbaijan (see Fig. S1 in the supplemental material), it seems most likely that CRF01_1b2k was generated in the former Soviet Union. An equivalent analysis of subtype 1b viruses provides no reliable phylogeographic linkage, due to the low phylogenetic resolution of the NS5B data set.
Our estimated date of CRF origin coincides with an interesting period of history in the former Soviet Union, which was an early leader in transfusion technology. Under the leadership of Alexander Bogdanov in the 1920s, a nationwide network of blood transfusion centers and research institutes, as well as the Central Institute of Hematology in Moscow, Russia, in 1926, was established throughout the Soviet republics (
61). This expanded into a network of ~1,500 blood donating centers across the republics (
18). The Soviets also adopted blood storage and preservation techniques at an early stage. They established more than 60 primary and 500 subsidiary blood storage centers by the mid-1930s, which shipped blood across the entire Soviet Union (
61). During the Second World War, these networks were swiftly readapted to support the front line; in Moscow alone, about 2,000 blood donations were given per day (
18,
61). The impressive scale of the blood service in the former Soviet Union is likely to have favored HCV transmission by increasing the efficiency and geographic range of the virus's dissemination. Whether specific medical practices at this time increased the probability of mixed viral infections remains unknown. It is interesting to note that Bogdanov himself was fascinated by the ideological interpretation of blood sharing and frequently practiced what he called “physiological collectivism”: the exchange of blood with others through mutual transfusions (
61).
Although unscreened blood transfusions can provide a credible hypothesis for the origin of CRF01_1b2k in the Soviet Union some time from 1923 to 1956, we must also attempt to explain how subtype 2k or the CRF itself arrived in the Soviet Union from West Africa or the Caribbean. Migration from Africa to the former Soviet Union did occur during the late 1950s and 1970s as a result of alliances forged by the Soviet government with newly independent African states such as Ghana and Angola (
35). However, these connections are too late to have contributed to the emergence of CRF01_1b2k, according to our dating estimates. Although we cannot reject the hypothesis that the CRF was formed in West Africa and subsequently moved to the Soviet Union, our results are more consistent with the recombination event occurring in the latter. This uncertainty is likely to be reduced with further samples, especially subtype 2k viruses, from African and former Soviet Union locations.
The epidemic history CRF01_1b2k () since its emergence is similar to that estimated for other epidemic subtypes of HCV (e.g., see reference
33). The growth in the CRF01_1b2k effective population sizes coincides with a substantial increase in blood transfusion, including during the Second World War, and with the subsequent rise in intravenous drug usage. CRF01_1b2k transmission seems to have slowed or stabilized since the early 1990s, coinciding with the onset of the anti-HCV screening of donors. In the absence of any data to the contrary, the transmission of this recombinant and its spread from the former Soviet Union reflect the peculiar epidemiological properties of the risk groups that it has been associated with rather than any intrinsic properties of the virus.
We demonstrate the practicality and benefits of using a hierarchical phylogenetic model to jointly estimate parameters of interest when analyzing multipartite sequence data that result from genetic exchange. This method yields more accurate parameter estimates than previous methods (e.g., see references
27 and
67) by incorporating the phylogenetic information and uncertainty in different genomic regions. We recommend that this improved statistical framework be used in future investigations of recombination in fast-evolving RNA viruses.
This study has made significant steps in understanding the epidemic history and spread of the unique circulating HCV recombinant 2k/1b. Most significantly, we show that this strain originated many decades before the post-Soviet rise in injection behavior with which it is currently associated. On the basis of the date of its origin and its molecular epidemiology, there are reasonable grounds to suppose that the Soviet Union's revolutionary blood service was instrumental in the CRF's early generation and continental-scale spread. This infrastructure may have facilitated the pan-Eurasian spread of other parenterally transmitted blood-borne infections, and this is an interesting question for future research.