|Home | About | Journals | Submit | Contact Us | Français|
Thirty HIV-1 URF_01AE/ B′ complete or nearly full-length genome sequences sampled within Southeast Asia were obtained from the Los Alamos HIV Sequence Database. Phylogenetic and recombinant analyses revealed that three sequences indeed displayed the identical recombinant structure. Of note, the three subjects, harboring novel CRF01_AE/B recombinants, did not have apparent epidemiological linkage. They fulfilled the criteria for the designation of a new circulating recombinant form (CRF) and constituted the 52nd CRF identified in the worldwide HIV-1 pandemic. In this chimera, two short subtype B segments were inserted into a backbone of CRF_01AE. The breakpoints corresponded to HXB2 nucleotide positions 2930, 3251, 8521, and 9004 approximately. This CRF is the first one identified by neatening and analyzing the sequences already presented in the Los Alamos HIV Sequence Database. This indicates that we should pay attention not only to explicit subtype sequences but also to those classified as a unique recombinant form (URF) so far.
Mutation, coupled with recombination, has resulted in the extensive genetic diversity of HIV-1. The high genetic diversity of HIV-1 allows its classification into groups, subtypes, subsubtypes, and circulating recombinant forms (CRFs).1 Group M HIV-1 dominates the AIDS pandemic with at least nine subtypes (A to D, F to H, J, and K) and multiple intersubtype recombinants. Currently, a total of 51 CRFs have been represented in the Los Alamos HIV database web site (http://hiv-web.lanl.gov/CRFs/CRFs.html). Meanwhile multiple unique recombinant forms (URFs) have been reported in many regions around the world.
Southeast Asia was thought to be an epicenter of the HIV-1 epidemic in Asia. The AIDS epidemic in this region was initiated by two different subtype strains in two separated risk groups, with subtype B among injecting drug users (IDUs) and CRF01_AE among those heterosexually exposed, throughout the early phase.2,3 CRF01_AE can clearly cross over from the heterosexual epidemic to IDUs and by 1995 it also became the predominant subtype in IDUs.4 The cocirculation of the two different strains in this region led to the generation of more complex recombinant strains that emerged in almost every country in Southeast Asia.5–10 Furthermore, some of the recombinant strains have spread widely in populations and have become CRFs. Up to now, five CRFs, originated from CRF01_AE and subtype B′ lineages, have been reported in Southeast Asia: CRF15_01B11 and CRF34_01B12 from Thailand, CRF33_01B13 and CRF48_01B14 from Malaysia, and CRF51_01B15 from Singapore. The HIV-1 epidemic in this region has now become increasingly heterogeneous.
The extensive use of complete and nearly full-length genome sequencing of HIV-1 strains has provided a powerful and accurate approach to the molecular epidemiology of regional epidemics. Consequently, the number of complete or nearly full-length HIV-1 genome sequences was growing larger and larger. The majority of these can be classified as subtypes and CRFs definitely, while the minority remains as URFs. To define a CRF, at least three epidemiologically unlinked HIV-1 sequences with identical mosaic structures should be characterized, at least two of them in the nearly full-length genome (8kb)1. For the present study, we identified a novel CRF_AE/B′ candidate, which emerged in Southeast Asia, using sequences obtained from the Los Alamos HIV Sequence Database (www.hiv.lanl.gov).
Thirty HIV-1 URF_01AE/ B′ complete or nearly full-length genome sequences sampled within Southeast Asia were obtained from the Los Alamos HIV Sequence Database. These sequences were generated by sequencing from plasma or peripheral blood mononuclear cell (PBMCs) samples collected from Thailand (n=19), Malaysia (n=8), Myanmar (n=2), and China (n=1). In addition to these 30 sequences, we also retrieved subtype reference sequences from the same database. All the sequences were incorporated into one sequence dataset using SynchAlign, software http://www.hiv.lanl.gov/content/sequence/SYNCH_ALIGNS/SynchAligns.html, and edited using BioEdit software. A phylogenetic tree was constructed by the neighbor-joining method based on Kimura's two-parameter distance matrix with 1000 bootstrap replicates using Mega 5.04. With respect to recombination analysis, RIP, http://www.hiv.lanl.gov/content/sequence/RIP/RIP.html, software was applied at first; jpHMM software (http://jphmm.gobics.de/) was used to further scrutinize and define the recombination breakpoints subsequently. The origin of each segment was analyzed by subregion neighbor-joining tree analysis. Detailed information about the three subjects harboring CRF52_01B recombinants was collected from the studies13,16,17 in which the three subjects were reported. Furthermore, we retrieved the records for the three subjects from the Los Alamos HIV Sequence Database in order to supplement elements such as patient name and risk factor.
Using phylogenetic analysis, we found that three nearly full-length genome sequences from three patients formed a monophyletic branch supported by a high bootstrap value of 99%, located outside any HIV-1 subtypes and known CRFs in Southeast Asia (Fig. 1). Both RIP and jpHMM revealed that the three URFs nearly full-length genome sequences indeed displayed the identical recombinant structure composed of CRF_01AE and subtype B. In this chimera, two short subtype B segments were inserted into a backbone of CRF_01AE. The breakpoints corresponded to HXB2 nucleotide positions 2930, 3251, 8521, and 9004 approximately (Fig. 2).
To confirm the subtype structure of CRF52_01B and to estimate likely parental lineages, we divided the HIV-1 genomes into five segments according to the breakpoints. Subregion phylogenetic analysis confirmed the four breakpoints identified using jpHMM. The genomic segments I, III, and V clustered with reference subtype CRF01_AE while the other two segments II and IV branched with reference subtype B (Fig. 3). As shown in Fig. 3, the subregion tree analysis demonstrated that CRF01_AE segments (I, III, and V) belonged to the cluster of Thailand CRF01_AE other than African CRF01_AE, showing that the three CRF01_AE segments indeed originated from CRF01_AE of Thailand origin. Similarly, subtype B segment IV analysis revealed that it originated from Thailand B rather than Western B. However, we could not determine the origin of the remaining subtype B segment II because the length was too short to obtain high bootstrap values.
The profiles of the subtype structure of the three sequences were distinct from the other five CRFs originated from CRF01_AE and subtype B′ reported in Southeast Asia. Of note, the three nearly full-length genome sequences (01B.MY.2003.03MYKL018_1, 01B.TH.2000.00TH_R1741, and 01B.TH.1996.M043) were presented in three different studies13,16,17 from three different patients (Table 1) with no apparent epidemiological linkage. These data established that the three sequences show the same recombinant structure over the whole genome, fulfilling the criteria for designation of a new CRF. They constitute the 52nd CRF identified in the worldwide HIV-1 pandemic, herein called CRF52_01B.
Detailed information about the three subjects harboring CRF52_01B recombinants is shown in Table 1. Unlike other CRFs identified worldwide, the most interesting element is that the three subjects belonged to two different countries, with two in Thailand and one in Malaysia, within Southeast Asia. However, the subjects were reported by year of first positive HIV-1 test, although that of subject M043 was not available. From Table 1, it is apparently that subject M043 was infected in or before 1996. The obviously wide geographic separation and early infection date of the three subjects suggested that CRF52_01B arose at least 15 years ago. To obtain accurate estimates regarding the time of origin and transmission direction of the novel CRF in Southeast Asia, more sequences with different locations and sampling times are needed in Bayesian coalescent analysis using BEAST software.18,19
All of the three subjects reported sexual exposure, with two being heterosexual and one bisexual. It has been reported that most women infected in Southeast Asia have been the monogamous wives or regular partners of higher risk men.20 Of interest, subject R1741 was one of the participants recruited from public family planning clinics in Thailand. Her only reported risk factor for HIV-1 infection was heterosexual exposure.16 Taking subject R1741 into account, she should have acquired the HIV-1 infection from her male sexual partner. It was suggested that the novel CRFs have spread into low-risk people early.
In this study, we identified a novel CRF (CRF52_01B) composed of CRF01_AE and subtype B in Southeast Asia by analyzing the 30 complete or nearly full-length genome sequences obtained from the Los Alamos HIV Sequence Database. CRF52_01B is the sixth identified CRF that has originated from CRF01_AE and subtype B lineages within Southeast Asia in the past several years. What is most interesting is that this CRF is the first one identified by neatening and analyzing the sequences already presented in the Los Alamos HIV Sequence Database. This emphasizes that we should pay attention not only to explicit subtype sequences but also to those classified as URFs so far. The prevalence and public significance of the novel CRF remain to be elucidated in the future.
This work was supported by the National Key S&T Special Projects on Major Infectious Diseases (Grants 2008ZX10001-004 and 2008ZX10001-012).
No competing financial interests exist.