There are few cohorts of individuals who have survived infection with HIV-1 for more than 20 years, reported and followed in the literature, and even fewer from Africa. Here we present data on a cohort of subtype C-infected individuals from rural northern Malawi. By sequencing multiple clones from long-term survivors at different time points, and using multiple genotyping approaches, we show that 5 of the 11 individuals are predicted as CXCR4 using (by ≥3/5 predictors) but only one individual is predicted as CXCR4 using by all five algorithms. Using any one genotyping approach overestimates the number of predicted CXCR4 sequences. Patterns of diversity and divergence were variable between the HIV-1 long-term survivors with some individuals showing very small amounts of variation and change, and others showing a greater amount; both patterns are consistent with what has been described in the literature.
The challenge presented by high-throughput sequencing necessitates the development of novel tools for accurate alignment of reads to reference sequences. Current approaches focus on using heuristics to map reads quickly to large genomes, rather than generating highly accurate alignments in coding regions. Such approaches are, thus, unsuited for applications such as amplicon-based analysis and the realignment phase of exome sequencing and RNA-seq, where accurate and biologically relevant alignment of coding regions is critical. To facilitate such analyses, we have developed a novel tool, RAMICS, that is tailored to mapping large numbers of sequence reads to short lengths (<10 000 bp) of coding DNA. RAMICS utilizes profile hidden Markov models to discover the open reading frame of each sequence and aligns to the reference sequence in a biologically relevant manner, distinguishing between genuine codon-sized indels and frameshift mutations. This approach facilitates the generation of highly accurate alignments, accounting for the error biases of the sequencing machine used to generate reads, particularly at homopolymer regions. Performance improvements are gained through the use of graphics processing units, which increase the speed of mapping through parallelization. RAMICS substantially outperforms all other mapping approaches tested in terms of alignment quality while maintaining highly competitive speed performance.
Many high throughput sequencing (HTS) approaches, such as the Roche/454 platform, produce sequences in which the quality of the sequence (as measured by a Phred-like quality scores) decreases linearly across a sequence read. Undertaking quality trimming of this data is essential to enable confidence in the results of subsequent downstream analysis. Here, we have developed a novel, highly sensitive and accurate approach (QTrim) for the quality trimming of sequence reads generated using the Roche/454 sequencing platform (or any platform with long reads that outputs Phred-like quality scores).
The performance of QTrim was evaluated against all other available quality trimming approaches on both poor and high quality 454 sequence data. In all cases, QTrim appears to perform equally as well as the best other approach (PRINSEQ) with these two methods significantly outperforming all other methods. Further analysis of the trimmed data revealed that the novel trimming approach implemented in QTrim ensures that the prevalence of low quality bases in the resulting trimmed data is substantially lower than PRINSEQ or any of the other approaches tested.
QTrim is a novel, highly sensitive and accurate algorithm for the quality trimming of Roche/454 sequence reads. It is implemented both as an executable program that can be integrated with standalone sequence analysis pipelines and as a web-based application to enable individuals with little or no bioinformatics experience to quality trim their sequence data.
Quality trimming; Next-generation sequencing; High-throughput sequencing; Phred scores
N-linked glycans attached to specific amino acids of the gp120 envelope trimer of a HIV virion can modulate the binding affinity of gp120 to CD4, influence coreceptor tropism, and play an important role in neutralising antibody responses. Because of the challenges associated with crystallising fully glycosylated proteins, most structural investigations have focused on describing the features of a non-glycosylated HIV-1 gp120 protein. Here, we use a computational approach to determine the influence of N-linked glycans on the dynamics of the HIV-1 gp120 protein and, in particular, the V3 loop. We compare the conformational dynamics of a non-glycosylated gp120 structure to that of two glycosylated gp120 structures, one with a single, and a second with five, covalently linked high-mannose glycans. Our findings provide a clear illustration of the significant effect that N-linked glycosylation has on the temporal and spatial properties of the underlying protein structure. We find that glycans surrounding the V3 loop modulate its dynamics, conferring to the loop a marked propensity towards a more narrow conformation relative to its non-glycosylated counterpart. The conformational effect on the V3 loop provides further support for the suggestion that N-linked glycosylation plays a role in determining HIV-1 coreceptor tropism.
South Africa’s national antiretroviral (ARV) treatment program expanded in 2010 to include the nucleoside reverse transcriptase (RT) inhibitors (NRTI) tenofovir (TDF) for adults and abacavir (ABC) for children. We investigated the associated changes in genotypic drug resistance patterns in patients with first-line ARV treatment failure since the introduction of these drugs, and protease inhibitor (PI) resistance patterns in patients who received ritonavir-boosted lopinavir (LPV/r)-containing therapy.
We analysed ARV treatment histories and HIV-1 RT and protease mutations in plasma samples submitted to the Tygerberg Academic Hospital National Health Service Laboratory.
Between 2006 and 2012, 1,667 plasma samples from 1,416 ARV-treated patients, including 588 children and infants, were submitted for genotypic resistance testing. Compared with 720 recipients of a d4T or AZT-containing first-line regimen, the 153 recipients of a TDF-containing first-line regimen were more likely to have the RT mutations K65R (46% vs 4.0%; p<0.001), Y115F (10% vs. 0.6%; p<0.001), L74VI (8.5% vs. 1.8%; p<0.001), and K70EGQ (7.8% vs. 0.4%) and recipients of an ABC-containing first-line regimen were more likely to have K65R (17% vs 4.0%; p<0.001), Y115F (30% vs 0.6%; p<0.001), and L74VI (56% vs 1.8%; p<0.001). Among the 490 LPV/r recipients, 55 (11%) had ≥1 LPV-resistance mutations including 45 (9.6%) with intermediate or high-level LPV resistance. Low (20 patients) and intermediate (3 patients) darunavir (DRV) cross resistance was present in 23 (4.6%) patients.
Among patients experiencing virological failure on a first-line regimen containing two NRTI plus one NNRTI, the use of TDF in adults and ABC in children was associated with an increase in four major non- thymidine analogue mutations. In a minority of patients, LPV/r-use was associated with intermediate or high-level LPV resistance with predominantly low-level DRV cross-resistance.
The role of HIV-1 RNA in the emergence of resistance to antiretroviral therapies (ARTs) is well documented while less is known about the role of historical viruses stored in the proviral DNA. The primary focus of this work was to characterize the genetic diversity and evolution of HIV drug resistant variants in an individual’s provirus during antiretroviral therapy using next generation sequencing.
Blood samples were collected prior to antiretroviral therapy exposure and during the course of treatment from five patients in whom drug resistance mutations had previously been identified using consensus sequencing. The spectrum of viral variants present in the provirus at each sampling time-point were characterized using 454 pyrosequencing from multiple combined PCR products. The prevalence of viral variants containing drug resistant mutations (DRMs) was characterized at each time-point.
Low abundance drug resistant viruses were identified in 14 of 15 sampling time-points from the five patients. In all individuals DRMs against current therapy were identified at one or more of the sampling time-points. In two of the five individuals studied these DRMs were present prior to treatment exposure and were present at high prevalence within the amplified and sequenced viral population. DRMs to drugs other than those being currently used were identified in four of the five individuals.
The presence of DRMs in the provirus, regardless of their observed prevalence did not appear to have an effect on clinical outcomes in the short term suggesting that the drug resistant viral variants present in the proviral DNA do not appear to play a role in the short term in facilitating the emergence of drug resistance.
HIV-1; Drug resistance; Subtype C; Malawi; Ultradeep sequencing; Proviral DNA
The “glycan shield” exposed on the surface of the HIV-1 gp120 env glycoprotein has been previously proposed as a novel target for anti-HIV treatments. While such targeting of these glycans provides an exciting prospect for HIV treatment, little is known about the conservation and variability of glycosylation patterns within and between the various HIV-1 group M subtypes and circulating recombinant forms. Here, we present evidence of strong strain-specific glycosylation patterns and show that the epitope for the 2G12 neutralising antibody is poorly conserved across HIV-1 group M. The unique glycosylation patterns within the HIV-1 group M subtypes and CRFs appear to explain their varying susceptibility to neutralisation by broadly cross-neutralising (BCN) antibodies. Compensatory glycosylation at linearly distant yet three-dimensionally proximal amino acid positions appears to maintain the integrity of the glycan shield while conveying resistance to neutralisation by BCN antibodies. We find that highly conserved clusters of glycosylated residues do exist on the gp120 trimer surface and suggest that these positions may provide an exciting target for the development of BCN anticarbohydrate therapies.
Standard genotypic antiretroviral resistance testing, performed by bulk sequencing, does not readily detect variants that comprise <20% of the circulating HIV-1 RNA population. Nevertheless, it is valuable in selecting an antiretroviral regimen after antiretroviral failure. In patients with poor adherence, resistant variants may not reach this threshold. Therefore, deep sequencing would be potentially valuable for detecting minority resistant variants. We compared bulk sequencing and deep sequencing to detect HIV-1 drug resistance at the time of a second-line protease inhibitor (PI)-based antiretroviral regimen failure. Eligibility criteria were virologic failure (HIV-1 RNA load of >500 copies/ml) of a first-line nonnucleoside reverse transcriptase inhibitor-based regimen, with at least the M184V mutation (lamivudine resistance), and second-line failure of a lopinavir/ritonavir (LPV/r)-based regimen. An amplicon-sequencing approach on the Roche 454 system was used. Six patients with viral loads of >90,000 copies/ml and one patient with a viral load of 520 copies/ml were included. Mutations not detectable by bulk sequencing during first- and second-line failure were detected by deep sequencing during second-line failure. Low-frequency variants (>0.5% of the sequence population) harboring major protease inhibitor resistance mutations were found in 5 of 7 patients despite poor adherence to the LPV/r-based regimen. In patients with intermittent adherence to a boosted PI regimen, deep sequencing may detect minority PI-resistant variants, which likely represent early events in resistance selection. In patients with poor or intermittent adherence, there may be low evolutionary impetus for such variants to reach fixation, explaining the low prevalence of PI resistance.
In human immunodeficiency virus type 1 (HIV-1) infection, transmitted viruses generally use the CCR5 chemokine receptor as a coreceptor for host cell entry. In more than 50% of subtype B infections, a switch in coreceptor tropism from CCR5- to CXCR4-use occurs during disease progression. Phenotypic or genotypic approaches can be used to test for the presence of CXCR4-using viral variants in an individual’s viral population that would result in resistance to treatment with CCR5-antagonists. While genotyping approaches for coreceptor-tropism prediction in subtype B are well established and verified, they are less so for subtype C.
Here, using a dataset comprising V3 loop sequences from 349 CCR5-using and 56 CXCR4-using HIV-1 subtype C viruses we perform a comparative analysis of the predictive ability of 11 genotypic algorithms in their prediction of coreceptor tropism in subtype C. We calculate the sensitivity and specificity of each of the approaches as well as determining their overall accuracy. By separating the CXCR4-using viruses into CXCR4-exclusive (25 sequences) and dual-tropic (31 sequences) we evaluate the effect of the possible conflicting signal from dual-tropic viruses on the ability of a of the approaches to correctly predict coreceptor phenotype.
We determined that geno2pheno with a false positive rate of 5% is the best approach for predicting CXCR4-usage in subtype C sequences with an accuracy of 94% (89% sensitivity and 99% specificity). Contrary to what has been reported for subtype B, the optimal approaches for prediction of CXCR4-usage in sequence from viruses that use CXCR4 exclusively, also perform best at predicting CXCR4-use in dual-tropic viral variants.
The accuracy of genotyping approaches at correctly predicting the coreceptor usage of V3 sequences from subtype C viruses is very high. We suggest that genotyping approaches can be used to test for coreceptor tropism in HIV-1 group M subtype C with a high degree of confidence that they will identify CXCR4-usage in both CXCR4-exclusive and dual tropic variants.
Human immunodeficiency virus; Coreceptor; Chemokine receptors; CXCR4; CCR5; Genotype; Phenotype; Subtype C
Here we present new sequence data from HIV-1 subtype C-infected long-term survivors (LTS) from Karonga District, Malawi. Gag and env sequence data were produced from nine individuals each of whom has been HIV-1 positive for more than 20 years. We show that the three amino acid deletion in gag p17 previously described from these LTS is not real and was a result of an alignment error. We find that the use of dried blood spots for DNA-based studies is limited after storage for 20 years. We also show some unlikely amino acid changes in env C2-V3 in LTS over time and different patterns of genetic divergence among LTS. Although no clear association between mutations and survival could be shown, amino acid changes that are present in more than one LTS may, in the future, be shown to be important.
Drug resistance testing before initiation of, or during, antiretroviral therapy (ART) is not routinely performed in resource-limited settings. High levels of viral resistance circulating within the population will have impact on treatment programs by increasing the chances of transmission of resistant strains and treatment failure. Here, we investigate Drug Resistance Mutations (DRMs) from blood samples obtained at regular intervals from patients on ART (Baseline-22 months) in Karonga District, Malawi. One hundred and forty nine reverse transcriptase (RT) consensus sequences were obtained via nested PCR and automated sequencing from blood samples collected at three-month intervals from 75 HIV-1 subtype C infected individuals in the ART programme.
Fifteen individuals showed DRMs, and in ten individuals DRMs were seen from baseline samples (reported to be ART naïve). Three individuals in whom no DRMs were observed at baseline showed the emergence of DRMs during ART exposure. Four individuals who did show DRMs at baseline showed additional DRMs at subsequent time points, while two individuals showed evidence of DRMs at baseline and either no DRMs, or different DRMs, at later timepoints. Three individuals had immune failure but none appeared to be failing clinically.
Despite the presence of DRMs to drugs included in the current regimen in some individuals, and immune failure in three, no signs of clinical failure were seen during this study. This cohort will continue to be monitored as part of the Karonga Prevention Study so that the long-term impact of these mutations can be assessed. Documenting proviral population is also important in monitoring the emergence of drug resistance as selective pressure provided by ART compromises the current plasma population, archived viruses can re-emerge
HIV-1; drug resistance; subtype C; ART; Malawi; Reverse transcriptase
The systematics of the poriferan Order Haplosclerida (Class Demospongiae) has been under scrutiny for a number of years without resolution. Molecular data suggests that the order needs revision at all taxonomic levels. Here, we provide a comprehensive view of the phylogenetic relationships of the marine Haplosclerida using many species from across the order, and three gene regions. Gene trees generated using 28S rRNA, nad1 and cox1 gene data, under maximum likelihood and Bayesian approaches, are highly congruent and suggest the presence of four clades. Clade A is comprised primarily of species of Haliclona and Callyspongia, and clade B is comprised of H. simulans and H. vansoesti (Family Chalinidae), Amphimedon queenslandica (Family Niphatidae) and Tabulocalyx (Family Phloeodictyidae), Clade C is comprised primarily of members of the Families Petrosiidae and Niphatidae, while Clade D is comprised of Aka species. The polyphletic nature of the suborders, families and genera described in other studies is also found here.
In this preliminary study we show that in 2008, 3 years after antiretroviral therapy was introduced into the Karonga District, Malawi, a greater than expected number of drug-naive individuals have been infected with HIV-1 subtype C virus harboring major and minor drug resistance mutations (DRMs). From a sample size of 40 reverse transcriptase (RT) consensus sequences from drug-naive individuals we found five showing NRTI and four showing NNRTI mutations with one individual showing both. From 29 protease consensus sequences, again from drug-naive individuals, we found evidence of minor DRMs in three. Additional major and minor DRMs were found in clonal sequences from a number of individuals that were not present in the original consensus sequences. This clearly illustrates the importance of sequencing multiple HIV-1 variants from individuals to fully assess drug resistance.
The extent to which prokaryotic evolution has been influenced by horizontal gene transfer (HGT) and therefore might be more of a network than a tree is unclear. Here we use supertree methods to ask whether a definitive prokaryotic phylogenetic tree exists and whether it can be confidently inferred using orthologous genes. We analysed an 11-taxon dataset spanning the deepest divisions of prokaryotic relationships, a 10-taxon dataset spanning the relatively recent gamma-proteobacteria and a 61-taxon dataset spanning both, using species for which complete genomes are available. Congruence among gene trees spanning deep relationships is not better than random. By contrast, a strong, almost perfect phylogenetic signal exists in gamma-proteobacterial genes. Deep-level prokaryotic relationships are difficult to infer because of signal erosion, systematic bias, hidden paralogy and/or HGT. Our results do not preclude levels of HGT that would be inconsistent with the notion of a prokaryotic phylogeny. This approach will help decide the extent to which we can say that there is a prokaryotic phylogeny and where in the phylogeny a cohesive genomic signal exists.
Recent studies have demonstrated the emergence of human immunodeficiency virus type 1 (HIV-1) subtypes with various levels of fitness. Using heterogeneous maximum-likelihood models of adaptive evolution implemented in the PAML software package, with env sequences representing each HIV-1 group M subtype, we examined the various intersubtype selective pressures operating across the env gene. We found heterogeneity of evolutionary mechanisms between the different subtypes with a category of amino acid sites observed that had undergone positive selection for subtypes C, F1, and G, while these sites had undergone purifying selection in all other subtypes. Also, amino acid sites within subtypes A and K that had undergone purifying selection were observed, while these sites had undergone positive selection in all other subtypes. The presence of such sites indicates heterogeneity of selective pressures within HIV-1 group M subtype evolution that may account for the various levels of fitness of the subtypes.
Human immunodeficiency virus type 1 (HIV-1) subtype C is responsible for more than 55% of HIV-1 infections worldwide. When this subtype first emerged is unknown. We have analyzed all available gag (p17 and p24) and env (C2-V3) subtype C sequences with known sampling dates, which ranged from 1983 to 2000. The majority of these sequences come from the Karonga District in Malawi and include some of the earliest known subtype C sequences. Linear regression analyses of sequence divergence estimates (with four different approaches) were plotted against sample year to estimate the year in which there was zero divergence from the reconstructed ancestral sequence. Here we suggest that the most recent common ancestor of subtype C appeared in the mid- to late 1960s. Sensitivity analyses, by which possible biases due to oversampling from one district were explored, gave very similar estimates.