1.  Recursive Partitioning for Monotone Missing at Random Longitudinal Markers 
Statistics in medicine  2012;32(6):978-994.
The development of HIV resistance mutations reduces the efficacy of specific antiretroviral drugs used to treat HIV infection, and cross-resistance within classes of drugs is common. Recursive partitioning has been extensively used to identify resistance mutations associated with a reduced virologic response measured at a single time point; here we describe a statistical method that accommodates a large set of genetic or other covariates and a longitudinal response. This recursive partitioning approach for continuous longitudinal data uses the kernel of a U-statistic as the splitting criterion, and avoids the need for parametric assumptions regarding the relationship between observed response trajectories and covariates. We propose an extension of this approach that allows longitudinal measurements to be monotone missing at random by making use of inverse probability weights. We assess the performance of our method using extensive simulation studies, and apply them to data collected by the Forum for Collaborative HIV Research as part of an investigation of the viral genetic mutations associated with reduced clinical efficacy of the drug abacavir.
PMCID: PMC3754451  PMID: 22941582
Inverse Probability Weighting; Recursive Partitioning; U-statistics
2.  N348I in the Connection Domain of HIV-1 Reverse Transcriptase Confers Zidovudine and Nevirapine Resistance 
PLoS Medicine  2007;4(12):e335.
The catalytically active 66-kDa subunit of the human immunodeficiency virus type 1 (HIV-1) reverse transcriptase (RT) consists of DNA polymerase, connection, and ribonuclease H (RNase H) domains. Almost all known RT inhibitor resistance mutations identified to date map to the polymerase domain of the enzyme. However, the connection and RNase H domains are not routinely analysed in clinical samples and none of the genotyping assays available for patient management sequence the entire RT coding region. The British Columbia Centre for Excellence in HIV/AIDS (the Centre) genotypes clinical isolates up to codon 400 in RT, and our retrospective statistical analyses of the Centre's database have identified an N348I mutation in the RT connection domain in treatment-experienced individuals. The objective of this multidisciplinary study was to establish the in vivo relevance of this mutation and its role in drug resistance.
Methods and Findings
The prevalence of N348I in clinical isolates, the time taken for it to emerge under selective drug pressure, and its association with changes in viral load, specific drug treatment, and known drug resistance mutations was analysed from genotypes, viral loads, and treatment histories from the Centre's database. N348I increased in prevalence from below 1% in 368 treatment-naïve individuals to 12.1% in 1,009 treatment-experienced patients (p = 7.7 × 10−12). N348I appeared early in therapy and was highly associated with thymidine analogue mutations (TAMs) M41L and T215Y/F (p < 0.001), the lamivudine resistance mutations M184V/I (p < 0.001), and non-nucleoside RTI (NNRTI) resistance mutations K103N and Y181C/I (p < 0.001). The association with TAMs and NNRTI resistance mutations was consistent with the selection of N348I in patients treated with regimens that included both zidovudine and nevirapine (odds ratio 2.62, 95% confidence interval 1.43–4.81). The appearance of N348I was associated with a significant increase in viral load (p < 0.001), which was as large as the viral load increases observed for any of the TAMs. However, this analysis did not account for the simultaneous selection of other RT or protease inhibitor resistance mutations on viral load. To delineate the role of this mutation in RT inhibitor resistance, N348I was introduced into HIV-1 molecular clones containing different genetic backbones. N348I decreased zidovudine susceptibility 2- to 4-fold in the context of wild-type HIV-1 or when combined with TAMs. N348I also decreased susceptibility to nevirapine (7.4-fold) and efavirenz (2.5-fold) and significantly potentiated resistance to these drugs when combined with K103N. Biochemical analyses of recombinant RT containing N348I provide supporting evidence for the role of this mutation in zidovudine and NNRTI resistance and give some insight into the molecular mechanism of resistance.
This study provides the first in vivo evidence that treatment with RT inhibitors can select a mutation (i.e., N348I) outside the polymerase domain of the HIV-1 RT that confers dual-class resistance. Its emergence, which can happen early during therapy, may significantly impact on a patient's response to antiretroviral therapies containing zidovudine and nevirapine. This study also provides compelling evidence for investigating the role of other mutations in the connection and RNase H domains in virological failure.
Analyzing HIV sequences from a Canadian cohort, Gilda Tachedjian and colleagues identify a common mutation in a little-studied domain of reverse transcriptase that confers resistance to two drug classes.
Editors' Summary
In the 1980s, infection with the human immunodeficiency virus (HIV), which causes acquired immunodeficiency syndrome (AIDS), was a death sentence. Although the first antiretroviral drugs (compounds that block HIV's life cycle) were developed quickly, single antiretrovirals only transiently suppress HIV infection. HIV rapidly accumulates random changes (mutations) in its genetic material, some of which make it drug resistant. Nowadays, there are many different antiretrovirals. Some inhibit the viral protease, an enzyme used to assemble new viruses. Others block reverse transcriptase (RT), which makes replicates of the genes of the virus. Nucleoside/nucleotide RT inhibitors (NRTIs; for example, zidovudine—also called AZT—and lamivudine) and non-nucleoside RT inhibitors (NNRTIs; for example, nevirapine and efavirenz) interfere with the activity of RT by binding to different sites in its so-called “DNA polymerase domain,” the part of the enzyme that constructs copies of the viral genes. Highly active antiretroviral therapy (HAART), which was introduced in the mid 1990s, combines several antiretrovirals (usually a protease inhibitor and two NRTIs or an NNRTI and two NRTIs) so that the replication of any virus that develops resistance to one drug is inhibited by the other drugs in the mix. When treated with HAART, HIV infection is usually a chronic, stable condition rather than a fatal disease.
Why Was This Study Done?
Unfortunately, HIV that is resistant to drugs still develops in some patients. To improve the prevention and management of drug resistance, a better understanding of the mutations that cause resistance is needed. Resistance to RT inhibitors usually involves mutations in the DNA polymerase domain that reduce the efficacy of NRTIs (including thymidine analogue mutations—also known as TAMs—and lamivudine-resistance mutations) and NNRTIs. Blood tests that detect these resistance mutations (genotype tests) have been used for several years to guide individualized selection of HIV drugs. Recently, however, mutations outside the DNA polymerase domain have also been implicated in resistance to RT inhibitors. In this study, the researchers have used data and samples collected since the mid 1990s by Canada's British Columbia Centre for Excellence in HIV/AIDS to investigate the clinical relevance of a mutation called N348I. This mutation changes an asparagine (a type of amino acid) to an isoleucine in a region of RT known as the connection domain. The researchers have also investigated how this mutation causes resistance to RT inhibitors in laboratory tests.
What Did the Researchers Do and Find?
The researchers analyzed the first two-thirds of the RT gene in viruses isolated from a large number of the Centre's patients. Virus carrying the N348I mutation was present in less than one in 100 patients whose HIV infection had never been treated, but in more than one in 10 treatment-experienced patients. The mutation appeared early in therapy, often in viruses that had TAMs, a lamivudine-resistance mutation called M184V/I, and/or NNRTI resistance mutations. Patients treated with zidovudine and nevirapine were 2.6 times more likely to have the N348I mutation than patients not treated with these drugs. Furthermore, the appearance of the N348I mutation often coincided with an increase in viral load, although other mutations that appeared at a similar time could have contributed to this increase. When the researchers introduced the N348I mutation into HIV growing in the laboratory, they found that it decreased the susceptibility of the virus to zidovudine and to NNRTIs.
What Do These Findings Mean?
These findings show that the treatment of patients with RT inhibitors can select a drug-resistant HIV variant that has a mutation outside the enzyme's DNA polymerase domain. Because this N348I mutation, which is commonly selected in vivo and has also been seen in other studies, confers resistance to two classes of RT inhibitors and can emerge early during therapy, it could have a large impact on patient responses to antiviral regimens that contain zidovudine and nevirapine. Although these findings do not show that the N348I mutation alone causes treatment failure, they may have implications for genotypic and phenotypic resistance testing, which is often used to guide treatment decisions. At present, genotype tests for resistance to RT inhibitors look for mutations only in the DNA polymerase domain of RT. This study is the first to demonstrate that it might be worth looking for the N348I mutation (and for other mutations outside the DNA polymerase domain) to improve the ability of genotypic and phenotypic resistance tests to predict treatment outcomes.
Additional Information.
Please access these Web sites via the online version of this summary at
Information is available from the US National Institute of Allergy and Infectious Diseases on HIV infection and AIDS
HIV InSite has comprehensive information on all aspects of HIV/AIDS, including links to fact sheets (in English, French, and Spanish) about antiretrovirals, and chapters explaining antiretroviral resistance testing
NAM, a UK registered charity, provides information about all aspects of HIV and AIDS, including fact sheets on types of HIV drugs, drug resistance, and resistance tests (in English, Spanish, French, Portuguese, and Russian)
The US Centers for Disease Control and Prevention provides information on HIV/AIDS and on treatment (in English and Spanish)
AIDSinfo, a service of the US Department of Health and Human Services provides information for patients on HIV and its treatment
PMCID: PMC2100143  PMID: 18052601
3.  A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities 
Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected.
We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy.
The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns.
The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.
PMCID: PMC4129184  PMID: 24739673
Recursive partitioning; Tree-based regression; Lung cancer; Disease taxonomy; Genomic
4.  A recursively partitioned mixture model for clustering time-course gene expression data 
Translational cancer research  2014;3(3):217-232.
Longitudinally collected gene expression data provides an opportunity to investigate the dynamic behavior of gene expression and is crucial for establishing causal links between changes on a molecular level and disease development and progression. In terms of the analysis of such data, clustering of subjects based on time-course expression data may improve our understanding of temporal expression patterns that result in disease phenotypes. Although there are numerous existing methods for clustering subjects using gene expression data, most are not suitable when expression measurements are repeatedly collected over a time-course.
We present a modified version of the recursively partitioned mixture model (RPMM) for clustering subjects based on longitudinally collected gene expression data. In the proposed time-course RPMM (TC-RPMM), subjects are clustered on the basis of their temporal profiles of gene expression using a mixture of mixed effects models framework. This framework captures changes in gene expression over time and models the autocorrelation between repeated gene expression measurements for the same subject. We assessed the performance of TC-RPMM using extensive simulation studies and a dataset from a multi-center research study of inflammation and response to injury (, which consisted of time-course gene expression data for 140 subjects.
Our simulation studies encompassed several different scenarios and were aimed at assessing the ability of TC-RPMM to correctly recover true class memberships when the expression trajectories that characterized those classes differed. Overall, our simulation studies revealed favorable performance of TC-RPMM compared to competing approaches, however clustering performance was observed to be highly dependent on the proportion of class discriminating genes used in clustering analysis. When applied to real epidemiologic data with repeated-measures, longitudinal gene expression measurements, TC-RPMM identified clusters that had strong biological and clinical significance.
Methods for clustering subjects based on temporal gene expression profiles is a high priority for molecular biology and bioinformatics research. Along these lines, the proposed TC-RPMM represents a promising new approach for analyzing time-course gene expression data.
PMCID: PMC4208690  PMID: 25346887
Longitudinal gene expression data; repeated-measures microarrays; time-course microarrays; clustering; mixture models
5.  Minority HIV-1 Drug Resistance Mutations Are Present in Antiretroviral Treatment–Naïve Populations and Associate with Reduced Treatment Efficacy 
PLoS Medicine  2008;5(7):e158.
Transmitted HIV-1 drug resistance can compromise initial antiretroviral therapy (ART); therefore, its detection is important for patient management. The absence of drug-associated selection pressure in treatment-naïve persons can cause drug-resistant viruses to decline to levels undetectable by conventional bulk sequencing (minority drug-resistant variants). We used sensitive and simple tests to investigate evidence of transmitted drug resistance in antiretroviral drug-naïve persons and assess the clinical implications of minority drug-resistant variants.
Methods and Findings
We performed a cross-sectional analysis of transmitted HIV-1 drug resistance and a case-control study of the impact of minority drug resistance on treatment response. For the cross-sectional analysis, we examined viral RNA from newly diagnosed ART-naïve persons in the US and Canada who had no detectable (wild type, n = 205) or one or more resistance-related mutations (n = 303) by conventional sequencing. Eight validated real-time PCR-based assays were used to test for minority drug resistance mutations (protease L90M and reverse transcriptase M41L, K70R, K103N, Y181C, M184V, and T215F/Y) above naturally occurring frequencies. The sensitive real-time PCR testing identified one to three minority drug resistance mutation(s) in 34/205 (17%) newly diagnosed persons who had wild-type virus by conventional genotyping; four (2%) individuals had mutations associated with resistance to two drug classes. Among 30/303 (10%) samples with bulk genotype resistance mutations we found at least one minority variant with a different drug resistance mutation. For the case-control study, we assessed the impact of three treatment-relevant drug resistance mutations at baseline from a separate group of 316 previously ART-naïve persons with no evidence of drug resistance on bulk genotype testing who were placed on efavirenz-based regimens. We found that 7/95 (7%) persons who experienced virologic failure had minority drug resistance mutations at baseline; however, minority resistance was found in only 2/221 (0.9%) treatment successes (Fisher exact test, p = 0.0038).
These data suggest that a considerable proportion of transmitted HIV-1 drug resistance is undetected by conventional genotyping and that minority mutations can have clinical consequences. With no treatment history to help guide therapies for drug-naïve persons, the findings suggest an important role for sensitive baseline drug resistance testing.
Using real-time PCR to detect HIV resistance mutations present at low levels, Jeffrey Johnson and colleagues investigate prevalence and clinical implications of minority transmitted mutations.
Editors' Summary
Since the mid-1990s, several powerful antiretroviral drug combinations have been developed that have greatly improved the prognosis of HIV infection. All antiretroviral therapy (ART) regimens combine drugs that act against HIV in different ways (so-called different drug classes). Multiple drugs are necessary because HIV continually accumulates random changes (mutations) in its genetic material (genome). Some of these mutations make HIV resistant to individual antiretroviral drugs, so a mixture of drugs is needed to keep the virus in check. However, the efficacy of ART (which itself selects for drug-resistant variants by giving them a growth advantage over drug-sensitive variants) is substantially reduced when these variants account for more than about 20% of the viruses in an infected person. This level of variant virus can be detected in blood samples with a technique called bulk sequencing. In North America and Europe, where ART has been widely used for many years, around 20% of HIV-infected people who have taken ART themselves develop this level of drug-resistant virus, which can be transmitted by the same routes as nonresistant HIV (typically unprotected sexual intercourse or needle sharing). In such cases, the person acquiring drug-resistant HIV may experience treatment failure when drugs later fail to work against the resistant virus. In these countries, therefore, resistance testing by bulk sequencing is done routinely before ART is initiated to decide which antiviral drugs are likely to be effective.
Why Was This Study Done?
Several years usually elapse between the time a person becomes infected with HIV and the time he or she starts ART. During this time, the absence of selection pressure from antiviral drugs means that transmitted drug-resistant variants tend to decline to levels undetectable by bulk sequencing. These “minority drug-resistant variants” can be detected using other more sensitive tests but it is not known what proportion of HIV-infected people who have never taken ART carry minority drug-resistant variants (the “prevalence” of these variants). It is also unknown whether the presence of minority drug-resistant variants reduces the success of ART. In this paper, the researchers first report a “cross-sectional” study in North America using a sensitive assay to determine the prevalence of minority drug-resistant viruses among HIV-infected people who had never received ART. They then investigate whether minority drug-resistant variants have any impact on the effectiveness of ART in a “case-control” study.
What Did the Researchers Do and Find?
In their cross-sectional study, the researchers used a highly sensitive test for detecting mutations (called a real-time PCR-based assay) to look for low levels of viruses carrying any of eight major drug-resistance mutations in people with newly diagnosed HIV infection who reported no prior treatment with ART. Seventeen percent of the people who had only wild-type (nonmutated) virus by bulk sequencing (205 participants) were found, in fact, to carry low levels of virus variants with 1–3 drug-resistance mutations; 2% of them carried viruses resistant to two different drug classes (called multi-drug resistance). Among the people with resistance mutations detected by bulk sequencing (303 participants), 10% had at least one additional minority drug-resistant variant, often a viral variant that was resistant to a drug class different from that detected by bulk sequencing. In the case-control study, the researchers used their sensitive assays to measure the levels of viruses containing any of the three most common drug resistance mutations likely to affect viral responses to the antiretroviral drugs efavirenz and lamivudine in 316 people just before they started their first HIV treatment, which included these drugs. Of people for whom ART failed, 7% were infected with minority drug-resistant virus variants at baseline compared with only 0.9% of people for whom ART worked; this difference was statistically significant.
What Do These Findings Mean?
The findings of the cross-sectional study indicate that conventional bulk sequencing fails to detect a large proportion of transmitted HIV drug resistance and suggest that the transmission of drug-resistant variants from infectious ART-experienced people to ART-naïve individuals might not be uncommon. The findings of the case-control study suggest that the minority drug-resistant HIV variants may have clinical consequences. That is, the presence of such variants in individuals who have not previously taken ART may reduce the efficacy of some ART regimens. However, the number of participants meeting the criteria for analysis in the cross-sectional study was limited, and the association between minority resistance and treatment failure may have been influenced by other factors. Taken together, these findings suggest that, to ensure that first-line ART is as effective as possible, greater efforts should be made to prevent HIV transmission, whether from ART-experienced or ART-naive people. However, because data on minority drug-resistant virus are limited, more studies— particularly with recent populations—are needed before testing for these variants can be considered appropriate in the clinical management of newly diagnosed HIV infection.
Additional Information.
Please access these Web sites via the online version of this summary at
This study is further discussed in a PLoS Medicine Perspective by Steven G. Deeks
Information is available from the US National Institute of Allergy and Infectious Diseases on HIV infection and AIDS
HIV InSite has comprehensive information on all aspects of HIV/AIDS, including links to fact sheets (in English, French, and Spanish) about antiretrovirals and information on genetic testing for HIV drug resistance
NAM, a UK registered charity, provides information about all aspects of HIVand AIDS, including fact sheets on types of HIV drug, drug resistance, and resistance tests (in English, Spanish, French, Portuguese, and Russian)
The US Centers for Disease Control and Prevention provides information on HIV/AIDS and on treatment (in English and Spanish)
PMCID: PMC2488194  PMID: 18666824
6.  Prognostic scores in brain metastases from breast cancer 
BMC Cancer  2009;9:105.
Prognostic scores might be useful tools both in clinical practice and clinical trials, where they can be used as stratification parameter. The available scores for patients with brain metastases have never been tested specifically in patients with primary breast cancer. It is therefore unknown which score is most appropriate for these patients.
Five previously published prognostic scores were evaluated in a group of 83 patients with brain metastases from breast cancer. All patients had been treated with whole-brain radiotherapy with or without radiosurgery or surgical resection. In addition, it was tested whether the parameters that form the basis of these scores actually have a prognostic impact in this biologically distinct group of brain metastases patients.
The scores that performed best were the recursive partitioning analysis (RPA) classes and the score index for radiosurgery (SIR). However, disagreement between the parameters that form the basis of these scores and those that determine survival in the present group of patients and many reported data from the literature on brain metastases from breast cancer was found. With the four statistically significant prognostic factors identified here, a 3-tiered score can be created that performs slightly better than RPA and SIR. In addition, a 4-tiered score is also possible, which performs better than the three previous 4-tiered scores, incl. graded prognostic assessment (GPA) score and basic score for brain metastases (BSBM).
A variety of prognostic models describe the survival of patients with brain metastases from breast cancer to a more or less satisfactory degree. However, the standard brain metastases scores might not fully appreciate the unique biology and time course of this disease, e.g., compared to lung cancer. It appears possible that inclusion of emerging prognostic factors will improve the results and allow for development and validation of a consensus score for broad clinical application. The model that is based on the authors own patient group, which is not large enough to fully evaluate a large number of potential prognostic factors, is meant to illustrate this point rather than to provide the definitive score.
PMCID: PMC2674059  PMID: 19351389
7.  Emergence of Drug Resistance Is Associated with an Increased Risk of Death among Patients First Starting HAART 
PLoS Medicine  2006;3(9):e356.
The impact of the emergence of drug-resistance mutations on mortality is not well characterized in antiretroviral-naïve patients first starting highly active antiretroviral therapy (HAART). Patients may be able to sustain immunologic function with resistant virus, and there is limited evidence that reduced sensitivity to antiretrovirals leads to rapid disease progression or death. We undertook the present analysis to characterize the determinants of mortality in a prospective cohort study with a median of nearly 5 y of follow-up. The objective of this study was to determine the impact of the emergence of drug-resistance mutations on survival among persons initiating HAART.
Methods and Findings
Participants were antiretroviral therapy naïve at entry and initiated triple combination antiretroviral therapy between August 1, 1996, and September 30, 1999. Marginal structural modeling was used to address potential confounding between time-dependent variables in the Cox proportional hazard regression models. In this analysis resistance to any class of drug was considered as a binary time-dependent exposure to the risk of death, controlling for the effect of other time-dependent confounders. We also considered each separate class of mutation as a binary time-dependent exposure, while controlling for the presence/absence of other mutations. A total of 207 deaths were identified among 1,138 participants over the follow-up period, with an all cause mortality rate of 18.2%. Among the 679 patients with HIV-drug-resistance genotyping done before initiating HAART, HIV-drug resistance to any class was observed in 53 (7.8%) of the patients. During follow-up, HIV-drug resistance to any class was observed in 302 (26.5%) participants. Emergence of any resistance was associated with mortality (hazard ratio: 1.75 [95% confidence interval: 1.27, 2.43]). When we considered each class of resistance separately, persons who exhibited resistance to non-nucleoside reverse transcriptase inhibitors had the highest risk: mortality rates were 3.02 times higher (95% confidence interval: 1.99, 4.57) for these patients than for those who did not exhibit this type of resistance.
We demonstrated that emergence of resistance to non-nucleoside reverse transcriptase inhibitors was associated with a greater risk of subsequent death than was emergence of protease inhibitor resistance. Future research is needed to identify the particular subpopulations of men and women at greatest risk and to elucidate the impact of resistance over a longer follow-up period.
Emergence of resistance to both non-nucleoside reverse transcriptase inhibitors and protease inhibitors was associated with a higher risk of subsequent death, but the risk was greater in patients with NNRTI-resistant HIV.
Editors' Summary
In the 1980s, infection with the human immunodeficiency virus (HIV) was effectively a death sentence. HIV causes AIDS (acquired immunodeficiency syndrome) by replicating inside immune system cells and destroying them, which leaves infected individuals unable to fight off other viruses and bacteria. The first antiretroviral drugs were developed quickly, but it soon became clear that single antiretrovirals only transiently suppress HIV infection. HIV mutates (accumulates random changes to its genetic material) very rapidly and, although most of these changes (or mutations) are bad for the virus, by chance some make it drug resistant. Highly active antiretroviral therapy (HAART), which was introduced in the mid-1990s, combines three or four antiretroviral drugs that act at different stages of the viral life cycle. For example, they inhibit the reverse transcriptase that the virus uses to replicate its genetic material, or the protease that is necessary to assemble new viruses. With HAART, the replication of any virus that develops resistance to one drug is inhibited by the other drugs in the mix. As a consequence, for many individuals with access to HAART, AIDS has become a chronic rather than a fatal disease. However, being on HAART requires patients to take several pills a day at specific times. In addition, the drugs in the HAART regimens often have side effects.
Why Was This Study Done?
Drug resistance still develops even with HAART, often because patients don't stick to the complicated regimens. The detection of resistance to one drug is usually the prompt to change a patient's drug regimen to head off possible treatment failure. Although most patients treated with HAART live for many years, some still die from AIDS. We don't know much about how the emergence of drug-resistance mutations affects mortality in patients who are starting antiretroviral therapy for the first time. In this study, the researchers looked at how the emergence of drug resistance affected survival in a group of HIV/AIDS patients in British Columbia, Canada. Here, everyone with HIV/AIDS has access to free medical attention, HAART, and laboratory monitoring, and full details of all HAART recipients are entered into a central reporting system.
What Did the Researchers Do and Find?
The researchers enrolled people who started antiretroviral therapy for the first time between August 1996 and September 1999 into the HAART Observational Medical Evaluation and Research (HOMER) cohort. They then excluded anyone who was infected with already drug-resistant HIV strains (based on the presence of drug-resistance mutations in viruses isolated from the patients) at the start of therapy. The remaining 1,138 patients were followed for an average of five years. All the patients received either two nucleoside reverse transcriptase inhibitors and a protease inhibitor, or two nucleoside and one non-nucleoside reverse transcriptase inhibitor (NNRTI). Nearly a fifth of the study participants died during the follow-up period. Most of these patients actually had drug-sensitive viruses, possibly because they had neglected taking their drugs to such an extent that there had been insufficient drug exposure to select for drug-resistant viruses. In a quarter of the patients, however, HIV strains resistant to one or more antiretroviral drugs emerged during the study (again judged by looking for mutations). Detailed statistical analyses indicated that the emergence of any drug resistance nearly doubled the risk of patients dying, and that people carrying viruses resistant to NNRTIs were three times as likely to die as those without resistance to this class of antiretroviral drug.
What Do These Findings Mean?
These results provide new information about the emergence of drug-resistant HIV during HAART and possible effects on the long-term survival of patients. In particular, they suggest that clinicians should watch carefully for the emergence of resistance to NNRTIs in their patients. Because this type of resistance is often due to poor adherence to drug regimens, these results also suggest that increased efforts should be made to ensure that patients comply with the prescribed HAART regimens, especially those whose antiretroviral therapy includes NNRTIs. As with all studies in which a group of individuals who share a common characteristic are studied over time, it is possible that some other, unmeasured difference between the patients who died and those who didn't—rather than emerging drug resistance—is responsible for the observed differences in survival. Additional studies are needed to confirm the findings here, and to investigate whether specific subpopulations of patients are at particular risk of developing drug resistance and/or dying during HAART.
Additional Information.
Please access these Web sites via the online version of this summary at
US National Institute of Allergy and Infectious Diseases fact sheet on HIV infection and AIDS
US Department of Health and Human Services information on AIDS, including details of approved drugs for the treatment of HIV infection
US Centers for Disease Control and Prevention information on HIV/AIDS
Aidsmap, information on HIV and AIDS provided by the charity NAM, which includes details on antiretroviral drugs
PMCID: PMC1569883  PMID: 16984218
8.  Random forest methodology for model-based recursive partitioning: the mobForest package for R 
BMC Bioinformatics  2013;14:125.
Recursive partitioning is a non-parametric modeling technique, widely used in regression and classification problems. Model-based recursive partitioning is used to identify groups of observations with similar values of parameters of the model of interest. The mob() function in the party package in R implements model-based recursive partitioning method. This method produces predictions based on single tree models. Predictions obtained through single tree models are very sensitive to small changes to the learning sample. We extend the model-based recursive partition method to produce predictions based on multiple tree models constructed on random samples achieved either through bootstrapping (random sampling with replacement) or subsampling (random sampling without replacement) on learning data.
Here we present an R package called “mobForest” that implements bagging and random forests methodology for model-based recursive partitioning. The mobForest package constructs large number of model-based trees and the predictions are aggregated across these trees resulting in more stable predictions. The package also includes functions for computing predictive accuracy estimates and plots, residuals plot, and variable importance plot.
The mobForest package implements a random forest type approach for model-based recursive partitioning. The R package along with it source code is available at
PMCID: PMC3626834  PMID: 23577585
Random forests; Model-based recursive partitioning; Ensemble; R
9.  Frequent Emergence of N348I in HIV-1 Subtype C Reverse Transcriptase with Failure of Initial Therapy Reduces Susceptibility to Reverse-Transcriptase Inhibitors 
N348I emerges frequently with failure of first-line antiretroviral therapy (ART) in subtype C human immunodeficiency virus type 1 infection and affects susceptibility to nevirapine, efavirenz, etravirine, and zidovudine. This finding has implications for cross-resistance to subsequent ART regimens in resource-limited settings.
Background. It is not known how often mutations in the connection and ribonuclease H domains of reverse transcriptase (RT) emerge with failure of first-line antiretroviral therapy (ART) in subtype C human immunodeficiency virus type 1 (HIV-1) infection and how these mutations affect susceptibility to other antiretrovirals.
Methods. We compared full-length RT sequences in plasma obtained before therapy and at virologic failure of initial ART among 63 participants with subtype C HIV-1 infection enrolled in the Comprehensive International Program of Research on AIDS in South Africa (CIPRA-SA) study. Recombinant viruses containing full-length plasma-derived RT sequences from participants with N348I at virologic failure were assayed for drug susceptibility.
Results. Y181C and M184V mutations in the RT polymerase domain were associated with failure of stavudine-lamivudine-nevirapine (d4T/3TC/NVP; P < .01), and K103N, V106M, and M184V with failure of d4T/3TC/efavirenz (EFV; P < .01). N348I in the RT connection domain emerged in 45% (P = .002) and 12% (P = .06) of participants receiving failing regimens containing NVP or EFV, respectively. Longitudinal analyses revealed that nonnucleoside RT inhibitor resistance mutations in the polymerase domain generally appeared first. N348I emerged at the same time, or after, M184V. N348I in the context of polymerase domain mutations reduced susceptibility to NVP (8.9–13-fold), EFV (4–56-fold), etravirine (ETV; 1.9–4.7-fold) and decreased hypersusceptibility to zidovudine (AZT; 1.4–2.2-fold).
Conclusions. N348I emerges frequently with virologic failure of first-line ART in subtype C HIV-1 infection and reduces susceptibility to NVP, EFV, ETV, and AZT. Additional studies are warranted to characterize the effects of N348I on virologic response to second- and third-line regimens in resource-limited settings where subtype C predominates.
PMCID: PMC3491849  PMID: 22618567
10.  Radiotherapy and temozolomide for newly diagnosed glioblastoma and anaplastic astrocytoma: validation of Radiation Therapy Oncology Group-Recursive Partitioning Analysis in the IMRT and temozolomide era 
Journal of Neuro-Oncology  2010;104(1):339-349.
Since the development of the Radiation Therapy Oncology Group-Recursive Partitioning Analysis (RTOG-RPA) risk classes for high-grade glioma, radiation therapy in combination with temozolomide (TMZ) has become standard care. While this combination has improved survival, the prognosis remains poor in the majority of patients. Therefore, strong interest in high-grade gliomas from basic research to clinical trials persists. We sought to evaluate whether the current RTOG-RPA retains prognostic significance in the TMZ era or alternatively, if modifications better prognosticate the optimal selection of patients with similar baseline prognosis for future clinical protocols. The records of 159 patients with newly-diagnosed glioblastoma (GBM, WHO grade IV) or anaplastic astrocytoma (AA, WHO grade III) were reviewed. Patients were treated with intensity-modulated radiation therapy (IMRT) and concurrent followed by adjuvant TMZ (n = 154) or adjuvant TMZ only (n = 5). The primary endpoint was overall survival. Three separate analyses were performed: (1) application of RTOG-RPA to the study cohort and calculation of subsequent survival curves, (2) fit a new tree model with the same predictors in RTOG-RPA, and (3) fit a new tree model with an expanded predictor set. All analyses used a regression tree analysis with a survival outcome fit to formulate new risk classes. Overall median survival was 14.9 months. Using the RTOG-RPA, the six classes retained their relative prognostic significance and overall ordering, with the corresponding survival distributions significantly different from each other (P < 0.01, χ2 statistic = 70). New recursive partitioning limited to the predictors in RTOG-RPA defined four risk groups based on Karnofsky Performance Status (KPS), histology, age, length of neurologic symptoms, and mental status. Analysis across the expanded predictors defined six risk classes, including the same five variables plus tumor location, tobacco use, and hospitalization during radiation therapy. Patients with excellent functional status, AA, and frontal lobe tumors had the best prognosis. For patients with newly-diagnosed high-grade gliomas, RTOG-RPA classes retained prognostic significance in patients treated with TMZ and IMRT. In contrast to RTOG-RPA, in our modified RPA model, KPS rather than age represented the initial split. New recursive partitioning identified potential modifications to RTOG-RPA that should be further explored with a larger data set.
PMCID: PMC3151374  PMID: 21181233
Glioblastoma; Anaplastic Astrocytoma; RTOG-RPA; Validation; Temozolomide; IMRT
11.  Model-based clustering of DNA methylation array data: a recursive-partitioning algorithm for high-dimensional data arising as a mixture of beta distributions 
BMC Bioinformatics  2008;9:365.
Epigenetics is the study of heritable changes in gene function that cannot be explained by changes in DNA sequence. One of the most commonly studied epigenetic alterations is cytosine methylation, which is a well recognized mechanism of epigenetic gene silencing and often occurs at tumor suppressor gene loci in human cancer. Arrays are now being used to study DNA methylation at a large number of loci; for example, the Illumina GoldenGate platform assesses DNA methylation at 1505 loci associated with over 800 cancer-related genes. Model-based cluster analysis is often used to identify DNA methylation subgroups in data, but it is unclear how to cluster DNA methylation data from arrays in a scalable and reliable manner.
We propose a novel model-based recursive-partitioning algorithm to navigate clusters in a beta mixture model. We present simulations that show that the method is more reliable than competing nonparametric clustering approaches, and is at least as reliable as conventional mixture model methods. We also show that our proposed method is more computationally efficient than conventional mixture model approaches. We demonstrate our method on the normal tissue samples and show that the clusters are associated with tissue type as well as age.
Our proposed recursively-partitioned mixture model is an effective and computationally efficient method for clustering DNA methylation data.
PMCID: PMC2553421  PMID: 18782434
12.  Class-Sparing Regimens for Initial Treatment of HIV-1 Infection 
The New England journal of medicine  2008;358(20):10.1056/NEJMoa074609.
The use of either efavirenz or lopinavir–ritonavir plus two nucleoside reverse-transcriptase inhibitors (NRTIs) is recommended for initial therapy for patients with human immunodeficiency virus type 1 (HIV-1) infection, but which of the two regimens has greater efficacy is not known. The alternative regimen of lopinavir–ritonavir plus efavirenz may prevent toxic effects associated with NRTIs.
In an open-label study, we compared three regimens for initial therapy: efavirenz plus two NRTIs (efavirenz group), lopinavir–ritonavir plus two NRTIs (lopinavir–ritonavir group), and lopinavir–ritonavir plus efavirenz (NRTI-sparing group). We randomly assigned 757 patients with a median CD4 count of 191 cells per cubic millimeter and a median HIV-1 RNA level of 4.8 log10 copies per milliliter to the three groups.
At a median follow-up of 112 weeks, the time to virologic failure was longer in the efavirenz group than in the lopinavir–ritonavir group (P = 0.006) but was not significantly different in the NRTI-sparing group from the time in either of the other two groups. At week 96, the proportion of patients with fewer than 50 copies of plasma HIV-1 RNA per milliliter was 89% in the efavirenz group, 77% in the lopinavir–ritonavir group, and 83% in the NRTI-sparing group (P = 0.003 for the comparison between the efavirenz group and the lopinavir–ritonavir group). The groups did not differ significantly in the time to discontinuation because of toxic effects. At virologic failure, antiretroviral resistance mutations were more frequent in the NRTI-sparing group than in the other two groups.
Virologic failure was less likely in the efavirenz group than in the lopinavir–ritonavir group. The virologic efficacy of the NRTI-sparing regimen was similar to that of the efavirenz regimen but was more likely to be associated with drug resistance. ( number, NCT00050895.)
PMCID: PMC3885902  PMID: 18480202
13.  Efavirenz Therapy in Rhesus Macaques Infected with a Chimera of Simian Immunodeficiency Virus Containing Reverse Transcriptase from Human Immunodeficiency Virus Type 1 
The specificity of nonnucleoside reverse transcriptase (RT) inhibitors (NNRTIs) for the RT of human immunodeficiency virus type 1 (HIV-1) has prevented the use of simian immunodeficiency virus (SIV) in the study of NNRTIs and NNRTI-based highly active antiretroviral therapy. However, a SIV-HIV-1 chimera (RT-SHIV), in which the RT from SIVmac239 was replaced with the RT-encoding region from HIV-1, is susceptible to NNRTIs and is infectious to rhesus macaques. We have evaluated the antiviral activity of efavirenz against RT-SHIV and the emergence of efavirenz-resistant mutants in vitro and in vivo. RT-SHIV was susceptible to efavirenz with a mean effective concentration of 5.9 ± 4.5 nM, and RT-SHIV variants selected with efavirenz in cell culture displayed 600-fold-reduced susceptibility. The efavirenz-resistant mutants of RT-SHIV had mutations in RT similar to those of HIV-1 variants that were selected under similar conditions. Efavirenz monotherapy of RT-SHIV-infected macaques produced a 1.82-log-unit decrease in plasma viral-RNA levels after 1 week. The virus load rebounded within 3 weeks in one treated animal and more slowly in a second animal. Virus isolated from these two animals contained the K103N and Y188C or Y188L mutations. The RT-SHIV-rhesus macaque model may prove useful for studies of antiretroviral drug combinations that include efavirenz.
PMCID: PMC514752  PMID: 15328115
14.  Minor HIV-1 Variants with the K103N Resistance Mutation during Intermittent Efavirenz-Containing Antiretroviral Therapy and Virological Failure 
PLoS ONE  2011;6(6):e21655.
The impact of minor drug-resistant variants of the type 1 immunodeficiency virus (HIV-1) on the failure of antiretroviral therapy remains unclear. We have evaluated the importance of detecting minor populations of viruses resistant to non-nucleoside reverse-transcriptase inhibitors (NNRTI) during intermittent antiretroviral therapy, a high-risk context for the emergence of drug-resistant HIV-1. We carried out a longitudinal study on plasma samples taken from 21 patients given efavirenz and enrolled in the intermittent arm of the ANRS 106 trial. Allele-specific real-time PCR was used to detect and quantify minor K103N mutants during off-therapy periods. The concordance with ultra-deep pyrosequencing was assessed for 11 patients. The pharmacokinetics of efavirenz was assayed to determine whether its variability could influence the emergence of K103N mutants. Allele-specific real-time PCR detected K103N mutants in 15 of the 19 analyzable patients at the end of an off-therapy period while direct sequencing detected mutants in only 6 patients. The frequency of K103N mutants was <0.1% in 7 patients by allele-specific real-time PCR without further selection, and >0.1% in 8. It was 0.1%–10% in 6 of these 8 patients. The mutated virus populations of 4 of these 6 patients underwent further selection and treatment failed for 2 of them. The K103N mutant frequency was >10% in the remaining 2, treatment failed for one. The copy numbers of K103N variants quantified by allele-specific real-time PCR and ultra-deep pyrosequencing agreed closely (ρ = 0.89 P<0.0001). The half-life of efavirenz was higher (50.5 hours) in the 8 patients in whom K103N emerged (>0.1%) than in the 11 patients in whom it did not (32 hours) (P = 0.04). Thus ultrasensitive methods could prove more useful than direct sequencing for predicting treatment failure in some patients. However the presence of minor NNRTI-resistant viruses need not always result in virological escape.
Trial registration NCT00122551
PMCID: PMC3124548  PMID: 21738752
15.  Association between Prenatal Exposure to Antiretroviral Therapy and Birth Defects: An Analysis of the French Perinatal Cohort Study (ANRS CO1/CO11) 
PLoS Medicine  2014;11(4):e1001635.
Jeanne Sibiude and colleagues use the French Perinatal Cohort to estimate the prevalence of birth defects in children born to HIV-infected women receiving antiretroviral therapy during pregnancy.
Please see later in the article for the Editors' Summary
Antiretroviral therapy (ART) has major benefits during pregnancy, both for maternal health and to prevent mother-to-child transmission of HIV. Safety issues, including teratogenic risk, need to be evaluated. We estimated the prevalence of birth defects in children born to HIV-infected women receiving ART during pregnancy, and assessed the independent association of birth defects with each antiretroviral (ARV) drug used.
Methods and Findings
The French Perinatal Cohort prospectively enrolls HIV-infected women delivering in 90 centers throughout France. Children are followed by pediatricians until 2 y of age according to national guidelines.
We included 13,124 live births between 1994 and 2010, among which, 42% (n = 5,388) were exposed to ART in the first trimester of pregnancy. Birth defects were studied using both European Surveillance of Congenital Anomalies (EUROCAT) and Metropolitan Atlanta Congenital Defects Program (MACDP) classifications; associations with ART were evaluated using univariate and multivariate logistic regressions. Correction for multiple comparisons was not performed because the analyses were based on hypotheses emanating from previous findings in the literature and the robustness of the findings of the current study. The prevalence of birth defects was 4.4% (95% CI 4.0%–4.7%), according to the EUROCAT classification. In multivariate analysis adjusting for other ARV drugs, maternal age, geographical origin, intravenous drug use, and type of maternity center, a significant association was found between exposure to zidovudine in the first trimester and congenital heart defects: 2.3% (74/3,267), adjusted odds ratio (AOR) = 2.2 (95% CI 1.3–3.7), p = 0.003, absolute risk difference attributed to zidovudine +1.2% (95% CI +0.5; +1.9%). Didanosine and indinavir were associated with head and neck defects, respectively: 0.5%, AOR = 3.4 (95% CI 1.1–10.4), p = 0.04; 0.9%, AOR = 3.8 (95% CI 1.1–13.8), p = 0.04. We found a significant association between efavirenz and neurological defects (n = 4) using the MACDP classification: AOR = 3.0 (95% CI 1.1–8.5), p = 0.04, absolute risk +0.7% (95% CI +0.07%; +1.3%). But the association was not significant using the less inclusive EUROCAT classification: AOR = 2.1 (95% CI 0.7–5.9), p = 0.16. No association was found between birth defects and lopinavir or ritonavir with a power >85% for an odds ratio of 1.5, nor for nevirapine, tenofovir, stavudine, or abacavir with a power >70%. Limitations of the present study were the absence of data on termination of pregnancy, stillbirths, tobacco and alcohol intake, and concomitant medication.
We found a specific association between in utero exposure to zidovudine and heart defects; the mechanisms need to be elucidated. The association between efavirenz and neurological defects must be interpreted with caution. For the other drugs not associated with birth defects, the results were reassuring. Finally, whatever the impact that some ARV drugs may have on birth defects, it is surpassed by the major role of ART in the successful prevention of mother-to-child transmission of HIV.
Please see later in the article for the Editors' Summary
Editors' Summary
AIDS and HIV infection are commonly treated with antiretroviral therapy (ART), a combination of individual drugs that work together to prevent the replication of the virus and further spread of the infection. Starting in the 1990s, studies have shown that ART of HIV-infected women can substantially reduce transmission of the virus to the child during pregnancy and birth. Based on these results, ART was subsequently recommended for pregnant women. Since 2004, ART has been standard therapy for pregnant women with HIV/AIDS in high-income countries, and it is now recommended for all HIV-infected women worldwide. Several different antiviral drug combinations have been shown to be effective and are used to prevent mother-to-infant transmission. However, as with any other drugs taken during pregnancy, there is concern that ART can harm the developing fetus.
Why Was This Study Done?
Several previous studies have assessed the risk that ART taken by a pregnant woman might pose to her developing fetus, but the results have been inconsistent. Animal studies suggested an elevated risk for some drugs but not others. While some clinical studies have reported increases in birth defects in children born to mothers on ART, others have shown no such increase.
The discrepancy may be due to differences between the populations included in the studies and the different methods used to diagnose birth defects. Additional large studies are therefore necessary to obtain more and better evidence on the potential harm of individual anti-HIV drugs to children exposed during pregnancy. So in this study, the authors conducted a large cohort study in France to assess the relationship between different antiretroviral drugs and specific birth defects.
What Did the Researchers Do and Find?
The researchers used a large national health database known as the French Perinatal Cohort that contains information on HIV-infected mothers who delivered infants in 90 centers throughout France. Pediatricians follow all children, whatever their HIV status, to two years of age, and health statistics are collected according to national health-care guidelines. Analyzing the records, the researchers estimated the rate at which birth defects occurred in children exposed to antiretroviral drugs during pregnancy.
The researchers included 13,124 children who were born alive between 1994 and 2010 and had been exposed to ART during pregnancy. Children exposed in the first trimester of pregnancy, and those exposed during the second or third trimester, were compared to a control group (children not exposed to the drug during the whole pregnancy). Using two birth defect classification systems (EUROCAT and MACDP—MACDP collects more details on disease classification than EUROCAT), the researchers sought to detect a link between the occurrence of birth defects and exposure to individual antiretroviral drugs.
They found a small increase in the risk for heart defects in children with exposure to zidovudine. They also found an association between efavirenz exposure and a small increase in neurological defects, but only when using the MACDP classification system. The authors found no association between other antiretroviral drugs, including nevirapine (acting similar to efavirenz); tenofovir, stavudine, and abacavir (all three acting similar to zidovudine); and lopinavir and ritonavir (proteinase inhibitors) and any type of birth defect.
What Do These Findings Mean?
These findings show that, overall, the risks of birth defects in children exposed to antiretroviral drugs in utero are small when considering the clear benefit of preventing mother-to-child transmission of HIV. However, where there are safe and effective alternatives, it might be appropriate to avoid use by pregnant women of those drugs that are associated with elevated risks of birth defects.
Worldwide, a large number of children are exposed to zidovudine in utero, and these results suggest (though cannot prove) that these children may be at a slightly higher risk of heart defects. Current World Health Organization (WHO) guidelines for the prevention of mother-to-child transmission no longer recommend zidovudine for first-line therapy.
The implications of the higher rate of neurological birth defects observed in infants exposed to efavirenz in the first trimester are less clear. The EUROCAT classification excludes minor neurological abnormalities without serious medical consequences, and so the WHO guidelines that stress the importance of careful clinical follow-up of children with exposure to efavirenz seem adequate, based on the findings of this study. The study is limited by the lack of data on the use of additional medication and alcohol and tobacco use, which could have a direct impact on fetal development, and by the absence of data on birth defects and antiretroviral drug exposure from low-income countries. However, the findings of this study overall are reassuring and suggest that apart from zidovudine and possibly efavirenz, other antiretroviral drugs are not associated with birth defects, and their use during pregnancy does not pose a risk to the infant.
Additional Information
Please access these websites via the online version of this summary at
This study is further discussed in a PLOS Medicine Perspective by Mofenson and Watts
The World Health Organization has a webpage on mother-to-child transmission of HIV
The US National Institutes of Health provides links to additional information on mother-to-child transmission of HIV
The Elizabeth Glaser Pediatric AIDS Foundation also has a webpage on mother-to-child transmission
The French Perinatal Cohort has a webpage describing the cohort and its main publications (in French, with a summary in English)
PMCID: PMC4004551  PMID: 24781315
16.  Top scoring pairs for feature selection in machine learning and applications to cancer outcome prediction 
BMC Bioinformatics  2011;12:375.
The widely used k top scoring pair (k-TSP) algorithm is a simple yet powerful parameter-free classifier. It owes its success in many cancer microarray datasets to an effective feature selection algorithm that is based on relative expression ordering of gene pairs. However, its general robustness does not extend to some difficult datasets, such as those involving cancer outcome prediction, which may be due to the relatively simple voting scheme used by the classifier. We believe that the performance can be enhanced by separating its effective feature selection component and combining it with a powerful classifier such as the support vector machine (SVM). More generally the top scoring pairs generated by the k-TSP ranking algorithm can be used as a dimensionally reduced subspace for other machine learning classifiers.
We developed an approach integrating the k-TSP ranking algorithm (TSP) with other machine learning methods, allowing combination of the computationally efficient, multivariate feature ranking of k-TSP with multivariate classifiers such as SVM. We evaluated this hybrid scheme (k-TSP+SVM) in a range of simulated datasets with known data structures. As compared with other feature selection methods, such as a univariate method similar to Fisher's discriminant criterion (Fisher), or a recursive feature elimination embedded in SVM (RFE), TSP is increasingly more effective than the other two methods as the informative genes become progressively more correlated, which is demonstrated both in terms of the classification performance and the ability to recover true informative genes. We also applied this hybrid scheme to four cancer prognosis datasets, in which k-TSP+SVM outperforms k-TSP classifier in all datasets, and achieves either comparable or superior performance to that using SVM alone. In concurrence with what is observed in simulation, TSP appears to be a better feature selector than Fisher and RFE in some of the cancer datasets
The k-TSP ranking algorithm can be used as a computationally efficient, multivariate filter method for feature selection in machine learning. SVM in combination with k-TSP ranking algorithm outperforms k-TSP and SVM alone in simulated datasets and in some cancer prognosis datasets. Simulation studies suggest that as a feature selector, it is better tuned to certain data characteristics, i.e. correlations among informative genes, which is potentially interesting as an alternative feature ranking method in pathway analysis.
PMCID: PMC3223741  PMID: 21939564
17.  Antiretroviral treatment in HIV-1 infected pediatric patients: focus on efavirenz 
Efavirenz is a non-nucleoside reverse transcriptase inhibitor (NNRTI), used for the treatment of human immunodeficiency virus (HIV)-1 infection. Approved by the US Food and Drug Administration in 1998, its indication was recently extended to include children as young as 3 months of age. The World Health Organization and many national guidelines consider efavirenz to be the preferred NNRTI for first-line treatment of children over the age of 3 years. Clinical outcomes of patients on three-drug antiretroviral regimens which include efavirenz are as good as or better than those for patients on all other currently approved HIV medications. Efavirenz is dosed once daily and has pediatric-friendly formulations. It is usually well tolerated, with central nervous system side effects being of greatest concern. Efavirenz increases the risk of neural tube defects in nonhuman primates and therefore its use during the first trimester of pregnancy is limited in some settings. With minimal interactions with antituberculous drugs, efavirenz is preferred for use among patients with HIV/tuberculosis coinfection. Efavirenz can be rendered inactive by a single point mutation in the reverse transcriptase enzyme. Newer NNRTI drugs such as etravirine, not yet approved for use in children under the age of 6 years, may maintain their activity following development of efavirenz resistance. This review highlights key points from the existing literature regarding the use of efavirenz in children and suggests directions for future investigation.
PMCID: PMC4412603  PMID: 25937791
efavirenz; human immunodeficiency virus; pediatrics; antiretroviral
18.  Identifying compositionally homogeneous and nonhomogeneous domains within the human genome using a novel segmentation algorithm 
Nucleic Acids Research  2010;38(15):e158.
It has been suggested that the mammalian genome is composed mainly of long compositionally homogeneous domains. Such domains are frequently identified using recursive segmentation algorithms based on the Jensen–Shannon divergence. However, a common difficulty with such methods is deciding when to halt the recursive partitioning and what criteria to use in deciding whether a detected boundary between two segments is real or not. We demonstrate that commonly used halting criteria are intrinsically biased, and propose IsoPlotter, a parameter-free segmentation algorithm that overcomes such biases by using a simple dynamic halting criterion and tests the homogeneity of the inferred domains. IsoPlotter was compared with an alternative segmentation algorithm, DJS, using two sets of simulated genomic sequences. Our results show that IsoPlotter was able to infer both long and short compositionally homogeneous domains with low GC content dispersion, whereas DJS failed to identify short compositionally homogeneous domains and sequences with low compositional dispersion. By segmenting the human genome with IsoPlotter, we found that one-third of the genome is composed of compositionally nonhomogeneous domains and the remaining is a mixture of many short compositionally homogeneous domains and relatively few long ones.
PMCID: PMC2926622  PMID: 20571085
19.  Fusing Dual-Event Datasets for Mycobacterium Tuberculosis Machine Learning Models and their Evaluation 
The search for new tuberculosis treatments continues as we need to find molecules that can act more quickly, be accommodated in multi-drug regimens, and overcome ever increasing levels of drug resistance. Multiple large scale phenotypic high-throughput screens against Mycobacterium tuberculosis (Mtb) have generated dose response data, enabling the generation of machine learning models. These models also incorporated cytotoxicity data and were recently validated with a large external dataset.
A cheminformatics data-fusion approach followed by Bayesian machine learning, Support Vector Machine or Recursive Partitioning model development (based on publicly available Mtb screening data) was used to compare individual datasets and subsequent combined models. A set of 1924 commercially available molecules with promising antitubercular activity (and lack of relative cytotoxicity to Vero cells) were used to evaluate the predictive nature of the models. We demonstrate that combining three datasets incorporating antitubercular and cytotoxicity data in Vero cells from our previous screens results in external validation receiver operator curve (ROC) of 0.83 (Bayesian or RP Forest). Models that do not have the highest five-fold cross validation ROC scores can outperform other models in a test set dependent manner.
We demonstrate with predictions for a recently published set of Mtb leads from GlaxoSmithKline that no single machine learning model may be enough to identify compounds of interest. Dataset fusion represents a further useful strategy for machine learning construction as illustrated with Mtb. Coverage of chemistry and Mtb target spaces may also be limiting factors for the whole-cell screening data generated to date.
PMCID: PMC3910492  PMID: 24144044
Bayesian models; Collaborative Drug Discovery Tuberculosis database; Dual-event models; Function class fingerprints; Lead optimization; Mycobacterium tuberculosis; Recursive partitioning; Support vector machine; Tuberculosis
20.  Efavirenz: a decade of clinical experience in the treatment of HIV 
Efavirenz, a non-nucleoside reverse transcriptase inhibitor, has been an important component of the treatment of HIV infection for 10 years and has contributed significantly to the evolution of highly active antiretroviral therapy (HAART). The efficacy of efavirenz has been established in numerous randomized trials and observational studies in HAART-naive patients, including those with advanced infection. In the ACTG A5142 study, efavirenz showed greater virological efficacy than the boosted protease inhibitor (PI), lopinavir. Efavirenz is more effective as a third agent than unboosted PIs or the nucleoside analogue abacavir. Some, but not all, studies have suggested that efavirenz (added to two nucleoside reverse transcriptase inhibitors) is more effective than nevirapine. Virological and immunological responses achieved with efavirenz-based HAART have been maintained for 7 years. Dosing convenience predicts adherence, and studies have demonstrated that patients can be switched from PI-based therapy to simplified, once-daily efavirenz-based regimens without losing virological control. The one-pill, once-daily formulation of efavirenz plus tenofovir and emtricitabine offers a particular advantage in this regard. Efavirenz also retains a role after failure of a first PI-based regimen. Efavirenz is generally well tolerated: rash and neuropsychiatric disturbances are the most notable adverse events. Neuropsychiatric disturbances generally develop early in treatment and they tend to resolve with continued administration, but they are persistent and troubling in a minority of patients. Efavirenz has less effect on plasma lipid profiles than some boosted PIs. Lipodystrophy can occur under treatment with efavirenz but it may be reduced if the concurrent use of thymidine analogues is avoided. Efavirenz resistance mutations (especially K103N) can be selected during long-term treatment, underscoring the importance of good adherence. Recent data have confirmed that efavirenz is a cost-effective option for first-line HAART. In light of these features, efavirenz retains a key role in HIV treatment strategies and is the first-line agent recommended in some guidelines.
PMCID: PMC2760464  PMID: 19767318
HAART; treatment simplification; adherence; resistance
21.  Censored quantile regression with recursive partitioning-based weights 
Biostatistics (Oxford, England)  2013;15(1):170-181.
Censored quantile regression provides a useful alternative to the Cox proportional hazards model for analyzing survival data. It directly models the conditional quantile of the survival time and hence is easy to interpret. Moreover, it relaxes the proportionality constraint on the hazard function associated with the popular Cox model and is natural for modeling heterogeneity of the data. Recently, Wang and Wang (2009. Locally weighted censored quantile regression. Journal of the American Statistical Association 103, 1117–1128) proposed a locally weighted censored quantile regression approach that allows for covariate-dependent censoring and is less restrictive than other censored quantile regression methods. However, their kernel smoothing-based weighting scheme requires all covariates to be continuous and encounters practical difficulty with even a moderate number of covariates. We propose a new weighting approach that uses recursive partitioning, e.g. survival trees, that offers greater flexibility in handling covariate-dependent censoring in moderately high dimensions and can incorporate both continuous and discrete covariates. We prove that this new weighting scheme leads to consistent estimation of the quantile regression coefficients and demonstrate its effectiveness via Monte Carlo simulations. We also illustrate the new method using a widely recognized data set from a clinical trial on primary biliary cirrhosis.
PMCID: PMC3862210  PMID: 23975800
Censored quantile regression; Recursive partitioning; Survival analysis; Survival ensembles
22.  Estimating HIV-1 Fitness Characteristics from Cross-Sectional Genotype Data 
PLoS Computational Biology  2014;10(11):e1003886.
Despite the success of highly active antiretroviral therapy (HAART) in the management of human immunodeficiency virus (HIV)-1 infection, virological failure due to drug resistance development remains a major challenge. Resistant mutants display reduced drug susceptibilities, but in the absence of drug, they generally have a lower fitness than the wild type, owing to a mutation-incurred cost. The interaction between these fitness costs and drug resistance dictates the appearance of mutants and influences viral suppression and therapeutic success. Assessing in vivo viral fitness is a challenging task and yet one that has significant clinical relevance. Here, we present a new computational modelling approach for estimating viral fitness that relies on common sparse cross-sectional clinical data by combining statistical approaches to learn drug-specific mutational pathways and resistance factors with viral dynamics models to represent the host-virus interaction and actions of drug mechanistically. We estimate in vivo fitness characteristics of mutant genotypes for two antiretroviral drugs, the reverse transcriptase inhibitor zidovudine (ZDV) and the protease inhibitor indinavir (IDV). Well-known features of HIV-1 fitness landscapes are recovered, both in the absence and presence of drugs. We quantify the complex interplay between fitness costs and resistance by computing selective advantages for different mutants. Our approach extends naturally to multiple drugs and we illustrate this by simulating a dual therapy with ZDV and IDV to assess therapy failure. The combined statistical and dynamical modelling approach may help in dissecting the effects of fitness costs and resistance with the ultimate aim of assisting the choice of salvage therapies after treatment failure.
Author Summary
Mutations conferring drug resistance represent major threats to the therapeutic success of highly active antiretroviral therapy (HAART) against human immunodeficiency virus (HIV)-1 infection. Viral mutants differ in their fitness and assessing viral fitness is a challenging task. In this article, we estimate drug-specific mutational pathways by learning from clinical data using statistical techniques and incorporate these into mathematical models of in vivo viral infection dynamics. This approach enables us to estimate mutant fitness characteristics. We illustrate our method by predicting fitness characteristics of mutant genotypes for two different antiretroviral therapies with the drugs zidovudine and indinavir. We recover several established features of mutant fitnesses and quantify fitness characteristics both in the absence and presence of drugs. Our model extends naturally to multiple drugs and we illustrate this by simulating a dual therapy with ZDV and IDV to assess therapy failure. Additionally, our modelling approach relies only on cross-sectional clinical data. We believe that such an approach is a highly valuable tool in assisting the choice of salvage therapies after treatment failure.
PMCID: PMC4222584  PMID: 25375675
23.  On the Adaptive Partition Approach to the Detection of Multiple Change-Points 
PLoS ONE  2011;6(5):e19754.
With an adaptive partition procedure, we can partition a “time course” into consecutive non-overlapped intervals such that the population means/proportions of the observations in two adjacent intervals are significantly different at a given level . However, the widely used recursive combination or partition procedures do not guarantee a global optimization. We propose a modified dynamic programming algorithm to achieve a global optimization. Our method can provide consistent estimation results. In a comprehensive simulation study, our method shows an improved performance when it is compared to the recursive combination/partition procedures. In practice, can be determined based on a cross-validation procedure. As an application, we consider the well-known Pima Indian Diabetes data. We explore the relationship among the diabetes risk and several important variables including the plasma glucose concentration, body mass index and age.
PMCID: PMC3101215  PMID: 21629694
24.  Evaluating uses of data mining techniques in propensity score estimation: a simulation study† 
In propensity score modeling, it is a standard practice to optimize the prediction of exposure status based on the covariate information. In a simulation study, we examined in what situations analyses based on various types of exposure propensity score (EPS) models using data mining techniques such as recursive partitioning (RP) and neural networks (NN) produce unbiased and/or efficient results.
We simulated data for a hypothetical cohort study (n=2000) with a binary exposure/outcome and 10 binary/ continuous covariates with seven scenarios differing by non-linear and/or non-additive associations between exposure and covariates. EPS models used logistic regression (LR) (all possible main effects), RP1 (without pruning), RP2 (with pruning), and NN. We calculated c-statistics (C), standard errors (SE), and bias of exposure-effect estimates from outcome models for the PS-matched dataset.
Data mining techniques yielded higher C than LR (mean: NN, 0.86; RPI, 0.79; RP2, 0.72; and LR, 0.76). SE tended to be greater in models with higher C. Overall bias was small for each strategy, although NN estimates tended to be the least biased. C was not correlated with the magnitude of bias (correlation coefficient [COR]=−0.3, p=0.1) but increased SE (COR=0.7, p<0.001).
Effect estimates from EPS models by simple LR were generally robust. NN models generally provided the least numerically biased estimates. C was not associated with the magnitude of bias but was with the increased SE.
PMCID: PMC2905676  PMID: 18311848
propensity score; logistic regression; neural networks; recursive partitioning
25.  Medical and health economic assessment of radiosurgery for the treatment of brain metastasis 
Radiotherapy for patients suffering from malignant neoplasms has developed greatly during the past decades. Stereotactic radiosurgery (SRS) is one important radiotherapeutic option which is defined by a single and highly focussed application of radiation during a specified time interval. One of its important indications is the treatment of brain metastases.
The objective of this HTA is to summarise the current literature concerning the treatment of brain metastasis and to compare SRS as a single or additional treatment option to alternative treatment options with regard to their medical effectiveness/efficacy, safety and cost-effectiveness as well as their ethical, social and legal implications.
A structured search and hand search of identified literature are performed from January 2002 through August 2007 to identify relevant publications published in English or German. Studies targeting patients with single or multiple brain metastases are included. The methodological quality of included studies is assessed according to quality criteria, based on the criteria of evidence based medicine.
Of 1,495 publications 15 medical studies meet the inclusion criteria. Overall study quality is limited and with the exception of two randomized controlleed trials (RCT) and two meta-analyses only historical cohort studies are identified. Reported outcome measures are highly variable between studies. Studies with high methodological quality provide evidence, that whole-brain radiotherapy (WBRT) in addition to SRS and SRS in addition to WBRT is associated with improved local tumour control rates and neurological function. However, only in patients with single brain metastasis, RPA-class 1 (RPA = Recursive partitioning analysis) and certain primary tumour entities, this combination of SRS and WBRT is associated with superior survival compared to WBRT alone. Studies report no significant differences in adverse events between treatment groups. Methodologically less rigorous studies provide no conclusive evidence with regard to medical effectiveness and safety, comparing SRS to WBRT, neurosurgery (NS) or hypofractionated radiotherapy (HCSRT). The quality of life is not investigated in any of the studies.
Within the searched databases a total of 320 economic publications are identified. Five publications are eligible for this report. The five reports have a quiet variable quality. Concerning the economic efficiency of alternative equipment, while assuming equal effectiveness, the calculations show, that economic efficiency depends to a large extend on the number of patients treated. In case the two alternative equipments are used solely for SRS, the Gamma Knife might be more cost-efficient. Otherwise an adapted linear accelerator is most likely to be beneficial because of its flexibility. One Health Technology Assessment (HTA) states, that the cost for a Gamma Knife and a dedicated linear accelerator are comparable, while an adapted version is cheaper.
No reports concerning ethical, legal and social aspects are identified.
Overall, quantity and quality of identified studies is limited. However, the identified studies indicate that the prognosis of patients with brain metastases is despite highly developed and modern treatment regimes still limited. Conclusive evidence with regard to the effectiveness of identified interventions is only available for the combined treatment of SRS and WBRT compared to SRS or WBRT alone. Furthermore, there is insufficient evidence to compare SRS with WBRT, NS or HCSRT.
The efficiency of the different equipments depends to a great extent on the number and the indications of the patients treated. If dedicated systems are used to their full capacity, there is some evidence for superior cost-effectiveness. If more treatment flexibility is required, adapted systems seem to be advantageous. However, equal treatment effectiveness is a necessary assumption for these conclusions. The need for a treatment precision can influence the purchase decision. No reports concerning more recent therapeutic alternatives are currently available.
Combination of SRS and WBRT is associated with improved local tumour control and neurological function compared to SRS or WBRT alone. However, only for patients with single metastasis there is strong evidence that this results in improved survival compared to WBRT alone. Methodologically rigorous studies are warranted to investigate SRS compared to WBRT and NS and to investigate the quality of life in patients undergoing these treatment regimes.
Concerning the type of equipment used, economic efficiency depends to a great extend on the capacity at which the system can be used. Dedicated systems might be favourable for a high number of patients, while lower patient counts probably favour adapted systems with their superior treatment flexibility. Using the equipment at its full capacity may result in a limited number of machines, what in turn may give rise to the question of an equal and easy access to this technology. Studies focusing on the comparative effectiveness and cost-effectiveness of different treatment options and their combinations, especially for the German setting, are warranted.
PMCID: PMC3011285  PMID: 21289890
brain metastasis; radiosurgery

