The Department of Human Health Services (DHHS) and the WHO [
10] have guidelines on which ARV regimens to use for initial and second-line therapy of HIV-1-infected patients. However, many clinical scenarios are not addressed by these guidelines including the management of (i) patients who began ARV therapy with suboptimal regimens – a problem particularly common in the U.S., Europe in the past, and many middle income countries where previously available ARVs were considerably less potent and drugs were used as they became available rather than as part of a national treatment program, (ii) patients with transmitted resistance, and (iii) heavily treated patients and patients whose viruses have complex patterns of drug-resistance mutations.
The difficulty of recommending therapy for such patients has motivated researchers to study how pre-treatment characteristics influence the response to a change in ARV therapy. Indeed, there have been many studies correlating the presence of baseline ARV-resistance mutations with the response to a new ARV regimen while accounting for essential covariates such as past treatment history, baseline plasma HIV-1 RNA levels, and baseline CD4+ counts. Such studies, however, often differ in their inclusion criteria (i.e. past ARV treatments, timing of plasma HIV-1 RNA levels and genotypic resistance data), salvage therapy requirements [
11-
18], and definition of virological response. For example, some studies define virological response by the extent of reduction in plasma HIV-1 RNA levels, whereas others define it as the suppression of plasma HIV-1 RNA levels below the limits of quantification [
19,
20]. Most of studies have examined plasma HIV-1 RNA levels at fixed time points, whereas an approach based on time to virological failure has been recently proposed [
21].
A standardized representation of data such as that found in the TCE XML Schema makes it possible to apply uniform inclusion and virological endpoint criteria across TCEs from different studies. The combined data can be analyzed for three purposes (i) to reproduce prior results, (ii) to apply and test new analytic methods, and (iii) to generate or test new hypotheses. Many tools are available to validate and transform the contents of XML Schema documents. XML Schemas therefore ensure that the data are represented consistently and can be readily integrated into different applications.
Studies of TCEs typically do not analyze the complete treatment history of a patient. Rather these studies parameterize essential features of the patient’s past ARV exposures. This condensed treatment history combined with the response to a new therapy was called a “treatment-change episode” (TCE) by Larder et al. of the Resistance Database Initiative (RDI) [
22]. The TCE XML Schema is therefore much less complex than the relational database implemented by the HIV Cohort Data Exchange Protocol (HICDEP) [
23]. Moreover, the fact that the TCE XML Schema does not require demographic or epidemiologic data and allows relative (rather than absolute) dates makes it impossible to identify individual patients or clinics [
24].
The TCE XML suite comprises four medical informatics tools: (1) The XML Schema; (2) The TCE Viewer, an online program that creates a graphical representation of data in the XML document; (3) The TCE Repository, which provides the proof-of-concept that the TCE XML Schema can be used to exchange data from multiple clinics; and (4) The TCE Finder, a search engine to identify TCEs meeting specific criteria. The TCE XML suite is useful for comparing genotypic resistance interpretations and hypothesis generation and testing. It should therefore be distinguished from ongoing projects designed to optimize therapy for individual patients such as RDI’s HIV Treatment Response Prediction System (TREPS) [
22] and Genafor’s Theo [
25]. However, because the data in the TCE Repository is publicly available it can be used to increase the training sets for machine learning systems such as Theo and TREPS.
Despite its nascent stage, the TCE Repository has already been shown to be useful for comparing different genotypic resistance test interpretation systems. Specifically, 734 of the TCEs were previously used in a study comparing the predictive value of three algorithms [
26]. Without such a repository, comparisons of genotypic resistance interpretation systems can be performed solely by using proprietary datasets. In addition, we demonstrate here that the TCE Repository makes it possible to generate novel hypotheses that that may be relevant to salvage therapy in resource-limited regions. Indeed, at least one other research team has proposed the use of three rather than two NRTIs for certain salvage therapy scenarios in regions without access to newer ARV classes [
27]. However, considering the large number of covariates associate with treatment response, very large numbers of TCEs will be required to adequately test novel hypotheses.
Although the XML Schema and Viewer are useful to individual research groups and collaborations, the usefulness of the Finder and Repository depends on the willingness of researchers to contribute data to this effort. Therefore, we have collaborated with four research groups to demonstrate the utility of the XML suite of applications for collaboration between multiple clinics. We are continuing to work with clinics in North America, Spain, and the EuResist Network Database to expand the Repository with TCEs that are relevant to resource limited regions (i.e. the regimens are confined primarily to NRTI, NNRTIs, and PIs) and with TCEs involving the use of more recently approved ARV classes including the integrase inhibitors and maraviroc.