Antiretroviral drug resistance is a major obstacle to the successful treatment of human immunodeficiency virus type 1 (HIV-1) infection. A large number of retrospective and prospective studies have demonstrated that the presence of drug resistance before starting a treatment regimen is an independent predictor of success of that regimen (1
). As a result, several expert panels have recommended that HIV reverse transcriptase (RT) and protease sequencing be done to help physicians select antiretroviral drugs for their patients and genotypic resistance testing has been part of routine clinical care for the past several years (2
The HIV RT and protease sequence database (HIVRT&PrDB) is intended to assist scientists designing new HIV-1 drugs, clinical investigators studying HIV-1 drug resistance and clinicians using genotypic HIV-1 drug resistance tests (3
). The database links sequence changes in the molecular targets of HIV-1 therapy to other forms of data including treatment history and phenotypic (drug susceptibility) data. Data on the virological response (plasma HIV-1 RNA levels) to a new treatment regimen have been added and will soon be accessible over the web.
The HIVRT&PrDB is a relational database with 19 normalized (nonredundant) core tables, 10 look-up tables and about 20 derived tables. The database is implemented using MySQL on a Linux platform. There are several major hierarchical relationships linking key entities in the database: (i) patient treatment history (list of drug regimens and their start and stop dates); (ii) patient isolate (clinical) sequence drug susceptibility result; (iii) isolate (laboratory) drug susceptibility result; and (iv) patient plasma HIV-1 RNA level. Sequences are stored in a virtual alignment with the subtype B consensus sequence; thus amino acid sequences are also represented as lists of differences from the consensus sequence.
The HIVRT&PrDB contains data from more than 420 published papers. Sequences are available on HIV-1 isolates from more than 7000 individuals and from about 500 laboratory isolates containing mutations generated by virus passage or site-directed mutagenesis. About 20 000 drug susceptibility results from tests performed on more than 2000 virus isolates are available. Figures and contain composite alignments showing 193 protease and 395 RT mutations present at a frequency of >0.1% in HIV-1 isolates from treated and untreated persons. Figure shows a summary of the drug susceptibility results available on each of the 16 approved antiretroviral drugs.
Figure 1 Composite sequence alignment of HIV-1 protease, positions 1–99. This figure resulted from a query that retrieved all HIV-1 sequences in the database including those belonging to different subtypes and those obtained from treated and untreated (more ...)
Figure 2 Composite sequence alignment of HIV-1 RT, positions 1–240. Although the RT enzyme has 560 positions, nearly all drug-resistance mutations are found between positions 40–240. This figure resulted from a query that retrieved all HIV-1 sequences (more ...)
Figure 3 Phenotypic drug susceptibility data on about 2000 HIV-1 isolates. Drug susceptibility to each of the 16 FDA-approved drugs are shown. The first column contains the nucleoside/nucleotide RT inhibitors: 3TC (lamivudine), ABC (abacavir), AZT (zidovudine), (more ...)
The database allows users to retrieve sets of sequences meeting specific criteria. Commonly submitted queries include: (i) the retrieval of sequences of HIV-1 isolates from patients receiving a specific drug treatment, (ii) the retrieval of sequences of HIV-1 isolates containing mutations at specific protease or RT positions, (iii) the retrieval of drug susceptibility data on HIV-1 isolates containing specific mutations or combinations of mutations, and (iv) a summary of data in any particular reference.
Each query initially returns data in the form of a table and each record in the returned table contains 8 or more columns of data. The data returned include: (i) hyperlinks to the MEDLINE abstract and GenBank record, (ii) a list of mutations in the sequence, (iii) a classification of the sequence by patient and time point, (iv) drug treatment history, and (v) additional data depending upon the query (e.g. drug susceptibility results, phylogenetic data, technical data about virus isolation and sequencing). Together with this table, users are given the option of downloading or viewing the raw sequence data in a variety of formats.