We have made use of sequence data obtained for resistance genotyping for the largest clinical centre treating patients with HIV in London to reconstruct the transmission network in this population. In contrast to previous studies based on sparsely sampled populations, by examining all pairwise comparisons among sequences from more than 2,000 patients, we were able to identify a subset of 402 subtype B pairs with a genetic distance of 5% or less. Detailed phylogenetic analysis identified a number of large clusters among these patients, which together comprised 25% of this group. “Dated phylogeny” analysis of these clusters [28
] revealed an episodic pattern, with many of the transmissions within them occurring within a short space of time.
New HIV diagnoses among MSM have risen steadily in the United Kingdom for almost 10 years, and are now approaching twice the number recorded annually in the mid to late 1990s [32
]. Efforts to characterize the changes in this population that have been responsible, including the National Survey of Sexual Attitudes and Lifestyles (NATSAL; [33
]), have provided substantial amounts of information on current risk behaviour of this population. Unprotected anal intercourse with one or more partners in the past year was reported by between 32% and 45% of MSM [34
] recruited in different surveys; approximately 18% of respondents in another study reported unprotected anal intercourse with individuals of unknown HIV status [35
], and 3.2% of respondents reported unprotected anal intercourse with five or more partners in the previous year [34
]. There was also a notable and significant increase in prevalence of risk activities between the 1990 and 2000 NATSAL surveys [36
We have used a database of HIV sequences collected in the course of routine clinical treatment from 2,126 patients to characterize the relationships between viruses infecting different individuals attending a large clinic in London. The depth of sampling meant this study was much more informative about the transmission patterns than previous studies [15
]. We identified 402 individuals whose virus had a close relationship with at least one other, and using two different approaches showed that almost 90 of these individuals could be linked in clusters of 10 or more individuals. Using information on the date each sample was taken, we have reconstructed dated phylogenies that revealed that at least 25% of transmissions among these individuals occurred within a few months of their infection. The tightness of clustering is striking, with most of the linked transmissions occurring within periods of at most 3–4 y. The closeness of these events is inevitably underestimated as a result of incomplete data (i.e., intervening individuals not sampled), so the actual average time between transmissions in these clusters is likely to be smaller. Extrapolation of the conclusions from this clinic population more widely depends on the degree to which patients attending the Chelsea and Westminster clinic reflect the UK MSM population with HIV as a whole. This clinic is the largest HIV clinic in the UK and has contributed 6,551 (24%) out of a (2006) total of 26,811 patients to the UK CHIC study [37
], comprising 29% of all the patients from London. While the location of its primary catchment area within central London suggested it would be likely to be representative, we have recently been able to extend these studies to the entire UK CHIC patient population. Preliminary analysis of HIV genotypes from 8,088 patients that have been investigated for clustering using the genetic distance approach revealed 2,150 individuals with a link to at least one other patient. Among these, several large clusters have been observed, with Chelsea and Westminster patients distributed among patients from other clinics (data not shown). We are therefore confident the pattern described in the current study reflects that of the wider UK population of MSM with HIV.
Other possible limitations of the study should be recognised. Although the use of a phylogenetic definition of clusters avoids the necessity to select an arbitrary distance value, there are clear restrictions on what can be concluded from the phylogeny. The similarity between many of the sequences within
these clusters is frequently so high that there is little power to estimate the internal order of transmissions with any confidence. Neither is it possible, from the phylogeny alone, to determine direction of transmission, or how many individuals in the cluster were transmitters. Thus, cluster E () could have been generated by transmissions from a minimum of three individuals transmitting to six, one, and two others, respectively; or alternatively, from a maximum of seven, where one transmits to three others and all others transmit once. The former situation would be expected under a more skewed distribution of partner numbers, which has been suggested by several studies [8
]. These results therefore complement rather than replace studies that would define parameters such as partner number.
One of the possible consequences of rapid transmission within clusters is a local increase in the transmission of drug-resistant strains [39
]. Whether this had occurred was examined by maximum likelihood reconstruction of ancestral states at all sites associated with drug resistance in the six large clusters. In no case was a drug-resistant virus identified at the root of these clusters (unpublished data), which may have reflected the time at which these particular events were occurring (B). The distribution of mutations at the tips of the trees also do not suggest extensive transmission of resistance-associated mutations. There were only two cases where nearest-neighbour patients both had mutations, and the mutations differed between the pair in each case, suggesting they were all examples of secondary rather than primary resistance.
As the time-dependent phylogenies are calibrated in calendar years, we are able to estimate when most of the transmissions in each cluster occurred. For cluster A, the largest cluster comprising 30 individuals, most of the transmissions between them occurred within about 6 y preceding 1999 (B); for cluster C, the transmissions were tightly restricted to 1995–1997, with many estimated to have occurred in 1996; for cluster B, however, the transmissions mostly occurred between 1991 and 1995. Many of the transmission intervals (with the exception of cluster B) lie within or closely precede the period during which HIV diagnoses have been increasing. Given an expected delay from infection to diagnosis of 3–5 y, we could infer that these clustered transmissions contributed to the increase in prevalence that occurred in London and the United Kingdom in that time [32
]. From these results we can also say that many of these transmission clusters were initiated relatively early in the highly active antiretroviral therapy (HAART) era and before transmitted ARV resistance became a significant problem in the UK [40
As epidemiological models become increasingly complex, incorporating variable mixing patterns [41
], quantitative data on the transmission network structure and the dynamics of transmission will be vital to ensure appropriate parameterization. The level of epidemiologically relevant information yielded by the time-dated phylogeny with respect to the structure and dynamics of the HIV transmission network in this population represents a substantial increase in our depth of knowledge on which interventions can be based.