To directly visualise the change in genetic diversity during horizontal HIV-1 transmission between the donor-recipient pair studied, we first inferred the phylogenetic relationships among their HIV-1 sequences using maximum likelihood methods. The phylogenies for *env V1-V4 *and *gag p24 *depicted in Figure show that branch lengths are substantially shortened immediately after transmission, illustrating that a significant reduction in diversity has occurred.

To investigate the demographics of viral transmission in this transmitter pair more closely, four coalescent models were fitted to the sequence data. Crucially, samples were available both before and after the transmission event allowing distinct demographic functions for donor and recipient HIV-1 populations (Equations 1 to 5), with the time of transition between them estimated from the data [

26]. In addition to a null model that constrained the effective population size in donor (

*N*_{D}) and recipient (

*N*_{R}) to be identical (so that there is no bottleneck at transmission), models with constant, exponential and logistic demographic functions for the recipient population were fitted. In all cases the donor population size was assumed to be constant.

The relative Bayesian posterior scores for each demographic model are listed in Table . For both

*env V1-V4 *and

*gag p24*, the model with the lowest AIC (the preferred model) fits a constant population size in the donor and logistic growth in the recipient (Equations 4 and 5). The null hypothesis that there has been no change in population size at transmission was therefore rejected. Using the estimated model parameters we reconstructed the demographic profiles of genetic diversity (

*Nτ*, the product of the effective population size and generation length in days [

27]) against time for each gene (Figure ).

To further test the extent of the transmission bottleneck, the demographic history of the population was reconstructed using the Bayesian skyline plot [see Methods, [

28]]. The results for

*env V1-V4 *and

*gag p24 *are shown in Figure . In both cases there is a good fit between the demographic profiles estimated using the two different methods. Noticeably, the timing of the transmission bottleneck is the same, and evidence for a bottleneck is readily apparent under both models.

The Bayesian skyline plot also justifies our use of the logistic-constant demographic model to estimate the diversity that survives during horizontal transmission of HIV-1. Using the logistic growth model (Equations 4 and 5) we were able to calculate diversity in the recipient *N*_{R}*τ *at the estimated time of transmission *t*_{trans}. We estimated *t*_{trans }to be approximately 30 days prior to collection of the first recipient sample (day 0) for *env *and 40 days for *gag *(Table ). We calculated *N*_{R}*τ*(*t*_{trans}) to be 1.6 for *env V1-V4*, and 2.0 for *gag p24 *(Table ). These values are near the lower prior boundary of one and their posterior distributions both exhibit a large positive skew (Figure ). The level of diversity in the donor at the time of transmission *N*_{D}*τ *was compared with that which was transmitted *N*_{R}*τ*(*t*_{trans}) as a percentage ratio *δ*. For *env*, *N*_{D}*τ *was estimated to be 1014, giving a value of *δ *as 0.17%. For *gag p24*, *N*_{D}*τ *was 771, giving *δ *as 0.29% (Table ).

| **Table 2**Parameter estimates used to calculate the percentage diversity that survived transmission |

Importantly, if selection was acting on *env *to restrict the proportion of variants capable of establishing a new infection, we would expect a greater loss of diversity in this region when compared to *gag*, assuming recombination between the two regions. Therefore, the similarity in *δ *between *env *and *gag *argues against strong selection at transmission.

We conclude that > 99% of genetic diversity in the donor viral population, in both

*env *and

*gag*, was lost during this case of horizontal transmission. A reduction in viral diversity after horizontal transmission has been reported frequently in the literature [

1,

3-

6]. However, information regarding the diversity present in the donor is often lacking, and even in cases where this data exists [

2,

7] it is difficult to measure levels of diversity close to the transmission event. The method implemented here overcomes this problem, estimating genetic diversity at the inferred time of transmission, and therefore allows accurate quantification of the transmission bottleneck.

To generalise this result we next investigated diversity (

*Nτ*) of the founding viral population in nine patients infected through homosexual contact for which donor sequences were unavailable. Sequences had been published previously [

29]. Assuming the best-fit demographic model,

*Nτ *at seroconversion was found to vary between around 1720 and 8 (mean: 406; Table ). In the recipient of the transmitter pair,

*Nτ *at seroconversion (day 0) was 1150 (HPD upper: 1930), which is not significantly different (

*p *= 0.302; one-sample

*t*-test).

| **Table 3**Estimates of viral diversity close to the time of transmission |

Finally, to compare the diversity present close to the time of infection in patients infected via two different modes of transmission, we estimated *Nτ *at birth (transmission) in 27 vertically infected infants. The average *Nτ *at birth was 696 (Table ). Although we were unable to detect a bottleneck at transmission in eight of the infants (p2, p3, p6, p8, pa, pd, pc and pd), the estimates for *Nτ *close to the time of infection in the horizontally and vertically infected patient groups were not significantly different (*p *= 0.320; two-sample *t*-test).