All subjects gave written informed consent. The protocols to collect the shedding data were approved by the University of Washington Human Subjects Review Committee. The modelling and statistical analyses were certified exempt by the UCLA Office for Protection of Research Subjects.

The women in the longitudinal study ranged from 22 to 51 years of age (median age 27) at baseline, were seronegative for HSV‐1 and were not taking antiviral therapy at the time of data collection for this analysis. They had enrolled in two clinical trials.

^{6}^{,7} At baseline, the median time since HSV‐2 acquisition was 8 months (range 2 months to 2 years), and at follow‐up, the median time was 3.2 years (range 2–4.6 years). Acquisition of genital HSV‐2 was defined clinically and all infections were confirmed virologically and serologically. The women collected separate cervicovaginal and vulvar swabs once daily by passing over these mucosal surfaces with a Dacron swab. The women collected mucosal swabs for 57–85 days (median 67.5 days) during the baseline period and 45–82 days (median 66 days) during the follow‐up period. The swabs were analysed by a real‐time quantitative HSV DNA PCR assay.

^{8} We defined a sample collection time point to be positive for shedding if

500 HSV DNA copies/ml of specimen were detected in either specimen collected that day, and negative otherwise. The women also recorded recurrences of genital lesions in diaries. A recurrence was defined as an episode of genital lesions from the onset of the first lesion until all lesions were completely healed. We annualised recurrence rates as (number of recurrences)/(length of observation period). Symptomatic shedding was defined as an HSV positive result with lesions present; asymptomatic shedding was defined as an HSV positive result with no lesions present.

The numbers of women with HSV‐2 shedding and recurrences at baseline and follow‐up were compared using Fisher's exact test. Differences between baseline and follow‐up in percentage of sample collection time points that were HSV positive, percentage of such time points that were asymptomatic, percentage of days with recurrences, and number of recurrences per year were assessed using the sign test for paired samples. The median duration of recurrences in each time period was estimated by aggregating all recurrences within the relevant time period.

Viral dynamic model

The model of HSV within‐host dynamics, depicted in fig 1, was developed to describe HSV pathology^{2}^{,9} and links HSV reactivation from latency in the ganglia with shedding from the mucosa. The model postulates that each individual experiences intermittent reactivation of HSV‐2 with a characteristic frequency. As reactivation is believed to be triggered by a variety of stimuli that tend to occur sporadically or randomly over time, we model it as a Poisson process.^{10} Thus, the model postulates that an individual i has HSV‐2 reactivations over time according to a Poisson process with rate λ_{i}, defined as the individual's mean number of reactivations per year. Figure 1A provides an example of the timing of reactivations.

Each reactivation is associated with a short‐term infection of the epithelium, during which HSV is shed. We define the duration of a reactivation as the duration of HSV shedding detectable through swabbing and PCR associated with the reactivation‐inducing stimulus. The durations vary within individuals, and we model them as independent and identically distributed (iid) exponential random variables, with parameter θ_{i} corresponding to individual i's mean duration of a reactivation, measured in days. Figure 1B provides an illustration.

It has been observed that while shedding is occurring in one region of the mucosa, HSV can become newly detectable in another region.^{7}^{,11} This could be due to continued transport of HSV from the ganglia in response to the original stimulus, or might reflect the arrival of a new stimulus. The model encompasses both scenarios. In the first scenario, the shedding is subsumed into the duration of the original reactivation. To address the second scenario, the model postulates that stimuli continue to occur at the same rate, λ_{i}, during a shedding episode. Thus, another reactivation can occur before the shedding associated with previous reactivations has cleared, resulting in a shedding episode involving >1 reactivation. A shedding episode with overlapping reactivations is illustrated in fig 1C.

Figure 1D illustrates the collection of mucosal swabs at daily time points. Each swab pools secretions from the entire swabbed surface, abrogating the ability to identify the distinct areas where shedding is occurring. Thus the number of potentially overlapping reactivations cannot be determined directly from the mucosal swab data. However, it can be estimated using the model, as described below.

We fit the model to mucosal shedding data to estimate each individual's reactivation frequency λ

_{i}, mean duration θ

_{i} and other quantities using Bayesian analysis with Gibbs sampling as described elsewhere.

^{3} We allowed individuals to have different parameters during baseline and follow‐up and regressed these parameters on time since HSV‐2 acquisition using ψ

_{i}=

(log λ

_{1i}, log θ

_{1i}, log λ

_{2i}, log θ

_{2i}) ˜ iid N(X

_{i}β, Σ), where (λ

_{1i}, θ

_{1i}) and (λ

_{2i}, θ

_{2i}) are individual i's reactivation parameters during baseline and follow‐up, X

_{i} is a covariate matrix indicating individual i's time since HSV‐2 acquisition, β is a regression coefficient vector, and Σ is a covariance matrix. Estimates of β, Σ, and ψ

_{i}, i

=

1,...,n, were obtained as posterior distributions, which we summarised using the median and interquartile range (IQR). Results are based on 15 000 posterior simulations following a burn‐in of 5 000.

Transformations of ψ_{i} yielded estimates of the following quantities: frequency and mean duration of shedding episodes, λe^{−λθ/365} and 365(e^{λθ/365}–1)/λ. These quantities are uncertain when swabs are collected at spaced time points, as the onset and resolution times of shedding episodes are not observed and episodes occurring between two sample points could be missed (fig 1D). Frequency and mean duration of reactivation from latency, λ and θ. This estimate is made under the assumption that reactivations continue to occur at the individual's characteristic rate regardless of whether the individual is currently shedding HSV as described above; thus, some reactivations can occur during shedding episodes. The percentage of reactivations that occurred during shedding episodes was modelled by 1–e^{−λθ/365}. This is equivalent to the probability that a reactivation occurs while the individual is shedding and also equivalent to the percentage of time with mucosal shedding.

Model validation

Our model validation methods followed recommendations for posterior predictive model checking for Bayesian data analysis.^{5}^{,12} Our model validation approach differs from procedures that fit the model to one set of individuals and use the estimates to predict outcomes for another set of individuals. This type of prediction is not possible with our model because our model postulates that each individual has his/her own characteristic HSV‐2 reactivation frequency and duration, and it is not possible under the current state of knowledge to predict a particular individual's frequency and duration from an analysis of an independent group of subjects. Rather, our objective was to examine whether the model assumptions, that reactivations occur according to a Poisson process and have durations that are iid exponential, provide a reasonable explanation of observed patterns of shedding. Thus, our approach was to simulate HSV reactivation as postulated by the model for a diverse group of individuals, producing simulated mucosal shedding data that show what the data would look like if the model were valid. We then compared the simulated data to actual mucosal shedding data collected from the individuals.

The individuals used for model validation were different from the individuals in the longitudinal study and included 27 men and 40 women with genital HSV‐2 infection who had either recent diagnosis of genital herpes (n

=

25, median time since acquisition 5.8 months, range 2–8.5 months) or established HSV with frequent recurrences (n

=

42, median time since acquisition 5.5 years, range 9 months to 20 years). The subjects had enrolled in a clinical trial,

^{13} and included 22 women and 16 men seropositive for HSV‐2 only, and 18 women and 11 men seropositive for HSV‐1 and 2. Patients were not taking antiviral therapy at the time of data collection for this analysis, and provided mucosal swabs for a median of 55 days (range 39–63 days). The individuals collected one daily specimen by swabbing the entire cervicovaginal, vulvar and perianal areas (women), or penile skin and perianal areas (men).

To produce simulated mucosal shedding data, we first obtained estimates of each individual's reactivation frequency and duration by fitting the model as described above, using ψ

_{i}=

(log λ

_{i}, log θ

_{i}). For each individual i, we drew a reactivation frequency and mean duration (λ

_{i}^{(m)}, θ

_{i}^{(m)}) from his/her posterior distribution and simulated reactivation times r

_{1}, r

_{2}, ..., with r

_{1} ~ Exp(λ

_{i}^{(m)}) and r

_{j+1}=

r

_{j}+ Exp(λ

_{i}^{(m)}) for j

=

2, 3, ..., and associated durations d

_{1}, d

_{2}, ... ˜ Exp(1/θ

_{i}^{(m)}). We superimposed the reactivations to obtain a continuous‐time mucosal shedding pattern as in fig 1C, then drew a random start time and ascertained the positive/negative shedding state at time points equal in number and spacing to the individual's actual swab collection time points, producing simulated data as in fig 1D. We produced 300 sets of simulated data for each individual, using a different draw of (λ

_{i}^{(m)}, θ

_{i}^{(m)}) from the posterior distribution for each replication, in order to account for random variation in the reactivation process and uncertainty in parameter values.

We compared the simulated and observed data in terms of percent of time with mucosal shedding and the dispersal of shedding over time. To characterise dispersal, we tabulated the number and average length of HSV positive “runs”, where an HSV positive “run” is defined as a sequence of HSV positive results preceded and succeeded by HSV negative results. The concept of a run is widely used in statistics to characterise patterns in sequences of data.^{14}