Most HSV-2 seropositive persons have genital HSV detected over observation periods lasting 30 days or longer, and the proportion classified as "shedders" increases with the number of samples collected. In fact it is likely that most if not all HSV-2 seropositive persons shed HSV in the genital tract if observed for long enough periods (

20). Therefore, the term "shedder" is context specific.

It is appealing to dichotomize individuals into "shedders" and "not shedders" or into "high shedders" versus others since the computation is straightforward and may provide a qualitative distinction between those who shed frequently and those who do not. However, the data presented herein demonstrate the trouble with drawing conclusions based on the quantitative ratio of “shedders” or “high shedders”. Real data examples using HIV status and HSV shedding show that dichotomization can cause group differences to appear larger or smaller than shedding rate ratios, depending on observation length. Theoretical examples also confirm these differences.

Conclusions based on dichotomization of the data may be misleading, particularly for studies that attempt to determine risk factors for viral transmission to others. The dichotomization may provide an arbitrary and context-dependent distinction between groups when defining risk factors for shedding; and it may not be predictive of disease status or infectiousness. A better summary measure is overall shedding rate: number of positive samples out of all swabs taken on persons in that group. Even the shortest observation intervals provide accurate information on shedding rate, though more samples increase precision. Group differences can then be described using Poisson regression. And zero-inflated models can be of use when most persons are not observed to shed (

28,

29).

Comparisons among "shedders" do not currently appear to be appropriate in any context as their value depends on the length of observation. However, some authors have provided group comparisons based both on "shedders" and the shedding rate ratio (

11,

16,

17,

30,

31) and some of the findings are discrepant. For example, in a crossover design with 6 samples per arm, Nagot (

16) found a significant decrease in the rate of genital HIV-1 RNA detection while on valacyclovir using Poisson regression to compute rate ratios (RR=.77, p=.006) but found no effect of treatment using the ratio of shedders (RS=.93, p=.30). Mayaud (

30) found a difference in HSV-2 shedding rate by HIV-1 serostatus and HAART use (p<.001) but no difference in the rates of “shedders” over 12 visits between these groups (p=.17). Cowan (

11) found a 40% reduction in HSV-2 shedding with acyclovir using the ratio of shedders over 13 samples (p=.004) and a 76% reduction (p<.001) using the shedding rate ratio. Thus the difference in inference by method of measurement indicates that comparisons based on "shedders" may not be reliable.

Limitations of our methods include an imperfect ability to distinguish overlapping reactivations from the ganglia within a single shedding episode (

27). Recent work has shown that frequent sampling allows identification of more distinct episodes (

8). However, our purpose was to assess shedding frequency rather than reactivation duration. Further, we did not consider subject-level variability in episode characteristics. However, we anticipate the impact of this additional consideration to be negligible.

Dichotomization at a pre-determined cutoff, whether that cutoff is

*any shedding* or some fixed rate, can result in non-differential misclassification and subsequent bias since those with lowest shedding rates or longest episode lengths are most likely to be classified as non- or low-shedders. Others have warned of the potential for bias in related contexts. Copeland demonstrated that non-differential misclassification can attenuate risk ratio estimates while differential misclassification can attenuate or exaggerate it (

32). Irwig et al. showed the potential for attenuation of exposure-response relationship when dichotomizing a continuous outcome at cutoffs related to the observed standard error (

33). We have found similar bias for cutoffs determined a priori in both episodic and non-episodic shedding patterns. Based on this evidence, dichotomization of shedding is discouraged and group comparisons based on proportion of "shedders" in each group are considered unreliable. Computation of overall shedding rates and group comparisons using Poisson regression is recommended.

Key messages

- Most HSV-2 seropositives persons are expected to shed HSV from the genital mucosa.
- Dichotomization of persons into “shedders” versus “nonshedders” over repeated samples leads to inference that varies by the number of samples collected.
- Poisson regression is recommended as it can accurately determine group differences in shedding rates regardless of the length of the sampling interval.