This study found no between-group differences, indicating an absence of evidence of assessment or feedback benefit in the intention-to-treat analyses among unselected university students. There was one between-group difference among those who were risky drinkers at study entry in the per-protocol analysis. This potentially provides some very modest evidence of benefit attributable to receiving feedback in addition to assessment, although this should be interpreted in the context of the wider study finding of no effects.
There are three principal obstacles to making more definitive statements attesting to evidence of intervention ineffectiveness in the present study. First, the differential attrition between group 3 and groups 1 and 2 suggests problems of nonequivalence between the groups and thus bias in direct comparisons. Second, the present study was undertaken to prepare for a subsequent large trial and was thus not originally designed to produce such conclusions. Lastly, the approach taken to outcomes evaluation in which populations are randomized and compared regardless of their need for intervention should be carefully considered. Each of these issues shall be addressed in some detail.
Because of the randomized nature of this study, it can be inferred that differential attrition was caused by the earlier involvement of groups 1 and 2 than of group3 with the study. The earlier invitation to participate in the alcohol e-SBI was apparently not sufficiently different from the later alcohol survey, or the mere fact of a second alcohol-related email invitation may have interfered with the likelihood of accepting the invitation. Whatever the precise mechanism, the main implication is clear: selection bias is possible, if not likely, and outcomes for group 3 cannot be validly assumed to be directly comparable with those for groups 1 and 2 in relation to intervention effectiveness. Because of the lack of contact with group 3, the basis for making comparisons to evaluate this possibility is limited. This is restricted to the unlikelihood of data being collected at follow-up having been altered differentially between groups by the passage of time. Here there is evidence of a slight tendency for a greater proportion of male participants, and the presence of other unmeasured confounders cannot be ruled out. It should be noted that this issue of differential attrition particularly complicates randomized comparisons involving group 3 and is less of a problem for comparisons restricted to groups 1 and 2 only. A stronger conclusion can thus be drawn in relation to whether feedback added to the effects of assessment-only, and there is little evidence here that it did, notwithstanding one statistically significant difference among the per-protocol analysis outcomes.
The second reason for caution relates to the highly naturalistic Internet study context and the need to confront the difficult methodological challenges that this implies during the course of a long-term research program. Attrition has been a major source of difficulty in previous work in developing e-SBI in Linköping and other Swedish universities [31
It is also a significant problem in the conduct of online trials in other populations as underscored by Eysenbach [32
] and others [33
]. The initial take-up of the routine service provision of e-SBI has been similar to that observed here and is likely to be influenced by factors such as patterns of email use, rates of risky drinking, the salience of alcohol, and interest in intervention. In Sweden, as elsewhere, there are also seasonal influences on drinking, including proximity to exams, and these complicate any analyses of change over time. Randomization, it should be noted, safeguards the validity of between-group comparisons, if attrition and other similar sources of bias are equivalently distributed between groups [34
]. In the previous follow-up studies undertaken at Linköping University, less than half of those who participated at baseline did so at first follow-up, and approximately one-quarter participated in second follow-ups [19
]. Here, rather than follow-up emails being sent by the student health care service as was done previously, blinding of participants to trial conduct was implemented. This involved an explicit attempt to separate the experience of follow-up from earlier e-SBI delivery. An email was sent by the first author (PB) requesting participation in a survey of student alcohol consumption, partially following the approach of Kypri and colleagues, who invited participation in a series of surveys at the outset and who obtained high follow-up rates [20
]. As has been seen, this innovation, along with the use of cinema ticket incentives, was partially successful in restricting attrition at follow-up. It also introduced differential attrition as has been discussed. To rectify this, we have decided that in the next trial we will abbreviate the alcohol outcome measures and conceal them within a lifestyle questionnaire in the follow-up study. The overall attrition rate could be further improved with the use of stronger incentives, though this would potentially compromise the pragmatic nature of the study [35
The third main reason for caution in drawing conclusions from the present study relates to our intention-to-treat approach to outcomes evaluation, which was highly conservative. The intervention comprised an automated email providing a means of accessing a website in an unselected population with an elevated prevalence of hazardous and harmful drinking. Thus, the intervention was delivered more widely than was necessary, as we only wished to intervene with risky drinkers. The intervention could be defined more narrowly as being delivered to those who accessed the website, with the email merely being the means of recruitment. Even if this definition is applied, the intervention would still have been accessed by students whose drinking was not risky and who would thus not have been deemed to merit individual targeting for intervention. More narrowly still, outcome evaluation could have been restricted to those whose drinking was found to be risky. The overarching problem is that a greater number of people were randomly assigned than would have been targeted for intervention. The primary rationale for proceeding in this way was that assessing eligibility, baseline data collection, and intervention delivery were all quickly integrated in 1 brief online session. We shaped our research design pragmatically around the real-world intervention opportunity, matching the research study to routine practice as it is delivered, rather than interfering with it for research purposes, which would have introduced external validity problems. This approach takes advantage of an opportunity to avoid any research participation effects that may be associated with screening and other aspects of study entry, in much the same way as cluster randomized trials can be used for this purpose. The obvious major disadvantage of this approach is that it biases hypothesis testing toward the null and thus is highly conservative. Thinking about outcomes evaluation needs to take account of these issues.
There then arises the question of the consistency of study findings with the existing literature. Put simply, there are no existing studies against which to compare our intention-to-treat findings, as none have used no-contact control groups. The per-protocol comparisons more closely reflect existing studies, and smaller between-group differences are observed. Thus, for both internal and external validity purposes, our test of the third hypothesis is particularly important. Comparisons with the existing literature also need to take account of the highly naturalistic study context. If our results are confirmed in further studies, they have important implications when considering the effectiveness of online alcohol interventions.
Our unusual study design confers many limitations, as well as strengths, some of which have already been considered. We used the AUDIT as an efficient summary measure of alcohol consumption and whether it may be hazardous or harmful. Although the AUDIT has been validated in online student contexts [9
], this does not extend to use as an outcome measure in a trial. As well as uncertainty about such use, more direct behavioral measures of drinking may be better suited to universal prevention contexts. As the study was completely automated, there was no potential for subversion of randomization, nor of observer bias in ascertainment of study outcomes. The initial take-up or reach of the intervention is neither a simple strength nor limitation of this study, being part of the object of evaluation. Necessarily, the outcomes were self-reported and, although computerized data collection may minimize social desirability bias, the validity of self-reported outcome data in brief alcohol intervention trials needs to be studied. The approach used here involves deception, and therefore it is appropriate to consider whether less-ethically problematic methods could be used. For example, if we were only concerned with constraining assessment reactivity, would it not have been possible to adopt informed consent procedures and simply withhold assessment? This would indeed have been possible had we been interested only in exploring assessment reactivity effects. We are aware, however, of the potential for other research participation effects [29
] and specifically wished to control for this possibility here. This need requires novel or underused approaches to research design, for example, and studies may involve avoiding informed consent [36
]. As well as developing research methods in this program of study, we are very conscious of the need both to undertake ethical analyses in parallel and to undertake dedicated empirical studies to assist ethical evaluations. Further trials that provide access to large samples are likely to be useful for further substantive effectiveness trials, along with dedicated methodological and ethical studies of the issues contended with here.