The utility of a platform for the remote conduct of GxE studies has been demonstrated. In a representative sample of older people, 11% of those estimated to be connected to the internet consented to participate of whom 99.9% provided data. A randomised trial nested within the study found that the request of a bio-sample had little effect on participation. The donation rate of those who agreed to provide a bio-sample was over 70%.
There were several challenges to overcome before ethical approval for this study was obtained. These were largely due to the combination of technologies that were being proposed. The major issue was the linking of genetic information with clinical records. The commitment to use a fully secure and de-identified database for linkage and subsequent analyses was considered to provide adequate protection for participants. 
A further issue was the possibility of identity fraud. However, it was accepted that the likelihood of this was low and, due to the use of de-identified data for analyses, the consequence would be to add noise to the data rather than pose any risk to the individual. On this basis obtaining consent without a ‘wet’ signature was also approved. A supporting argument for conducting the study remotely was the availability of telephone support which enabled any prospective participant to discuss the study with the research team, including the Principal Investigator. This facility is not usually available in large (usually multi-centre) studies using face-to-face methods.
The extremely low rate of complaint (0.05%) strongly confirmed the evidence of previously conducted qualitative studies that, in principle, remote methods are acceptable to the public. 
Complaints were almost entirely due to a mis-perception that the research team had access to personally identifying information prior to it being given by participants. As initial contact was achieved via a third party, this issue was easily addressed. Although the complaint rate may not fully reflect the acceptability of the study, it reflects a reassuringly high level of acceptability. The high completion rates for all modules (82%–99%), which differed widely in format and content suggests that a web platform has application to a wide range of epidemiologic studies.
The response rates achieved here are difficult to assess accurately as it was not known beforehand which invitees had internet access. Based on Government figures it was likely that 6,011 invitees were internet connected giving a response rate of 11%. The Government figures were based on a representative sample of 7,728 households throughout Wales (reflecting a 71% response rate) surveyed in 2007. In our study, in terms of the mailed sample 4.5% responded. Given that this is an older population with limited internet connectivity, this response may be considered comparable to the 5.4% achieved in UK Biobank. 
Overtime differences in selection bias between web-based and face-to face samples 
are likely to reduce as a greater proportion of the population becomes connected. 
Remote methods, in which recruitment costs are minimised and participation restricted by computer access, bring the issue of selection bias into sharp focus. A helpful distinction is between descriptive and etiologic studies. The former describe specific populations. For descriptive studies to achieve unbiased estimates of prevalence, incidence or normal ranges, representative samples are required. Etiologic studies investigate mechanisms that occur across populations. For these studies heterogeneous population samples are required so that the range of values for an exposure is available to the analysis. Also required is the non-differential ascertainment of incident outcomes. GxE studies are not designed to describe specific populations. As such, response rates affect cost rather than bias. Similarly, remote methods are not generally suitable for descriptive studies but for testing etiologic hypotheses. Our study has demonstrated, in terms of age, sex and deprivation, that heterogeneity can be achieved using remote methods.
Heterogeneity in this study was achieved by dint of numbers rather than by a systematic method, such as random sampling. It is unlikely that the heterogeneity available to the analysis would have been materially affected had we used a different recruitment method, such as a media campaign, provided we recruited sufficient numbers.
Rather than requiring all studies to be representative, a preferred strategy is to identify mechanisms using etiologic studies and then apply that knowledge to specific populations. Clarifying and separating these goals enables more efficient study design, as etiologic studies may be conducted without the unnecessary burden (and cost) of having to achieve representativeness, and descriptive studies may be conducted without the unnecessary burden (and cost) of having to achieve large sample size. By separating these goals each design can be prosecuted more efficiently.
A further issue is the validity and reliability of web-based assessment. Evidence largely supports comparability between measurement media for questionnaires. 
We found very little difference between the distributions of several self evaluation questionnaires between web-based and clinic-based methods. Less is known in relation to cognitive testing. In part, this is due to the difficulties of cognitive testing in both face-to-face and remote contexts and the wide range of cognitive tests in use. In this study a cognitive battery, designed specifically for epidemiologic use, was compared between web-based and clinic-based assessment. The distributions were closely similar between measurement contexts. Although these between cohort comparisons are indirect (neither randomised nor repeat measurement) they are the best comparisons available.
Requesting a bio-sample appeared to have only a small effect on participation. It appears that if the rationale for the study is persuasive, the donation of genetic material is not problematic. Furthermore, although the donation of dried blood was not a painless exercise, this also was widely acceptable. The actual donation rates (70–75% according to sample), although useful for planning purposes, are likely to be conservative due to the passage of several months between most participants joining the study and being mailed the sampling kit.
Etiologic studies also require non-differential ascertainment of outcomes. In practice, this means very high follow-up rates. Here the principal follow-up method was by record linkage. The high level of linkage achieved may not be surprising given the initial invitations were based on the National Health Service Administrative Register database. However, although follow-up by electronic linkage may virtually eliminate attrition, for many hypotheses e.g. those involving inadequate routine measurement of outcomes such as common mental disorder, or those involving change over time such as cognitive decline, follow-up by re-contact is required.
This study cost around £100 per participant. This mostly involved IT development costs, reflecting the cost structure of remote studies being front-loaded compared to traditional methods. For larger studies, on the basis of subsequent cost being due largely to mailing and bio-sampling, we crudely estimate, on the basis of a 10% response rate, and a 70% bio-sample donation rate, that recruitment and bio-sampling for a GxE study of 50,000 would cost around £15 per participant over an 18 month period. These per capita costs would be reduced if the response and donation rates were improved or if the study size was increased. Costs would likely be reduced further if recruitment was achieved without an initial mailing i.e. through a media campaign, although this is conjectural. By comparison, costs using traditional methods would be additional to those estimated here. Currently, initial contact costs would be similar, as response rates between this study and UK Biobank were comparable. However, with time, as internet connectivity becomes more prevalent, response rates from mailed invitations for internet studies are likely to increase, thus reducing contact costs. Additional costs for traditionally conducted studies would include premises hire (for assessment and bio-sampling), staff (clinicians for bio-sampling as well as technicians assessment and for sample preparation). In relation to bio-sampling, depending upon the sample, remote methods will usually generate lower costs. A mailed dry-blood sample, for example, requires less processing and storage space than a wet sample which requires venepuncture and processing consumables, transport to a repository, and long term very low temperature storage. Although we are not in a position to put numbers to these cost headings, the additional costs are substantial. Finally, linkage follow-up costs will be closely similar between remote and traditional methods, depending upon the quality of personally identifying information available.
Many limitations remain to be overcome before remote methods are as clearly understood and accepted as face-to-face methods. Although we have achieved linkage we have not downloaded data, but the system proposed for this is currently being used in a large e-cohort of births in Wales. 
We have not tested our ability to re-measure participants. The incentivisation of participants for re-measurement is a critical issue for web-based cohorts. Options available include the provision of feedback at the individual, as well as study, level. A further limitation is the quality of available bio-sample. We have shown that donation of either buccal cell or dry blood is feasible. In addition, other pilot work (not shown) has demonstrated that the remote donation of saliva for genetic determination is also feasible. Any of these methods is adequate for the retrieval of DNA allowing genotyping and the use of genotypes as instrumental variable in Mendelian randomisation studies. However, dry blood may also be used for an increasingly broad range of assays. Although dry-blood may not currently be suitable for cutting-edge molecular biology, it is suitable for assessing many established risk factors and so is informative in GxE studies. A particular limitation of remote studies is objective measurement. Although this has been largely solved for cognitive performance, remote methods and devices for assessing anthropometry, physical activity and other aetiologically important risk factors need to be developed before remote methods will have a broad based epidemiologic impact.
The Way Forward
GxE studies offer the prospect of robust causal inference through both gene identification and instrumental variable approaches. 
As such they are a major and much needed epidemiologic development. The value of remote methods is increasingly recognised and they are being adopted in a variety of epidemiologic contexts. 
We have shown, that even in their infancy, the application of remote methods can be extended to GxE studies as a cost-effective alternative to traditional approaches. In acknowledging the need for further methodological refinement, we expect this greater efficiency to improve as the field matures. We have also shown evidence that over a range of psychological and cognitive assessments the data are comparable with those collected face-to-face. We expect the range of remotely assessed measures to increase, particularly with the development of small objective measurement devices such as accelerometers, with remote measures being preferable in many areas. By these means, even in an age of fiscal restraint, remote methods provide an opportunity to increase the capacity for GxE studies; offering the prospect of GxE studies going beyond broad-based investigations of chronic disease to more finely niched investigations focussing on more refined outcomes in more closely defined population strata. 
In an age of increasingly diverse public expectation, a growing desire for robust inference, and ubiquitous information technology, a bourgeoning of remotely conducted GxE studies is not an unrealistic expectation.