The main conclusion of our study is that humans tend to respond realistically at subjective, physiological, and behavioural levels in interaction with virtual characters notwithstanding their cognitive certainty that they are not real. The specific conclusion of this study is that within the context of the particular experimental conditions described participants became stressed as a result of giving ‘electric shocks’ to the virtual Learner. It could even be said that many showed care for the well-being of the virtual Learner – demonstrated, for example, by their delay in administering the shocks after her failure to answer towards the end of the experiment. To some extent based on previous evidence this was to be expected. In fact, it has even been taken for granted that virtual humans can substitute for real humans when studying the responses of people to a social situation. For example, this was the strategy used in the fMRI study described in 
, where participants passively observed virtual characters gazing at the participants themselves or at other virtual characters. However, no previous experiments have studied what might happen when participants have to actively engage in behaviours that would have consequences for the virtual humans. The evidence of our experiments suggests that presence is maintained and that people do tend to respond to the situation as if it were real. We review the evidence for this in subsequent paragraphs.
First, several participants withdrew from the experiment before termination. We have been conducting experimental studies with virtual environments since the early 1990s, with altogether hundreds of participants. Ethical rules require us to inform the participants that they may withdraw from the experiment at any time without giving reasons. Nevertheless, withdrawal is extremely rare, and has only previously occurred due to simulator sickness with no more than about 5 participants out of all the hundreds. Second, there were physiological responses that indicated stress (the SCL, SCR and ECG analysis). There were differential responses within groups (comparing the baseline to the learning session) and between groups (comparing those in the VC with those in the HC). Third, subjectively reported physiological symptoms also differed between groups. Finally, there were clear behavioural differences between the HC and the VC regarding responses to a failure of the Learner to reply to the questions. All these factors, together with the non-quantifiable participant behaviour observed by the experimenters, show a pattern of responses similar to those found in the original Milgram studies, although at lesser intensity.
In the original studies by Milgram it was found that the smaller the ‘distance’ between the Learner and the Teacher the more likely that the Teacher would refuse to give the higher level of shocks. For example, at one extreme the Learner was hidden as in the case of our HC, although unlike in our condition he protested by banging on the wall. At another extreme the subjects had to force the Learner's hand onto the shock machine in order to administer the shock. A similar result regarding ‘distance’ was found here, comparing the responses of the HC with the VC. However, it must also be said that the objections of the virtual Learner were much less extreme and violent than those of Milgram's actor. The virtual Learner complained and even screamed, but there was none of the banging and shouting and protestations of a heart condition expressed by the original actor. One of our participants, for example, reported that although he was affected by the protestations of the virtual Learner, he wasn't too upset, because she didn't protest enough, did not for example scream at and insult him nor writhe in agony in the chair.
Our study leaves open many avenues of further research. We carried out this experiment using two conditions that are far apart. However, we do not know what would have happened if the virtual Learner in the HC had issued protests through text. Neither do we know whether simply the voice of the virtual Learner would have been sufficient to provoke the responses, nor what would have happened if the protests of the Learner had been extremely violent. During our pilot studies we did try a condition with three participants where the Learner was seen but did not show any signs of discomfort and did not protest. One of those participants claimed to see signs of discomfort in the behaviour of the Learner (even though none had been programmed), and said that he felt uncomfortable continuing with the experiment. It is possible that very minimal cues are sufficient to provoke the stress responses in some people.
This issue of minimal cues is important in another sense. Our virtual Learner could never be confused with a real human. Her visual representation was not realistic, and her behaviours were as realistic as could be programmed with the resources available to us (see, for example, Movie S1
). Nevertheless, there were evidently strong responses to her. How is this possible? It has been pointed out before that the phenomenon of presence in virtual environments is an important a research question in its own right, closely related to the question of consciousness 
. People tend to respond to virtual environments as if the objects and events depicted are real, in spite of low fidelity representations and certain knowledge that the events taking place are within a virtual reality. However, the perceptual and neural mechanisms that underlie this are largely unexplored.
The line of research opened up by Milgram stopped forty years ago due to ethical concerns, despite the tremendous importance of this work in the understanding of human behaviour. It has been argued before that immersive virtual environments can provide a useful tool for social psychology 
. Our results reinforce this argument and show that virtual environments can provide an alternative methodology for pursuing laboratory-based experimental research even in this type of extreme social situation. For example, in future experiments within the Milgram obedience paradigm we plan to make the experimenter a virtual character, thus allowing manipulations of the type of person that the experimenter represents (for example, personality type, clothing, and so on) and also supporting a greater degree of conflict between the demands of the experimenter and the protests of the Learner than is possible when the experimenter is a real person.
The argument regarding the utility of virtual environments applies not simply to obedience research but to all social and psychological research where, for ethical or safety reasons, it is not possible to immerse experimental participants into the actual phenomena to be studied. For example, one of the motivations for our Milgram study was a longer term goal to explore ‘bystander behaviour’ in street violence. There is a well-known result in social psychology that counter-intuitively predicts, amongst other things, that the greater the size of a crowd that is watching street violence, the less likely it is that anyone will attempt to intervene to stop it. This is a vital area of current social-psychological research given the current level of perceived crime in urban areas – yet in order to study this researchers are forced at best to use videos that require people to judge likely responses to such situations 
, and the same techniques have been used in the Milgram obedience paradigm 
. Milgram's own results clearly show that taking people's opinions about their own or others' behaviours in such circumstances at face value is far from reliable. We suggest that immersive virtual environments provide an alternative way forward in this area of research.
Speculations on Obedience in Virtual Reality
Although as stated in the opening paragraphs we did not set out to study obedience in this experiment, it is nevertheless interesting to speculate to what extent the results throw light on this issue. The first point to note is that the problem of major deception that arose in the original experiments by Milgram was avoided here – since every participant knew for sure that the Learner was a virtual character, and therefore no one could believe that they were inflicting pain on anyone else. We refer to this as the explicit knowledge of the participants that they were not harming anyone 
Consider the actual experience of the participants, however. They arrived at the laboratory and were asked to complete various questionnaires. The experimenters were very serious, one introduced as a Professor. The instructions were given to them in written form and again read out loud by the experimenter. For example, they were told: “Thank you for taking part in this experiment. As part of our research program a virtual character has learned a set of word-pair associations. The learning is sometimes not exact, but we are testing a reinforcement learning procedure, to see if the infliction of discomfort motivates her, the virtual character, to remember the word-pair associations better.” The Learner had a quite realistic face, with eye movements and facial expressions; she visibly breathed, spoke, and appeared to respond with pain to the ‘electric shocks’. Not only that but she seemed to be aware of the presence of the participant by gazing at him or her, and also of the experimenter - even answering him back at one point (“I don't want to continue – don't listen to him!”). Finally, of course, the electric shocks and resulting expressions of discomfort were clearly caused by the actions of the participants.
The participants were therefore put into a situation where everything conspired to give the impression that this was a serious matter. In keeping with this, not a single participant queried the statement about the ‘infliction of discomfort’ motivating the virtual character to ‘remember the word-pair associations better’ even though this is not rational.
Therefore we would argue that in spite of their explicit knowledge that they were not actually causing pain to any real person, the situation established for the participants an implicit knowledge that their actions were causing distress to an animated entity (and one that resembled a human being). For most participants this caused increasing discomfort as witnessed by their physiological responses and later comments during the post-experimental interviews, and this discomfort was higher for those who saw and heard the Learner (VC) compared with those who only interacted with her through text (HC).
The majority of all participants followed the experimental instructions to the end, though a number of those in the VC withdrew without completing all the shocks. Can this compliance be construed as ‘obedience’? It could be argued that rather than obedience this was a matter of participants being willing to put up with their own discomfort for the sake of honouring their agreement to be a participant in the experiment. Similar arguments have been made in relation to the original experiments by Milgram – for example, that his subjects were not necessarily being obedient, but were deferring to the expert scientific authority; in other words, since the behaviour of the experimenter indicated that nothing out of the ordinary was happening, this signalled to the subjects that everything must be going according to plan 
We argue that whether participants complied because of ‘obedience to authority’ or politeness, or respect for expertise does not really matter. The fact is that they continued to carry out a task that they found to be unpleasant, when there was no reason for them to do so. Unlike the situation in, for example, the military, there were no real negative consequences that would follow from withdrawal – indeed participants had been advised that they were free to withdraw at any time without giving reasons. Hence, our experiment shows that it is possible to set up a situation in virtual reality where people will comply with requests to follow instructions that appear to cause pain to another entity thus causing discomfort to themselves. Explicitly they know that there is no pain, but it may be that the totality of their perceptions in that situation results in an implicit knowledge that indeed their actions are causing another entity to suffer. This idea fits with the evidence that participants in the VC tended to wait a relatively long time before giving the shocks after the Learner had stopped responding. From the point of view of their explicit knowledge waiting made no sense, but it did make sense at the implicit level.
Although this particular experiment did not address Milgram's hypothesis about destructive obedience, in particular there were many variations on the basic experiment that Milgram carried out that were not addressed here, our conclusion is that virtual reality could be successfully used for this purpose. However, it is important to bear in mind the limitations inherent in the distinction between the explicit knowledge that the situation is fake, and the implicit knowledge that is embedded in the virtual reality portrayal. As one of our participants noted – she had to keep reminding herself that this was a virtual reality and that no one real was being hurt. The actual conditions of Milgram's experiments can, of course, never be exactly replicated in virtual reality since the participants will always know that the situation is unreal - and if eventually virtual reality became so indistinguishable from reality that the participants could not readily discriminate between the two, then the ethics issue would arise again.