|Home | About | Journals | Submit | Contact Us | Français|
Patients are commonly presented with complex documents that they have difficulty understanding. The objective of this study was to design and evaluate an animated computer agent to explain research consent forms to potential research participants.
Subjects were invited to participate in a simulated consent process for a study involving a genetic repository. Explanation of the research consent form by the computer agent was compared to explanation by a human and a self-study condition in a randomized trial. Responses were compared according to level of health literacy.
Participants were most satisfied with the consent process and most likely to sign the consent form when it was explained by the computer agent, regardless of health literacy level. Participants with adequate health literacy demonstrated the highest level of comprehension with the computer agent-based explanation compared to the other two conditions. However, participants with limited health literacy showed poor comprehension levels in all three conditions. Participants with limited health literacy reported several reasons, such as lack of time constraints, ability to re-ask questions, and lack of bias, for preferring the computer agent-based explanation over a human-based one.
Animated computer agents can perform as well as or better than humans in the administration of informed consent.
Animated computer agents represent a viable method for explaining health documents to patients.
Face-to-face encounters with a health provider—in conjunction with written instructions—remains one of the best methods for communicating health information to patients in general, but especially those with low health literacy [1–4]. Face-to-face consultation is effective because providers can use verbal and nonverbal behaviors, such as head nods, hand gesture, eye gaze cues and facial displays to communicate factual information to patients, as well as to communicate empathy  and immediacy  to elicit patient trust. Face-to-face conversation also allows providers to make their communication more explicitly interactive by asking patients to do, write, say, or show something that demonstrates their agreement and understanding . Finally, face-to-face interaction allows providers to dynamically assess a patient’s level of understanding based on the patient’s verbal and nonverbal behavior and to repeat or elaborate information as necessary .
However, there are several pervasive problems that limit clinician’s capacity to communicate effectively. Providers can only spend a limited amount of time with each patient . Time pressures can result in patients feeling too intimidated to ask questions. Another problem is that of “fidelity”: providers do not always perform in accordance with recommended guidelines, resulting in significant variation in the delivery of health information.
Given the efficacy of face-to-face consultation, a promising approach for conveying health information to patients with limited health literacy is the use of computer animated agents that simulate face-to-face conversation with a provider . These benefits of using conversational agents include:
According to the 2004 National Assessment of Adult Literacy, fully 36% of American adults have limited health literacy skills, with even higher rates of prevalence among patients with chronic diseases, those who are older, minorities, and those who have lower levels of education [13, 14]. Seminal reports about the problem of health literacy include a sharp critique of current norms for overly complex documents in health care such as informed consent [15, 16]. Indeed, a significant and growing body of research has brought attention to the ethical and health impact of overly complex documents in healthcare [17, 18]. Computer agents may provide a particularly effective solution for addressing this problem, by having the agents describe health documents to patients using exemplary communication techniques for patients with limited health literacy and by providing this information in a context unconstrained by time pressures.
Informed consent agreements for individuals to participate in medical research represent a particular challenge for individuals with limited health literacy to understand, since they typically encode many subtle and counter-intuitive legal and medical concepts. They are often written at a reading level that is far beyond the capacity of most subjects  . Researchers may not have the resources to ensure that participants understand all the terms of the consent agreement. Indeed, many potential research subjects sign consent forms that they do not understand [21–23].
Consequently, we modified an existing computer agent framework designed for health counseling   to provide explanation of health documents such as research informed consent forms. In this paper we describe the development of this agent, and then present a preliminary evaluation of the computer agent in a three-arm randomized trial in which the agent explains an informed consent document for participation in a genetic repository.
We conducted two empirical studies to characterize how human experts explain health documents to their clients in face-to-face interactions . The first study was conducted with four different experts explaining two different health documents to research confederates. The second study was conducted with one expert explaining health documents to three laypersons with different levels of health literacy. Our primary focus was a micro-analysis of the nonverbal behavior exhibited by the expert in order to inform the development of a computational model of document explanation. We found that one kind of nonverbal behavior was nearly ubiquitous: the use of pointing gestures towards the document by the expert (Figure 1). Of the 1,994 expert utterances analyzed, 26% were accompanied by a hand gesture, and 90% of these involved pointing at the document.
We derived a predictive model of the occurrence and form of referential hand gestures and other nonverbal behavior used by the experts during their explanations. We found that initial mentions of part of a document were more likely to be accompanied by a pointing gesture (43% vs. 19%) and that the kind of document object referred to (page vs. section vs. word or image) was predictive of the kind of hand gesture used (e.g., using a flat hand to refer to a page vs. pointing with a finger to refer to a word). We also found that the expert in the second study omitted a significant amount of detail and used more scaffolding (description of document structure) when describing a health document to listeners with low health literacy, compared to listeners with adequate health literacy.
An existing computer agent framework designed for health counseling   was modified to provide explanation of health documents. The framework features an animated computer agent whose nonverbal behavior is synchronized with a text-to-speech engine (Fig. 2). Patient contributions to the conversation are made via a touch screen selection from a multiple choice menu of utterance options, updated at each turn of the conversation. Dialogues are scripted using a custom hierarchical transition network-based scripting language. Agent utterances can be dynamically tailored based on information about the patient, information from previous conversations, and the unfolding discourse context . The animated agent has a range of nonverbal behaviors that it can use, including: hand gestures, body posture shifts, gazing at and away from the patient, raising and lowering eyebrows, head nods, different facial expressions, and variable proximity.
The framework was extended for document explanation in several ways. A set of animation system commands was added to allow document pages to be displayed by the character (Figure 2), with page changes accompanied by a page-turning sound. A set of document pointing gestures was added so that the agent could be commanded to point anywhere in the document with either a pointing hand or an open hand. While the document is displayed, the agent can continue using its full range of head and facial behavior, with gaze-aways modified so that the agent looks at the document when not looking at the patient (in our studies of human experts, the expert gazed at the document 65% of the time and at the patient 30% of the time). We also extended our text-to-embodied-speech translation system (“BEAT” ) to automatically generate document pointing gestures given the verbal content of the document explanation script, based on models from our earlier studies .
We conducted a preliminary study of the document explanation agent in a three-arm randomized trial with 18 participants aged 19–33, in which each participant experienced two of the three conditions. We compared agent-based health document explanation with explanation by human experts and a self-study condition. While there were no significant effects of study condition on comprehension of the documents (measured by post-intervention knowledge tests), the participants who interacted with both the agent and human were significantly more satisfied with the agent (paired t(5)=2.7, p<.05) and with the overall experience (paired t(5)=2.9, p<.05), compared to the human .
While the preliminary study provided feedback on the promise of using agents for health document explanation, it lacked ecological validity because the participants were primarily college students who had a fairly high level of health literacy. Thus, the primary purpose of the third research activity was to repeat the pilot evaluation with a population in which limited health literacy is represented.
We conducted an evaluation study to test the efficacy of our agent-based document explanation system, compared with a standard of care control (explanation by a human) and a non-intervention control (self-study of the document in question) for individuals with adequate and inadequate health literacy. The study was a 3-arm (COMPUTER AGENT vs. HUMAN vs. SELF) between-subjects randomized experimental design.
An interaction script was created to present an informed consent form for participation in a genetic repository, based on the preliminary work described above. The consent document used was taken with minor revisions from an existing National Institute of General Medical Sciences template for genetic repository research that has been used in multiple NIH funded projects . The example of a study involving a genetic repository was chosen because we wanted little overlap between the simulated consent experience and the actual consent document used for participating in the current study, and we wanted material that would be largely foreign to participants to decrease the influence of prior knowledge. In each script, patients could simply advance linearly through the explanation (by selecting “OK”), ask for any utterance to be repeated (“Could you repeat that please?”), request major sections of the explanation to be repeated, or request that the entire explanation be repeated. Any number of repeats could be requested and, although the scripting language has the ability to encode rephrasings when an utterance is repeated, for the current study the agent would repeat the exact same utterance when a repeat was requested for any part of the script. The agent was deployed on a mobile cart with a touch screen attached via an articulated arm.
In addition to basic demographics, we assessed health literacy using the 66-word version of the Rapid Estimate of Adult Literacy in Medicine (REALM) . We defined limited health literacy as a reading level of 8th grade and below and adequate health literacy as 9th grade and above for our analyses, as prior authors have done [28–31]. We also created a knowledge test for the consent document, based on the Brief Informed Consent Evaluation Protocol (BICEP) . This test was administered in an “open book” fashion with the participant able to refer to a paper copy of the consent form during the test. We augmented the BICEP with scale measures of likelihood to sign the consent document, overall satisfaction with the consent process, and perceived pressure to sign the consent document. In the COMPUTER AGENT and HUMAN conditions, the number of questions or requests for clarifications asked by participants during the explanation of the consent document was also counted.
Twenty-nine subjects participated in the study, were recruited via fliers posted around the Northeastern University neighborhood and in a nearby apartment complex whose demographic consisted of mostly older minority adults, and were compensated for their time. Participants had to be 18 years of age or older and able to speak English. Participants were 66% female, aged 28–91 (mean 60.2). Three were categorized as 3rd grade or below, four as 4th–6th grade, six as 7th–8th grade, and the rest as high school level.
The study took place either in a common room of the apartment complex or the Human-Computer Interaction laboratory at Northeastern University. After arriving, people who consented to participate filled out a demographic questionnaire and then had the REALM health literacy evaluation administered.
Following this, they were exposed to one of three treatments in which a consent document was explained to them by either the COMPUTER AGENT or a HUMAN, or were given time to read the document on their own (SELF). For the COMPUTER AGENT condition, they were given a brief training session on how to interact with the computer agent. The experimenter then gave the participant a paper copy of the consent document so they could follow along with the computer agent’s explanation, and left the room. At the end of the interaction, the computer agent informed the participant that they could take as much time as they needed to review the document before signaling to the experimenter that they were ready to continue. For the HUMAN condition, a second research assistant explained the document to the study participant. Two different female instructors played this second role, and both had significant experience administering informed consent for research studies. The instructor was blind to the computer agent interaction script content and evaluation instruments, and was simply asked to explain the document in question to the participant. For the CONTROL condition, participants were handed the document and told to take as much time as they needed to read and understand it, and were then left alone in the observation room until they signaled they were ready to continue.
The research assistant then verbally administered the knowledge test in “open book” format, with the participant being able to reference their paper copy during the test. The process measures were then verbally administered and a semi-structured interview was conducted to ask participants about their impressions of the study.
Of the 29 participants, 13 (45%) had inadequate health literacy. We conducted full-factorial ANOVAs for all measures, with study CONDITION (COMPUTER AGENT, HUMAN, SELF) and health LITERACY (ADEQUATE, INADEQUATE) as independent factors, and LSD post-hoc tests when applicable. Table 1 shows descriptive statistics for the outcome measures.
There was a significant interaction between CONDITION and LITERACY on knowledge test comprehension scores, F(2,23)=4.41, p<.05 (Figure 3). Post-hoc tests indicated that, for participants with adequate health literacy, explanations by HUMAN and COMPUTER AGENT resulted in significantly greater comprehension compared to SELF study (with no significant difference between HUMAN and COMPUTER AGENT). However, for participants with inadequate health literacy, there were no significant differences on comprehension between study conditions, and they scored significantly lower as a group compared to participants with adequate health literacy.
There was a main effect of study CONDITION on satisfaction with the consent process, F(2,23)=4.78, p<.05, with participants being significantly more satisfied with explanations by the COMPUTER AGENT compared to the HUMAN (participants were also more satisfied with the COMPUTER AGENT compared to SELF study, with post-hoc tests approaching significance, p=.09).
There was also a main effect of study CONDITION on self-reported likelihood to sign the consent document, F(2,32)=5.46, p<.05, with participants significantly more likely to sign the consent form following explanation by the COMPUTER AGENT, compared to either explanation by the HUMAN or SELF study.
There were no significant differences between groups on perceived PRESSURE to sign the consent form, F(2,23)=0.20, p=0.72
Finally, it appeared that participants with limited health literacy asked more questions of the computer agent compared to the human, while those with adequate health literacy asked more questions of the human, although this interaction was not significant, F(1,13)=1.76, p=0.21 (Figure 4).
Participant responses to semi-structured interview questions were transcribed from the videotape and common themes were identified .
When asked about their impressions of the computer agent, the most frequently mentioned theme (7 participants) was that the computer agent was clear, direct and easy to understand. One participant explicitly said that this clarity was due to the computer agent’s ability to point at the virtual document, with the participant following along:
“She was very direct and very clear when she was explaining it, she was explaining it very nice and slow. And she was pointing to the areas that needed to be focused on. When she was explaining it, she was breaking it down on the paper. Where you couldn’t get lost if you were concentrating on what she was saying. Because it was right there in front of you [points at computer] and it’s like right here [points on paper document], and it’s just she was explaining the whole thing. And I was very comfortable with it because as I was reading it, I understood what she was saying and what I was seeing in front of me.” (49 year old female, adequate literacy)
The second most common impression of the computer agent (4 mentions) was that participants felt they could take as much time as they needed, and did not feel embarrassed asking the computer agent to repeat itself:
“Elizabeth [the name of the agent] was very, uh, patient, and if she says something to you that you don’t understand, she will repeat it again if you push the button. And she would take her time.” (68 year old female, limited literacy).
“For me, you know, when it’s on the computer I can do it five times over if I want to. I can just hit repeat, wait I didn’t understand it, I can just repeat it again. You know, but I wouldn’t do that with you [a human] because if I didn’t understand it I might ask you one time to repeat it, and if I still didn’t understand it I wouldn’t ask you to repeat it. Because I wouldn’t want to seem stupid.” (47 year old female, adequate literacy).
Two participants said that they liked the computer agent because she was polite and did not talk down to them:
“She was really polite, she was really polite. That I liked. Besides the fact, more important than anything else, she looked at me and she talked to me. You know, she was talking to me as a person, as opposed to, um, looking down on me and saying ‘did you understand me!’ you know? And that made me feel really good.” (50 year old female, adequate literacy)
Other positive comments included that the computer agent was “informative” and “correct” (2 mentions), that the computer agent was “honest” (1 mention), and that the respondents liked using the touch screen instead of a mouse (1 mention).
There were two negative comments about the computer agent. One participant mentioned that the computer agent seemed “impersonal”, and another felt the computer agent was too “robotic”.
When asked whether they would prefer that health documents be explained to them by a person or a computer agent, 3 of the 9 participants who interacted with the agent said they would prefer the agent, 1 said they would prefer a human, and 1 said that either would be equally acceptable (the others did not respond). For example:
“I think she did the same as talking to an ordinary person in the hospital. Uh, except she would give you a little more information than they do. Because sometimes they only tell you a little bit, you know what I’m saying, and she explain the whole thing.” (68 year old female, limited literacy).
The computer agent did as well as or better than the human on all measures, with participants (regardless of literacy level) reporting higher levels of satisfaction with the consent process and greater likelihood to sign the consent document when it was explained by the computer agent, compared to either explanation by a human or self study. In addition, explanation by the computer agent led to the greatest comprehension of the document, but only for those participants with adequate levels of health literacy; participants with limited literacy scored poorly on comprehension in all treatment conditions.
The tendency for participants with inadequate health literacy to ask more questions of the agent may be due to their being comfortable asking a computer repeated questions without feeling “stupid” (as one participant put it). However, an alternate hypothesis is that they asked more questions of the agent because they had a more difficult time understanding it.
The low comprehension scores for participants with inadequate health literacy indicate that much work remains to make the computer agent effective for this population. One pedagogical methodology espoused for patients with limited health literacy is “teach back” in which the patient is asked to teach what they have learned back to the health educator . While there are some problems implementing this in an unconstrained way within our system, it is at least possible to add comprehension checks at key places in the agent-patient conversation and to have the computer agent provide additional information or review if it appears the patient is having problems.
Limitations of our study include the generalizability of our findings, especially given the very small convenience sample used. The research assistants who explained the consent forms to participants may not be representative of most researchers who perform this function. There are also ecological validity issues with our study settings, although we would expect that in a rushed clinical environment the agent may outperform a typical research assistant by an even wider margin than we observed.
Our future work is focused on several extensions to the system and more extensive evaluation. We plan to add audio prompts to the user interface so that patients who are unable to read the text of their conversational responses can still use the system. We are also developing a framework that will allow health document templates to be instantiated and explained, so that, for example, consent form “boilerplates” can be instantiated with the details of a research study, and the computer agent would be able to explain the document to a patient without further scripting or programming. We have also developed the capability for the agent to keep track of specific issues and questions that it could not resolve for the patient, and output these at the end of the session for follow up by a human research assistant or clinician. We also plan to explore the integration of the conversational agent with other multimedia content, such as video clips, to further explain complex topics such as randomization, or numerical concepts like rates - ideas that can be hard to convey verbally. Finally, we plan to replicate the evaluation study in a clinic or hospital environment, where we would expect that the advantages of the computer agent-based approach would be even greater given the time pressures that most human providers are under.
This work suggests that animated computer agents can perform as well as people in explaining health documents to patients. For the administration of informed consent in particular, it is possible to construct computer agents that result in at least as much understanding of the consent form, satisfaction of the process, and study participation rates compared to the administration of informed consent by human research assistants. Time and cost savings for research studies or medical procedures requiring informed consent could be significant when large number of patients are involved. The use of this technology may also lead to more ethical treatment of patients through a more controlled administration of informed consent and automated comprehension tests.
Role of Funding
This work was supported by a grant from the NIH National Heart Lung and Blood Institute (HL081307-01). The sponsors had no involvement in the study design, collection, analysis and interpretation of data, in the writing of the report, or in the decision to submit the paper for publication.
Thanks to Lindsey Hollister and Maggie McElduff for their assistance in conducting the study.
Conflict of interest
The authors have no conflicts of interest that could influence this work.
We confirm all patient/personal identifiers have been removed or disguised so the patient/person(s) described are not identifiable and cannot be identified through the details of the story.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Timothy W. Bickmore, Northeastern University College of Computer and Information Science, Boston, MA, USA.
Laura M. Pfeifer, Northeastern University College of Computer and Information Science, Boston, MA, USA.
Michael K. Paasche-Orlow, Boston University School of Medicine, Boston, MA, USA.