|Home | About | Journals | Submit | Contact Us | Français|
In a Liberal Democracy, Policy Decisions regarding ethical controversies, including those in research ethics, should incorporate the opinions of its citizens. Eliciting informed and well-considered ethical opinions can be challenging. The issues may not be widely familiar and they may involve complex scientific, legal, historical, and ethical dimensions. Traditional surveys risk eliciting superficial and uninformed opinions that may be of dubious quality for policy formation. We argue that the theory and practice of deliberative democracy (DD) is especially useful in overcoming such inadequacies. We explain DD theory and practice, discuss the rationale for using DD methods in research ethics, and illustrate in depth the use of a DD method for a long-standing research ethics controversy involving research based on surrogate consent. The potential pitfalls of DD and the means of minimizing them as well as future research directions are also discussed.
How should our society address significant yet unsettled ethical issues in research with human subjects—issues that do not have clear guidance in the current regulations? Should it be left entirely to politics as usual, in which stakeholder groups lobby those who have the power to make or implement policy? Or should we also seek out the considered moral opinions of citizens that reflect a shared common good, in the hopes of informing policy with such opinions? Bioethics researchers often choose the latter. They attempt to tap the moral opinions of laypersons, not necessarily to incorporate such opinions directly into policy but at least to take the normativity of such opinions seriously and to give them due weight in policymaking and implementation. For the purpose of this paper, we call such surveys “normative opinion surveys.” Normative opinion surveys pose unique validity challenges, beyond the usual ones for opinion research. In this paper, we discuss one promising way of meeting those challenges, namely, the theory and practice of deliberative democracy (DD).
The idea that citizens’ views about the content and application of ethical norms should have standing in ethics policy debates may seem obvious. But note that surveys on ethics controversies need not always have such a purpose. An opinion survey, for example, might be used as a “temperature-taking” device by organizations that already have a set opinion on the topic. Their goal is not to gather more information to help them find an answer to the ethical problem; rather, their goal may be to gather information to influence public opinion. An opinion survey may also be a part of journalistic reporting of public opinion to predict the outcome of a ballot initiative. Such uses of opinion surveys do not directly engage the normativity of the elicited opinions. Such surveys are strategic or pragmatic opinion surveys.
In contrast, bioethics opinion surveys can be conducted with the goal of informing policy, to provide normative guidance. Although exactly how such opinions might be incorporated into policy is just one issue in the large and complex topic of how policies are made (Kingdon, 2002), normative guidance need not mean wholesale adoption of public opinion. The amount of weight given to public opinion may vary depending on the topic and context. But the core idea is that the public’s moral opinions are taken seriously as a source of guidance, not simply something to be measured and influenced.
The immediate reaction to such an idea may be that the surveys of the public measure uninformed opinions that cannot be trusted in weighty and complex ethical matters. For instance, it might be said that the average citizen lacks the scientific background to give an opinion that research ethicists can rely on. But this does not mean that thoughtful, informed opinions of the public are not ascertainable.
Efforts to discover the moral opinions of the public for the purposes of guiding public policy confront all the problems that beset survey research. There are the usual concerns regarding external and internal validity and reliability (Carmines & Zeller, 1979). In most cases, surveys are judged by their external validity, or generalizability. Generalizability rests on proper sampling, a quantitative issue, amenable to objective evaluation.
Internal validity is equally important but more difficult to measure and to improve. More than 60 years ago, George Gallup observed, “Students of public opinion are fast coming to the realization that, relatively speaking, too much attention has been directed toward sample design, and too little toward question design … differences in question design often bring results which show far greater variation than those normally found by different sampling techniques …” (Gallup quoted in Foddy, 1993, p. ix). Gallup’s comments are still relevant today. It is much easier to criticize sampling design than to answer the more elusive question of whether one is accurately eliciting the respondents’ opinions.
Surveys of normative opinion raise unique validity issues. For example, consider the use of a ballot proposition to settle a controversial research policy question. In a recent Michigan election, voters were asked to approve the use of embryos in stem cell research (Associated Press, 2008). The names of the two opposing advocacy groups give a flavor for the strategic maneuvering by both sides: the opposition group called itself the Michigan Citizens Against Unrestricted Science & Experimentation and the pro-research group called itself CureMichigan. Those conducting a strategic opinion survey for purposes of predicting outcome need not be concerned with how informed the respondent’s opinion is, as long as the survey predicts what will happen at the polls. Even a complex bioethics issue can be reduced to a brief paragraph (or even a sentence) on such a survey. There is no need to ask questions like: Did the citizens receive sufficient information to provide a truly informed preference on the issues? Are their opinions based on balanced and accurate information? Are their judgments reflective of their considered views, something they would arrive at after reflection and deliberation with their peers? These questions, irrelevant to political pollsters, do matter if we are interested in the validity of normative opinion surveys.
If the purpose is to take the respondents’ normative opinions seriously, what are the implications for how we elicit those opinions? How can we be sure that the opinions elicited are valid? What does “valid” mean for such surveys?
The challenges are many. For research ethics questions, the issues will often be unfamiliar to people. The topics may involve complex scientific, legal, historical, and ethical dimensions. It is easy to elicit pat answers to complex questions with poorly designed surveys (e.g., it is unlikely that a person will express a preference for less privacy, less safety, or less protections, unless the overall tradeoffs or countervailing costs and burdens are presented accurately). Ultimately, there needs to be some assurance that the opinions elicited are, in a robust sense, considered or optimal moral opinions—or at least as valid as possible given the available resources. Given these complexities, normative opinion surveys in bioethics require serious theoretical and methodological reflection.
We can conceptualize the methodological issues as falling under two broad but interacting domains. First, how can we be sure that such surveys are based on the respondents receiving accurate, unbiased, and reasonably comprehensive information? Traditional surveys often convey very limited information to the respondents, even for complex issues, creating considerable doubt as to whether the respondents’ opinions are based on adequate information.
Second, how can we be sure that the ethical opinions elicited are in some robust sense considered opinions? Since most laypersons may not be familiar with a given ethical issue—or be familiar only with public information generated by partisan interest groups—and given the human tendency to seek information that affirms preconceived opinions, rather than the more difficult work of keeping one’s mind open and challenging one’s own views, the respondents will need a chance to engage in a process that helps them develop, examine, and challenge their views. Ideally, this should be an inter-personal process that involves consultation with experts representing diverse viewpoints and deliberations with peers in a setting that promotes careful, considered, and civil dialogue.
This is a tall order for a researcher who seeks to conduct normative opinion surveys in bioethics. The amount of work needed to satisfy the above criteria will vary depending on the topic. For a select few issues, perhaps the media have done a good job of educating the public and the issues are familiar enough so that most citizens are exposed to the full range of arguments regarding the issue. In such a situation, a traditional survey may suffice in eliciting the considered opinions of the public.
But for most issues in bioethics, where the complex scientific, historical, and ethical dimensions are not familiar to the average citizen, meeting those criteria will be quite difficult. Fortunately, the researcher who wants to conduct the highest quality surveys of this kind does not need to invent the process de novo. Over the past few decades, political philosophers and political scientists have developed theories of deliberative democracy (DD), an approach to incorporating public opinion in policymaking (Chambers, 2003). These scholars, together with policy researchers and citizen activists, have been using a variety of empirical methods to engage the public on various policy issues (Carson et al., 2005; Gastil & Keith, 2005; Thompson, 2008). Their work provides a valuable resource for the ethics researcher. In the remainder of this paper, we illustrate how this work can be applied in bioethics and in research ethics specifically.
In section II, we explain what DD is, briefly sketch how it is being used as an empirical research method in various fields, and provide a rationale for why it is particularly suited for studying bioethics (including research ethics) issues that involve normative opinion surveys.
In section III, we provide a detailed description of our own DD work (on the controversy over surrogate consent for dementia research) and in section IV, we discuss the limitations and potential pitfalls of DD and how they can be addressed. In these sections, we provide concrete and detailed guidance to those who may wish to employ the method in their own work. We do this by focusing on and elaborating in detail those elements of research design and implementation that are particularly relevant in conducting DD research. Finally, we close with some comments on the future of DD studies in research ethics.
There has been a proliferation of deliberative democratic theories in the past two to three decades (Chambers, 2003). Although the boundaries and definition of DD are not always clear, what is common to all theories of DD is the idea that informed and deliberative input of citizens is seen as the ideal for democratic governance (Bohman & Rehg, 1997). This deliberative input is more than simply summing up or aggregating opinions in some coordinated fashion. Some DD theorists, for example, argue for an ideal of reciprocal justification. As Gutmann and Thompson note in one of the earliest attempts to delineate the implications of DD for bioethics, DD theory asks “citizens and officials to justify any demands for collective action by giving reasons that can be accepted by those who are bound by the action” (Gutmann & Thompson, 1997).
On the surface, deliberative democratic ideals seem to embody a fairly non-controversial vision of citizen participation in the political process. But the popularity of deliberative democratic theory in political science and philosophy is a relatively recent phenomenon. As Bohman and Rehg note, during the middle of the 20th century, the dominant theories of democracy tended to be suspicious of public deliberation. Some explicitly espoused an elite theory of democracy, driven by the sociological reality that most citizens seemed “politically uninformed, apathetic, and manipulable” (p. x). Wide public participation and deliberation was seen as undesirable and unrealistic, given that politically powerful leaders can manipulate the masses. Although the masses might exercise negative discipline by voting out leaders periodically, the real policymaking was seen as a matter of coordinating special interests, conducted by the elite. Political interaction is not about citizens reasoning together about a common good; it’s about bargaining or negotiating among elite leaders who represent a variety of special interests.
Another dominant model of democracy from the 20th century was based on economic theory of democracy in which “parties function as entrepreneurs who compete to sell their policies in a market of political consumers” (Bohman & Rehg, 1997). Citizens’ votes are seen as expressions of preferences that are fixed (rather than something to be transformed through public dialogue); they are summed together by voting but there is no expectation of the citizens reasoning together to seek a common good of some sort.
In neither of these theories is the actual content of the citizen’s opinions taken seriously as something to be evaluated as a better or worse expression of what should happen in a society. In both of these theories, reason is instrumental, as a tool for gaining the upper hand in interest group politics, as a tool in bargaining with opponents, or as a means of aggregating preferences most efficiently.
In contrast to these traditions, deliberative democratic theory is a normative theory that takes citizens’ views seriously as a source of public policy. Much of the philosophical literature on DD is devoted to working out the essential elements and critiques of this core idea of citizen deliberation as the basis of governance (Bohman & Rehg, 1997; Chambers, 2003; Elster, 1998; Freeman, 2000). Deliberative democratic theory is more optimistic about the idea that citizens can deliberate and reason together and come to share a common good, at least to some degree.
It is not hard to see why deliberative democratic theories have important implications for bioethics. Such theories provide a broad theoretical framework for what is implicitly accepted within bioethics. How should our public policy regarding controversial research ethics issues be resolved? One approach is to provide some background rules (no violence, no fraud, etc.), but allow the flow of money and influence to determine the outcome, without an attempt to promote a common good perspective. Let the stronger interest group win, as long as they play by a minimum set of rules.
Alternatively, we could attempt to resolve the conflict by promoting a deliberation among citizens during which an attempt is made to find a common perspective. The possibility that people can change their minds based on giving reasons and evaluating others’ reasons is left open. Such a perspective would respect the citizens’ abilities to consider tradeoffs necessary in public policy, rather than assuming that they will only protect their own personal interests. The interaction is not just for purposes of influence; people treat one another as decision-makers who deserve respect by being provided with reasons that they can evaluate and respond to.
It would be discouraging if policy solutions to modern bioethics dilemmas were addressed only by means of special interest group politics, or based only on economic considerations or other forms of power. If this were the normatively accepted way of resolving ethics policy problems, it is hard to see why such problems would be considered ethical problems at all. This concern has been implicitly recognized in that, historically, the usual mode of addressing bioethics issues has been to turn to expert panels and commissions (Jonsen, 1998). These commissions have been hugely influential, with much of the current human subject research regulations, for example, being direct descendents of one of those commissions (Levine, 1986). These commissions do embody some key components of DD, insofar as they publicly deliberate, attempt to incorporate diverse and opposing viewpoints, are led by evaluation of reasons and arguments, and to a limited degree, represent an interdisciplinary perspective, including representatives from the public or from patient advocacy groups.
However, there are limitations to resorting to expert panels in addressing bioethics issues. First, such panels are constituted by a political process and there is no assurance that they represent the diversity of opinions that one may find among the citizens. Second, at the end of the day, the moral opinions of the panel members are simply the intuitions of a few non-randomly chosen people. Also, because they are selected to represent some interest group or a discipline (science, patient advocacy groups, law, ethics, etc.), the perspective of their deliberation is not as ordinary citizens but rather as specialists—thus, they run the risk, despite good intentions, of acting as representatives of special interests. This is especially worrisome if there is a paucity of reliable and valid empirical data regarding the considered moral opinions of the public or other key stakeholders who are not empowered to sit at the table (which is often the case). There is no reason to privilege the intuitions of the few over others’ views when it comes to, for instance, how much risk a competent research subject should be allowed take on in a research project. Why should the panel members’ intuitions be privileged over the opinions of citizens?
It is also important not to confuse DD with representation by layperson groups, such as patient advocacy groups. Currently there is increasing political advocacy pressure to develop “home run” therapies for certain illnesses, and to take more chances with invasive and riskier approaches (Neergard, 2005). Political action by advocacy groups is an essential part of our democratic system. But we must not forget that they are operating as an interest group. Thus, the inclusion of such groups in deliberations of ethics policy issues is not an automatic indication of democratic deliberation.
The above discussion should make fairly clear that the very nature of modern bioethics rejects a political theory that assumes fixed preferences of citizens simply need to be aggregated efficiently, or somehow coordinated as a matter of special interest politics. Of course, this does not mean that attempts to influence public policy by lobbying and public relations are illegitimate. Nor does it deny that some people (and groups) probably do engage in bioethics discourse from a mostly strategic perspective. The point is that we should not take the activity of bioethics to be entirely reducible to such activities. In this way, bioethics and DD are natural allies. If so, then bioethics could learn from the empirical research practices in the field of DD.
Deliberative democracy is not only a theory. In fact, the pursuit of public deliberation has had a long and active history of its own, quite apart from the theoretical developments in political philosophy (Gastil & Keith, 2005). There are now many models designed to implement deliberative democratic methods in policymaking. These models go by the name of Deliberative Polling (Fishkin, 1997), Citizens Jury (Crosby et al., 2005), 21st Century Town Meetings (Lukensmeyer et al., 2005), National Issues Forums (Melville et al., 2005), and others (Gastil & Kelshaw, 2000). Thus it is not correct to talk about “the” deliberative democratic method since there are many such methods. Some have adapted and combined these methods, in attempts to improve public deliberative input into policy (Carson et al., 2005). In fact, there has been such a proliferation of empirical work in DD that there are now multiple reviews and theoretical reflections on the relationship between the theory and practice of DD (Carpini, Cook, & Jacobs, 2004; Chambers, 2003; Ryfe, 2005; Thompson, 2008).
Some methods are initiated and implemented by government agencies. For example, between 2001 and 2005, the Minister of Planning and Infrastructure in Western Australia used DD methods to directly inform policymaking through almost 40 deliberative engagements with the public (Gregory, Hartz-Karp, & Watson, 2008). At the other end of the spectrum are methods that are designed to elicit views of citizens that can be used to inform policy discussions, in a consultative model. Such a view may take individual polls but do not in fact require the participants to develop a consensus around a policy (Chambers, 2003). In our work, as will be seen, we take a somewhat middle road: we use a consultative method, without official government sanction, but our procedures require the participants to deliberate toward achieving a policy recommendation.
Deliberative techniques have been used to study a range of topics. Robert Luskin, James Fishkin, and colleagues have conducted deliberations (Deliberative Polls) in the United States, the European Union, and Australia to inform a range of issues in economic, social, and foreign policy. The very first Deliberative Poll, conducted in Manchester, England in 1994, is illustrative. This two-day deliberative study focused on rising crime rates and potential policy interventions (Luskin, Fishkin, & Jowell, 2002). Significant opinion changes occurred for 35 of the 52 policy items, with major shifts in the opinions pertaining to punishment, policing, and the social root causes of crime (Luskin et al., 2002). For example, after deliberation, participants were inclined to send fewer criminals to prison, to decrease the length of prison sentences, and were open to alternative approaches for dealing with criminals, such as reform programs instead of prison sentencing (Luskin et al., 2002). The authors state that these changes likely are the result of participants’ improved appreciation that prisons are expensive to use and may not be very effective in limiting crime rates.
A recent literature review of public engagement on public priority setting and resource allocation indicates that deliberative approaches for eliciting public opinion are on the rise. Mitton et al. reviewed articles reporting empirical work on public engagement between 1981 and 2006, and found 175 such articles reporting on 190 studies conducted in the US, UK, Canada, Australia, and New Zealand (Mitton et al., 2009). Although not all involved deliberative methods, the authors note that “in each successive time period…the proportion of cases in which at least one deliberative method was employed increased,” with 37% of studies involving deliberation by the 2000–2006 period (Mitton et al., 2009).
A recent New Zealand study addressing the question “[S]hould the New Zealand government offer free screening mammograms to all women aged 40–49 years?” provides a dramatic example of the potential impact of deliberative procedures (Paul et al., 2008). In this study, a citizen’s jury method was used in which eleven women aged 40–49 years participated as the jury. Various experts presented information and made arguments for their case; the members of the jury then deliberated and made their decisions. At baseline, all eleven women were in support of this screening policy. After deliberation, all but one woman had changed their minds, citing the high rates of false negatives and false positives of mammography for this age group and the minimal evidence that mammography actually saves lives when done before age 50 (Paul et al., 2008).
Scully and colleagues examined the moral arguments used by lay people deliberating on the ethically complex question of “social sex selection” (parents using pre-implantation genetic diagnosis to fulfill their wish for a boy or girl child) (Scully, Banks, & Shakespeare, 2006). They found that ordinary citizens can articulate basic moral norms, question them, acknowledge competing moral considerations, and provide cogent arguments in support of their positions. In the midst of the common objections to sexual selection such as unequal proportions of male and females in the population, the uneasy feeling of “playing God,” and the discomfort with “unnatural” processes, participants spoke of children as “gifts.” Using this metaphor and contrasting it to “commodity,” the researchers gained valuable insight into how lay persons can think about this medical technology.
DD methods are beginning to be used in research ethics. In compliance with the Federal regulations regarding emergency research without informed consent (21 CFR 50.24), a variety of methods have been used to meet the “community consultation” requirement (Baren & Biros, 2007), including public meetings. For example, one study compared three different “communities” using a random-digit telephone survey, in-person interviews, and public meetings and found that acceptance of emergency research varied by community and by method of consultation (Contant et al., 2006). More recently, two deliberative studies have assessed public opinion to inform the issue of medical records research, one within the Veterans Administration and another focusing on biobank research in British Columbia (Damschroder et al., 2007; Secko et al., 2009). Damschroder and colleagues did not find significant changes in attitude toward consent issues from their deliberation procedures (which did not use facilitated discussions) but were able to detect a close relationship between the role of trust and the veterans’ preferences regarding consent for medical records research (Damschroder et al., 2007). Secko and colleagues found that deliberation led to changes in attitude. For example, after deliberation, their respondents more strongly disagreed with mandatory re-contact for each time a researcher wants to use a biobanked sample and became less sure of endorsing a right of a subject to withdraw samples at any time (Secko et al., 2009).
Our group recently conducted a study of caregivers and primary decision-makers for persons with dementia (Kim et al., in press), assessing their views regarding surrogate consent for dementia research. The study involved 178 locally recruited caregivers or primary decision-makers for persons with dementia. They were, on average, 59 years old and 73% were women. The subjects were randomly assigned to either an all-day DD session group or a traditional survey–only control group. The acceptability of family surrogate consent for dementia research (“surrogate-based research” or SBR) from a societal policy perspective and also the acceptability of the research from the more personal perspectives of deciding for oneself or deciding for a loved one (self and surrogate perspectives) were assessed at baseline, immediately post-DD session, and a month after the DD date, for four research scenarios of varying risk-benefit profiles. We found that the support for a policy of family consent for SBR increased for the DD group (e.g., support for SBR for gene transfer research changed from 54% pre-DD session to 75% after DD session to 76% at one month after the DD session), but not for the control group. In the DD group, there were transient increases in acceptability of SBR from surrogate or self-perspectives but these changes were not sustained a month later; in the control group, there were no changes from baseline in attitude toward surrogate consent from any perspective. The participants of the DD session saw the process as informative, engaging, and fair; in fact, 9 of 10 DD participants were willing to abide by the policy decision put forward by their small group, “even if [they] personally have a different view.”
The caregiver DD study sample reflects persons most likely to serve as surrogate decision-makers for persons enrolling in dementia research. However, the findings cannot be generalized to a genuine societal perspective. Therefore, we are currently conducting a DD study on surrogate consent for dementia research, using a random sample of older general public, which we now describe in detail.
The perennial debate over surrogate consent for research is a good candidate for using DD methods. The issue has been debated for several decades with little progress (Kim et al., 2004). The current US regulations allow research with incapacitated adults based on consent by their legally authorized representatives (LAR) (45CFR46, 102c, 111.a4, and 116). However, the regulations defer to states for defining the LAR and few states have done so (Hoffmann & Schwartz, 1998; Saks et al., 2008). Three recent state laws (California, Virginia, and New Jersey) have diverged on how to balance the potential benefits with risks (Saks et al., 2008). The US federal government is currently revisiting SBR oversight policy (“Request for Information and Comments on Research That Involves Adult Individuals With Impaired Decision-making Capacity,” 2007). The UK is also currently focusing on ethical issues in dementia, including the ethics of SBR (Nuffield Council on Bioethics, 2008).
Despite its long history of controversy, there are limited layperson opinion data regarding the ethics of SBR, with only a few relevant studies (Bravo, Paquet, & Dubois, 2003; Karlawish et al., 2009; Kim et al., 2005; Wendler et al., 2002). Our recent study involving caregivers of persons with dementia is the only study that uses deliberative methods (Kim et al., in press).
A DD method is well-suited for this controversial topic. The topic involves several domains of specialized knowledge that need to be conveyed: the nature and purpose of scientific procedures in Alzheimer’s disease (AD) research, the rationale and structure of the current human subject protections system (including history of abuses, current oversight system through Research Ethics Committees, the ethical framework behind current regulations), the context of current AD research (the actual remaining capabilities of persons with moderate AD, the role of family members, etc.). It is a challenge to present all of the necessary background material in an understandable and comprehensive manner. Further, given that the topic is not a widely discussed issue among the public, it is unlikely that the public will have been exposed to the range of arguments for and against various policies.
The study elicits the public’s views regarding a policy for surrogate consent for four dementia research scenarios of varying risks and potential benefit: a lumbar puncture study to develop a diagnostic test, a randomized clinical trial of a drug, an efficacy study of a vaccine, and an early phase neurosurgical gene transfer study.
Participants are members of the general public, aged 51 and older, recruited via random-digit dialing telephone calls within a 60-mile radius of Ann Arbor, Michigan. The participants are randomized into three equal-sized arms: persons who complete surveys only (simulating a traditional survey, “survey-only group”); persons who receive written educational materials about ethics and science of surrogate-based research (educational materials plus survey group, hereafter referred to as “education group”); and persons who participate in an all-day democratic deliberation session (“DD session group”).
There are three measurement time points: a survey sent to all participants one month prior to the DD session date (after which the subjects are randomized into the three arms); a second survey after the DD session for the DD group, or around that time by mail for the two control arms (survey-only group and education group) who do not attend the DD session; a third survey to all groups by mail about one month after the DD session date.
Participants assigned to the DD session group attend a day-long meeting that is comprised of plenary and small group sessions facilitated by trained facilitators. Two national experts (in AD research and in research ethics) present detailed information, and are also available throughout the day, traveling together from table to table answering any questions.
We discuss below several particularly important issues for a DD study: rationale for an experimental design; special recruitment considerations; development of study materials; procedures for the democratic deliberation session itself; necessity of facilitators; and monitoring and evaluation of the quality of the DD process.
Because the DD method is relatively new to bioethics research, it is important to assess whether a DD procedure has any impact at all on people’s opinions. The design of our study is a randomized controlled experiment to maximize internal validity, by accounting for any “secular trends” caused by time or news events (or other factors). The survey-only control group provides a traditional cross-sectional survey comparison. The education group—whose members receive largely same background information as the DD attendees but without the intensive deliberative component—provides a comparison that will be useful in assessing the potential mechanism for the impact of DD. That is, if the DD session does seem to make a difference in the respondents’ views, is it explained by the increase in information provided to the respondents or is there some more specific effect of the deliberation day itself? Given the cost and effort of DD sessions, this is an important question.
Our experimental design maximizes internal validity in a setting where the limitations to external validity are unavoidable. Although practitioners of deliberative consultation methods like ours cite generalizability as a strength (Fishkin & Luskin, 2005), it is important to recognize and address the limitations in sampling. Our DD procedures take a full day and require the completion of several surveys. Setting aside an entire day for research involvement will necessarily reduce the participation rate and will create a certain level of selection bias. This is unavoidable as long as research relies on volunteers. However, by using a randomized design, we increase the rigor with which we can examine the causal effect of the DD intervention.
Acknowledging these challenges, it is important to take measures to optimize recruitment. Subjects are recruited via random-digit dialing by the Institute for Social Research at the University of Michigan—an international leader in survey research. We recruit members of the older general public (age 51 and older), for two reasons. First, a general older public sample balances the need for a representative sample with the need to optimize deliberation among those for whom the issue under discussion is “live” and pertinent. Age is the strongest risk factor for dementia and thus SBR is most relevant to the older population, especially since many persons in that age group will have had personal experiences with loved ones suffering from dementia. Second, as explained in the final section below, using randomly selected persons from this age group allows some exploratory, but intriguing, secondary analyses extrapolating our DD study results to a national survey of older Americans aged 51 and older (Kim et al., 2009).
Because the DD session requires considerable commitment from volunteers, minimization of dropouts is crucial. We communicate to the subjects at the beginning of the project that there will be frequent telephone and mail contacts so that they know ahead of time that we will be keeping in close touch with them throughout. The research team, in other words, needs to develop a working relationship with the participants.
An all-day on-site session provides a unique opportunity to creatively inform and educate the participants. During the DD session, we provide a highly informative yet experientially driven background on Alzheimer’s disease with a 30-minute segment of a critically acclaimed public television documentary, The Forgetting: A Portrait of Alzheimer’s (Arledge, 2004). The 30-minute segment was generated by discussing successive subsections of the film with four lay persons who have personal experience with AD (all of whom participated in a previous study we conducted).
The presentations on AD clinical research and on the ethics of SBR require extreme care to ensure that they are, in combination, balanced and fair. The integrity of the DD method depends on this. Further, the presentations must be audience-friendly and comprehensive enough for informed opinion formation, revision, or refinement. These presentations were first developed for our caregiver DD study (Kim et al., in press). The research team worked closely with an advisory panel that consisted of a political science expert in deliberative democracy methods, a senior AD researcher, a bioethicist-sociologist, a geriatrician, a director of a human subject protections program at an academic medical center, a qualitative research expert, a gerontological nurse, and a caregiver of a person with AD.
These presentations were further refined for use with the general public, based on a final systematic review by the members of the advisory panel, additional external experts (in both AD research and bioethics), and lay persons, totaling 11 persons. This review was conducted systematically using a survey. Each separate subsection of the presentations was evaluated quantitatively (using a five-item Likert scale with a range of strongly disagree to strongly agree) on accuracy, balance, and on how easy the material was to understand. The reviewers provided open-ended comments as well, suggesting items to ensure completeness of information, and any necessary changes needed to ensure that the materials were adequate for a general public audience.
The survey is a shortened version of an earlier instrument that has been validated and used in a previous study (Kim et al., 2005). The survey elicits respondents’ attitudes toward surrogate consent for dementia research, using four research scenarios of about 120 words each that depict a lumbar puncture study, a randomized clinical trial for a medication, a vaccine trial, and an early phase gene transfer trial.
Two weeks prior to the DD date, the members of the DD session group are provided with copies of the experts’ presentations. Participants are asked to read through the presentations before the meeting, and to prepare any questions they have for the experts.
On the day of the DD session, DD group participants are randomly assigned to tables, in groups ranging from five to seven persons per table, with the aim of having six participants per table (a number we feel is ideal, based on previous trials), along with a facilitator.
The events of the DD day itself are scheduled to achieve the two main goals: one, conveying accurate, unbiased, and comprehensive information; and two, providing optimum conditions for deliberation. In terms of the content of the information, most of the hard work is done in preparation of the presentation materials; this is enhanced by well-rehearsed presentations of those materials by the two experts who take many questions from the audience, and further make themselves available throughout the day. In terms of optimizing the deliberative process, the sequence of events is designed to create an atmosphere of openness, respect, and collaboration within the groups (breaks and lunch are not listed below):
To maintain balanced, expert responses to all questions, the two experts (AD clinical researcher and bioethicist) are available and travel together from table to table to answer questions throughout the day. The principal investigator of the project is also available to answer simple factual questions or questions regarding the DD day procedures, e.g., questions regarding how to interpret the small group exercise (e.g., “What do you mean by…?”). The extensive interactive component of the day minimizes the chance that the participants are basing their views on incorrect or incomplete information.
The entire DD procedure is based on the premise that optimal conditions must be provided to encourage rigorous, high-quality deliberation over a controversial ethical issue. The small group sessions are designed to create an open, respectful, and comprehensive reflection and interchange of viewpoints by the participants. Although such high-quality group deliberation could occur spontaneously with luck, having a trained facilitator at each table removes the element of luck.
The facilitator’s role is to optimize the process of discussion rather than act as content experts. They can come from a variety of backgrounds, including previous study participants who have stood out in their ability to promote constructive dialogue at their tables. Prior to each DD session, all facilitators, both new and experienced, participate in a training session, led by the principal investigator. All facilitators are asked to review written materials prior to the training session: the research plan of the DD project, the presentations that will be used by the experts, detailed description of the DD session day, and the facilitator guidelines.
The training session consists of two parts. First, we review the role of the facilitator, list of specific tasks, strategies for dealing with difficult situations, and an annotated guide for leading each of the three small group deliberation sessions scheduled for the day. Second, there is an in-depth discussion and role play using scenarios collected from analyzing the previous DD small group sessions. We have identified various points that are potential problems and examples of particularly good facilitation; these are used to demonstrate how to conduct good facilitation and how to navigate through potential group problems.
Standardizing the roles for facilitators is crucial to promoting good deliberation and minimizing unwanted group dynamic effects (Crosby et al., 2005). They must remain vigilant and active procedurally in outlining the ground rules for discussion, keeping the group on task, promoting respectful exchange of information, prompting clarification of statements, encouraging expression of opposing viewpoints and participation by everyone, and limiting domination of discussion by some participants.
At the same time, because laypersons naturally begin to defer to expert opinion (Levine, Fung, & Gastil, 2005), the facilitators are reminded of this potential pitfall. Facilitators should not insert themselves into the discussion such that their opinion dictates the content of the group’s outcome.
The rule of thumb the facilitators are encouraged to keep in mind in determining whether to intervene in a particular situation is: Be neutral in content but active in process.
Do the participants of DD method actually engage in a reasonably high-quality deliberation that is informed, thoughtful, and consisting of civil exchange of ideas? Since the validity of the outcome relies on DD fulfilling its promise, a DD study should monitor and report on the quality of the deliberation. These are not simple phenomena to measure, and a single approach is not sufficient (Fishkin & Luskin, 2005; Neblo, 2007; Steiner et al., 2004; Thompson, 2008). We assess quality in various ways.
First, we assess the participants’ perceptions of civility and respect, fairness of the process, participant trust, and the value participants place on information and deliberation (Steiner et al., 2004; Thompson, 2008), using a self-report questionnaire administered at the end of the DD session day. The DD Session Evaluation Form contains the following questions (using a Likert response scale):
Second, to assess the level of engagement and equality of participation (Cohen, 2007; Thompson, 2008), we keep track of some simple metrics. The number and types of questions from the audience during the expert presentations and during small group deliberations are recorded. Also, we count the number and length of times (text length) each participant speaks during small group deliberations. It is not necessary that everyone speaks the same number of times or for the same duration to ensure equality and good deliberation, but it is necessary that no one or two individual(s) dominate(s) the discussions and that everyone has the opportunity to speak their mind (Neblo, 2007).
Third, the small group deliberations are qualitatively analyzed for their content. The technique used is more traditional qualitative analysis, and it addresses a variety of questions that touch on the quality of deliberation: How are disagreements resolved? Are mediation and compromise common (Steiner et al., 2004)? What are the common themes and rationales for the groups’ policy recommendations? Are facts used accurately—and if not, do the participants correct each other? Are the participants keeping to the task? Is there evidence of polarization? Do the comments reflect appeals to a common good perspective or are the reasons given for opinions mostly based on self-interest?
Much of a DD study’s procedures and collection of data are concentrated at key time points (i.e., the one-day DD sessions) and the procedures are logistically demanding. Thus, a pilot session can be very useful in several ways. First, recruitment can be a challenge because participation is quite demanding. A pilot DD session allows the development of a realistic recruitment plan, e.g., to estimate how many people need to be recruited by phone for an adequate attendance rate for the DD sessions. Second, a pilot study also serves as the final pretest for the various study materials and procedures. Third, it provides an opportunity to develop context-sensitive and realistic facilitator guides and training materials. Finally, debriefing at the end of the pilot DD session day while the events of the day are fresh in the minds of everyone (participants, PI, presenters, study team members, and facilitators) can be very useful.
The potential pitfalls of the deliberative process are widely discussed among scholars of deliberation, with the main concerns having to do with group dynamic factors (Carpini et al., 2004; Levine, Fung, & Gastil, 2005; Mendelberg et al., 2002; Ryfe, 2005). For example, won’t minority view members feel pressured to adopt the majority view? Although experimental studies have shown that there is a tendency for the minority to shift toward the views of the majority, it is also true that “minority opinion can lead majorities to consider new alternatives and perspectives” (Carpini et al., 2004; Mendelberg et al., 2002). Overall, a comprehensive review of social psychology of small group deliberation (not specifically designed to test DD theory) as well as a variety of studies (specifically designed to test DD theory) concludes that there is “substantial evidence” that deliberation can lead to the benefits postulated by DD theory (Carpini et al., 2004).
Perhaps the most widely discussed criticism of DD methods is that rather than arriving at a common good–based policy recommendation, DD will lead to group polarization (Sunstein, 2002, 2007). Group polarization is a phenomenon by which deliberators tend to move to more extreme positions in the direction of their own pre-deliberation opinions (Sunstein, 2002). This may occur because participants gather selective information supportive of the views they already hold from peers who are like-minded. Thus their prior opinions are not challenged and indeed grow more extreme. Polarization may occur also because of social comparison/peer pressure (the simple desire to be perceived favorably by the group), and confidence by corroboration (people gain confidence after feeling group support, so more extreme viewpoints can be expressed), (Sunstein, 2007). Under these circumstances, Sunstein suggests that side-switching is unlikely in group deliberations (Sunstein, 2007).
However, group polarization may not be a concern for some types of DD methods and steps can be taken to minimize it. For natural gatherings and deliberations among people who already have specific opinions on the topic, polarization is more likely. A DD method that randomly selects its participants may be less susceptible. Further, procedures can minimize the potential for polarization. For example, Sunstein acknowledges that DD procedures that are careful to introduce balanced information may be less susceptible (Sunstein, 2002). The key seems to be, as Sunstein puts it, to pay “far more attention to the circumstances and nature of deliberation, not merely to the fact that it is occurring” (Sunstein, 2002). Thus, in conducting DD studies, specific, context-sensitive considerations are important. First, if the topic itself is already polarizing (e.g., embryonic stem cell debate, abortion, etc.), then there is a need to be especially careful that the exercise does not simply promote more polarization. Second, sample selection will have an important impact; homogeneous groups may be more likely to lead to group polarization and representative, probability samples may mitigate this. Third, to counteract the reduced argument pool in groups that already favor one view over another, balanced expert information is crucial, and careful vetting by various perspectives is important. Fourth, group facilitation by an independent facilitator is crucial in helping to curb the natural group polarizing tendencies by ensuring that participation is equitable and respectful even of minority views, and by encouraging arguments based on giving of reasons and rationales rather than simply asserting strongly held positions.
Another potential limitation of DD methods is that the information presented to the participants may be inadequate or biased, despite a rigorous review and multidisciplinary vetting process. There may be intangible factors in the experts’ act of presentation, and in the question-and-answer interaction, which may affect participant opinion differently depending on the expert presenters involved. These issues may need further study.
No method of opinion elicitation is perfect and every method has its own potential pitfalls. The question is a comparative one. It is not a question of whether DD methods are perfect but instead whether they are better than the traditional survey methods of eliciting normative opinions on controversial ethical issues. We believe that those advantages of DD method over traditional surveys that we have laid out in this paper can be maintained, as long as methodological precautions are implemented carefully.
There are many potential bioethics topics that could be addressed using DD-based methods. Some of these topics have already been well-described in the literature (Gutmann & Thompson, 1997). In research ethics, any topic with broad policy implications that require a genuine tradeoff of goods (e.g., individual privacy and control versus welfare of community) would be a candidate. However, because DD studies are challenging and expensive, researchers should also continue to address methodological issues. In particular, future refinements should focus on maximizing validity while at the same time minimizing cost. We offer two examples of potential refinements.
One way to achieve this is to combine two types of studies, i.e., to conduct a DD study in combination with a relatively less expensive traditional survey. In our case, we conducted the first nationally representative survey of the older public’s attitudes toward surrogate consent for dementia research with excellent external validity (Kim et al., 2009). Of course, even without an explicit quantitative linking of the national survey with our DD project, the DD project can provide some important insights and educated speculations about what might happen at a national level. For example, we will be able to say, “On a brief traditional national survey, X% of the older general public supports phase I gene transfer studies for AD using surrogate consent; further, our DD experiment shows that were we to optimize how we obtain the public’s opinion via in-depth education and deliberation, it would probably lead to [increase/decrease/no change] in support.” This qualitative statement about the potential impact of DD at a national level (“increase/decrease/no change”) can be made more specific by statistically linking the DD study with the national survey. For instance, we can generate propensity scores to weight the DD sample to provide an estimate for the effect of DD generalized to the national survey sample. This quantitative estimation, although speculative, could provide a sense of the potential magnitude of difference that DD might make if conducted nationally.
Another potential methodological refinement would allow researchers to expend more of their resources on their main ethical focus. Most DD studies on research ethics will need to educate the participants about the history of research ethics, the current oversight system, and the nature of human subject research (e.g., phases of treatment development). Perhaps validated education/deliberation modules can be developed that can then be used across various DD studies in research ethics, rather than each researcher re-inventing the wheel.
Democratic deliberation and bioethics are natural allies. They both take the normative opinions of the lay public seriously, and strive to elicit and be guided by those opinions. Of course, no method of eliciting such opinions is without limitations and potential pitfalls. However, DD methods, with proper precautions in methodology, have the potential of eliciting the considered moral judgment of ordinary citizens regarding challenging and controversial bioethics policy issues, such as disputes in how to ethically conduct clinical research.
S. Kim is a Greenwall Foundation Faculty Scholar in Bioethics. Additional funding was provided by grant R01-AG029550 from the National Institute on Aging. The contents are solely the responsibility of the authors and do not necessarily represent the official views of the National Institutes of Health.
Scott Y. H. Kim is Associate Professor of Psychiatry, a core faculty member of the Bioethics Program, and an investigator in the Center for Behavioral and Decision Sciences in Medicine, all at the University of Michigan. His book, Evaluation of Capacity to Consent to Treatment and Research, is due out in 2010 from Oxford University Press.
Ian Wall is a Research Associate in the Center for Behavioral and Decision Sciences in Medicine at the University of Michigan. He plans to pursue a Ph.D. in Medical Sociology to continue his research on the cultural influences and social structures that can shape health-related behaviors.
Aimee Stanczyk is a Research Associate in the Center for Behavioral and Decision Sciences in Medicine at the University of Michigan.
Raymond G. De Vries is Professor in the Bioethics Program, the Department of Obstetrics and Gynecology, and the Department of Medical Education, all at University of Michigan. He is the author of A Pleasing Birth: Midwifery and Maternity Care in the Netherlands (Temple University Press, 2005), and co-editor of The View from Here: Bioethics and the Social Sciences (Blackwell, 2007). He is at work on a critical social history of bioethics, and is studying the regulation of science; international research ethics; the difficulties of informed consent; bioethics and the problem of suffering; and the social, ethical, and policy issues associated with non-medically-indicated surgical birth.