|Home | About | Journals | Submit | Contact Us | Français|
The tenets of fuzzy trace theory are summarized with respect to their relevance to health and medical decision making. Illustrations are given for HIV prevention, cardiovascular disease, surgical risk, genetic risk, and cancer prevention and control. A core idea of fuzzy trace theory is that people rely on the gist of information, its bottom-line meaning, as opposed to verbatim details in judgment and decision making. This idea explains why precise information (e.g., about risk) is not necessarily effective in encouraging prevention behaviors or in supporting medical decision making. People can get the facts right, and still not derive the proper meaning, which is key to informed decision making. Getting the gist is not sufficient, however. Retrieval (e.g., of health-related values) and processing interference brought on by thinking about nested or overlapping classes (e.g., in ratio concepts, such as probability) are also important. Theory-based interventions that work (and why they work) are presented, ranging from specific techniques aimed at enhancing representation, retrieval, and processing to a comprehensive intervention that integrates these components.
Research on medical decision making and health addresses urgent, practical problems, a goal that might seem at odds with theory. Using fuzzy trace theory (FTT) as a worked example, I argue, instead, that the practical questions about what works in medicine and public health are best pursued by answering questions about causal mechanisms, which is the province of theory. Scientific theory gives the researcher a blueprint for practical applications and allows for cumulative progress in these applications, in contrast to fads and trial-and-error approaches that currently characterize some decision aids and interventions.
In particular, I discuss 3 claims that are grounded in FTT that pertain to how information about health and medicine is processed by patients and by physicians: 1) why precise information (e.g., about risk) does not work, 2) why a bridge is needed between health-relevant information and action, and 3) theory-based interventions that work (and why they work). Illustrations are given from HIV prevention, cardiovascular disease, surgical risk, genetic risk, and cancer prevention and control.
FTT originated as an explanation of puzzling results.1-4 In experiments that spanned many of the major paradigms in developmental, experimental, and judgment and decision-making psychology, memory capacity for verbatim background facts in problems, such as numerical information, did not affect reasoning accuracy. Judgment and decision making relied preferentially on gist representations of information (e.g., about risk), as opposed to verbatim representations. Gist and verbatim are defined much as they are in everyday parlance, except that verbatim applies to more than verbal information but also to graphs, numbers, pictures, and any other form of information. Thus, a gist representation is vague and qualitative; it captures the bottom-line meaning of information, and it is a subjective interpretation of information based on emotion, education, culture, experience, worldview, and level of development. A verbatim representation, in contrast, is precise and quantitative, and it captures the exact surface form of information (i.e., it is literal).
Consider a 49-year-old woman attempting to understand her risk for breast cancer. Suppose that she comes across the Breast Cancer Risk Estimation Tool that is available on the National Cancer Institute Web site and answers the 9 questions found there (http://www.cancer.gov/bcrisktool/). Suppose further that, according to this tool, her estimated lifetime risk is 22.2% of developing invasive breast cancer.5 The verbatim level of risk given by this tool is “22.2%.” However, the interpretation of that risk, the gist, could range from “low” to “high” risk; the risk is low in that it is unlikely to occur (less than 50%), but the risk is high relative to an average risk of 11.3% for a 49-year-old woman (also generated by the tool). The gist of the risk that is extracted from an estimate such as 22.2% depends on contextual and individual factors, including a person's level of numeracy (i.e., ability to understand numbers).6 The gist representation is the answer to the question “What does 22.2% mean?” to that individual.
Table 1 summarizes evidence from experiments, mathematical models, neuroimaging, and other methods that support the conclusion that people extract separate gist and verbatim memory representations from many types of information: words, numbers, literal sentences, metaphors, pictures, graphs, narratives, and events. Table 2 presents definitions and examples of judgment and decision-making effects (many demonstrated in the context of health communication and medical decision making) explained by FTT.7-13 This evidence establishes that the theory can accommodate a wide array of known effects, lending credence to its assumptions and mechanisms.
More important, however, the theory has led to new discoveries, such as the memory independence effect (that reasoning accuracy is independent of memory accuracy);14 that reliance on gist-based intuition increases with development;15,16 that such intuition reduces unhealthy risk taking;17,18 that disentangling and making set relations transparent reduces errors in probability judgment such as base rate neglect;8,19 and that reliance on verbatim memory can impair reasoning performance.2 As detailed in the next section, FTT has also been extended to how laypersons (e.g., patients) and health care providers (e.g., physicians) understand, process, and apply representations of health-relevant information in a variety of contexts, including HIV prevention,7,18 cardiovascular disease,20 surgical risk,21 genetic risk,22 and cancer prevention and control.23
Taken together, these studies show that gist and verbatim representations are extracted roughly in parallel and independently and that people prefer to operate on the crudest gist representation that they can to make judgments or decisions. What this means is that our hypothetical 49-year-old woman encodes and stores the verbatim number “22.2%” along with separate representations of the gist of that number to her, such as “That's really bad; my risk is high.” Note that gist includes the emotional meaning, or affective interpretation, of the information.24,25 As has so often been demonstrated, people may pass a knowledge test about the literal content of risk communication messages (they remember the facts they have been taught), but their risk behavior is not necessarily affected by those messages.26,27 According to FTT, judgments and decisions, and, consequently, behavior are affected by the gist that people understand, rather than the verbatim facts they are presented with.
FTT is referred to as a dual-processes theory, but dual gist and verbatim representations are endpoints of what is, in reality, a continuum of representations. (For a discussion of distinctions between FTT and standard dual-process accounts, see Reyna and Brainerd10). In particular, people extract multiple levels or “hierarchies” of gist from information, although they might only use one representation at a time in reasoning or decision making. These hierarchies of gist can be thought of as analogous to scales of measurement, with nominal or categorical being the simplest distinction, then ordinal, and then finer grained distinctions, such as interval or ratio level. For example, our hypothetical 49-year-old woman might encode “my risk is high,” “I am going to get cancer like my sister did,” “my risk is higher than average,” “0.2 means that this estimate is exact,” “22.2% is about 1 in 5,” and so on.
The preference to operate on the crudest gist, the fuzzy-processing preference, increases with experience or expertise. For example, given a patient who presented in the emergency room with nontraumatic chest pain, experienced physicians homed in on the key dimension of imminent risk of myocardial infarction (MI), whereas less experienced physicians considered more dimensions.20 Similarly, experienced physicians focused on change in size over time of pigmented skin lesions, whereas less experienced physicians considered multiple dimensions, such as pigmentation and size.28 “Garden-path” thinking can occur in experienced reasoners that is sometimes fallacious because new instances do not always fit old experience. The quality of categorical thinking, then, is a function of the level of understanding of the thinker (see Reyna and Adam7 and Reyna and others9 for empirical methods for judging whether thinking is advanced). Gist-based thinking is not simply the retrieval of instances experienced in the past but instead is the distillation of the meaning of past experiences into an intuitive, bottom-line interpretation (that is then recognized in and applied to current instances). It is not experience per se that is important but what is understood or learned from past experience that can be applied to recognizing similar future instances.*
A mental representation, whether gist or verbatim, does not determine judgments and decisions by itself, however. After information is represented, people retrieve their values, principles, and knowledge and apply them to the representation.9 People can retrieve reasoning principles that are then applied to representations to derive judgments and decisions, or they can retrieve factual knowledge to further interpret or elaborate on representations. For example, when a woman infers that “I am going to get cancer like my sister did,” she has retrieved knowledge about her sister from memory (e.g., “I am like my sister and my sister got cancer”) and applied that to the interpretation of 22.2%. Similarly, inferring that “my risk is higher than average” requires knowledge of some kind about “average risk” and comparing 22.2% to that average. A woman who interprets her risk as high because of a family history or genetic mutation may then decide to have a prophylactic mastectomy because she has retrieved the value “better to avoid risk,” in this instance, of breast cancer. A woman might have competing values, such as appearance (avoiding disfigurement), but the priority of values in long-term memory and the cuing of values in the episodic context jointly determine their accessibility at a given point in time (i.e., the relative importance of the value and whether it is cued in context both determine how readily it comes to mind). The latter effect of contextual cuing contributes substantially to variability in judgments and decisions, and its effect is generally underestimated. Even health care professionals can fail to retrieve highly overlearned knowledge without retrieval cues in the environment.29,30
Table 3 presents examples of gist representations and retrieved values or principles in medical decision making and health. These representations are the bottom line or culmination of what might have been a far more detailed and elaborate thought process; therefore, they do not represent everything that a person knows about, for example, chemotherapy or screening, but they are the kinds of intuitive representations that guide decision making. For instance, in a sample of 33 adults offered a variety of alternatives, 91% endorsed the gist of screening when asymptomatic as a choice between feeling okay (without screening) and taking a chance on feeling okay (a negative test result) or not feeling okay (a positive test result). Because feeling okay is better than not feeling okay (a value), this gist clearly discourages screening (which is the only option that has not feeling okay as a possible outcome).
As is apparent from the examples in Table 3, the gist is only as good as the level of understanding of the decision maker. Reyna and Adam7 reported that the gist of sexual transmission of disease as “exchange of bodily fluids” was associated with overestimation of the effectiveness of condoms, even among physicians, because this prototypic gist does not encompass infections, such as human papilloma virus, that are also transmitted skin to skin. The gist of chemotherapy as poison, although widespread, has similar shortcomings, motivating many to search for “healthy” alternatives with lower or unproven efficacy compared with chemotherapy. The gist of surgery depicted in Table 3—namely, as a technique for removing something bad from the body—explains why surgery would be unduly favored over equally effective medical approaches to cancer (e.g., 60% chose surgery for prostate cancer according to the National Prostate Cancer Coalition's annual Men's Health Survey, released 6 June 2006).31 Research has investigated how stereotypes reflect the gist of social categories and thus conform to predictions of FTT; for example, stereotypes show developmental trends that are similar to other kinds of gist-based thinking. Diseases, such as cancer, and therapies, such as surgery or chemotherapy, are also subject to stereotypes, which are inaccurate gist representations of the essence or bottom line of the category. As Table 3 indicates, these stereotyped representations, in concert with retrieved values, sometimes lead medical decisions away from efficacious treatments and health-promoting behaviors.
Although the aforementioned representational and retrieval assumptions are central to FTT, concepts such as processing interference and inhibition have also played a part from the outset.32,33 Processing interference (often caused by nested or overlapping classes, as in probability judgments in which the target class is included in both the numerator and the denominator) rather than memory load explains many examples of human errors and fallacies. For example, consider a diagnostic test that has an 80% accuracy rate (80% positive when disease is present and 80% negative when disease is absent). Given a 10% base rate (or prevalence) of disease, if the test result is positive, is the likelihood of disease closer to 30% or 70%? In this example, the classes correspond to instances of different test results and instances of disease or no disease, which overlap with one another (e.g., having a positive test result with disease v. having a positive test result with no disease, etc.). Overlapping classes create confusion about what is being referred to and interfere with thinking coherently about probabilities.10,19,13,34 Even experienced physicians perform poorly in this simple forced-choice version of a base rate neglect task, so named because respondents fail to adjust sufficiently for the base rate. In one study, for example, 82 physicians chose the correct response only 32% of the time, significantly below a chance level performance of 50%.7,12
Studies have shown that poor performance in this task is not due to a lack of conceptual understanding of probability.8,35 Interference among overlapping classes is the key. People become confused about which classes are referred to (present for all ratio concepts, including probability): whether it is the ratio of people with positive test results to those who have disease or the ratio of people with disease to those who have positive test results. As summarized by Reyna and Brainerd, “Class-inclusion reasoning, probability judgment, risk assessment, and many other tasks, such as conditional probability, conjunction fallacy, and various deductive reasoning tasks, are subject to what has been called inclusion illusions. ... [8,10,13,34] Inclusion illusions occur because part-whole relationships are difficult to process.”2(p34) Processing can be simplified and interference reduced by providing a notational system in which elements of parts and of wholes are distinctly represented, such as Venn diagrams, used to represent subsets and more inclusive sets using a system of overlapping circles.8,36
In summary, the published literature has focused on errors in understanding messages about health-related risks and on biases in medical decision making; FTT can explain the processing origins of many of these errors and biases. These origins have to do, in no small part, with the difficulties people have in translating numbers (and other health-related information) into meaningful representations or gist, with reliably retrieving and implementing their values and knowledge, and with inherent complexities involved in processing ratio concepts, such as probabilities, among other factors.37,38 The meaning of health-relevant information is seldom self-evident, and even health professionals have difficulty retrieving knowledge and processing nested or overlapping classes involved in probability judgments.10,23,29
In the previous section, an overview of human judgment and decision making was presented in which vague, imprecise gist representations were emphasized as the major means by which people encode and act on health-relevant information. If we take this characterization of human thinking seriously, it is clear why providing physicians, members of the public, and others with highly precise information (e.g., about risk) might have little effect on their judgments or decisions and that any effect would be expected to vary depending on how the information was interpreted (qualitatively). Even when it can be demonstrated that people have accurate memory for the health information presented to them (or when the information is in front of them), they will generally not rely on that verbatim memory. Instead, people (patients and physicians) rely on vague gist, not on precise information that is presented. Moreover, they start at the lowest level of gist—categorical—and then move up in precision if they are forced to (e.g., if response constraints require a precise point estimate).2,39 (Note that low is used to mean imprecise, which is often good for performance rather than bad.16,18,20) This fuzzy-processing preference creates framing effects, task variability (the same concept tapped in different tasks yields different levels of performance), and apparent construction of preferences. That is, preferences seem to shift, like will-o'-the-wisps blown by the wind, in response to trivial changes in the wording of options (i.e., they seem to be constructed on the spot), and such effects have been obtained for actual medical decisions.40,41 However, although qualitative interpretations and levels of representation used to understand information are shifting across contexts and presentation formats, the assumption in FTT is that core values and preferences are not necessarily shifting.
To illustrate these ideas about representation, research on the classic Asian disease problem is described,1 but the remarks apply equally to frequent medical decisions involving a sure status quo and a risky but beneficial procedure or operation. For example, consider a 45-year-old man who has permanently lost the vision in one eye and is considering surgery to remove cataracts in the other eye. The decision about whether to have surgery can be framed in terms of gaining sight or preventing further loss. The gain and loss versions for the case of cataract surgery capture two different perspectives on the same decision, one of which encourages risk aversion and the other that encourages risk seeking for reasons that are similar to those in the Asian disease problem. Many medical decisions have this gist—namely, choosing between some functionality with impairment and a procedure or operation that offers improvement but with some risk of death or even worse disability (e.g., hip replacement).
Standard theories of framing effects (e.g., shifts in preference from risk aversion for gains to risk seeking for losses; Table 2) ascribe the effects to the psychophysics of number perception.42,43 For instance, the number of people saved in the gamble for the Asian disease problem (600) is discounted because the function relating objective numerical outcomes to perceptions of those outcomes is not linear. Nonlinearities are also said to distort the weighting of probabilities. However, according to FTT, decisions are based on qualitative gist rather than precise numbers.1,2 For the Asian disease problems, the simplest contrast between none and some (the nominal level) distinguishes the options: in the gain frame, the options boil down to save some people for sure or take a risk and possibly save some people or save none. Because saving some people is better than saving none (a core value), the sure option is preferred. Analogously, in the loss frame, the options boil down to some people die for sure or take a risk and possibly some people die or none die. Because none dying is better than some dying (a core value), the risky option is preferred.
If the FTT account is correct, removing all or some of the numbers and replacing them with vague words, such as some, that preserve the bottom-line gist should preserve or even enhance framing effects.† Such nonnumerical framing effects have been obtained.1,2 These nonnumerical framing effects demonstrate that, despite pervasively low levels of numeracy as assessed with nationally representative samples,6 efforts to increase the precision of people's understanding of numbers are misguided; efforts should instead focus on qualitative relations among numbers (i.e., gist).
In addition to nonnumerical framing effects, further evidence supports the idea that simple gist representations (the contrast between some and none) guide these decisions. On one hand, in prospect and related theories, the complement in which no one is saved or none die (e.g., 2/3 probability that no one would be saved) literally contributes zero to predictions (e.g., 2/3 probability × 0 saved = 0 expected utility). However, the zero complements provide the pivotal categorical contrast in FTT and hence are essential for observing framing effects. On the other hand, the complement without zero outcomes (e.g., 1/3 probability that 600 people will be saved) is pivotal for prospect and related theories but conveys a gist of equivalence of options (200 ≈ 1/3 × 600) in FTT. Consistent with FTT, large framing effects were observed when people focused on choosing between the sure option and the zero complement of the gamble (although no information was missing as the entire gamble had been provided in background information), but framing effects disappeared when people focused on the very numbers that were supposed to be the source of the effect in standard theories. These findings support the conclusion that people use the simple, bottom-line gist of information to make decisions, rather than exact numbers.
Framing effects emerge with age from childhood to adolescence, as the fuzzy-processing preference (gist-based thinking) increases.16-18 Gist-based thinking explains risk aversion in laboratory tasks involving gains, which is mirrored in adolescent real-life risk taking. For example, adolescents (who take risks) treat increments in experimentation with drugs (e.g., trying drugs 0, 1, 2, 3, 4 times or more) as smoothly increasing in perceived risk, whereas adults (who take many fewer risks) treat anything above zero times of experimentation as sharply more risky.44 Consequently, adolescent–adult differences were largest when evaluating the harmfulness of trying drugs “once or twice.” Adults evaluate potentially catastrophic risks in categorical terms, as safe (no risk) or risky.18,27
For some decisions, there is no safe, sure option, only varying degrees of risk (e.g., a choice between the risk of side effects of medication and the risk of death from disease). In this case, the precision of representations is increased, and ordinal gist representations (e.g., lower v. higher risk) are used. Unlike verbatim representations of numbers, the ordinal gist of numbers is an interpretation that is colored by the context of other numbers.45,46 In a study by Fagerlin and others,47 for example, some women were asked for estimates of their lifetime risk of breast cancer, and other women were not asked for such estimates. The average estimate was 46% probability of breast cancer among the women who were asked (a large overestimate). When both groups of women were told that there was a 13% average lifetime risk, women comparing 13% with their previous estimates (averaging 46%) thought 13% was a “low” risk. Women who had not given an initial estimate were more likely to view the same 13% as a “high” risk. Similarly, Windschitl and others48 asked people to evaluate a risk of 12% of disease for women paired with either a risk of 4% for men or of 20% for men. The same 12% was viewed as higher in risk when it was paired with 4% relative to 20%. Thus, results support FTT's contention that the gist of risk is relative, that is, like any semantic interpretation, meaning that it depends on the context.47-50
One of the most important contexts for risk communication is informed consent for surgery. In a study of memory for risk information given to actual patients prior to a carotid endarterectomy (removing blockages from the carotid artery to prevent stroke or death), Lloyd and others51 found that many patients could not accurately report verbatim risk estimates quoted to them and supplied in writing. Even at the point of providing signed consent for the surgery, many patients grossly misestimated the risks of surgery or of stroke with and without surgery. Reyna and Hamilton21 argued that most patients’ estimates were consistent with the correct ordinal gist of the information (i.e., their estimates preserved the correct ordering of risks, with no surgery being highest in risk and surgery being lowest). They also pointed out that patients whose estimates were further off the mark quantitatively from the verbatim estimates could be argued to have given informed consent if erroneous estimates preserved the essential gist (e.g., that surgery involved risk), whereas some patients whose estimates were closer to the actual risk had not given informed consent. So, given a risk of 2% of dying on the table during surgery, an estimate of 10% for that risk would constitute informed consent (because the patient recognizes that the surgery has some risk), but an estimate of 0% would not constitute informed consent, even though it is numerically closer to 2%. The implication of these observations is that informed consent is a matter of getting the right gist as opposed to verbatim accuracy, a harder task because it requires achieving understanding. It is possible for patients to not get the facts right and still get the gist or to get the facts right but not get the gist (not derive the proper meaning). The key for informed consent, according to FTT, is getting the gist.52 Summarizing the evidence reviewed thus far regarding mental representations, people use the lowest (i.e., least precise) level of gist they can to accomplish a task (e.g., choice). These hierarchies of gist can be likened to scales of measurement (especially when considering numbers), such as nominal (categorical), ordinal, interval, and ratio scales. An example of categorical gist would be that surgery has some risk, as opposed to none. An example of ordinal gist is that surgery has less risk than not having surgery. Gist is relative in the sense that it is sensitive to context, such as the values of other quantities; 12% can be a high or low risk, depending on what it is compared to. Informed consent is about getting the gist of information right rather than getting the verbatim facts right. It can be seen from this summary why providing precise information (e.g., about risk) is not necessarily effective and why exhorting people to use such information does not solve the problem. This analysis also suggests that people who recall verbatim facts about risk might not be fully informed. To be compelling, information must appeal to gist-based intuition rather than verbatim-based analysis.
It might be argued that the points discussed thus far apply mainly to laypersons, especially to those low in numeracy or education. However, one of the main tenets of FTT is that gist-based intuition is advanced. Fuzzy-processing preference increases with age from childhood to adulthood and with increasing expertise in adulthood.16,18,20 For example, Reyna and Lloyd20 studied participants ranging in domain-specific expertise from medical students to attending physicians in internal medicine to expert cardiologists (who pursued specialized study beyond internal medicine). Three patient profiles were presented at each of 3 levels of overall risk (low, intermediate, and high) for unstable angina, according to American Heart Association (AHA) and American College of Cardiology (ACC) guidelines (profiles were prepared by an experienced physician based on the guidelines). Patient characteristics such as age, gender, and type of chest pain were varied. These 9 patient descriptions resemble summaries routinely used when physicians seek consultations from specialists, and they were presented in random order. Judgments of imminent risk of heart attack (MI) and probability of coronary artery disease (CAD) were elicited, among other judgments, as well as admission decisions (outpatient follow-up, admission to hospital but not intensive care, or admission to cardiac intensive care).
As might be expected, greater expertise was associated with better discrimination between lower and higher risk patients, based on guidelines, for both risk judgments and admission decisions. However, better discrimination was achieved using fewer dimensions of information. Higher and higher levels of risk (based on their own estimates of risk) of CAD were tolerated as expertise increased across 5 groups of physicians. The decision threshold for heart attack (MI) risk, however, went down across the same groups; admission decisions hinged increasingly on that one dimension, whereas high levels of risk on the other dimension (CAD) were ignored. For the most expert, only MI risk correlated significantly with admission probability and admission decisions. The most expert physicians also either discharged patients or admitted them to cardiac intensive care, whereas less expert physicians and students made more fine-grained distinctions among the same patients.
To summarize, higher knowledge groups relied on fewer dimensions of information than lower knowledge groups, and experts made sharper all-or-none distinctions among decision categories. Consistent with FTT, experts achieved better discrimination by processing less information, more crudely. The tendency to rely on categorical gist (at risk or not) increased as participants became more knowledgeable. (These results mirror framing effects; gist-based intuition is preferred as development advances.) Health care professionals in this and in other studies rely on gist, too, more so as training and experience increase.7,23,29,35 Therefore, despite advanced quantitative competence (necessary for admission to medical school), physicians preferred to make gist-based categorical (all-or-none, at-risk v. outpatient) decisions. Because physicians rely on gist, they are more susceptible to intuitive stories that place individuals in qualitatively different categories rather than basing clinical decisions on statistical facts about populations.23 The principles noted earlier, such as appealing to gist-based intuition, apply as readily to professionals as to laypersons.
Specific interventions for risk communication and medical decision making are suggested by the principles of FTT and have been evaluated in research.14,35,53 For example, bar graphs presented side by side can convey relative risk and encourage risk-avoidant behavior because individuals make a gist-based relative magnitude judgment by comparing the heights of the bars.53 Beginning at a young age, people automatically make perceptual estimations of relative magnitudes of target instances to make probability judgments.10,13,52 For example, Figure 1 highlights relative risk by showing that the number of patients with disease drops by 15 with treatment A relative to treatment B. The theoretical principle that is illustrated here is that the relative heights of bar graphs facilitate extracting a salient gist—namely, that treatment A has “lower risk” than treatment B.
Note that FTT does not claim that people only encode gist but rather that they encode both gist and verbatim representations in parallel and tend to rely on gist in making decisions. When gist and verbatim information conflict, however, as when differences between bars look large but the numbers are actually small (Figure 1B), people, with some effort and assistance, can reject the misleading visual illusion.32 Panel B illustrates this theoretical principle of interference between competing verbatim and gist representations: both .003 and .006 are objectively small, but the bar graphs emphasize relative magnitude, that A is lower than B.10,54 People rely on the salient gist as a default, but they respond to reminders to pay attention to specific numerical details, such as the fact that .003 and .006 are objectively small.2,13
As shown in Figure 2, stacked bar graphs facilitate appropriate attention to denominators as well as numerators, which can be neglected because of nested or overlapping classes (see Background).8,10,19,55 Such denominator neglect is illustrated by a number of judgment phenomena, such as overestimation of small risks or confusion of conditional probabilities (e.g., when sensitivity and positive predictive value are confused with one another), as predicted by FTT. In Figure 2, denominators are explicitly represented to avoid denominator neglect, and their absolute magnitudes are such that they dwarf the small levels of absolute risk. Indeed, the smaller bars representing absolute risk (the number with disease, panel A, or the proportion with disease, panel B) are so small that they can barely be discerned against the backgrounds of the larger bars that represent the total number treated. Thus, stacked bar graphs are better at conveying absolute risk, whereas simple bar graphs are better at conveying relative risk.54 Decimals in conjunction with simple bar graphs (Figure 1B) can simultaneously convey absolute and relative risk, but care must be taken to ensure that people do not focus solely on the salient gist of relative risk, which is encoded automatically. In these displays, the gist of absolute risk—the gist interpretation of verbatim numbers, such as .003 and .006, as “small”—should be highlighted so that it does not escape notice (as FTT predicts it will).
In addition to bar graphs, there are a number of other graphical formats that have predictable effects on the extraction of gist relations among numbers.56-58 A few examples can be used to illustrate that different formats highlight different gists: a line graph is typically the best choice when illustrating the effectiveness of a drug over time or other trends over time (e.g., survival and mortality curves). The gist of a monotonic trend (e.g., that magnitude is going up but not by exactly how much) is extracted automatically from line graphs (and in many other tasks).32,59,60 Note that the same theoretical principle applies here as with the bar graphs. People tend to ignore verbatim numbers (e.g., those shown on the y-axis) in favor of the salient gist relation (e.g., that magnitude is going up or going down), but they encode both and can pay attention to both under specific circumstances (e.g., when directed to attend to the small range of values on the y-axis).
As we have discussed, people intuitively grasp the use of height in visual displays to signify the gist of relative magnitude (e.g., levels of risk), as in bar graphs and risk ladders. Thus, a bar graph is effective for comparing the relative rates of adverse events for different medical treatments; pie charts are also useful for judging relative proportions. Pictographs, in which icons represent the number in a population affected by some event, can be used to illustrate magnitude and convey randomness. Again, the theoretical principle applies that people easily and automatically estimate relative magnitudes perceptually, whether comparing relative heights of bar graphs, differently colored areas in a pie chart, or differently colored icons in a pictograph. However, pictographs that display icons in a random rather than systematic fashion make it harder to “get the gist,” that is, to judge relative magnitude.13 This interfering effect on gist extraction of scrambling inputs rather than ordering them systematically has been replicated in a variety of tasks: in probability judgments,13 linear inferences,60 and false-memory tasks that involve getting the gist of a semantically related word list or set of sentences.61-63 Scrambling the inputs disrupts the representational momentum associated with encoding meaningfully related items (the tendency to see a meaningful pattern in related inputs increases with experience15). Thus, based on FTT, scrambled v. ordered presentation of meaningful inputs has been predicted and shown to affect the readiness with which people extract the gist of those inputs.
The representational effect of presentation formats on making gist salient should be distinguished from the processing effect of disentangling nested or overlapping classes (see Interventions That Mainly Affect Processing). Stacked bar graphs and other visual displays that make class-inclusion (part-whole) relations distinct (e.g., Venn diagrams) reduce processing interference from nested or overlapping classes. FTT predicts that visual displays that emphasize only the numerator (i.e., showing only adversely affected individuals) tend to increase risk-avoidant behaviors, whereas those that highlight both the numerator and denominator tend to decrease risk-avoidant behavior. In sum, graphical displays can be valuable in helping individuals detect global patterns (e.g., linear trends), perform rudimentary magnitude comparisons, and see part-whole relations, such as those that involve conditional probabilities, ratios, and proportions. Biases and errors in risk estimation and probability judgment are predictable based on a few empirically supported theoretical principles, such as salience of relative magnitude and denominator neglect, and graphical tools and techniques exist that have been shown to mitigate these biases and errors.
Because different formats highlight different meanings, it behooves health care professionals and designers of decision aids to think carefully about the most important aspects of information that must be conveyed to patients and members of the public. One can focus on low absolute risk, knowing that that will tend to encourage complacency (Fagerlin and others47 note that when risks are lower than expected, women return at a lower rate for screening tests), or focus on relative risk, knowing that that will tend to encourage risk avoidance and prevention behaviors, which is beneficial at a population level. Similar considerations apply in medical decisions; for example, does one wish to stress that treatment A is more efficacious than treatment B or that both treatments have a low probability of success and serious side effects (or, like the cataract surgery, that there is a gains perspective and a losses one)? If health messages stress verbatim accuracy, as many have argued they should, protected sex, Pap smears, and other prevention measures are likely to be discouraged (e.g., the objective risk of HIV infection is low for most groups18). In short, the values of patient autonomy and shared decision making should not be discarded, but professionals have the burden of considering the consequences of framing information in ways that guide decision makers toward one gist representation rather than another.
Representation and retrieval are both important in reducing unprotected sex and other behaviors that put people at risk of contracting sexually transmitted diseases (STDs), including HIV infection.18,64 Retrieval cues have been shown to be effective in a wide variety of populations (e.g., high school students, health educators, nurses, and physicians) in increasing perceptions of the risk of STDs.7,29 The cues are simply reminders to think about each STD and ways in which someone might contract the disease. Reminders about facts that are well known to respondents (e.g., that syphilis and gonorrhea are STDs) are sufficient to increase risk perceptions. Practice with a variety of likely retrieval cues is important in achieving automaticity. Automatic responding to contextual cues that signal risk, not deliberative thinking, is a goal of FTT's risk reduction interventions.
FTT's principles of representation and retrieval have been implemented in a randomized trial with high school students designed to reduce premature pregnancy and sexually transmitted infections. The 3 arms of the trial are 1) a “control” group that receives an unrelated but beneficial intervention for the same number of contact hours as the treatment groups; 2) a standard, comprehensive risk reduction intervention (Reducing the Risk27) that integrates elements of a host of health behavior approaches, such as the theory of reasoned action, and includes direct instruction and role-playing to promote self-efficacy and refusal skills; and 3) an enhanced version of Reducing the Risk that incorporates all of the same content but emphasizes gist representations of knowledge, gist-based thinking, and retrieval of gist-based values and principles (Table 4). This comprehensive, scaled-up enhanced intervention involves a minimum of 14 contact hours with students, direct instruction in the health curriculum (and role-playing) in small groups in school settings and after-school youth programs, encouragement and tips to engage in communication with parents, and assessments conducted preintervention, immediately postintervention, and at 3-, 6-, and 12-month follow-ups.
Table 4 presents some illustrative examples of gist representations of knowledge, gist-based thinking, and gist-based values and principles that are taught in the enhanced intervention. The goal of the intervention is not to inculcate mindless aphorisms (“Mom says don't do it”) or to indoctrinate students with values/principles that they do not believe in. On the contrary, the main aspect of the definition of gist in FTT is that it represents meaningful understanding (type B or productive thinking in gestalt theory, which facilitates transfer beyond the concrete situations covered in the curriculum), not mindless memorization or compliance (type A or nonproductive thinking in gestalt theory). Thus, the aim of the curriculum is to help students understand, for example, to understand why HIV/AIDS is incurable (i.e., it is a virus, and other sexually transmitted viruses, herpes simplex and human papilloma, are also incurable for the same reason, etc.). Similarly, students select gist principles that represent their own highest values rather than having values foisted upon them. Educators then help them recognize cues that would signal the relevance of those values in real-life contexts.
In sum, the aim of the curriculum is not to increase the precision of risk perceptions or to increase reflective deliberation at the point of decision making in real life but rather to help students to 1) understand the gist of knowledge that will help them avoid or reduce unhealthy risk taking (which is expected to endure longer in memory after the intervention, compared to memory for verbatim details); 2) engage in gist-based thinking, rather than detailed verbatim analysis of pros and cons of risk taking; 3) quickly and automatically recognize signs that risk taking or danger (e.g., forced sex) is imminent; 4) quickly and automatically retrieve their core values and principles that are relevant in risky contexts; and 5) apply those values and principles to their representations of the situation to make healthy decisions.
Perhaps the most thoroughly investigated intervention derived from FTT involves reducing processing interference (part-whole confusions) from overlapping or nested classes.8,14,19,22,35,65 Although we have discussed these effects in connection with visual displays, the underlying theoretical principle extends to other kinds of information formats. Processing interference is exemplified in phenomena such as conjunctive and disjunctive probability judgments, conditional probability judgments (e.g., the diagnosis example of base rate neglect given earlier), ratio/numerosity biases (preferring a ratio to win of 9 out of 100 over 1 out of 10 because 9 is bigger than 1), frequency effects (being more impressed by a ratio of 20 out of 100 than an equivalent percentage of 20%), overestimating very small risks, and many other cognitive illusions.10 Each of these judgments has in common that people are confused by overlapping classes and seize on comparing focal classes in the numerator, thereby neglecting denominators. For example, sensitivity (the probability that someone with disease gets a positive test result) is confused with positive predictive value (the probability that someone with a positive test result has the disease) because only the denominators differ. The same joint probability appears in the numerator of both sensitivity and positive predictive value, and thus they seem similar when denominators are neglected.
As discussed in connection with stacked bar graphs, the remedy for this problem is to disentangle overlapping classes, making the referent classes clear and more easily mentally manipulated when comparing ratios.2 In one intervention with medical students and internal medicine residents, participants were taught to use a 100-square grid to represent pretest probability of disease, as well as sensitivity and specificity (the number of people without disease who have a negative test result), to read off the positive predictive value (and other conditional probabilities).35 The key to the success of this intervention, which eliminated most errors, was the ability to separately perceive each class: the patients with disease who had a positive result, the patients without disease who had a positive result, the patients with disease who had a negative result, and patients without disease who had a negative result. Participants were then able to perceptually estimate the relative number of squares indicating disease among the squares with a positive result (indicated by a plus sign).
Consistent with FTT, errors in reasoning caused by processing interference persist late into development. Patients make errors that are similar to those made by physicians in estimating conditional probabilities such as the probability of disease given a genetic mutation and vice versa.22,23 The errors are not conceptual nor due to working memory capacity, and they are present in advanced reasoners with large funds of content knowledge.9,20 Physicians are as likely to commit errors involving updating probabilities based on diagnostic test results, although they make such judgments routinely, as untutored high school students.12 Because the errors are not due to fundamental misconceptions, once classes are represented discretely, respondents ranging from students to expert physicians are able to make coherent judgments. In another instantiation of the same theoretical principle, merely estimating each of the 4 constituent probabilities separately in a 2 × 2 table also significantly reduces conjunctive and disjunctive errors in probability judgments (Table 2).66,67 Filling in, for example, the probability that Linda is a bank teller and a feminist, a feminist but not a bank teller, a bank teller but not a feminist, and neither and then making conjunctive or disjunctive judgments significantly reduces errors.
It is important to note, however, that salient gist interacts with processing interference, increasing errors. Therefore, the compelling portrait of Linda enhances the class inclusion illusion, or conjunctive fallacy, because it directs attention to the focal gist of Linda as surely a feminist. Similar salient gist representations have been identified in reasoning problems that produce large and robust illusions.2,8,13,34 Individual differences have also been identified in the ability to inhibit interference and fall prey to cognitive illusions.19,68-71
In summary, there are 3 elements to creating missteps in reasoning about overlapping or nested classes: 1) a “push” factor created by confusion from overlapping classes that applies to any ratio concept, including risks and probabilities, that pushes reasoning away from the correct solution; 2) a “pull” factor that compels reasoning in the direction of salient or compelling gist; and 3) behavioral inhibition, the ability to retrieve values or knowledge to edit processing (e.g., to retrieve knowledge about class relations to reject the possibility that being a feminist bank teller could be more likely than being a bank teller). Disentangling overlapping classes prevents a cascade of reasoning errors, reducing susceptibility to salient but misleading gist (e.g., reducing errors in the Linda and similar problems).
In this article, the basic assumptions of FTT have been presented that relate to how health information is mentally represented, retrieved, and processed in decision making and, ultimately, behavior. Often, health information is poorly understood, and advances in disease prevention and treatment are inaccessible to the most vulnerable members of society who are at greatest risk of disease. The reasons for this are many and includelow numeracy, but the solution is not necessarily to provide more numbers and greater detail about those numbers. Precise information (e.g., about risk) is frequently ineffective in changing decisions and behaviors because patients and professionals rely on gist instead. The gist representation is the answer to the following question: what does the information mean to that individual?
A bridge is needed between information and action because information is filtered through the brain to become action, and so it must appeal to gist-based intuition. This bridge is provided by encouraging people to extract appropriate categorical (e.g., “my risk is high” so I have to make lifestyle changes) and ordinal (e.g., “my risk is higher than others” so I cannot eat what others eat) gist representations, to retrieve health-relevant values and principles automatically in contexts when they are needed (e.g., avoid deadly risks, such as HIV), and to implement those values coherently when processing information, especially about risks and probabilities that are inherently confusing. Predictable variations in representations, retrieval, and processing explain what seem to be variations in core values and preferences.
A number of specific theory-based interventions were discussed that work (and why they work), including ways of formatting information so that the relevant gist pops out, providing retrieval cues that remind people of knowledge that they already have, and disentangling classes and representing them discretely to improve probability judgments. This approach suggests new ways of targeting health communications and of designing decision aids, ways that should make advances in health and medicine more accessible to people who now suffer and die needlessly.
Preparation of this manuscript was supported, in part, by grants from the National Cancer Institute (Basic and Biobehavioral Research Branch, Behavioral Research Program, Division of Cancer Control and Population Sciences; R13CA126359), the National Institutes of Health (MH-061211), and the National Science Foundation (BCS 0553225).
*One might well ask how an outside viewer (e.g., a clinician) assesses whether a patient's (or a clinician's) gist of a situation is adequate for the purposes of decision making. In other words, does the theory have any way of predicting the circumstances when a gist will be an adequate basis for a decision and when it might lead to a serious error? Reyna and others provide a detailed answer to this question.2,7,9,18,20,29 The short answer is that we have applied criteria for internal coherence of decision processes and external correspondence with reality, especially outcomes, and have shown that both are required to ascertain decision quality. In clinical medicine, years of experience do not necessarily mean that a particular clinician gets the gist (i.e., that he or she understands the underlying causal mechanisms of a disease, its pathophysiology, and how those mechanisms give rise to symptoms and are best treated). Clinicians with specialized training, however, who rely more on simple gist compared with those with less training, have been shown to make superior diagnoses that agree better with evidence-based guidelines.20 As people become more advanced thinkers (judged by objective criteria of coherence and correspondence), they tend to rely more on crude (i.e., less precise) gist representations. The appropriate gist (which depends on the specific task at hand) is simple, but it does not omit crucial features. Simplification of complexity, done right, is generally a virtue not a vice. Thus, FTT is both a descriptive and prescriptive theory.72
†Simply offering an alternative view without testing prior theories is not acceptable in science because new theories should not be introduced if prior theories are sufficient to account for findings. The findings discussed here falsify any theory that assumes that the psychophysics of numerical quantities explains results (i.e., expected utility and related theories). FTT builds directly on prior findings of prospect theory, but the latter cannot account for the framing effects reviewed. Although verbatim representations of number are encoded (and may cause framing effects under specific circumstances), people rely mainly on gist representations, and this assumption is both necessary and sufficient to account for framing effects.