|Home | About | Journals | Submit | Contact Us | Français|
The Scientific Report of the 2015 Dietary Guidelines Advisory Committee was primarily informed by memory-based dietary assessment methods (M-BM; e.g., interviews, surveys). The reliance on M-BM to inform dietary policy continues despite decades of unequivocal evidence that M-BM data bear little relation to actual energy and nutrient consumption. M-BM data are defended as valid and valuable despite no empirical support, and no examination of the foundational assumptions regarding the validity of human memory and retrospective recall in dietary assessment. We assert that uncritical faith in the validity and value of M-BM has wasted significant resources and constitutes the greatest impediment to scientific progress in obesity and nutrition research. Herein, we present evidence that M-BM are fundamentally and fatally flawed due to well-established scientific facts and analytic truths. First, the assumption that human memory can provide accurate or precise reproductions of past ingestive behavior is indisputably false. Second, M-BM require participants to submit to protocols that mimic procedures known to induce false recall. Third, the subjective (i.e., not publicly accessible) mental phenomena (i.e., memories) from which M-BM data are derived cannot be independently observed, quantified, nor falsified; as such, these data are pseudoscientific and inadmissible in scientific research. Fourth, the failure to objectively measure physical activity in analyses renders inferences regarding diet-health relationships equivocal. Given the overwhelming evidence in support of our position, we conclude that M-BM data cannot be used to inform national dietary guidelines and the continued funding of M-BM constitutes an unscientific and significant misuse of research resources.
“When the facts change, I change my mind. What do you do, sir?”1John Maynard Keynes
Over the past century, our nation's food supply and the nutritional status of Americans have improved to a level unparalleled in human history.2,3 While this reality may be contrary to the popular belief that our modern diet is inherently inadequate, the data are clear. In the early 20th century nutritional diseases such as pellagra, beriberi, rickets, and goiter were significant public health challenges. In the U.S. alone, pellagra (a disease of niacin deficiency) claimed more than 100,000 lives and severely affected more than 3 million people.4 Yet in 2013, the Centers for Disease Control and Prevention's (CDC) Second National Report on Biochemical Indicators of Diet and Nutrition reported that nearly “80% of Americans (aged ≥ 6 y) were not at risk of deficiencies in any of the 7 vitamins” examined via biomarkers (i.e., vitamins A, B-6, B-12, C, D, and E; emphasis added).2 In addition, ~90% of women of child-bearing age (i.e., 12-49 years) were not at risk of iron deficiency, and folate levels increased by ~50 % since the previous national report.2,5 As such, the vast majority of the US population is not at risk for nutritional deficiencies, nor do they suffer from nutritional deficiencies and associated diseases.
Given these significant improvements in diet-related health and recent work demonstrating that non-genetic evolution may be the predominant driver of the ‘diseases of excess’ (e.g., obesity epidemic and risk of type 2 diabetes mellitus, T2DM),6-8 it can be posited that diet is no longer a major risk factor for disease for the vast majority of Americans. If accurate, this hypothesis suggests the billions of research dollars targeted for diet and nutrition-related health research are misdirected.9,10 Nevertheless, despite the significant dietary milestones of the past century and substantial increases in federal funding over the last two decades,9,10 research into human nutrition has been increasingly criticized.11-13 The genesis of these criticisms is the appalling track record of highly publicized nutrition claims derived from epidemiologic studies (e.g., see 14,15) that consistently failed to be supported when tested using objective study designs.11,16 Young and Karr examined over 50 nutritional claims from observational studies for a wide variety of dietary patterns and nutrient supplementation and demonstrated that “100% of the observational claims failed to replicate” and five claims were statistically significant “in the opposite direction.”17 These outcomes and others 18-21 suggest that as often as not, when epidemiologic nutrition claims are tested against objective research methods, the results are either inconclusive or indicative of a contrary outcome.
Epidemiologic studies suggest that almost any nutrient can be associated with a myriad of outcomes,11,22 as observed by Schoenfeld and Ioannidis'article, “Is everything we eat associated with cancer?”22 With persistent cycles of specious nutrition claims in the media, it is unsurprising that the public is confused and incredulous.23 Insofar as the provision of clear and consistent dietary guidelines for the consuming public is a goal of nutrition epidemiology, it has failed in decisively answering the simple question, “what should we eat?”24 Nowhere is this fact more evident than the shifting sands of opinion on the relative risks of fat, salt, cholesterol, and sugar.25-30 Five decades of controversy surrounding basic dietary guidelines and nutrition recommendations is a public acknowledgement of a failed research paradigm. The striking incongruence between the improvements in the nutritional status of the U.S. population2,5 and the current state of confusion, controversy, and clinical failure of epidemiologic nutrition research could not be clearer and necessitates an examination of both the validity and value of epidemiologic nutrition research.
Memory-based dietary assessment methods (M-BM; e.g., interviews, questionnaires, and surveys31,32) are the dominant data collection protocols in national nutrition surveillance,33 government-funded epidemiologic nutrition34 and obesity research.33 Importantly, M-BM data are used to inform national nutritional policy and dietary guidelines.30 The recent Scientific Report of the 2015 Dietary Guidelines Advisory Committee (DGAC) stated explicitly that “[m]ost of the DGAC data analyses used…” the M-BM of the National Health and Nutrition Examination Survey (NHANES) dietary component, ‘What We Eat in America’ (WWEIA).30 While decades of unequivocal evidence demonstrate that the indirect, proxy estimates derived from M-BM bear little relation to actual energy or nutrient consumption,13,33,35-45 the underlying assumptions regarding the validity of human memory and recall in dietary assessment have not been questioned. To the contrary, M-BM data are vigorously defended as valid and inherently valuable46 despite no empirical support for those assertions. While the relationship between two different constructs may be expected to be weak, the trivial relationships between the proxy estimates (i.e., self-reported energy intake [EI] and nutrient intake) and its referent (i.e., actual EI and nutrient intake) is unacceptable. We assert that the explanatory and predictive failure of epidemiologic nutrition research is explained by its reliance on M-BM, and as such, the uncritical faith in the validity and value of M-BM has wasted significant resources and constitutes the single greatest impediment to actual scientific progress in the fields of obesity and nutrition research.
The purpose of this review is to survey the explanatory and predictive failure of nutrition epidemiology in general,11,17 with a focus on the WWEIA-NHANES data,33 and argue that these failures are due to the reliance on M-BM. First, we present evidence that the anecdotally-derived proxy data produced by M-BM bear little relation to actual EI or nutrient consumption.13,33,35-45 Second, we provide interdisciplinary evidence that human memory is an amalgam of constructive and reconstructive processes47-52 (e.g., imagination53) that render the archival model of human memory 54 and the naïve assumption that recall provides literal, accurate or precise reproductions of past events indisputably false.50,52,55-58 Third, M-BM require respondents to undergo protocols 59 and perform behaviors 31 that mimic procedures known to induce false recall.50,52,53,60,61 Fourth, the subjective (i.e., private, not publically accessible) mental phenomena (i.e., memories) from which M-BM data are derived are not subject to independent observation, quantification, falsification or verification; as such, M-BM data are pseudoscientific and inadmissible in scientific research.62-66 Fifth, the failure to accurately and objectively measure and control for physical activity (PA), cardiorespiratory fitness (CRF), and other obvious confounders annuls inferences regarding diet-health relationships.
The primary methods of data collection for nutrition epidemiologic research (e.g., WWEIA-NHANES) are M-BM (e.g., 24-hour dietary recalls [24HR], and food frequency questionnaires [FFQs]31-33) For clarity, these methods do not directly or objectively measure energy or nutrient intake, nor do they directly or objectively measure food and beverage consumption. The actual data derived from M-BM are the a priori numeric values from nutrient databases that are assigned by researchers to the participants' reports of their memories of past eating and drinking behaviors. In other words, nutrition researchers designate numeric values to whatever the respondents are willing and/or able to recall about what they think (or would like the researcher to think67) he or she consumed during the study period. Given the indirect, pseudo-quantitative (i.e., number generating68) nature of M-BM and the fact that the respondents' reports of their memories are subject to both intentional and unintentional distorting factors (e.g., perceptual, encoding and retrieval errors,69 social desirability,42 false memories,55 and omissions 48,49,70) it is hardly surprising that the majority of conclusions drawn from these number-generating protocols have failed to be supported when subjected to rigorous objective examination.11,17
“It is the natural tendency of the ignorant to believe what is not true. In order to overcome that tendency it is not sufficient to exhibit the true; it is also necessary to expose and denounce the false.”71H. L. Mencken
M-BM research reports a wide range of EI that are not physiologically plausible (i.e., incompatible with survival), and fail to accurately quantify the foods and nutrients consumed.11,33,35,38-40,42 Recently, we used multiple methods to ascertain the validity and plausibility of the NHANES and WWEIA-NHANES EI data from 1971-201033 and found they suffered from such severe systematic biases as to render them fatally flawed. Given that “[a]cross the 39-year history of the NHANES, [self-reported energy intake] data on the majority of respondents (67.3% of women and 58.7% of men) were not physiologically plausible” 33 (see figure 1), we concluded that these data are not valid for any inferences regarding energy intake and the etiology of the obesity epidemic. A recent editorial in the British Medical Journal concurred and stated that the NHANES dietary data are “incompatible with life.”11
In our report,33 we used two objective, physiologically-based methods to determine misreporting: 1) “Goldberg cutoffs”44,45,72 (i.e., reported EI divided by basal metabolic rate; rEI/BMR), and 2) the disparity between the Institute of Medicines (IOM) total energy expenditure (TEE) equations73 and rEI via NHANES M-BM. The two methods were in close agreement, demonstrating significant misreporting. The cutoffs we used (i.e., rEI/BMR = <1.35 and >2.40) were more generous than rEI/BMR cut-off of 1.50 suggested by Goldberg et al.45 when using a single 24HR and BMR is “predicted from the Schofield equations” with a sample size of ≥300.45 Given the reduced sensitivity of our cutoffs, we captured far fewer under-reporters. As reported, when using the proposed cut-off of 1.50, underreporting increased to >70% for the entire NHANES sample and to ~77% and ~85% for obese men and women, respectively. We also reported the large and significant disparity between rEI and the IOM TEE: -467 and -554 kcal/d, (>17% and 30%) for obese men and women, respectively. In addition to underreporting, there was significant over-reporting in all subpopulations (e.g., normal, overweight, and obese men and women). One important caveat with the use of ‘cutoffs’ is that the term “plausible reporter” is not synonymous with “accurate reporter.” Participants with high levels of physical activity (PA) may significantly underreport yet still be considered “plausible reporters.”
Given these results, we ask four questions. 1) What is the value of WWEIA-NHANES M-BM data if 70-80% of obese women's self-reported EI are physiologically implausible and therefore incompatible with life? (See figure 1). 2) Given the extant objective data on the nutrition-related health status of Americans,2 why does the DGAC rely on the subjective M-BM data?30 3) What is the “unrealized potential” 46 and “utility”74 of these data when both implausible over-reporting and implausible underreporting are demonstrated in all subgroups? 4) Can statistical alchemy transform these implausible data into valid estimates of dietary consumption, or will it continue to spawn searches for machinations that generate numbers with improved correlations (i.e., post-hoc data manipulation) while ignoring the lack of validity?
The conclusions drawn by our study33 and the recent British Medical Journal editorial11 are, in fact, supported by many decades of evidence demonstrating that M-BM suffer from severe, intractable systematic biases that render the data implausible and therefore invalid.11,13,37,44,75,76 Research with “…motivated…well-educated, non-smoking Caucasians” 35 (i.e., respondents less likely to misreport) demonstrated that compared to doubly labelled water, a biomarker for TEE, self-reported dietary intake was significantly misestimated.35,38 Men underreported EI 12–14% (average of two 24HR) and 31–36% with FFQs. Women underreported by 16–20% (the average of two 24HR) and by 34–38% with the FFQs. Contrary to the oft-repeated statement that additional self-reports improve precision and accuracy, the second administration of the 24HR, “showed greater underreporting.” 38 These results are in agreement with our analyses of the NHANES in which the mean estimates for the second 24HR in every NHANES wave from 2001 to 2010 exhibited significantly greater levels of underreporting than the first. We agree with the OPEN study's authors when they wrote, “[w]e measure energy so poorly…”38 and “[t]he 24HR… may be particularly problematic in the obese.”35 These words echo statements on underreporting from 60 years ago.77
Recently, some of the strongest proponents of M-BM have provided additional data that clearly demonstrate the futility of the continued use of these methods.36 In Freedman et al.'s paper the pooled, squared average correlation between ‘true’ EI and self-reported EI were similar to our results using NHANES data, ranging from 0.04 to 0.10.36 This suggests that the measurement ‘noise’ (i.e., error) is more than nine times greater than the ‘signal’ (i.e., valid information) derived from M-BM. Nevertheless, one important finding from the OPEN study that Freedman et al.78 overlook in their analyses is that despite the fact that the second administration of the 24HR, “showed greater underreporting,”38 the correlations between ‘true’ and reported EI increased. This demonstrates an increase in precision with a concomitant reduction in the accuracy of the estimate. These results clearly support our position that M-BM data “offer an inadequate basis for scientific conclusions”13 and more importantly, that statistical machinations, however sophisticated, cannot overcome the systematic recall bias that render all inferences suspect.41,79
The phenomenon of misreporting is not limited to U.S. epidemiologic studies or specific populations.45 The European Prospective Investigation into Cancer and Nutrition (EPIC) study is one of the largest epidemiologic studies in the world and found strong evidence of systemic underreporting across all study sites with ~10-14% of survey respondents being “extreme underreporters” 80 and “…most centres were below the expected reference value.”80 These results are consistent with research from the early 1990s that found >65% of the mean rEI values were physiologically implausible in 37 studies across 10 countries.45 The misreporting value of >65% is strikingly similar to our NHANES results using similar methods.33 In 2015, a multi-national report demonstrated that misreporting “in five populations of the African diaspora”81 was substantial with the South African cohort exhibiting an astounding 52.1% underreporting of dietary energy intake.81 With respect to age, Forrestal (2011) found in children and adolescents that misreporting “…appeared to be more common than it is among adults.”82 The ubiquitous nature of misreporting and the consistency of research results over many decades and across multiple populations, cohorts, and countries provide strong support that M-BM measures of EI are fatally flawed and therefore, diet-health inferences from studies that use M-BM are essentially meaningless.
It is well-established that specific macronutrients, foods, beverages, and food groups (e.g., protein, fat, carbohydrate, alcohol, sugar, vegetables) are subject to differential misreporting that significantly affects subsequent estimates of energy intake. 38,79,83-89 Because EI is the foundation of dietary consumption and all nutrients must be consumed within the quantity of food and beverages needed to meet minimum energy requirements,90 it is a logical and analytic truth that dietary patterns (i.e., macro and micro-nutrient consumption; e.g., protein, carbohydrate, fat, vitamins, minerals) are differentially and unpredictably misreported when total reported EI is physiologically implausible. For example, both macro and micro-nutrient composition are significantly altered in underreporters, with reported fat and carbohydrate consumption often lower, and reported protein, fruits and vegetable intakes higher.42,83,87 In other words, participants qualitatively and quantitatively misreport due to both non-intentional (e.g., forgetting, false memories) and intentional factors (e.g., health-related perceptions). This non-uniformity of misreporting leads to macro and micro-nutrient specific errors87,88 which alter nutrient-to-energy intake ratios in an unpredictable and non-quantifiable manner. This simple fact renders energy adjustments fallacious,41,79 and demonstrates the assumption that M-BM data can be used to examine patterns of diet or dietary composition is not logically valid.
The use of M-BM requires faith in the belief that human perception, memory, and recall are accurate and reliable instruments for the generation of scientific data. Nevertheless, more than 80 years of research demonstrates that this belief is patently false.50,58,70,91 The discrepancy between objective reality and human memory is well-established 48,92 and the limitations of recall are widely acknowledged in disciplines outside of nutrition and obesity.47-49,69,70,93 In fact, the scientific study and analysis of memory would be impossible if it were not for the inherent fallibility of memory.49 Bartlett (1932)94 presented the first empirical evidence that the human memory is not a literal, accurate, or precise reproduction of past events. Over the ensuing 80 years, research has clearly demonstrated that the encoding of memories 69,92 and subsequent recall depend on constructive and reconstructive processes (e.g., imagination)48,69,53 prone to errors, distortions, omissions, complete fabrications, false reports, and illusions.50,58,69,70,91
Given the breadth of this research, reported memories such as those presented in 24HR and FFQs can be most accurately defined as mere attributions based on mental experiences that are strongly influenced by the respondents' idiosyncratic qualities (i.e. education), prior memories and information, knowledge and beliefs, motives, goals, habitual behavior, and the social context in which the memories are encoded and/or reported.47,49,58 Perhaps the most salient example of the fallibility of memory and recall (and misplaced confidence) is that false reports (i.e., inaccurate eyewitness testimony) was a key factor in ~75% of the first 100 cases of individuals exonerated by DNA evidence after conviction for crimes that they did not commit.57 The following sections provide a survey of the evidence to support our contention that data can only be as valid as the accuracy of the instrument used in its collection and that human memory and recall are not valid instruments for the generation of data to be used in the scientific formulation of nutrition guidelines.
Numerous studies, dating back to over 50 years ago have demonstrated that there is little or no correlation between self-reported behavior and actual behavior.95,96 Bernard et al. (1984) reviewed the validity of self-reported data in “The Problem of Informant Accuracy.” 58 Surveying multiple research domains including health care, child-care, communications, nutrition, criminal justice, economics, anthropology, and psychology, Bernard et al. concluded “[t]he results of all of these studies leads to one overwhelming conclusion: on average, about half of what informants report is probably incorrect in some way.”58 Bernard et al. also provide a prescient commentary, “In sum, despite the evidence, the basic fact of informant inaccuracy seems not to have penetrated either graduate training or professional social science research. Informant inaccuracy remains both a fugitive problem and a well-kept open secret.”58 Given the substantial funding of M-BM each year,9,10 it appears that this 30-year-old commentary also applies to nutrition and obesity research.
Furthermore, when events or behaviors are commonplace (e.g., food and beverage consumption), previous experiences (e.g., previous memories and mental schema69,97 of past meals) will determine what is encoded in memory and not the actual perception of behavior. For example, Freeman et al.98 demonstrated a 52% error rate in recalling social interactions, with reports of social interactions shaped by typical past experiences. They explain their results by suggesting that when events are repeatedly experienced, each specific event will be minimally processed and the “actual memory of such elements will be poor,” and “attempts at recall result in a constructive process that taps into the general structure rather than the specific memory.”98
Importantly, Bernard et al. lamented two common problems with social scientific data, 1) the lack of an explicit formal theory of human behavior and 2) objective evidence from which to test the plausibility of self-reported data. Nevertheless, nutrition epidemiologists have both a formal theory (i.e., human metabolism and the basic energy requirements of human life) and voluminous objective data44,45 by which to test the validity of M-BM.33 Despite the availability of formal theory and overwhelming evidence that self-reported EI data are not accurate, “plausible,”33 or even “compatible with life,”11 self-reported EI continues to be assumed a valid measure of actual energy and nutrient consumption that can be used to inform public nutrition and dietary policy.30
A detailed review of the social research literature is beyond the scope of this paper and we direct our readers to Bernard et al.'s review.58 Nevertheless, one more notable example is warranted. Immediately upon leaving a restaurant, Kronenfeld et al., (1972) had participants report on both the attire of the wait-staff and the restaurants' choice of music.58,99 Participants demonstrated much greater agreement on what the waiters were wearing compared to the waitresses' attire. The interesting finding was that these restaurants had all-female wait-staff (i.e., there were no waiters in the restaurants). Participants also provided much greater detail on the music from restaurants that were not playing music than from restaurants that were.58,99 These results raise the question: what is the possibility that self-reported food and beverage consumption in a restaurant setting will be literal, accurate or reliable representations of actual ingestive behavior?
The domain of cognitive neuroscience supports the hypothesis that human memory is an amalgam of dynamic constructive and reconstructive processes.47-53,55-57,69,70 For example, encoding is not a process that begins de novo with each perception. Encoding is the result of the limited amount of information available to perception at any given moment being “patched together to form memories with varying degrees of accuracy” 49 (e.g., the process of associative grouping via semantic relatedness 50,93,100) and subject to “the distorting influences of present knowledge, beliefs, and…previous experience.”49 As such, the general knowledge and availability of mental schemas from previous eating occasions intrude on the encoding of current consumption to produce both false and/or fuzzy (i.e., gist) memories.51,101 Memory and recall are subject to a myriad of unintentional “sins” 70 including but not limited to distortions, misattribution, suggestibility, simple forgetting, falsehoods, and omissions.49,91,92 Because both selective and elaborative processes operate on the perceptions that are encoded and recalled, “memory does not [and cannot] operate like a video recording.”57
Recently, the process of reconsolidation (i.e., the reconstruction and re-encoding of memories after recall) has been demonstrated in rodents, and the evidence in humans is supportive.102,103 Reconsolidation involves the same neural processes as the encoding of the original memory.92 Therefore, each time a memory is recalled, it is irretrievably changed such that the original memory no longer exists and a new memory of unquantifiable error replaces it.102,103 This fact has implications for the “current state-of-the-art 24-hour dietary recall instrument,” the USDA Automated Multiple-Pass Method.31 With each ‘pass’ of the multi-pass procedure, the process of reconsolidation alters the original memory so that by the end of the data collection period, the result will be an amalgam of multiple ‘new’ memories and reports with unquantifiable error. As such, neither the researchers nor the participants know the validity or reliability of the reported food and beverage consumption.
False reports are the recollection of an event, or details of an event, that did not actually occur.69 False memories and recalls may be produced in multiple contexts (e.g., during research,55,104 psychotherapy, and criminal investigatory interviews60). While research has demonstrated that false memories of ingestive behavior and subsequent false reporting of foods occur in lab-settings,55,61,104 there is a larger literature base outside of nutrition. The Deese-Roediger and McDermott (DRM) paradigm is commonly used in research settings to elicit false reports.105,106 In this protocol, a list of semantically-related words (e.g., breakfast, bacon, sausage, orange juice, cereal) are presented or read to subjects. After a delay (minutes to days), participants are asked to report the words they remember. The mere presentation of lists of semantically-related words induces extremely high levels (i.e., >75%) of the false reporting of related, but non-presented words (i.e., critical lures; 49,100,106 e.g., the word ‘egg’ in our previous example). The DRM is so effective at inducing false reports, that memory distortions occur even in the small percentage of individuals with highly superior memories.50 With the DRM, respondents are often more confident in their false reports than the presented words.93
Researchers familiar with FFQs will recognize that by design, FFQs mimic the DRM protocol in that lists of semantically-related words (i.e., foods and beverages) are presented and respondents are expected to provide a response. Given that FFQs mimic the procedures designed to produce false recall, it is not surprisingly that FFQs with longer lists of semantically-related words elicit more responses.107 Given the vast literature demonstrating misreporting with FFQs 35,38,42,108 and the parallel literature on the extremely high level of false reports using the DRM paradigm, 93,101,105,106 it is not a question of whether or not FFQs induce false reporting, but to what extent. As stated previously, neither the researchers nor the participants know the validity or reliability of the reported food and beverage consumption, nor can they quantify the error induced via false reporting. As we discuss in a later section, the inability of current nutrition epidemiologic research designs to independently falsify or confirm M-BM data renders the error due to false reports unquantifiable and therefore inadmissible as scientific data.
Recent research has examined the effects of creating “false memories for food preferences and choices.”55,61 We refer our readers to a review by Bernstein and Loftus.55 Their work has established that it is relatively simple to, “implant false beliefs and memories regarding a variety of early childhood food-related experiences.”55 We assert that false memories and reports are induced via the NHANES interview protocol itself, as has been demonstrated in other interviewing contexts.60 The factors that potentially induce false memories and reporting are well-established. For example, the development of rapport between an authority figure and respondents followed by the use of guided imagery, the use of silence in responding, repetition, use of props, suggestive or repeated questioning, and the encouragement to reminisce, imagine or elaborate on past behaviors have all been shown to increase false recall.55,69,92,93,101,106 All of these factors are explicitly described in the training manual for the research personnel that conduct the NHANES 24HR.59 The use of “rapport,” silence, imagery, props, repeated questioning, and the use of “expectant look[s]” are both explicit and noteworthy in the training manual.59 For example, the following directive is an exemplar of the potentially false memory inducing protocol, “If you sit quietly — but expectantly—your respondent will usually think of something. Silence and waiting are frequently your best probes for a “don't know” reply. Always try at least once to obtain a reply to a “don't know” response, before accepting it as the final answer.”59 The use of “rapport” combined with repeated questioning, silence, and “expectant looks” is especially coercive when applied by an authority figure in a research context. Additionally, NHANES personnel are directed to ask respondents to “imagine,” “think about,” and “begin reminiscing,” about their food intake, and to “encourage” and ensure that the respondents are “convinced of the importance of the study.”59. Throughout the manual there are examples of guided imagery and suggestive questioning such as directing participants to begin, “thinking about where you were, who you were with, or what you were doing, like working, eating out, or watching television”,59 and directives such as “Your own state of mind-- your conviction that the interview is important—will strongly influence the respondent's cooperation. Your belief that the information you obtain will be significant and useful will help motivate the respondent to answer fully…”. While the NHANES training manual states“[t]his methodology is designed to maximize respondents' opportunities for remembering and reporting foods they have eaten,” the scientific literature on false memories and recall strongly supports our contention that the NHANES M-BM generates significant false reporting. Given that imagination and coercive techniques (e.g., the use of “silence”59) are known to increase the probability of illusory (i.e., false) recollections,53,60 it may be that the majority of 24HR data are false reports. If this hypothesis is true, the NHANES 24HR is a mere exercise in number generation and therefore, by design it does not provide proxy estimates of energy or nutrient consumption. This premise provides an empirically supported explanation why most M-BM data are implausible and have trivial relationships with reality (i.e., actual EI and nutrient intake.) Nevertheless, without objective corroboration, it is impossible to quantify what percentage of the recalled foods and beverages are completely false, grossly inaccurate, or somewhat congruent with actual consumption. Regardless, it is clear that people consistently “remember [and report] events that never happened.”106
Although the terms science and research are used interchangeably, they are not synonymous. Science is more than mere data collection; it is an attempt to discover order, a potentially self-correcting, explanatory and predictive process that demonstrates lawful relations (e.g., diets high in vitamin C prevent scurvy). In contrast, research is simply the process of collecting information, and many forms of research fail to meet the rigor necessary for the results to be scientific. There is a long history of efforts to formally demarcate scientific from non-scientific and pseudo-scientific data, the most famous of which may be Popper's falsifiability criterion.64-66 For example, in US jurisprudence, the ‘Daubert Standard’109,110 provides the rules of evidence for the admissibility of expert testimony. The criterion of falsifiability is central to expert scientific testimony and was used by Judge William Overton in ruling in McLean v. Arkansas Board of Education. This case determined that ‘creation science’ was not a science because it was not falsifiable, and therefore could not be taught as science in Arkansas public schools.111 As we detail in later sections, we assert that M-BM data is akin to ‘creation science’ in that it fails to meet the basic requirements of scientific research.
Although philosophers continue to debate demarcation criteria, practicing scientists must set forth principles from which to judge the admissibility of data in scientific research. We extend Popper's criterion and proffer the following widely accepted principles of scientific inquiry. First, for results to be scientific, the study's protocols must produce outcomes that are subject to replication. To accomplish this goal, the data must be 1) independently observable (i.e., accessible by others), 2) measureable, 3) falsifiable, 4) valid, and 5) reliable. These non-metaphysical criteria were first suggested by Roger Bacon in the 13th century, and later elaborated by the ‘father of empiricism,’ Sir Francis Bacon in the late 16th century.112 They were again reiterated by Sir Isaac Newton in the 17th century,113 and have been subsequently clarified and defined.62-66,68 The skepticism and empirical rigor inherent in these criteria are of such importance to science that ‘The Royal Society of London,’ the oldest scientific society in the modern world, succinctly summarized them in their motto, “Nullius in Verba.” This phrase, derived from Horace's Epistles,114 is translated as “on the word of no one” or “take no one's word for it” and suggests that scientific knowledge should be based not on authority, rhetoric, or mere words, but objective evidence.
The first three criteria (i.e., independently observable, measureable, falsifiable) define the phenomena that are within the domain of science (i.e., able to be examined via the scientific method), and the final two (i.e., validity and reliability) refer to the concordance between a measurement and its referent as well as the error associated with the measurement protocols used to collect the data. Together, the five basic tenets clearly distinguish scientific research from mere data collection and pseudo-science. For example, if someone is eating an apple, his or her behavior can be independently observed, measured, and verified or refuted. Yet if he or she reports eating an apple at some point in the past (e.g., as with a FFQ or 24HR), neither the past behavior nor the neural correlates of the memory of that behavior are independently observable or quantifiable, and without additional information, his or her statement cannot be falsified or confirmed. It is a rather obvious fact that the respondent is the only person that has access to the raw data of M-BM (i.e., his or her memories of consumption). As such, researchers cannot examine the validity of the memory and base M-BM research results on their faith in the verbal report (i.e., the belief that the participant is telling the truth). Nevertheless, faith and belief are basic tenets of religion, not science. The unwavering credulity of nutrition epidemiologists with respect to verbal reports is literally in direct opposition to “Nullius in Verba” (i.e., ‘take no one’s word for it”) and skeptical, rigorous science. The confluence of these simple facts and the well-documented failure of self-reported EI to accurately correspond to reality,33,35 demonstrate that the memory and subsequent recall of ingestive behavior are not within the realm of the scientific investigation of nutrition and obesity. As the philosopher Karl Popper stated, “all the statements of empirical science must be capable of being finally decided, with respect to their truth and falsity,”65 and it is wholly impossible to verify or refute something that cannot be directly or indirectly, independently observed and measured (e.g., memories).
The term pseudo-science describes data and/or results that are presented as scientific but lack plausibility because they cannot be reliably, accurately and independently observed, quantified, and confirmed or refuted.62-66 When M-BM are examined from the perspective of the basic tenets of science, the reason for the explanatory and predictive failure of epidemiologic nutrition research becomes obvious. First and foremost, scientific conclusions cannot result from non-empirical (i.e., unobserved) or subjective (i.e., private, not publically accessible) data that are not subject to independent observation, quantification, and falsification. When a person provides a dietary report, the data collected are not actual food or beverage consumption but rather an error-prone and highly edited anecdote regarding memories of food and beverage consumption. As such, M-BM fail to meet basic requirements of the scientific method, and by definition are pseudoscientific when presented as actual estimates of energy or nutrient consumption. Two famous physicists of the 20th century, Wolfgang Pauli and Arthur Schuster summed up the problem with pseudoscientific data eloquently when they stated respectively that a pseudo-scientific conclusion “is not only not right, it is not even wrong…”115 and while “[w]e all prefer being right to being wrong, but it is better to be wrong than to be neither right nor wrong.”116
It is difficult to determine the empirical consequences of M-BM because the primary data (i.e., memories: private information to which the respondents have privileged access) do not meet the basic tenets of scientific methodology (e.g., independent observation of data, falsifiability, accuracy). If neither the researchers nor the participants are able to quantify what percentage of the recalled foods and beverages are completely false reports, grossly inaccurate, or reports that are somewhat congruent with actual consumption, it is impossible to know the validity and the error associated with each report. As Dhurandhar et al., recently suggested, the use of M-BM based data is a context in which “…something is not better than nothing.”75 Given the forgoing, M-BM derived data are inadmissible and constitute a significant, ongoing threat to both nutrition and obesity research and national dietary guidelines because the greatest obstacle to scientific progress is not ignorance; it is the illusion of knowledge created by pseudo-scientific data that is neither right nor wrong.
Nevertheless, performing rigorous science is a skill that can be learned, but only if mentors understand and practice rigorous science. Given the ubiquitous use of M-BM over many decades, it appears that nutritional epidemiologists have eschewed the inherent rigor and skepticism of “Nullius in Verba” (i.e., take no one's word for it) and literally replaced it with “Totius in Verba” (i.e., ‘take everyone's word for it’). As a result, skeptical rigorous science is not practiced nor taught in nutrition and obesity epidemiologic research.24
If the two major components of US national nutritional surveillance are valid (i.e., NHANES M-BM data and the USDA Food Availability economic data), estimates from these surveillance tools should track together and independently provide population-level approximations of trends in food consumption and/or use. Nevertheless, history demonstrates this not to be the case. Trends in estimates in macronutrient consumption from population-level epidemiologic surveys (i.e., M-BM) exhibited statistically significant trends that were opposite to those of USDA economic data for fat, carbohydrates, protein, and energy (i.e., kilocalories per day) from the 1960s to the late 1980s.117 It should be apparent that US residents could not be simultaneously consuming more and less fat, protein, carbohydrates, and energy over time. The contradictory patterns and striking lack of correspondence between the two primary US nutrition surveillance tools suggests that one or more likely both protocols are invalid. Not surprisingly, as with the severe misreporting demonstrated across the globe,45,81 these contradictory patterns are not limited to the US, many countries exhibit considerable disparity between national surveillance via M-BM and economic/food supply data.118-121 This fact is further evidence that M-BM are fatally flawed and diet-health inferences from M-BM derived data are meaningless.
The lack of explanatory and predictive power of epidemiologic nutrition research may also be explained by the limited acknowledgement of non-nutritional determinants of health and disease such as non-genetic evolution,6-8 PA,122,123 CRF,124 and other components of nutrient partitioning and energy balance.125-131 For example, over 50 years ago the Food and Agriculture Organization of the United Nations and World Health Organization determined that human food energy requirements should be estimated using TEE, and that PA and basal energy expenditure were the primary determinants.132,133 Yet, most nutrition research fails to measure any form of energy expenditure or objectively quantify PA. Currently, there is only one manuscript of which we are aware that uses the NHANES objectively measured PA data to directly assess nutrition-related outcomes134 and no nutrition-related publications that include the NHANES treadmill CRF data in analyses. The lack of publications may be due to the fact that only two waves in the 40+ year history of the NHANES include objective measures of PA, and despite the widespread acknowledgment of the necessity of daily PA for health and well-being, it is routinely discounted by governmental public health funding agencies. For example, PA, CRF, and exercise are not even listed on the National Institutes of Health's (NIH) spreadsheet of categorical spending of nearly 250 classifications through 2016.9 This is unfortunate given that 80% of Americans are not at risk for most nutritional deficiencies,2 but 95% of Americans are at risk of PA deficiency (i.e., inactivity or high sedentary behavior) and do not meet the federal recommendations of 30 minutes per day of moderate to vigorous PA.135
Given that PA and CRF are major determinants of health, 123,124,134,136-138 and that PA is the only major modifiable determinant of TEE and nutrient-energy partitioning (i.e., the metabolic fate of the foods we consume),6,125-131,134 it is clear that both PA and CRF must be objectively measured and controlled for in analyses if the health effects of any dietary intervention are to be examined accurately. Yet because PA questionnaires suffer from many of the same systematic biases75,139,140 and inadmissibility issues as M-BM, the failure to objectively measure PA and control for it in analyses renders health inferences from previous nutrition epidemiologic studies moot. Fortunately for the science of health and disease, there are objective tools for the measurement of PA (e.g., pedometers, accelerometry based PA monitors)141 and despite limitations,142 these should be used in place of surveys and questionnaires to quantify PA in future examinations of health and disease.
“A wise man proportions his belief to the evidence.”143David Hume
This critical review provided empirical and analytic evidence to support our position that 1) M-BM estimates of EI and nutrient intakes have trivial relationships with actual EI and nutrient intakes; 2) the assumption that human memory and recall provide literal, accurate or precise reproductions of past ingestive behavior is indisputably false; 3) M-BM require participants to submit to protocols that mimic procedures known to induce false recall; 4) the subjective (i.e., private, not publically accessible) mental phenomena (i.e., memories) from which M-BM data are derived are not subject to independent observation, quantification, or falsification; therefore, these data are pseudoscientific and inadmissible in scientific research; and 5) the failure to objectively measure and control for PA and CRF in analyses renders inferences regarding most diet-health relationships moot.
Given the overwhelming evidence in support of our hypotheses, we conclude that M-BM data cannot be used to inform national dietary guidelines and that continued funding of M-BM constitutes an unscientific and significant misuse of research resources. Additionally, given that there are objective data on the nutrition-related health status of Americans,2 we find the DGAC's reliance on M-BM without scientific support or merit. We think that skepticism and rigor are essential requirements in scientific investigations, and fault the overly credulous nature of nutrition epidemiology for the obvious and well-demonstrated failures of the scientific community to properly inform previous federal dietary guidelines (e.g., cholesterol consumption).30,144 We think our nation's dietary guidelines should not be based on the pseudoscientific and highly-edited anecdotes of M-BM, and while others may disagree, we ask that they do as we have done and provide empirical evidence rather than rhetoric to support their positions. Without valid evidence, the dogmatic defense of illusory knowledge and the status quo in nutrition and obesity research (e.g., see 30,46,74) are impediments to both scientific progress and empirically supported public nutrition and obesity policy.
We began our critical review with evidence that our nation's food supply and the nutritional status of Americans have improved to a level unparalleled in human history.2,3,5 Given this reality and recent work on the intergenerational transmission of obesity and T2DM,6-8 we posit that the American diet is no longer a significant risk factor for disease for the majority of individuals. This hypothesis is supported by multiple lines of evidence such as a 40% decline in the age-adjusted mortality rate from 1969 to 2010,145 a progressive decades long reduction in age-adjusted cardiovascular disease incidence and mortality;146,147 and a 1.5% per annum reduction in age-adjusted mortality rates from all major cancers as well as significant reductions in lung cancer incidence in both men and women from 2001 to 2010.148 Given the forgoing and the evidence presented herein demonstrating the pseudoscientific nature of M-BM, we assert research efforts and funding of M-BM and diet-health research are misdirected and argue those resources would be better targeted on the most prevalent ‘disease of deficiency’ of the 21st century: inactivity (i.e., a lack of physical activity and exercise, and high levels of sedentary behavior).122,135
In this critical review we argued that the essence of science is the ability to discern fact from fiction and presented evidence from multiple fields to support our position that the data generated by nutrition epidemiologic surveys and questionnaires are not falsifiable. As such, these data are pseudoscientific and inadmissible in scientific research. Therefore, these protocols and the resultant data should not be used to inform national dietary guidelines or public health policy, and the continued funding of these methods constitutes an unscientific and significant misuse of research resources.
Conflicts of Interest and Source of Funding: Dr. Archer is supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under award number T32DK062710. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. He has received honoraria from the International Life Sciences Institute (ILSI) and The Coca Cola Company. Dr. Lavie reports receiving consulting fees and speaking fees from The Coca-Cola Company and writing a book on the obesity paradox with potential royalties. Dr. Pavela has no conflicts to declare and is supported by the National Institute of Diabetes and Digestive and Kidney Diseases of the National Institutes of Health under award number T32DK062710.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.