As recently discussed in this journal [1
], translational medicine is a rapidly evolving field. In its most recent conceptualization, it consists of two primary domains: translational research proper and translational effectiveness. This distinction arises from a cogent articulation of the fundamental construct of translational medicine in particular, and of translational health care in general.
The Institute of Medicine’s Clinical Research Roundtable conceptualized the field as being composed by two fundamental “blocks”: one translational “block” (T1) was defined as “…the transfer of new understandings of disease mechanisms gained in the laboratory into the development of new methods for diagnosis, therapy, and prevention and their first testing in humans…”, and the second translational “block” (T2) was described as “…the translation of results from clinical studies into everyday clinical practice and health decision making…” [2
]. These are clearly two distinct facets of one meta-construct, as outlined in Figure . As signaled by others, “…Referring to T1 and T2 by the same name—translational research—has become a source of some confusion. The 2 spheres are alike in name only. Their goals, settings, study designs, and investigators differ…” [3
Figure 1 Schematic representation of the meta-construct of translational health care in general, and translational medicine in particular, which consists of two fundamental constructs: the T1 “block” (as per Institute of Medicine's Clinical Research (more ...)
For the last five years at least, the Federal responsibilities for “block” T1 and T2 have been clearly delineated. The National Institutes of Health (NIH) predominantly concerns itself with translational research proper - the bench-to-bedside enterprise (T1); the Agency for Healthcare Research Quality (AHRQ) focuses on the result-translation enterprise (T2). Specifically: “…the ultimate goal [of AHRQ] is research translation—that is, making sure that findings from AHRQ research are widely disseminated and ready to be used in everyday health care decision-making…” [6
]. The terminology of translational effectiveness has emerged as a means of distinguishing the T2 block from T1.
Therefore, the bench-to-bedside enterprise pertains to translational research, and the result-translation enterprise describes translational effectiveness. The meta-construct of translational health care (viz., translational medicine) thus consists of these two fundamental constructs: translational research and translational effectiveness, which have distinct purposes, protocols and products, while both converging on the same goal of new and improved means of individualized patient-centered diagnostic and prognostic care.
It is important to note that the U.S. Patient Protection and Affordable Care Act (PPACA, 23 March 2010) has created an environment that facilitates the pursuit of translational health care because it emphasizes patient-centered outcomes research (PCOR). That is to say, it fosters the transaction between translational research (i.e., “block” T1) and translational effectiveness (i.e., “block” T2), and favors the establishment of communities of practice-research interaction. The latter, now recognized as practice-based research networks, incorporate three or more clinical practices in the community into a community of practices network coordinated by an academic center of research.
Practice-based research networks may be a third “block” (T3) in translational health care and they could be conceptualized as a stepping-stone, a go-between bench-to-bedside translational research and result-translation translational effectiveness [7
]. Alternatively, practice-based research networks represent the practical entities where the transaction between translational research and translational effectiveness can most optimally be undertaken. It is within the context of the practice-based research network that the process of bench-to-bedside can best seamlessly proceed, and it is within the framework of the practice-based research network that the best evidence of results can be most efficiently translated into practice and be utilized in evidence-based clinical decision-making, viz. translational effectiveness.
As noted, translational effectiveness represents the translation of the best available evidence in the clinical practice to ensure its utilization in clinical decisions. Translational effectiveness fosters evidence-based revisions of clinical practice guidelines. It also encourages effectiveness-focused, patient-centered and evidence-based clinical decision-making. Translational effectiveness rests not only on the expertise of the clinical staff and the empowerment of patients, caregivers and stakeholders, but also, and most importantly on the best available evidence [8
The pursuit of the best available evidence is the foundation of translational effectiveness and more generally of translational medicine in evidence-based health care. The best available evidence is obtained through a systematic process driven by a research question/hypothesis that is articulated about clearly stated criteria that pertain to the patient (P), the interventions (I) under consideration (C), for the sought clinical outcome (O), within a given timeline (T) and clinical setting (S). PICOTS is tested on the appropriate bibliometric sample, with tools of measurements designed to establish the level (e.g., CONSORT) and the quality of the evidence. Statistical and meta-analytical inferences, often enhanced by analyses of clinical relevance [9
], converge into the formulation of the consensus of the best available evidence. Its dissemination to all stakeholders is key to increase their health literacy in order to ensure their full participation in the utilization of the best available evidence in clinical decisions, viz., translational effectiveness.
To be clear, translational effectiveness – and, in the perspective discussed above, translational health care – is anchored on obtaining the best available evidence, which emerges from highest quality research. High quality of research is obtained when errors are minimized.
In an early conceptualization [10
], errors in research were presented as those situations that threaten the internal and the external validity of a research study – that is, conditions that impede either the study’s reproducibility, or its generalization. In point of fact, threats to internal and external validity [10
] represent specific aspects of systematic errors (i.e., bias) in the research design, methodology and data analysis. Thence emerged a branch of science that seeks to understand, control and reduce risk of bias in research.
Risk of bias and the best available evidence
It follows that the best available evidence comes from research with the fewest threats to internal and to external validity – that is to say, the fewest systematic errors: the lowest risk of bias. Quality of research, as defined in the field of research synthesis [11
], has become synonymous with low bias and contained risk of bias [12
Several years ago, the Cochrane group embarked on a new strategy for assessing the quality of research studies by examining potential sources of bias. Certain original areas of potential bias in research were identified, which pertain to (a) the sampling and the sample allocation process, to measurement, and to other related sources of errors (reliability of testing), (b) design issues, including blinding, selection and drop-out, and design-specific caveats, and (c) analysis-related biases.
A Risk of Bias tool was created (Cochrane Risk of Bias), which covered six specific domains:
1. selection bias,
2. performance bias,
3. detection bias,
4. attrition bias,
5. reporting bias, and
6. other research protocol-related biases.
Assessments were made within each domain by one or more items specific for certain aspects of the domain. Each items was scored in two distinct steps:
1. the support for judgment was intended to provide a succinct free-text description of the domain being queried;
2. each item was scored high, low, or unclear risk of material bias (defined here as “…bias of sufficient magnitude to have a notable effect on the results or conclusions…” [16
It was advocated that assessments across items in the tool should be critically summarized for each outcome within each report. These critical summaries were to inform the investigator so that the primary meta-analysis could be performed either only on studies at low risk of bias, or for the studies stratified according to risk of bias [16
]. This is a form of acceptable sampling analysis designed to yield increased homogeneity of meta-analytical outcomes [17
]. Alternatively, the homogeneity of the meta-analysis can be further enhanced by means of the more direct quality-effects meta-analysis inferential model [18
Clearly, one among the major drawbacks of the Cochrane Risk of Bias tool is the subjective nature of its assessment protocol. In an effort to correct for this inherent weakness of the instrument, the Cochrane group produced detailed criteria for making judgments about the risk of bias from each individual item [16
]. Moreover, Cochrane recommended that judgments be made independently by at least two people, with any discrepancies resolved by discussion [16
]. This approach to increase the reliability of measurement in research synthesis protocols is akin to that described by us [19
] and by AHRQ [21
In an effort to aid clinicians and patients in making effective health care related decisions, AHRQ developed an alternative Risk of Bias instrument for enabling systematical evaluation of evidence reporting [22
]. The AHRQ Risk of Bias instrument was created to monitor four primary domains:
1. risk of bias: design, methodology, analysis scoring – low, medium, high
2. consistency: extent of similarity in effect sizes across studies within a bibliome scoring – consistent, inconsistent, unknown
3. directness: unidirectional link between the interventions of interest and the sought outcome, as opposed to multiple links in a casual chain scoring – direct, indirect
4. precision: extent of certainty for estimate of effect with respect to the outcome scoring – precise, imprecise In addition, four secondary domains were identified:
a. Dose response association: pattern of a larger effect with greater exposure (Present/Not Present/Not Applicable or Not Tested)
a. Confounders: consideration of confounding variables (Present/Absent)
a. Strength of association: likelihood that the observed effect is large enough that it cannot have occurred solely as a result of bias from potential confounding factors (Strong/Weak)
a. Publication bias
The AHRQ Risk of Bias instrument is also designed to yield an overall grade of the estimated risk of bias in quality reporting:
•Strength of Evidence Grades (scored as high – moderate - low – insufficient)
This global assessment, in addition to incorporating the assessments above, also rates:
–jointly benefits and harms
–outcomes most relevant to patients, clinicians, and stakeholders
The AHRQ Risk of Bias instrument suffers from the same two major limitations as the Cochrane tool:
1. lack of formal psychometric validation as most other tools in the field [21
2. providing a subjective and not quantifiable assessment.
To begin the process of engaging in a systematic dialectic of the two instruments in terms of their respective construct and content validity, it is necessary to validate each for reliability and validity either by means of the classic psychometric theory or generalizability (G) theory, which allows the simultaneous estimation of multiple sources of measurement error variance (i.e., facets) while generalizing the main findings across the different study facets. G theory is particularly useful in clinical care analysis of this type, because it permits the assessment of the reliability of clinical assessment pro-tocols. The reliability and minimal detectable changes across varied combinations of these facets are then simply calculated [23
]. However, it is recommended that G theory determination follow classic theory psychometric assessment.
Therefore, we have commenced a process of revision the AHRQ Risk of Bias instrument by rendering questions in primary domains quantifiable (scaled 1–4), which established the intra-rater reliability (r
0.05), and the criterion validity (r
0.05) for this instrument (Figure ).
Figure 2 Proportion of shared variance in criterion validity (A) and inter-rater reliability (B) in the AHRQ Risk of Bias instrument revised as described. Two raters were trained and standardized  with the revised AHRQ Risk of Bias and with the R-Wong instrument, (more ...)
A similar revision of the Cochrane Risk of Bias tool may also yield promising validation data. G theory validation of both tools will follow. Together, these results will enable a critical and systematic dialectical comparison of the Cochrane and the AHRQ Risk of Bias measures.