The cornerstone of any field of scientific inquiry is the pursuit of a body of cumulative knowledge, yet the psychological sciences have often fallen short of this goal (e.g., Gans, 1992
; Hunter & Schmidt, 1996
; Meehl, 1978
; Schmidt, 1996
). This is not for want of trying. Both quantitative and methodological techniques have been developed to help build a cumulative knowledge base. Most noteworthy of these techniques is meta-analysis which allows for the synthesis of summary statistics drawn from multiple studies when the original data are not available (e.g., Cooper, in press
; Glass, 1976
; Rothstein, Sutton, & Borenstein, 2005
; Smith & Glass, 1977
). One of the original motivations for meta-analysis was that these techniques would further support the creation of a cumulative knowledge within the social sciences, particularly in psychology (e.g., Hunter & Schmidt, 1996
; Schmidt, 1984
). There is no doubt that meta-analysis has substantially advanced our science toward this goal.
Because the focus of meta-analysis is on the synthesis of summary statistics drawn from multiple studies, this approach is ideal when the original individual data used in prior analyses is inaccessible or no longer exists. However, as we discuss in greater detail below, there are many advantages to fitting models directly to the original raw data instead of synthesizing the relevant summary statistics when the original individual data are available for analysis (e.g., Berlin, Santanna, Schmid, Szczech, & Feldman, 2002
; Lambert, Sutton, Abrams, & Jones, 2002
). Recent developments within the scientific community, such as greater expectations for data sharing and better options for electronic data storage and retrieval, have increased the potential for accessing original individual data for secondary analysis (i.e., the analysis of existing data). This in turn creates new opportunities for the development of alternative methods for integrating findings across studies by using original individual data to help overcome some of the unavoidable limitations of meta-analysis. (See Cooper and Patall, this issue
, for a thoughtful comparison of the advantages and disadvantages of meta-analysis relative to the pooled analysis of raw data.)
Techniques for fitting models to pooled data go by a variety of names, none of which have been broadly adopted within the social sciences. Simply to offer a starting point, we will refer to this set of methodologies as integrative data analysis
We chose the term integrative
over options such as pooled, simultaneous, unified, or concomitant, to highlight our goal of creating "a whole by bringing all parts together" which is a common definition of integrate
(e.g., American Heritage Dictionary of the English Language, 2009
). Interestingly, IDA has been used in other areas of scientific inquiry for more than a decade. For example, IDA has been used in medicine to examine the efficacy of medications versus cognitive behavior therapy for severe depression (DeRubeis, Gelfand, Tang & Simons, 1999
); to evaluate clinical trial outcomes for treatment of Alzheimer's Disease (Higgins, Whitehead, Turner, Omar, & Thompson, 2001
); to examine the relation between fat intake and the risk of breast cancer (D. Hunter et al., 1996
); to study the pharmacogenetics of tardive dyskinesia (Lerer et al., 2002
); and to examine the relation between height, weight and breast cancer risk (van den Brandt et al, 2000
Despite the broader use of IDA techniques in other disciplines, such applications are relatively novel within the behavioral sciences in general and within psychology in particular (but see Lorenz, Simons, Conger, Elder, Johnson & Chao, 1997
; McArdle, Hamagami, Meredith & Bradway, 2000
; and McArdle, Prescott, Hamagami, & Horn, 1998
, for notable exceptions). One reason behind the slow adoption of these techniques may be the significant challenges that psychologists face in pooling across studies that are highly heterogeneous in their methodology, even when these studies examine the same topic. Differences between studies in sampling techniques and frame, historical timing, design characteristics and measurement create seeming barriers to study comparison and integration. However, by incorporating information about such between-study heterogeneity into our techniques for study integration, our conclusions may be more generalizable and our progress as a science more cumulative. Thus, IDA strives to capitalize upon such between-study heterogeneity to not only better understand findings across existing studies (i.e., study integration) but also to probe meaningful sources of between-study variability that may contribute to, and thus inform theories about, key psychological phenomenon (i.e., study comparison).
The topics that underlie IDA are both broad and complex and a comprehensive treatment is beyond the scope of any single manuscript. As such, our intent here is rather modest. Specifically we offer a general discussion of the core issues that typically arise in applications of IDA for study integration in the psychological sciences. These topics and our guiding perspective on IDA are largely culled from our experience in using these techniques on a project that we call Cross Study. Cross Study involves the integrated analysis of three independent longitudinal studies of children of alcoholic parents and matched controls. These data sets are unique in their excellent retention, breadth of measurement, and sampling of non-treatment samples. Nonetheless, the three studies differ in many respects, such as geographical location, developmental coverage, measurement, and assessment modality. Because applications of IDA are necessarily idiosyncratic to the theoretical questions and sample characteristics at hand, we wholly acknowledge that our experiences on Cross Study have shaped our views of IDA, and this in turn is reflected throughout our work here. However, it is this same sensitivity of IDA to the specific theoretical and methodological context that makes this both a broad topic eluding simple description and a flexible, informative set of techniques that is critically needed in our field.
In the current paper, we build on our work with Cross Study and aim to further establish IDA as a potential tool for pursuing and fostering a cumulative knowledge base in our field. We begin with a discussion of exactly what IDA is and what advantages IDA offers when appropriate data are available for analysis. Next we detail potential influences on between-sample heterogeneity that may serve either as nuisance factors when study integration is the goal or, in many instances, as sources of variance that offer novel insights about why a phenomenon may show study-to-study differences. We then explore general analytic strategies that address between-study heterogeneity. We conclude with future directions for research and recommendations for the use of these techniques in practice.