|Home | About | Journals | Submit | Contact Us | Français|
The utilization of administrative data in substance abuse research has become more widespread than ever. This selective review synthesizes recent extant research from 31 articles to consider what has been learned from using administrative data to conduct longitudinal substance abuse research in four overlapping areas: (1) service access and utilization, (2) underrepresented populations, (3) treatment outcomes, and (4) cost analysis. Despite several notable limitations, administrative data contribute valuable information, particularly in the investigation of service system interactions and outcomes among substance abusers as they unfold and influence each other over the long term. This critical assessment of the advantages and disadvantages of using existing administrative data within a longitudinal framework should stimulate innovative thinking regarding future applications of administrative data for longitudinal substance abuse research purposes.
Ample evidence indicates that decreases in substance abuse are associated with improvements in functioning across a broad range of areas, and substance abuse itself is widely acknowledged as a disorder that is often chronic and requiring of long-term care.1–5 To comprehensively assess the impact of chronic abuse of drugs and to track the course of recovery, researchers increasingly affirm the need to focus on the long-term interplay of multiple events associated with substance abuse over time (e.g., health services utilization, psychosocial mediation factors, criminal activity).
While there are notable exceptions,6–8 much of the research on substance abuse treatment and service utilization is limited in that it mainly relies on self-reported information collected from treated patients over a short period of time, usually at or soon after treatment discharge. Furthermore, information is typically gathered via follow-up interviews with participants, an expensive and time-consuming endeavor made even more challenging by inadequate resources for re-locating sufficient numbers of research participants, cognitive and technical factors that influence the accuracy of self-reported data,9 and increasingly restrictive regulatory criteria limiting re-contact methods.10 There are benefits to collecting self-reported information, particularly when studying a wide variety of stigmatized behaviors (e.g., illicit drug use, criminal activity, HIV risk behavior) that can often only be assessed by talking directly with research participants. The validity and limitations of self-reported information have received much attention in the literature.11–16 However, there are few comparable critical assessments of the pros and cons of using administrative data as an alternative or complementary data source for longitudinal research on substance abuse.
Administrative data, defined as existing data routinely collected primarily for non-research purposes, can be thought of as an “official” record of events as they occur. Examples of administrative data include utilization of services for medical or psychiatric problems, receipt of public welfare benefits, insurance claims, and criminal justice system records on arrests, convictions, and incarcerations. Administrative data can be difficult to obtain and link across systems,17 and linkage methods (e.g., deterministic vs. probabilistic) must be carefully considered.18–19 Despite the ethical, legal, and practical limitations to its use, 20 one principal advantage of administrative data is that it can be employed to examine events, service system interactions, and outcomes as they unfold and influence one another over the long term. Another benefit is that administrative data can provide information on individuals who may be characterized by unique needs and experiences but who, as a group, usually present too small of a sample for disaggregation in statistical analyses.21 Moreover, mining existing data presents a cost-effective opportunity to take full advantage of readily available resources. Finally, advancements in technology and statistics have diminished technical difficulties associated with sharing and manipulating large, and often messy, administrative datasets. For these and other reasons, the use of administrative data for research purposes has become more widespread than ever.
The health and social services field, and the criminal justice system, have long maintained databases for tracking phenomena by unique individual over time, and stakeholders in these disciplines have frequently identified administrative data as a rich resource for policy-relevant research opportunities.22–31 For example, administrative data has been used to examine the impact of managed care on health service access and utilization,32–33 to identify discrepancies between the prevalence of medical illness and service utilization,34 to document and describe the prevalence of criminal restraining orders,35 and to explore a wide range of health care topics, particularly those pertinent to veterans.36–39 Numerous studies have relied upon administrative data to examine issues related to mortality and substance abuse.40–48 Existing data has also been used to conduct cost-effectiveness and cost-offset studies, 49 and it is in this area that administrative data was first extensively used in the substance abuse research field.50–52 Since these early economic evaluations, the substance abuse research field has continued to capitalize on administrative data to generate empirical evidence to inform practice and policy (e.g., Hser, Teruya, Brown, Huang, Evans, & Anglin, 2007),2 particularly as part of efforts to implement and evaluate treatment outcome monitoring systems.
Recognizing the need to monitor treatment outcomes, 53 in 1998 the Center for Substance Abuse Treatment (CSAT) at the Substance Abuse and Mental Health Services Administration (SAMHSA) initiated the Treatment Outcomes and Performance Pilot Studies Enhancement (TOPPS II), a project that pilot-tested substance abuse treatment outcome monitoring systems in 19 states. Implementing data-driven systems in substance abuse treatment settings was difficult 54–58 and relatively expensive.59 Concurrent with TOPPS II, existing administrative data was identified as an alternative to primary data collection for examining service utilization and measuring treatment performance and outcomes, 21 an idea that subsequently garnered support among other substance abuse researchers.60–62 Most states that participated in TOPPS II expanded or refined existing state-level drug treatment data collection efforts and implemented collection of self-reported follow-up data; however, a few states (e.g., California, Maryland, Oklahoma, Washington) used administrative data in their research designs.63–64
More recently, the types of administrative data that are available for substance abuse treatment performance measurement have been well-documented, 65–67 and some states (e.g., Oklahoma, Washington) 68 have made integration of administrative data into evaluation and research efforts ever more routine. Researchers continue to draw on existing data to conduct innovative studies that contribute to and expand what is known about the prevalence and nature of substance abuse and its association with other life events. Curiously, however, since Alterman et al.’s 2001 article, 60 there has been no critical assessment of lessons learned from the use of administrative data for substance abuse research purposes, nor of the advantages and disadvantages of using administrative data within a longitudinal framework.
The purpose of this paper is to synthesize recent extant research that used administrative data in longitudinal substance abuse studies so as to (1) consider what has been learned from such use, (2) review the limitations of using administrative data in longitudinal substance abuse studies, and (3) discuss future directions. Findings should stimulate critical thinking regarding the scope of opportunities generated by using administrative data for longitudinal substance abuse research.
PubMed and PsychInfo were searched for substance abuse research articles published after 1999. Search terms included “substance abuse,” “administrative data,” “record or data linkage,” and “performance monitoring.” Studies were considered to be longitudinal if they covered at least 1 year of follow-up, although studies with follow-ups of 3 to 6 years were more common.
The 31 articles that were found are organized into four overlapping topic areas (see Table 1): service access and utilization (10 articles), underrepresented populations (11), treatment outcomes (4), and cost analysis (6). Within each domain, the focus is on representative articles that illustrate the depth and breadth of administrative data applications recently conducted within the substance abuse research field. Each study’s main findings are briefly summarized and a critical assessment of the contributions and lessons learned for longitudinal research is provided in each area as well. The review ends with a summary of how administrative data can be applied within a longitudinal research design to enhance knowledge, followed by a discussion of implications for future longitudinal research on substance abuse treatment/service utilization and outcomes.
Substance abusers with multiple needs often access care provided by several service systems, and, over time, individuals may have a history of a variety of healthcare experiences with varying results. But because each service system is usually a separate and distinct entity, it can be difficult to document the full constellation of services received, much less understand how utilization of services offered by different systems may have impacted outcomes over time. Several studies have combined administrative data from multiple agencies to broaden and extend our understanding of service access, utilization, and outcomes.
Researchers have used administrative data, combined with self-reported data, to strengthen study designs. Lundgren et al. (2005)69 linked self-reported interview data with statewide claims data on substance abuse treatment and health insurance to examine relationships over 6.5 years between drug treatment, health service use, HIV status, and emergency room and hospital use. The combined dataset generated complementary information on factors that predict costly events. Use of emergency room and hospital services was positively associated with mental health status, drug use severity, and having private health insurance. In another example, Rosen et al. (2007)70 also combined interview data with administrative records to determine whether assignment of a payee to receive funds affected clinical outcomes for 1,457 mentally ill individuals in 9 states over 12 months. Results showed that beneficiaries with a payee had more serious substance abuse and mental health problems, received more psychiatric services and a broader range of services, and demonstrated a greater reduction in substance use. Ray, Weisner, and Mertens (2005) 71 used administrative data to adjust for a variety of characteristics when examining the relationship between receipt of psychiatric services and 5-year drug treatment outcomes. Analysis of linked data on psychiatric services and drug treatment for 604 individuals served by a California health maintenance organization showed that over 5 years, drug abstinence was more likely among patients who received an average of 2.1 hours of psychiatric services per year.
Researchers have also used administrative data to document changes in patterns of care over time. Maynard et al. (2000)72 linked existing state records from multiple sources to examine service utilization over 4 years by 735 patients involuntarily committed to substance abuse treatment in Washington. Drug treatment completion was associated with a decreased likelihood in the use of acute care services and with an increased likelihood of post-discharge outpatient treatment participation. But of most interest for the purposes of this paper, utilization of some services was actually greater in the year immediately following treatment discharge, compared to the prior year, but by analyzing administrative data covering the second and third years following discharge, eventual decreases indicating service utilization below the pre-admission levels were revealed. Similarly, Lundgren et al. (2006)3 used administrative drug treatment data to explore service utilization patterns among 22,006 injection-drug-using treatment repeaters in Massachusetts over a 5-year period. Findings revealed variation in patterns of care, an overuse of detox only, and an underutilization of the state’s continuum-of-care model.
Finally, administrative data has been used to examine the impact of policy changes on the utilization of care. As one prime example, records from Oregon’s substance abuse treatment system and the state’s Medicaid eligibility and enrollment data were linked on more than 500,000 subjects over 4 years to analyze the impact of the shift from fee-for-service financing to managed care on access to publicly funded substance abuse treatment.73 Investigators found that access to drug treatment actually increased significantly under managed care, an unexpected consequence of this statewide health policy change. In subsequent studies, Oregon’s integrated administrative database was instrumental in examining the impact of policy decisions regarding coverage for drug treatment as a Medicaid benefit. When coverage was expanded, the number of opiate users enrolled in methadone maintenance increased.74–75 But when eligibility for benefits was later reduced, administrative data was used to uncover how the neediest patients were the most negatively impacted76 and also how changes in coverage both reduced new admissions to opiate treatment and also decreased the likelihood of a methadone placement for people who did present for treatment.77
These studies illustrate several advantages of using administrative data to conduct longitudinal research on substance abuse service access and utilization. First, administrative data provide empirical evidence not just on how substance abuse treatment impacts later drug use, but also how treatment/service utilization relates to changes in other types of behavior and related events over time. As demonstrated by these studies, utilization of services provided by one system can have a “ripple effect,” both in utilization of services provided by a sister system (e.g., the decreased use of medical and mental health treatment due to drug treatment) and also in outcomes felt by other systems (e.g., the impact of changes in Medicaid coverage on drug treatment admissions). Administrative data is also instrumental for observing system-wide impacts of large policy changes (i.e., changes in managed care or Medicaid enrollment policies) that result in unexpected or unintended consequences. Also revealed are relationships between outcomes and degree of service exposure (i.e., treatment retention and completion, not just treatment entry itself), in addition to how those relationships change over time (e.g., service utilization can increase just after drug treatment but then eventually decrease below pretreatment levels when observation periods are extended). Similarly, administrative data allows analyses to incorporate policy-level factors not often included in outcome studies, such as the impacts of private health insurance, use of prescription medication, and HIV status. Finally, administrative data can be used to strengthen study designs, for example, through adjustment for group differences, and can be combined with self-reported information to provide more complete information on participants. In conclusion, synthesis of administrative data from separate but overlapping service systems generates new knowledge that is comprehensive, timely, and policy-relevant, enhancing the ability of researchers to understand longitudinal trends in healthcare utilization and outcomes.
Review of these studies also raises the issue of “history” effects,78 which must be considered when using administrative data for longitudinal research. Changes in behaviors and outcomes observed over time may be a function of changes in access to treatment/services, perhaps due to policy changes that influence funding levels, changes in eligibility for services (e.g., the actual change in SSI eligibility in which addiction was no longer considered a disability), or economic conditions that influence behaviors (e.g., employment, insurance enrollment). Although some studies using administrative data employ an “interrupted time series” design, in which behaviors or outcomes are compared in periods prior to and after a defined event that directly influences the observed outcomes, in other cases, historical events may not be adequately accounted for in the interpretation of findings. It must be kept in mind that administrative data reflects only those events that come to the attention of the system providing the data. That is, administrative data may provide a wealth of information about clients who used a set of services, but an inherent limitation is that individuals who needed the service but did not access it are excluded. It follows that findings resulting from administrative data analysis often reflect those who have been in treatment rather than the general community.
Administrative data often contains information on the entire population served by the system, facilitating the exploration of issues specific to ethnic minority or other small populations that otherwise might not be examined, partly because of insufficient sample size. As one example, the California Treatment Outcome Project (CalTOP) combined self-reported information with administrative data from four sources (adult lifetime records on arrests, mental health service utilization, drug treatment histories, and driving records) on more than 20,000 individuals admitted to publicly funded drug treatment.63 The resulting dataset not only permitted analysis of emerging substance abuse policy issues, 79 but moreover it was used by several investigations aimed at exploring issues unique to a number of populations historically underrepresented in substance abuse research, including Hispanics, 80 American Indians, 81 Asian Americans, 82 methamphetamine users,83 women treated in women-only versus mixed-gender programs, 84–85 dually diagnosed patients, 86 and mothers involved with the child welfare system. 87 Discussing the design and findings of each CalTOP article is beyond the scope of this paper. Instead, CalTOP is cited as an example of how the combination of multiple sources of existing records generates a versatile dataset with enough sample size, depth, and breadth to permit analyses of complex issues unique to often neglected groups.
Green, Rockhill, and Furrer (2006)88 used administrative data to examine issues pertinent to a hard-to-study population, substance abusing women involved with the child welfare system, while also documenting the complex interplay of events, system interactions, and outcomes over time. The study examined 3 years of data on 1,911 women involved in Oregon’s child welfare and drug treatment systems. Data was used to track events over time and also to control for a number of diverse pretreatment characteristics such as treatment and child welfare history, and substance abuse frequency and chronicity. Results indicated that when women entered drug treatment sooner after the date of having a child placed in substitute care, spent more time in treatment, or completed treatment, their children spent fewer days in foster care and were more likely to be reunified with their parents.
In another example of using administrative data to conduct longitudinal research on an understudied population, Claus, Orwin, Kissin, et al. (2007)89 analyzed 6 years of Washington State’s treatment admission and discharge data on more than 1,500 individuals to examine differences in continuity of care between women with children who entered specialized women-only residential treatment versus standard mixed-gender residential treatment. Administrative data was used not only to examine these issues over time, but also to construct propensity scores for addressing group nonequivalence, to control for treatment completion and length of stay, and to examine alternative explanations of observed associations. Results showed that specialized treatment was associated with continuing care, and women who completed specialized treatment with longer stays were most likely to continue care.
These studies highlight how administrative data can be a rich resource for conducting rigorous longitudinal research on understudied populations. With administrative data, not only are sample sizes large enough to support statistical analyses, but, moreover, such data makes it possible to employ strong study designs and to examine complex interactions over time, two advantages that strengthen longitudinal research on underrepresented groups.
These studies also raise issues researchers face when individual variable definitions, such as aggregation of particular ethnicities within larger racial categories, not to mention entire data systems, change over time, making longitudinal analyses especially problematic. Often, there is little information on changes in the accuracy of data as it is collected over time, and without a means to verify data quality, “dirty” data may be omitted from analysis altogether. Furthermore, events of interest may lack needed specificity or may have been collected inconsistently over time, or data may not be available until some time after events have occurred. Finally, caution must be exercised when making causal attributions simply because events captured in administrative databases are associated in time. External criteria or self-selection forces that influence where and when people enter into different service systems must be considered, especially in terms of the generalizability of findings that are rooted in administrative data.
More than 10 years ago, Washington State recognized the potential of using statewide information systems to provide meaningful data for informing policy and practice,90 and since then Washington researchers have developed an impressive portfolio of treatment outcome studies based on administrative data. Others outside of Washington have also used existing data to measure treatment outcomes,91–93 but because of the state’s long history in this area, in this section the focus is on a few of their studies that illustrate useful lessons in the application of administrative data for longitudinal outcome research on substance abuse treatment.
In 2000, Luchansky et al. (2000)94 linked three sources of state-level data (substance abuse treatment, criminal histories, and employment wages) on 10,284 individuals to analyze factors related to treatment readmission in Washington. The administrative data, which covered 13 months, was instrumental in documenting the continuum of care and its impact on patterns of readmission over time. Findings revealed that only about a quarter of clients were readmitted to treatment over 1 year. Readmission was more likely for females and people arrested in the year prior to treatment, and it was less likely for males and those receiving a combination of inpatient and outpatient treatments.
Four years later, Maynard and colleagues (2004)95 examined relationships between death, mental illness, and substance abuse among 2,041 individuals discharged from Washington State mental hospitals. Relying on administrative data from 3 sources (mental health hospitalizations, Medicaid diagnostic data, and cause of death) covering 5 years, analysis revealed that patients with a co-occurring or substance use disorder had a 44% higher risk of death after discharge, compared to those with a mental illness diagnosis only, and these individuals died at a younger age, primarily due to injury, accidents, and medical conditions directly related to their addiction.
Most recently, Luchansky and colleagues96 obtained existing data from multiple state-level sources on 8,343 Supplemental Security Income recipients covering 1 year. Investigators examined the association between receipt of needed treatment and subsequent criminal justice involvement. Administrative data was used to capture not only the substance abuse treatment experiences of the sample but also several other related events, including medical care, arrests and convictions, and receipt of other health and social services. Also, identification of the need for substance abuse treatment drew upon administrative data from not just 1 but 3 data sources. Interestingly, as the authors note, more than 68% of the study population met criteria for substance abuse treatment need in more than one source, indicating that most of the study population came into contact with more than one governmental agency. Despite these multiple contacts, only about half of the study population actually entered substance abuse treatment. Furthermore, administrative data on substance abuse treatment histories allowed for analysis of outcomes related to treatment episodes of care (i.e., multiple treatment admissions that occur within 30 or fewer days after discharge from a previous admission), a preferred alternative to analyzing the outcomes of a single treatment admission and discharge, especially within a chronic care context. Finally, administrative data permitted an examination of whether the amount of treatment was associated with outcomes and also whether simply entering treatment had an impact. Findings revealed that reduced risks for re-arrest and conviction were associated with treatment completion and longer retention, as well as simply entering treatment.
A study that combined data from Washington with data from 2 other states provides a final example. The TOPPS II Interstate Cooperative Study Group (2003)64 used employment and drug treatment data on 20,495 drug treatment patients living in Baltimore, Washington State, and Oklahoma, to examine the effect of drug treatment completion on employment and wages in the year after treatment. Employment history prior to treatment was used to adjust for group differences, and all treatment services received within an episode of care, despite changes in modality, were captured. Posttreatment employment was associated with treatment completion and longer treatment stays. Findings were consistent across all three states, despite different populations, treatment delivery systems, and labor markets.
One ongoing issue raised by review of these studies that is of particular concern when using administrative data to study treatment outcomes involves researchers’ ability to link data across datasets and time. Reasons for unlinked data are frequently unknown; however, the absence of a record is often interpreted as a non-occurrence of an event. For example, when no record is found to indicate an arrest, utilization of services, or treatment readmission, it is sometimes assumed that behavioral or medical improvements have occurred. While some absent data can be explained by improvements in subject status, other explanations may include insufficient identifiers needed for data matching,72 technical complications prohibiting linkage such as duplicate records,97 inconsistencies between administrative records and clinical records,98 and systematic or inadvertent data purges. Additionally, different strategies for record matching might be employed when acquiring data repeatedly and from multiple sources, possibly resulting in differences in linkage rates. The problem of missing data has been associated with particular subject characteristics such as being a member of a racial/ethnic minority or infrequent exposure to a particular service system.99 Communication with agency staff that provide data is key to understanding reasons for missing data. Also important are the utilization of different linking strategies (probabilistic vs. deterministic) designed to minimize the amount of unlinked data, the use of thresholds for determining legitimate matches, the examination of possible biases resulting from unlinked records, and the use of administrative data in combination with self-reported data.
Despite these issues, Washington’s work demonstrates how existing data provides the means to generate findings on longitudinal treatment outcomes that converge across service systems, over time, and even across states. Furthermore, from study to study, Washington’s administrative data-based treatment outcome studies have involved large sample sizes, multiple datasets, and long observation periods, making it possible to examine how complexities resulting from the same individuals entering different systems of care impact long-term outcomes. Additional examples of Washington’s work with administrative data are available,100–102 but due to space constraints, they cannot be covered in this article. This body of work demonstrates that administrative data is a useful tool for generating knowledge by expanding, lengthening, and strengthening our observation of addiction treatment and its impact, while providing information on the complex interactions and processes that can influence outcomes over time.
A great deal of empirical evidence suggests that substance abuse is associated with increases in a wide range of costs to society,51, 103–108 including costs associated with crime and the criminal justice system;109–110 medical care; 109, 111–116 infectious diseases;117–118 perinatal care;118 mental health disorders;104 and public benefits programs.118–121 Using administrative data to examine economic issues associated with drug abuse is a widespread practice that has resulted in several notable contributions to the field. Just a few recent examples have been chosen to illustrate some strengths and challenges associated with using administrative data for longitudinal substance abuse cost/benefit studies.
Parthasarathy & Weisner (2005)122 linked service utilization and cost data to information self-reported by 1,204 commercially insured chemical dependency patients to examine 5-year patterns of health care utilization and costs. Administrative data on patients without alcohol and drug problems was used to form a matched comparison group to examine whether patterns were attributable to regional trends. The most significant predictors of long-term utilization and costs were age, gender, employment status, medical and psychiatric severity, dependence type, treatment modality, and abstinence. Administrative data allowed researchers to ascertain that total health care costs increased initially from baseline to 6 months later, but costs then decreased below intake levels by 1- and 5-years post-intake.
In a different but similarly designed study, Polen et al. (2006)123 examined differences in medical care costs over 6 years between 1,472 individuals recommended for substance abuse treatment and 738 people without substance abuse diagnoses or treatment within a health maintenance organization in Oregon. Administrative records provided data on demographic characteristics, psychiatric diagnoses, prior care, service utilization, treatment completion, and costs. Changes in medical care costs over time did not differ between the two groups, and individuals with improved treatment outcomes did not have greater reductions in medical costs.
Ettner et al. (2006)124 combined self-reported information from 2,567 individuals with state-level administrative data from four sources to examine costs/benefits associated with publicly funded drug treatment in California provided through CalTOP over 2 years. Substance abuse treatment represented a greater than 7 to 1 ratio of benefits to costs, primarily because of reduced crime costs and increased employment earnings following treatment. The authors noted that the most significant monetary benefits occurred in areas (crime, hospitalizations, earnings) that were captured by administrative data, suggesting that in the future, similarly designed cost/benefit analyses that omit self-reported information and rely entirely on administrative data would likely result in reasonable estimates.
Wickizer et al. (2006)125 relied entirely on administrative data from Washington to evaluate the economic impact of substance abuse treatment on medical expenditures for welfare recipients over 4 years. By using linked administrative data from six sources, researchers were able to define a large study population (n=32,919), control for a number of factors (including differences in baseline medical care expenditures, demographics, mental health status, and health risk), and modify the parameters of statistical models to test the robustness of study findings. Substance abuse treatment was associated with a reduction in medical expenses of about $2,500 annually. Additionally, secondary analyses revealed that, compared to an untreated group of welfare recipients, individuals in the treated group who used inpatient mental health services incurred fewer costs on average. The treated group was also more likely to use outpatient mental health services and less likely to use adult services such as in-home nursing care and assisted living.
Carey et al. (2006)126 obtained administrative data on drug treatment and numerous criminal justice system interactions to assess costs and benefits associated with California drug courts over 4 years. Administrative data provided information on drug court participants and also on a matched comparison group of offenders who did not enter drug court. The study found that every $1 invested in drug courts was associated with a return of $3.50, mainly due to reduced recidivism rates among participants. Additionally, for each year studied, the state saw a combined net savings of more than $9 million.
Longshore et al. (2007)127 utilized existing state-level records on more than 130,000 individuals covering 8 domains (e.g., criminal justice system interactions, drug treatment, healthcare) to assess the cost implications of California’s initiative to provide community-based substance abuse treatment to eligible drug offenders. Findings showed that over the 2.5 years of observation, the program yielded significant savings, with cost-saving ratios of 1:2.5 for participants generally (i.e., for every $1 invested, $2.50 was saved) and 1:4 for completers (i.e., for every $1 invested, $4 was saved). Additionally, administrative data was used to construct a comparison group, replicate findings using subsequent years of data, examine the impact of degree of treatment participation, and conduct sub-studies on populations of particular interest.
An issue raised by review of these studies is the need for an examination of the correspondence of information gathered from self-reported sources compared to information that relies on administrative data sources. For example, an individual’s self-reported estimate of income may include sources, such as “under-the-table” wages, not captured by administrative data sources, a discrepancy that might affect cost-benefit ratios associated with the program being studied. While some researchers have used administrative data to test the validity of measures128–132 and others have commented on how external forces such as regulatory requirements and billing systems likely influence the accuracy of administrative data,133 there have been no empirical investigations of the degree to which self-reported and administrative data sources complement each other, or whether the accuracy or reliability of information may be comparatively greater in one data source compared to the other.
These cost studies illustrate how administrative data can be used to enhance flexibility in study design, facilitating the measurement of behavior and service system interactions and associated costs as they unfold over time. Added strengths associated with the utilization of administrative data include large sample sizes, longitudinal study designs, multiple data points, matched comparison groups, risk adjustments, and robustness tests. Studies are simultaneously broad and deep enough to capture complexities, while revealing new policy-relevant knowledge on the intersections between different service systems (i.e., substance abuse treatment, welfare, mental health, adult health services) and their economic impacts.
The results of this review must be considered within the constraints of its design. Only longitudinal substance abuse research articles published in peer-reviewed journals after 1999 were included in this review. However, the results of many studies using administrative data are not necessarily published in peer-reviewed journals, but instead are in the form of reports to state governments or reports on program evaluations.134–136 These reports can provide valuable guidance, especially to researchers seeking assistance from experienced colleagues in navigating the intricacies of accessing and utilizing datasets unique to specific states.
Despite its limitations, administrative data is a valuable resource for conducting longitudinal research on substance abuse treatment, service utilization, and outcomes. A considerable amount of time and resources are required simply to access administrative data and adequately address concerns regarding confidentiality, and researchers are advised to carefully weigh the benefits to be gained from analyzing administrative data against the resources required to obtain and derive adequate information from it. But beyond enhancing the ability of researchers to analyze events and associated costs as they arise and unfold, other advantages, especially applicable to applying a long-term view, include opportunities to track historical trends, apply flexible follow-up intervals, and take advantage of large pools of data for matching groups on key characteristics for quasi-experimental designs. Additionally, administrative data can be used to verify some self-reported key events in an individual’s life, enabling researchers to strengthen the validity of findings informing the understanding of causes and effects surrounding particular events. Administrative data is also particularly well-suited to studying the long-term course of health service utilization and outcomes among co-morbid and disadvantaged populations, as these groups are often excluded from traditional randomized clinical trials. 137–138 Finally, even as longitudinal research participants age and die, administrative data on them remains available, and utilizing such data is one way to make up for lost opportunities to extend knowledge about key events within a life-course perspective.139
This article makes a contribution to the field by articulating the numerous issues associated with using administrative data for longitudinal research purposes, thereby creating a key resource for various stakeholders. For example, this paper may aid junior investigators who are seeking guidance before utilizing administrative data for the first time, and it may also be of use to more experienced researchers who are seeking evidence to corroborate their own experiences with administrative data or who are, perhaps, seeking new directions for extending or strengthening their current work. This review might also assist state agencies in considering the pros and cons of fostering greater collaboration and data integration, especially between sister agencies interested in analyzing existing administrative data within a framework that employs a longitudinal perspective.
Examining how behavior patterns are shaped and altered by events over time is a key component of longitudinal research. Knowing when and how often an individual engages in behaviors and how the course of those behaviors is altered through interactions with different systems could help to improve the planning and delivery of strategies to change those behaviors for the better. Few substance-abuse-related datasets contain the necessary data elements needed to track participants over time, particularly in complex areas of interest. As illustrated by the articles reviewed in this paper, combining elements on individuals from several administrative datasets, or in combination with complementary self-reported information, increases confidence in findings,140 strengthens research designs, and broadens the scope and flexibility of analyses, providing opportunities to generate empirical evidence on the longitudinal effects of drug abuse and related complex events that would not have been revealed by analysis of single datasets independent of one another.
A comprehensive, integrated administrative dataset, i.e., a multi-dimensional measurement of diverse events over time as they occur, would allow for more complex models that reflect real-life social interactions and phenomena. Research that utilizes data collected by self-contained systems that incorporate multidimensional data under one unique identifier (e.g., data from Health Maintenance Organizations, the Department of Veterans Affairs, public insurance recipients, and some individual states) is promising. But the drug abuse research field is not yet at the point of creating an information system like Denmark’s social registries, which constitute a single coherent source of social statistics.141
Is a “cyberinfrastructure” age on the horizon for addiction research? Touted as a tool that can “enable the development of more realistic models of complex social phenomena” and “the production of and analysis of larger datasets…that more completely record human behavior…” (Berman & Brady, 2005, p. 9),142 a cyberinfrastructure would make it possible to “track change in human behavior at multiple time scales and from multiple perspectives” (Berman & Brady, 2005, p. 13).142 Enthusiasm for technological advancements that rely on identifiable personal data must be tempered by persistent uncertainties regarding the maintenance of individual privacy, the potential for the fraudulent use of such data, and the legal requirements guiding use of administrative data for research purposes.
Admittedly not without its risks, a cyberinfrastructure is certainly an intriguing concept and its application in the addiction field would signal a great technological and conceptual leap forward. Further consideration is needed and, considering that much of the research included in this paper originated as statewide evaluations of specific or localized programs or policies, ongoing discussion regarding the utilization of administrative data for longitudinal research might best occur as part of the continuing national debate regarding different operational, conceptual, and methodological approaches to measuring drug treatment quality, performance, and outcomes.143 As a valuable resource for generating empirical evidence revealing complex events as they unfold and interact over time, administrative data is likely to continue to be a key feature of future longitudinal substance abuse research on treatment/service utilization and outcomes.
The project described was supported in part by Grant Number P30 DA016383 from the National Institute on Drug Abuse. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institute on Drug Abuse or the National Institutes of Health. Special thanks are due to staff at the UCLA Integrated Substance Abuse Programs for manuscript preparation. We particularly wish to thank Dr. Anoinette Krupski for her generous insights and comments.