|Home | About | Journals | Submit | Contact Us | Français|
To test whether self-report data agree with medical record data in marginalized, HIV-infected populations, we collected information about HIV primary care visits over a 6-month period from both sources. Patients were drawn from a large study of engagement and retention in care conducted between 2003 and 2005. Self-report data were collected in face-to-face interviews and medical records were extracted using a rigorous, standardized protocol with multiple quality checks. We found poor overall agreement (weighted κ=0.36, 95% confidence interval=0.28, 0.43). Factors associated with disagreement included younger age (adjusted odds ratio for 20 versus 40 years=1.25, 95% confidence interval=0.98, 1.60), non-Hispanic black race/ethnicity (adjusted odds ratio for non-Hispanic blacks versus non-Hispanic whites=1.48, 95% confidence interval=1.03, 2.13), lower education (adjusted odds ratio for high school education, GED, or less versus some college or college graduate=1.43, 95% confidence interval=0.96, 2.13), and substance use (adjusted odds ratio for any illicit drug/heavy alcohol use in the past 6 months versus no use=1.39, 95% confidence interval=1.02, 1.90). These findings do not support a conclusion that unconfirmed self-report data of HIV primary care visits are a sufficient substitute for rigorously collected medical record data in studies focusing on marginalized populations. Use of other data sources (e.g., administrative data), use of other self-reported outcome measures that have better concordance with medical records/administrative data (e.g., CD4 counts), or incorporation of rigorous measures to increase reliability of self-report data may be needed. Limitations of this study include the lack of a true gold standard with which to compare self-report data.
As advances in HIV treatment that reduce mortality and improve quality of life increasingly become incorporated into clinical practice for HIV care, it is critical to monitor their distribution in the population to ensure equitable and appropriate access. HIV-infected people first must be engaged in the health care system to access these advances, so making accurate documentation of HIV health services utilization, particularly HIV primary care visits, also is essential.1 This is especially important for monitoring HIV treatment in socially and economically marginalized populations, for whom improvements in access to HIV primary care is most greatly needed.2–8 However, in these populations, medical record and services utilization data are challenging to obtain because people often have inconsistent health insurance coverage, use multiple health care providers, have irregular health care utilization patterns, and the health care that is accessed is often obtained in resource-poor settings where electronic medical records are less frequently available. It might be a reasonable task to obtain various forms of medical record data when studying small samples of marginalized groups that receive all their care from a single HIV primary care practice, but it is much more challenging and expensive when studying large samples that receive care from multiple providers. Thus, researchers frequently rely on patients' self-reports of their health care utilization,3–10 even though this type of data is prone to biases and other types of error.11–14
For these reasons, understanding the degree to which self-report data can be accurately obtained from large samples of marginalized HIV-infected populations is essential. Few data on the accuracy of patient self-report and patterns of misclassification across difference sources of data are available for these groups. Most of the available data focus on the accuracy of self-reported drug use behaviors or on self-reports of various behaviors collected from drug-using populations, or are related to health care utilization behaviors other than primary care visits.15–17 Although the validity of self-report visit data has been evaluated previously,18,19 it is important to consider this issue in marginalized populations specifically. Marginalized populations include substance users, mentally ill, homeless, recently incarcerated, immigrants, ethnic/racial minorities, and those who have low socioeconomic status for other reasons. They are likely to have greater difficulty providing accurate reports of HIV service utilization than the general population for the same reasons that researchers are often forced to rely on these data: they are less likely than the general population to be consistently engaged in health care and may not return to the same source of care over time.6 Despite ongoing efforts to improve HIV/AIDS care in the United States,20,21 these disparities in health care utilization patterns continue to be documented and have been attributed to both health care service delivery system issues22,23 and social factors, such as competing priorities faced by marginalized populations.24,25 The complex patterns of health care utilization of marginalized populations are likely to be more difficult to remember and describe.
Understanding the relationship between self-report data and other data sources on HIV health care utilization specifically for marginalized populations is crucial to evaluate interventions designed to improve their care and reduce disparities in both health care services and outcomes that affect these populations. Thus, in a large study of interventions to improve engagement and retention in care among marginalized people with HIV across the United States who are at risk of suboptimal HIV health care, we collected both patient self-report and medical record information about HIV primary care visits. Medical records were extracted using a detailed protocol with several validity checks built in. We examined the relationship between self-report and medical record data on HIV outpatient service utilization, and explored demographic and clinical predictors of disagreement.
Data for this study were collected as part of a multisite initiative designed to evaluate the impact of outreach interventions on engagement and retention in health care for marginalized HIV-infected men and women, who are frequently considered medically underserved.8,26 The populations served were defined locally in each study site, and included active substance users, the medically indigent, commercial sex workers, people with mental illness, recently incarcerated individuals, and homeless people. (See Rajabiun et al.26 for a full description of the goals and methodology of this study.)
Recruitment was conducted in 10 sites across the United States from 2003 to 2005. HIV-infected people who enrolled in local outreach programs were invited to join the study. Eligibility criteria included HIV infection by self-report, at least 18 years of age, and ability to complete the interview in English or Spanish. All participants gave informed consent, and each study site obtained approval from their Institutional Review Boards.
The sample for this analysis comprised participants from 9 of the 10 study sites because 1 site did not collect medical records. Because this analysis focused on a comparison of self-report and medical record data for information obtained during the 6-month follow-up period, only people whose medical record data were obtained and who were retained in the study to the 6-month follow-up period were included in this analysis. A total of 1045 participants were enrolled across these sites; each site's sample size ranged from 43 to 145 participants. Across the sites, medical records were obtained from 988 (94.5%) participants, and, among this group, 694 (70.2%) were retained in this study to the 6-month follow-up period.
Participants were administered comprehensive face-to-face interviews at baseline and 6 months later, and medical records were reviewed covering the period between baseline and the 6-month follow-up period using standardized forms and systematic protocols. All self-report and medical record data used in these analyses were based on data covering this same 6-month period.
Interviews were conducted using standardized instruments and measures that have been previously validated in similar populations. Participants were asked to report on sociodemographic characteristics, substance use behaviors, HIV risk behaviors, severity of their HIV disease, and HIV health care service utilization. Sociodemographic data included age, gender (female versus male), race/ethnicity (Hispanic, non-Hispanic black, non-Hispanic white, and non-Hispanic other), education (high school, GED, or less versus college or some college), sexual orientation (heterosexual versus gay, lesbian, or bisexual), housing status (stable housing versus unstable housing, including doubling up with friends or family or homeless), and health insurance status (any versus none). Substance use behaviors were measured using a modified version of the Addiction Severity Index.27 Subjects were asked to report on use of illicit drugs and alcohol over the past 6 months, including heroin and/or nonprescribed opioid medication use, crack/cocaine use, and heavy alcohol use (defined as five or more drinks per day for men and four or more for women).28 A combined variable was created to indicate any illicit drug/heavy alcohol use versus no use over the past 6 months. Participants were also asked to report on injection drug use in the past 6 months. HIV sexual risk behaviors were measured using questions modified from the Risk Assessment Battery29 including items on sex without a male or female condom in the past 6 months and sex in exchange for money, drugs, food, or a place to stay (e.g., survival sex) in the past 6 months. Health status was measured by asking participants to report the length of time they have been HIV-positive and their most recent CD4 count. Self-report HIV health care utilization was measured by asking participants to report on number of visits to their HIV primary care provider, defined as “care from the provider that most frequently monitors CD4 count and viral load tests, and prescribes HIV medication” and could include care from physicians, physician assistants, doctors of osteopathy, or nurse practitioners in the past 6 months.
Medical records were requested from participants' HIV primary care providers. Participants were asked to report the names and addresses of all of their HIV primary care providers. The protocol for retrieving medical records was rigorous. A letter was sent to the participants' providers requesting the necessary data. If no response was received a second letter was sent, and/or telephone follow-ups were made. If a response was still not received, a research assistant went to the provider's facility to request the medical records in person. If after extensive attempts at collecting medical records failed, these data were considered missing and these participants were not included in the current analysis.
Since our study uses medical records as the standard against which we compare self-report data, we made every effort to ensure medical record reviews were conducted accurately. Therefore, HIV health care utilization data were extracted from medical records of HIV primary care providers. Primary care visits, which were defined exactly as in the participant interview, included visits to providers specializing in internal medicine, family medicine, infectious disease, or obstetrics/gynecology. Visits to specialists such as ophthalmologists, pulmonologists, and other providers that treat conditions secondary to HIV/AIDS were not included in this analysis.
A standardized medical record data extraction tool was developed for the purpose of this study and used by all sites. Data collection staff were required to have a sufficient level of medical knowledge and were trained by members of the multisite study coordinating center to extract medical record data. Each site was given detailed instructions on medical record extraction and quality assurance protocol to be followed for this study. Medical record extractors were supervised locally by an HIV clinician or senior evaluator. As a final quality control measure, the multisite study coordinating center reviewed medical record data submitted from local sites during the course of the study and contacted sites that were found to have incomplete or unusual utilization patterns. The coordinating center also conducted site visits, in which at least 10 medical records were selected at random and reviewed. Errors were resolved by mutual agreements between the coordinating center and the specific site with regard to correcting previously collected data and improving the medical record review process for the remainder of the study.26
Our main outcome variables for these analyses were HIV primary care visits as obtained from self-report and medical record data. For each data source, visits were recoded as no visits, one visit, and two or more visits during the 6-month follow-up period. The upper cutoff of two or more visits was selected based on recommendations that HIV-infected patients have at least two visits with their HIV primary care provider during every 6-month period to appropriately monitor CD4 count, medication adherence, and other health conditions.3,30 We created separate categories for no visits and one visit per 6-month period because some providers may argue that stable patients do not need more than one visit per 6 months.
Agreement between self-report and medical record data on the number of HIV primary care visits was evaluated using percent agreement and the weighted κ statistic. Following Cohen, the κ statistic is a chance-corrected measure of agreement that indicates good agreement with a value of 0.75 or greater, fair agreement with a value of 0.40–0.74, or poor agreement with a value of less than 0.40.31
We additionally examined whether disagreement between self-report and medical record data occurred systematically by the sociodemographic, behavioral, or clinical characteristics described above. We categorized participants as having “agreement” versus “disagreement” between the two data sources and then assessed whether disagreement was associated with any of the characteristics described above using χ2 statistics. Participants' self-report data were in agreement with medical record data if the number of visits fell in the same category (zero, one, or two or more) from both sources, and were in disagreement if they fell in different categories.
Characteristics that were associated with disagreement at p<0.20 were included in multivariable regression analyses that aimed to identify independent predictors of disagreement between self-report and medical record data. To validate our selection process for inclusion of specific characteristics in our multivariable regression models and to control for possible collinearity between covariates, we also performed a stepwise regression analysis. Entry criteria for sociodemographic, behavioral, and clinical characteristics in this step-wise analysis was also set at p<0.20. The resulting model from the step-wise analysis validated our selection process of specific characteristics for our final model. The final regression analysis was conducted using generalized estimating equations, for clustering in the data by site. We present the adjusted odds ratios and 95% confidence intervals of the characteristics that were included in our final model.
The mean age was 40 years (standard deviation [SD]=10.6). The sample was predominantly male (n=415, 60.7%); black or Hispanic (n=537, 77.5%); had a high school education, GED, or less (n=524, 75.5%); was unstably housed (n=380, 54.8%); and was insured (n=506, 73.0%). Most reported engaging in some type of illicit drug or heavy alcohol use in the past 6 months (n=376, 54.3%). The mean time since first HIV test was 8 years (SD=6.2), and only a small proportion (n=84, 12.19%) were diagnosed within the past 6 months.
Of the 694 study participants, medical records indicated that 518 people (74.6%) met generally recognized standards of care by making at least two HIV primary care visits during the previous 6-month period. Of those whose medical records indicated that standards of care were not met, 101 people made only one visit (14.6%), and 75 (10.8%) people made no visits. Alternatively, self-report data indicated that 579 people (83.4%) met standards of care. Of those whose self-report data indicated that standards of care were not met, 79 (11.4%) reported only one visit, and 36 (5.2%) reported making no visits.
Using visit data that were categorized as zero, one, or two or more visits per 6-month period, the overall agreement between self-report and medical record data for HIV primary care visits was 76.0% (weighted κ=0.36, 0.28-0.43, see Table 1). Using visit data that were categorized as having met standard of care or not (at least two visits in a 6-month period versus fewer than two visits), the overall agreement was 79.3% (weighted k=0.39, 0.37–0.47), (data not shown). Most disagreement resulted in patients over-reporting visits compared to medical record data.
Table 2 shows the percent disagreement between self-report and medical record data by sociodemographic, behavioral, and clinical characteristics. Disagreement (at the p<0.20 level) was associated with younger age; female gender; high school, GED, or less education versus some college/college education; heterosexual versus gay, lesbian or bisexual; illicit drug/heavy alcohol use; having been diagnosed with HIV within the previous six months; and having a CD4 count<350mm/cells3. While disagreement was not associated with our four-category race/ethnicity variable at the specified significance level, we observed a large difference in the proportion with disagreement for non-Hispanic whites and other race/ethnicity categories. We therefore conducted additional tests to determine the statistical significance of differences between non-Hispanic whites and each of the other categories. As disagreement was associated with race/ethnicity in the comparison between non-Hispanic whites and non-Hispanic blacks at p<0.20 level (data not shown), race/ethnicity was included in our multivariable regression analyses.
Table 3 shows the results of the multivariable regression analyses, which indicated that age (adjusted odds ratio [AOR] for 20 versus 40 years=1.25, 95% confidence interval [CI] 0.98, 1.60), race/ethnicity (AOR for non-Hispanic black versus non-Hispanic whites=1.48, 95% CI 1.03, 2.13), education (AOR for high school education, GED, or less versus some college/college graduate=1.43, 95% CI 0.96, 2.13), and substance use (AOR for any illicit drug/heavy alcohol use in the past 6 months versus no use=1.39, 95% CI 1.02, 1.90) were independently associated with disagreement. The trend associations for younger versus older age and less versus greater education with disagreement between self-report and medical record data did not reach statistical significance at the p<0.05 level.
In this heterogeneous sample of socially and economically marginalized HIV-infected people recruited across the United States for participation in interventions to improve engagement and retention in HIV care, agreement between self-report and medical record data for the number of HIV primary care visits was poor based upon generally agreed upon standards of evaluation of agreement (weighted κ=0.36). The four factors associated with disagreement included younger versus older age, non-Hispanic black versus non-Hispanic white race/ethnicity, less versus greater educational attainment, and illicit drug or heavy alcohol use.
These data raise concern about whether HIV health services research on HIV primary care visits in marginalized groups can rely on self-report data, and raise question as to whether such data from marginalized populations are directly comparable to similar data collected from other populations. Previous research examining HIV ambulatory care visits in populations that were not selected for being in marginalized reported findings different from ours. For example, two studies compared self-report data to medical or billing records in samples of well-educated white men.18,19 Both studies concluded that self-reported ambulatory care visits were reasonably concordant with medical or billing records, although one noted that the concordance for ambulatory visits was lower than inpatient visits and identified specific patterns associated with underreports and overreports.19 However, data from previous research of HIV-related ambulatory care visits in marginalized groups were consistent with the present study.32
We found that even though our sample comprised people with multiple social and economic disadvantages, illicit drug or heavy alcohol use stood out as a factor that was associated with lower agreement between self-report and medical records in terms of health services utilization data. It will be important to explore reasons for this pattern, which might include people who use illicit drugs or heavy alcohol have greater difficulty accurately remembering and reporting their health services utilization, medical records are less complete for this group, and other reasons. We were also uncertain why there was lower agreement in our sample for non-Hispanic black patients compared with non-Hispanic white patients, but speculate that it is possible that the poorer overall relationship with health care providers and the health care system experienced by racial/ethnic minorities might influence the way health care utilization is both remembered and reported by the patients and recorded in medical records.33–35 Additionally, there was a trend for younger and less educated people within this marginalized HIV-infected populations to have more disagreement between self-report and medical record data (p<0.20). Previous research also found that illicit drug users and people with less education were less likely to have concordant self-report and medical record data associated with various health care utilization behaviors.15,19 It may be important to keep these specific characteristics in mind when relying on self-report HIV health care utilization data.
While disagreement included both people who reported more and people who report fewer visits than were recorded in medical records, in the majority of the cases participants reported more medical visits with their HIV primary care provider than we could confirm in medical records. There could be several reasons for this particular pattern. First, the participants in our sample may have misunderstood our questions about primary HIV care visits and included other types of medical visits in their self-reports. Because marginalized individuals tend to rely on acute care facilities more than others, it is possible that they may inaccurately perceive visits to these facilities as primary care visits.2,3 Second, participants may have been influenced by social desirability bias and reported having made the two visits in the past 6 months that providers typically encourage, even if those visits actually were never made. We acknowledge that this pattern might have been the results of incomplete medical record data, as well as we believe this is unlikely to fully explain our findings. Our medical record extraction process was thorough and included a number of quality control measures. Additionally, while it is possible that medical records fail to include all utilization information, medical visits by HIV-infected people typically include writing a prescription, ordering laboratory tests, or receiving/reviewing the results of laboratory tests, all of which typically require written orders/confirmation by the physician. Therefore, while possible, it is unlikely that HIV-related primary care visit data are completely missing from a large proportion of patient records.
Limitations of this study include the lack of a true gold standard to compare self-report data with, making it impossible for us to estimate the true amount of error in self-report data by comparisons with medical record data or use our findings to derive correction factors. Other data sources that might be used to compare self-report data against, such as clinic and insurance billing records, may also be problematic for some measures of utilization.36,37 These data sources are likely to be particularly problematic for populations that suffer from inconsistent insurance coverage. In our sample, although many people were likely to be eligible to receive public health insurance due to poverty or AIDS-related disability, some people did not qualify. Additionally, eligibility might have fluctuated over time for various reasons, such as changes in employment. Therefore, to date, there are no data indicating that any one source of HIV primary care visit information can be considered a gold standard for marginalized populations. Furthermore, generally one cannot assume that medical record data are necessarily more accurate than self-report data. However, in this study, rigorous methods for collecting and extracting medical record data were applied that included a standardized protocol, didactic training, and multiple quality checks. Therefore, although the medical record data collected in this study are not likely to be perfect, they are less prone to error than the self-report data collected in this study.
Despite the challenges associated with collecting medial record data for HIV health services research among marginalized populations, this study does not support a conclusion that unconfirmed self-report data are sufficiently valid substitutes for carefully collected medical record data. This is based on the poor chance-corrected agreement between these two data sources in our large, heterogeneous sample. If health services researchers must rely on self-report data, they might consider using outcome measures that have been shown to have better concordance between self-reports and other data sources than HIV primary care visits, such as CD4 count.15,32 Researchers should also rigorously incorporate known strategies to increase the accuracy of self-reports, such as asking redundant questions, confirming responses, and incorporating short recall periods.17
This research was supported by the Health Resources and Services Administration (Grants H97HA00247 and H97HA00191). These grants were funded through HIV/AIDS Bureau’s Special Projects of National Significance. Additional support was provided by the Center for AIDS Research at the Albert Einstein College of Medicine/Montefiore Medical Center funded by NIH AI-51519 and by the Robert Wood Johnson Foundation’s Harold Amos Medical Faculty Development Program. The contents of this publications is solely the responsibility of the authors and does necessarily represent the views of the funding agencies.
No competing financial interests exist.