|Home | About | Journals | Submit | Contact Us | Français|
Objective. Prevalence of colorectal cancer (CRC) screening is ascertained by self-reported screening, yet little is known about the accuracy of this method across different racial/ethnic groups, particularly Hispanics. The purpose of this study was to compare the accuracy of CRC self-report measures across three racial/ethnic groups.
Methods. During 2004 and 2005, 271 white, African-American and Hispanic participants were recruited from a primary care clinic in Southeast Texas, and their CRC testing history based on self-report and medical record (the ‘gold standard’) were compared.
Results. Over-reporting was prevalent. Overall, up-to-date CRC test use was 57.6% by self-report and 43.9% by medical record. Racial/ethnic group differences were most pronounced for Hispanics in whom sensitivity was significantly lower for any up-to-date testing, fecal occult blood testing, flexible sigmoidoscopy and double contrast barium enema. There were no statistically significant differences across groups for over-reporting, specificity or concordance.
Conclusions. Self-report prevalence data are overestimating CRC test use in all groups; current measures are less sensitive in Hispanics.
Population-based estimates of the prevalence of colorectal cancer (CRC) screening in the USA have largely relied on patient self-reports.1,2 Yet, accurate determination of CRC screening is particularly challenging because an array of tests are recommended at differing frequencies, recommendations continue to change over time,3,4 and patients are unfamiliar with the tests.5 Establishing the accuracy of self-report measures is therefore very important, particularly in different racial/ethnic groups, because differences in CRC outcomes may be attributable to differences in screening uptake. Although there have been a number of recent studies on the accuracy of self-reported CRC screening behaviours,6–10 only one compared the accuracy of self-report CRC measures in different ethnic groups;9 none has examined accuracy of self-report based on recent guidelines,4,11 in Hispanics. Given that disparities in CRC screening exist,1,2 it is particularly important to ensure that prevalence estimates are accurately measured in all these groups. It is also important to test the accuracy of these measures in different groups because self-report measures are often used to evaluate the effect of CRC screening studies and programs. The purpose of this study, therefore, is to compare the accuracy of self-reported CRC testing across three major racial/ethnic groups in a diverse clinic sample.
Participants in this study were part of a parent study identifying factors associated with CRC screening.12 Parent study participants were recruited from a university-based family medicine clinic in Southeast Texas over a 16-month period during 2004 and 2005. The clinic serves a diverse mix of racial/ethnic groups from both urban and semi-rural areas. To be eligible, patients had to be 50–80 years of age and of non-Hispanic white, African-American or Hispanic race/ethnicity. Individuals with a past history of CRC or high risk of CRC (familial adenomatous polyposis syndrome, hereditary non-polyposis CRC or ulcerative colitis) were excluded but those with other common gastrointestinal diagnoses (polyps in the bowel, irritable bowel syndrome, diverticulosis or abdominal hernia) were included. A quota sampling scheme, balanced by race/ethnicity, age (<65, ≥65) and gender, was instituted to increase the statistical power for comparisons by race/ethnicity and age. Participants were eligible for inclusion in the validation arm of the study if they had been a patient at the institution since the age of 50 or for the last 10 years (whichever period was shorter) and had medical records available. Participants for the validation study (n = 300) were randomly selected from this group (n = 560) The sample size was calculated by assuming sensitivity estimates of 89–96% and specificity estimates of 86–97%13 and a prevalence of screening of 40%, in which case, group sizes of 100 would provide sufficient power (alpha = 0.05 and beta = 0.20) to detect differences as small as 20%.
Self-reported CRC screening was assessed using validated core measures with minor adaptation following pilot testing.14 These measures include a description to help patients identify each of the following tests: fecal occult blood testing (FOBT), colonoscopy (COL), flexible sigmoidoscopy (FS) and double contrast barium enema (DCBE) followed by a series of questions to establish which tests were performed and when they were done. A Spanish language version was developed using standard back translation methods.15 We also collected sociodemographic information including the patient’s age, educational level, gender, income and insurance type.16 Race/ethnicity was self-reported and elicited with a two-part question consistent with federal criteria.17 In addition, health status and previous gastrointestinal diagnoses were also collected.
Documented screening was determined by a comprehensive review of patients’ complete medical records including: (i) the hospital-wide paper chart that is used by all specialists and includes copies of all laboratory (FOBT), radiology (DCBE) and procedure (COL, FS) reports; (ii) the outpatient electronic medical record that captures all family medicine outpatient visits and (iii) the electronic lab database that was instituted in 2002 and captures all FOBT results, DCBE reports and pathology specimens from COL or FS. For each subject, a target date range for abstraction was established as being the date, they turned 50 to the time of interview or if >60 years, then the previous 10 years. The abstraction protocol was finalized after pilot testing and abstractors were trained by the principal investigator (PI) (N.K.S.) and research associate (C.A.C.). Five abstractors were used, the PI and research associate rechecked charts that were inconclusive to confirm final results by consensus. Although it is possible that patients could have had testing elsewhere, it is unlikely because of the unique geography of the area (an island with no availability of other secondary or tertiary care providers); patients tend to stay within the associated university system for their testing.
Up-to-date screening in the medical record and by self-report was determined by whether the data from self-report or chart review revealed testing, for any reason according to the guidelines current at the time of the study:11 home FOBT within a year of the survey report or FS within 5 years or DCBE in the previous 5 years or COL in the previous 10 years. We evaluated sensitivity, specificity, raw agreement (concordance) and the report-to-records ratio of the self-report measures. These were calculated by determining the true positive, false positive, true negative and false negative rates by comparing the patient’s self-reported screening with the medical record. If a patient had multiple up-to-date tests of the same type, we used the most recent test.
‘Sensitivity’ was defined as the number of participants who had documentation of a test in the medical record who reported having had the test. ‘Specificity’ was defined as the proportion of those without a test in the record who reported not having the test. ‘Concordance’ was defined as agreement between self-report on the survey and the medical record. The ‘report-to-records ratio’ was defined as the total number of self-reported tests versus actual number documented. We calculated two-sided 95% confidence intervals around the estimates for sensitivity, specificity, concordance and report-to-records using standard methods.18,19 Performance measures were reported for any up-to-date test according to the recommendations and by test type for the overall sample and by racial/ethnic subgroup. We used Tisnado’s criteria20 to evaluate the sensitivity and specificity of outpatient services: where an estimate of <0.7 indicates poor agreement, ≥0.7 to <0.8 indicates fair agreement, ≥0.8 to <0.9 suggests good agreement and ≥0.9 indicates excellent agreement. Statistical significance was inferred by non-overlapping confidence intervals across tests and subgroups.
The response rate for the parent survey was 53%, with no significant differences between respondents and non-respondents for age, gender or race/ethnicity.12 From this sample, we identified 390 eligible cases and randomly chose 300 for inclusion in the validation study. During the abstraction, we found that 29 patients did not have records for the entire period and were therefore excluded from further analyses, leaving 271 patients in the study. Of these, 33% were African-American, 34% were Hispanic and 33% were white. Fifteen surveys were completed in Spanish. Racial/ethnic group differences were noted for education, income and health status (Table 1). In total, 57.6% (n = 156) of the patients reported themselves as being up-to-date with screening with at least one type of test, (whites: 65.2%, African-Americans 55.6% and Hispanics 52.2%), whereas 43.9% (n = 119) were up-to-date according to the medical record (whites: 49.4%, African-Americans: 44.4% and Hispanics: 38.0% (Table 1). Twenty per cent of subjects (n = 54/271) were up-to-date with more than one type of test.
For any up-to-date testing in the sample as a whole, sensitivity was good (0.85), specificity was poor (0.69), concordance was fair (0.79) and over-reporting was prevalent (1.25). Subgroup analyses revealed that sensitivity for any up-to-date testing was significantly lower in Hispanics compared to whites: sensitivity was only fair in Hispanics (0.77), good in African-Americans (0.83) and excellent in whites (0.93).
Examination by test type included the most recent up-to-date test of each type (so if there were multiple tests of same type, only the most recent was included, Table 2). In the entire sample, sensitivity varied by test type: the highest being good for COL (0.89), fair for FOBT (0.70) and poor for DCBE (0.45) and FS (0.5). The sensitivity for COL was significantly higher than for all three other tests. Specificity and concordance were overall in the good range (≥0.80) and were comparable across tests. Over-reporting was prevalent for all tests, but the only statistically significant difference was that over-reporting for DCBE was higher than for COL. Racial/ethnic subgroup analyses revealed that sensitivity in Hispanics was in the poor range for FOBT, FS and DCBE and was significantly lower than for African-Americans or whites for these tests; no African-American/white differences were noted. We found no differences in sensitivity for any of the tests based on the level of education of the participants. Specificity and concordance did not vary much across groups for any test and it was predominantly in the good range. All groups over-reported all tests but there were no statistically significant group differences observed.
This study is one of the first to examine self-reported CRC testing accuracy for four recommended CRC screening tests in Hispanics compared to other groups. Our study suggests that there are significant differences in self-report measure sensitivity across racial/ethnic groups, but that all groups over-report tests. This finding is in keeping with prior qualitative work in this triethnic population that revealed patients had difficulty distinguishing between different CRC tests and often mentioned upper gastrointestinal tests as CRC tests as well.5 These survey items were designed to address that difficulty but may need further development to address this problem in Hispanics. We found only one study comparing accuracy of self-reported CRC test use in Hispanics with other groups21 and that study was conducted before the widespread dissemination and promotion of the guidelines that began in 199722 and did not examine COL or DCBE. They found no difference in sensitivity or specificity for either FOBT or FS measures. However, they examined only a 2-year time frame, and few FS tests were done in Hispanics. In a recent meta-analysis,23 self-reported accuracy of other types of cancer screening tests were compared across racial/ethnic groups. Mammography and digital rectal exam self-reports were less sensitive and Pap test self-reports were less specific in Hispanics compared to other groups. In our study, we observed better sensitivity for self-report of FOBT, FS and DCBE in African-Americans compared to Hispanics, but we did not observe statistically significant differences in any self-reported CRC test measure between African-Americans and whites. One other recent study comparing CRC self-report accuracy in African-American and white predominantly male veterans also found no differences.9
Our sample estimates of sensitivity and specificity for each of the four tests are in line with other recent studies examining the accuracy of CRC self-report.7–10 Consistent with our findings, they observed the highest sensitivity for COL recall (0.77–0.92), lowest for BE (0.49–0.74) with intermediate values for FS (0.75–0.87) and FOBT (0.56–0.93). These studies also found that the specificity of the tests varied much less across tests (range 0.72–0.97), and that there was over-reporting for all tests, particularly DCBE.
Strengths of our study include that it is one of the first to examine the accuracy of self-report CRC test use in Hispanics. We maximized the ascertainment of completed tests, through use of multiple sources of chart data, restricting our inclusion criteria to patients in the system for a sufficient length of time to adhere to the guidelines for all of the CRC tests and taking advantage of the relative geographic isolation of the university system that maximized the chances that patients did not go elsewhere for health care. However, there are some limitations of the study. Our response rate was modest at 53%, however, we did determine that our respondents were no different to non-respondents and were representative of the clinic population. Second, because we included only those with access to health care who had medical record availability, our findings should be generalized with caution to other populations and settings. Our estimates, however, were similar to those reported in other studies. Third, we were not able to analyze the characteristics of the Spanish version of the measures because of a small sample size. Fourth, because of our small sample size, we did not control for racial/ethnic group differences in education, income or health status that may have influenced our findings. Although a separate analysis by educational level across all groups showed no differences in sensitivity measures.
Developing surveys items for multiethnic groups is fraught with difficulties24,25 because responding to a survey is a complex process. It has been conceptualized as involving four different stages, each of which can lead to inaccuracies in reporting. During the ‘comprehension’ stage, the respondent interprets the meaning of the question; in the ‘retrieval’ stage, the respondent relies on long-term memory for relevant information; in the ‘estimation/judgment’ stage, the respondent assesses the information retrieved and its relevance to the question and chooses to accept or reject the information. In the final ‘response stage’, the respondent weighs factors such as sensitivity and social desirability of the responses and then decides what answer to give. Cultural differences among groups could affect the process at any of these stages and may have done so in our population.26 Previous studies have reported ethnic group differences in response style24 and have reported that measures of preventive behaviors are particularly problematic across cultures.25 These measures were cognitively tested but not in Hispanics in the target age group. Future studies should cognitively test measures in all groups and ages.27
The implication of these findings is that national prevalence estimates maybe overstating screening rates in all groups, and that the observed racial/ethnic group differences in test use are real, especially for FOBT, FS and DCBE. Overall the accuracy of these self-report measures is acceptable: with COL self-report displaying the least racial/ethnic variation in accuracy. Since secular trends in CRC test use point to greater use of COL and declining rates of FS FOBT1,28 and DCBE29 in the USA, the prevalence of inaccurate group estimates may be attenuated as COL becomes the preponderant CRC test. However, further challenges in accurately measuring CRC screening test use will remain, such as the addition of new tests to the recommendations.3 Self-report measures of new tests will need to be developed and our work points to the vital importance of cognitive testing of new measures in all population subgroups to enhance understanding of the cultural influences on the cognitive, emotional and judgment processes utilized by survey respondents.
Funding: John Sealy Memorial Endowment Fund for Biomedical Research; and the National Cancer Institute (grant no. K07 CA107052).
Ethical approval: The study was IRB approved by the University of Texas Medical Branch Institutional Review Board.
Conflict of interest: none.
We would like to thank Alma Salazar, Mary Jane Simmons and Gustavo Andrade for assistance with data collection.