|Home | About | Journals | Submit | Contact Us | Français|
Large-scale disasters may disrupt health surveillance systems, depriving health officials and researchers of timely and accurate information needed to assess disaster-related health effects and leading to use of less reliable self-reports of health outcomes. In particular, ascertainment of cancer in a population is ordinarily obtained through linkage of self-reported data with regional cancer registries, but exclusive reliance on these sources following a disaster may result in lengthy delays or loss of critical data. To assess the impact of such reliance, we validated self-reported cancer in a cohort of 59,340 responders and survivors of the World Trade Center disaster against data from 11 state cancer registries (SCRs).
We focused on residents of the 11 states with SCRs and on cancers diagnosed from September 11, 2001, to the date of their last survey participation. Medical records were also sought in a subset of 595 self-reported cancer patients who were not recorded in an SCR.
Overall sensitivity and specificity of self-reported cancer were 83.9% (95% confidence interval [CI] 81.9, 85.9) and 98.5% (95% CI 98.4, 98.6), respectively. Site-specific sensitivities were highest for pancreatic (90.9%) and testicular (82.4%) cancers and multiple myeloma (84.6%). Compared with enrollees with true-positive reports, enrollees with false-negative reports were more likely to be non-Hispanic black (adjusted odds ratio [aOR] = 1.8, 95% CI 1.2, 2.9) or Asian (aOR=2.2, 95% CI 1.2, 4.1). Among the 595 cases not recorded in an SCR, 13 of 62 (21%) cases confirmed through medical records were reportable to SCRs.
Self-report of cancer had relatively high sensitivity among adults exposed to the World Trade Center disaster, suggesting that self-reports of other disaster-related conditions less amenable to external validation may also be reasonably valid.
Identification and tracking of long-term physical and mental health effects of man-made and natural disasters often rely on self-reports of clinically diagnosed health conditions obtained longitudinally through questionnaires,1 but the accuracy of such self-reported diagnoses is often difficult to assess because of large sample sizes and limited funding. Accuracy of self-reported health data can depend on such factors as type of condition, study population demographics, and recall period length.2–9
Linkage of cohort data with independent cancer registries enables assessment of cancer self-report accuracy and may reflect self-report accuracy of other endpoints for which no outside data are available. The sensitivity of self-reported cancer is reported to vary by site, treatment, and number of previous tumors.6,10–12 False-negative self-report of cancer diagnoses determined by comparison with cancer registry data is associated with older age,6,11,13 nonwhite race, increased time since cancer diagnosis,6 lower education level,2,8 male sex, and urban living.2
The World Trade Center (WTC) Health Registry (WTCHR) has followed a cohort of 71,434 people since 2003 to identify and track the long-term health effects of the September 11, 2011 (hereinafter 9/11), terrorist attacks through surveys.14 Although the WTCHR periodically requests to compare its data with those of other health registries, such as state cancer registries (SCRs) and the Statewide Planning and Research Cooperative System, the time lags between diagnosis and data availability for comparison often limits timely surveillance of emerging or rare health conditions. Thus, the accuracy of self-reported health information, including cancer diagnosis, provided by WTCHR enrollees at enrollment and in subsequent waves is important. We examined the performance of self-reported health data collected by the WTCHR, using cancer as an example, by comparing WTCHR survey data with data obtained via linkage with SCRs to validate self-reported cancer diagnosis. We also examined correlates of false-negative and false-positive reports.
The WTCHR is a cohort study of 71,434 people who were directly exposed to the destruction of the WTC and surrounding buildings and its aftermath. Details of eligibility and recruitment are available elsewhere.14,15 In brief, people with potential exposure were recruited from lists of businesses and building occupants and via public outreach and media campaigns. Potential enrollees were screened for eligibility, with those enrolled belonging to one or more of the following groups: rescue/recovery workers and volunteers, lower Manhattan residents, area workers, passersby, schoolchildren, and school staff members. Baseline data (Wave 1) were gathered in 2003–2004 via computer-assisted telephone interview (95%) or in-person interview (5%). Wave 1 data included data on demographics, current health status, medical history, and exposure. An adult follow-up survey (Wave 2) conducted from November 2006 through December 2007 (response rate: 68%, n=46,602/68,959) obtained additional exposure data and updated health information.16 The final cohort (n=71,434) consisted of 30,664 (42.9%) responders (rescue/recovery workers and volunteers) and 40,770 (57.1%) survivors (residents, passersby, area workers, schoolchildren, and school staff members in Lower Manhattan on the morning of 9/11).
The study sample for this analysis was limited to adult enrollees aged ≥18 years at enrollment (Wave 1) who completed cancer questions in Wave 1 or follow-up (Wave 2) surveys. The study was also limited to residents since 9/11 of the 11 states (California, Connecticut, Florida, Massachusetts, New Jersey, New York, North Carolina, Ohio, Pennsylvania, Texas, and Washington) in which we conducted cancer record linkage. We excluded proxies and withdrawals and included only cancers—reported either by enrollees or the SCRs—that were diagnosed between September 12, 2001, and the date of their last survey participation (Wave 1 or Wave 2). A total of 59,340 enrollees met inclusion criteria (Figure 1).
In Wave 1, we asked (1) “Have you ever been told by a doctor or other health professional that you had cancer or a malignancy of any kind?”, (2) “Did a doctor or other health professional first tell you that you had cancer or a malignancy of any kind before 9/11 or after 9/11?”, and (3) “What kind of cancer was it?” with a drop-down menu of cancer sites. Wave 2 had similar questions, but the question about cancer site required an open-ended response and inquired about year of diagnosis. A self-reported post-9/11 cancer was defined as a positive answer to the first cancer question and a reported diagnosis after 9/11.
We matched enrollees with the people registered in the 11 SCRs, all of which adopted Link Plus, a probabilistic record linkage program developed by the Centers for Disease Control and Prevention.17 We provided full name, sex, race/ethnicity, birth date, complete address of residence, and social security number when available to each SCR. Matches were reported by each SCR; also reported by each SCR was information on primary cancer site(s), histology, stage, diagnosis date, and state where cancer was diagnosed. Linked cancer data from all SCRs were available through December 31, 2008. Cancer site was defined according to the International Classification of Diseases for Oncology, Third Edition, and grouped by using the Surveillance Epidemiology and End Results (SEER) site recode codes for primary site and histology.18
We investigated self-reports of cancer that were not also confirmed by the SCRs by contacting enrollees' physicians for confirmation of cancer diagnoses. This investigation required first contacting the enrollees for permission to communicate with their physicians and then reviewing relevant medical reports and records. Because of limited resources, we carried out this investigation from June 2009 to January 2010 only among the 595 enrollees then living in New York State.
We evaluated the performance of self-report using sensitivity, specificity, and positive predictive value (PPV). We defined a true-positive report of cancer as a self-report of cancer recorded in an SCR, a false-negative report of cancer as non-self-report of cancer that was recorded in an SCR, a false-positive report of cancer as a self-report not recorded in an SCR, and a true-negative report of cancer as a non-report and non-recorded cancer (Figure 2). We defined sensitivity as the proportion of true-positive reports among recorded cancers. We defined specificity as the proportion of true-negative reports among non-recorded cancers. We defined positive predictive value (PPV) (i.e., agreement of self-report with SCR) as the proportion of true-positive reports among all self-reports (true-positive reports and false-positive reports).
A matched cancer site refers to verification of the anatomical site of the self-reported cancer using SEER site recode rules. The term false-positive is used for consistency with standard validation analyses and does not imply that those reporting cancer actually do not have cancer. We recognize that a cancer may not have been reported to the SCR for various reasons. Understanding that a matched individual may not match to the cancer site, we assessed the performance of self-report separately, by individual and by site.
We performed bivariate and multivariable analyses using logistic regression modeling to assess whether enrollees' sociodemographic characteristics, medical history, or number of cancer sites were associated with either false-negative or false-positive reports. We computed unadjusted odds ratios (ORs), adjusted odds ratios (aORs), and 95% confidence intervals (CIs). The OR represents the odds of having a false-negative or false-positive report given the presence of a variable of interest, compared with the odds of having a false-negative or false-positive report given the absence of a particular variable of interest. We computed aORs and 95% CIs using a multivariate model in which variables such as sociodemographics, Wave 2 participation, history of other medical conditions, and probable posttraumatic stress disorder (PTSD) were included if they were significantly associated with false-negative or false-positive reporting in bivariate analyses. We evaluated socioeconomic status as a combination of annual household income and education at three levels: low (education ≤12 years and annual household income <$25,000), high (≥college degree and annual household income ≥$50,000), and intermediate (between low and high socioeconomic status). In analyzing people with false-positive reports, we excluded enrollees who reported only non-melanoma or unspecified skin cancers because these tumors are not reportable to SCRs.
We also examined whether or not a positive response to the Wave 1 cancer question was consistent with the Wave 2 response by computing percentages and 95% CIs of agreement. We performed all data analyses using SAS® version 9.2.19
Overall sensitivity and specificity of self-report were 83.9% (95% CI 81.9, 85.9) and 98.5% (95% CI 98.4, 98.6), respectively. Sensitivity was greater among participants in Wave 1 and Wave 2 than among participants in Wave 1 only (87.5% vs. 67.3%); however, specificity was not substantially different between these two groups (Table 1).
Of 1,909 enrollees who reported a cancer diagnosis, 1,393 (73.0%) specified a single cancer site, 80 (4.2%) reported two or more sites, and 436 (22.8%) did not specify any site. Sensitivity varied by site. The highest site-specific sensitivities for self-report of cancer were observed for pancreatic (90.9%), multiple myeloma (84.6%), and testicular (82.4%) cancers. The highest PPV of self-reported cancer was for multiple myeloma (100.0%), followed by prostate (93.5%) and testicular (93.3%) cancers, while the lowest was for melanoma of the skin (40.5%) (Table 1).
Of the 1,248 enrollees with a primary cancer site recorded in an SCR, 201 (16.1%) did not report having any cancer diagnosed in either wave (Table 1). Compared with the 1,047 enrollees with true-positive reports, enrollees with false-negative reports were more likely to be Wave 2 nonparticipants (aOR=2.9, 95% CI 2.0, 4.1), non-Hispanic black (aOR=1.8, 95% CI 1.2, 2.9) or Asian (aOR=2.2, 95% CI 1.2, 4.1), or have not provided a social security number (aOR=1.6, 95% CI 1.0, 2.6) (Table 2). Enrollee characteristics such as age, sex, smoking status, history of cardiovascular disease, emphysema, diabetes, or PTSD were not associated with false-negative reports. A non-English-language Wave 1 interview was strongly associated with false-negative report in bivariate analyses (OR=3.7, 95% CI 2.0, 6.9). However, this variable was excluded from multivariable analyses because of strong collinearity with socioeconomic status.
A total of 1,909 enrollees reported having post-9/11 cancer diagnosed during the study period, of whom 862 (45.2%) were not recorded in an SCR. Of these 862 enrollees, 31 (3.6%) reported more than one cancer and 390 (45.2%) reported non-melanoma or unspecified skin cancer that was not reportable to an SCR.
After excluding the 390 non-reportable non-melanomas or unspecified skin cancer cases, we compared 472 enrollees who had false-positive reports with 1,047 enrollees who had true-positive reports (Table 3). Several factors that were not associated with false-negative reports were associated with false-positive reports as compared with true-positive reports. Enrollees with false-positive reports were more likely than enrollees with true-positive reports to be younger (aOR=3.6, 95% CI 1.0, 13.6 for adults aged 18–24 years; aOR=2.3, 95% CI 1.8, 3.0 for adults aged 25–44 years), current smokers than never smokers (aOR=1.9, 95% CI 1.3, 2.6), and to have probable PTSD than not (aOR=1.7, 95% CI 1.3, 2.3) in the adjusted model. Factors such as race/ethnicity and providing a social security number that were associated with false-negative reports were not associated with false-positive reports. Those who did not participate in Wave 2 were less likely than those who did to have false-positive reports (aOR=0.6, 95% CI 0.4, 0.8).
In the sub-analysis of 595 New York State residents for whom we sought medical record confirmation of self-reported cancer, 260 enrollees (43.7%) responded to our investigation; 138 (53.1%) claimed they had never had a cancer diagnosis, 92 (35.4%) gave permission for us to obtain further information from their physician, and 30 (11.5%) refused to give permission. Among those who gave permission, 62 self-reported cancer cases (67.4%) were physician confirmed, 19 (20.7%) were physician confirmed as not having cancer, eight (8.7%) were not confirmed because the treating physicians failed to provide information, and three (3.3%) were not confirmed because medical records could not be located. Among the 62 self-reported cancers confirmed by primary or treating physicians, 13 were reportable cancer cases and the remaining were non-reportable cancers because they were either non-melanoma skin cancer or benign tumors.
When asked at Wave 1, “Have you ever been told by a doctor or other health professional that you had cancer or a malignancy of any kind?”, 611 of 39,531 people who participated in both Wave 1 and Wave 2 answered affirmatively. When asked the same question in Wave 2, 560 (91.7%) of these 611 enrollees again answered affirmatively, and 51 (8.3%) provided inconsistent answers. The cancer linkage proportion was 60.9% (n=341/560) for enrollees who provided consistent answers and 17.6% (n=9/51) for those who provided inconsistent answers.
This study examined the performance of self-reported cancer diagnoses in a 9/11-exposed cohort and identified correlates of false-negative and false-positive reporting. Sensitivity of self-reported cancer among those who responded to both surveys was relatively high (87.5%). It was also higher than in several other U.S. studies using cancer registry data as the gold standard (60.8%–74.2%)3,6,20 and similar studies conducted in other countries (40.0%–57.5%).8,13,21 The relatively high sensitivity in this study may be attributable to the use of multiple SCRs, to giving participants an opportunity to report a diagnosis twice (during Wave 1 and Wave 2), or to the relatively short recall period (2–6 years) compared with other studies (>20 years).6,11
The site-specific sensitivity varied considerably and was low for cancers in the oral cavity and the pharynx (22.2%) or brain and other nervous system (31.8%), indicating substantial underreporting of these cancer sites in the study sample. False-negative reporting among non-Hispanic black and Asian patients and those who did not provide a social security number was also prevalent. Similar findings were reported in a community-based study where the most often underreported cancer sites were central nervous system and lip, oral cavity, and pharynx, and nonwhite patients were 10 times more likely than white patients to provide a false-negative report of their cancer history.6
Knowing the patterns of false-negative reporting is important for surveillance of cancer incidence among WTCHR enrollees because high validity of self-reported cancer data collected from follow-up surveys may partially compensate for the delay in time from diagnosis to data availability in the cancer registry and provide insight into the validity of cancers that are not reportable to the SCR. Studies examining false-negative reporting have cited factors such as being less informed about the diagnosis (possibly related to cultural differences in communication) or mistrust of health-care professionals or study interviewers.6,13 Researchers might improve reporting of health conditions by adopting other researchers' methods. Methods for enhancing response accuracy include providing a clear definition of the respondent's task, improving respondent's motivation, and facilitating cognitive processing.22 These methods are particularly important because African Americans had the highest overall cancer incidence and mortality rates between 2000 and 2004 among all races in the United States23 and high incidence and mortality rates overall for other health conditions.24–26
Not completing a follow-up survey (Wave 2) was associated with false-negative reporting, which may have resulted from not having a second opportunity to report a cancer diagnosis or because of confounding factors associated with both nonparticipation and a cancer diagnosis. Those who did not complete the Wave 2 survey were more likely than those who completed Waves 1 and 2 to be older (≥65 years of age), have a lower annual household income (<$25,000), or be current smokers, all of which are reported to be associated with false-negative reporting.2,6,8,11
This study had several limitations. First, one possible explanation for false-positive reporting is that self-reported cancer could not be verified in this study if the cancer was reported outside coverage of the 11 SCRs. Second, some cancers that were reported on the survey may not be reportable to SCRs. In our investigation of a subset of reports of cancer that were not confirmed by the SCRs, a large proportion of cancers confirmed by treating physicians were not reportable to an SCR. Thus, we would expect this discrepancy to result in some false-positive reports. Completeness and accuracy of data from the 11 SCRs used in this study have been reported to be relatively high,27,28 so this explanation for false-positive reports is unlikely. Third, self-report is subject to recall bias. Cancer was not covered by the September 11 Victim Compensation Fund until 2012,29 five years after data used in this study were collected. As such, false-positive reports were unlikely to have been influenced by a desire to obtain financial support from the fund's existence.
Published WTC-related studies have largely relied on self-reported data. The findings in this study support the use of survey data for ongoing timely surveillance of adverse health conditions such as cancer. Self-reported cancer can be used to complement cancer linkage with cancer registries.
This study was supported by cooperative agreement #5U50/OH009739 and #1E11/OH009630 from the National Institute of Occupational Safety and Health (NIOSH), Centers for Disease Control and Prevention (CDC); #U50/ATU272750 from the Agency for Toxic Substances and Disease Registry, CDC, which included support from the National Center for Environmental Health, CDC, and the New York City Department of Health and Mental Hygiene (NYC DOHMH). The contents of this article are solely the responsibility of the authors and do not necessarily represent the official views of NIOSH or CDC. The authors assume full responsibility for analyses and interpretation of these data.
The authors acknowledge the support and collaborative effort of 11 state cancer registries for carrying out cancer record linkages, especially the following people: Allyn Ami, MPH, and Marilyn Kempster, MPH (California); Lou Gonsalves, PhD (Connecticut); Brad Wohler, MS (Florida); Pamela K. Agovino, MPH (New Jersey); Amy R. Kahn, MS (New York State); Chandrika Rao, PhD (North Carolina); Richard Knowlton, MS, MA, and BJ Mattson, MAE, MSTE (Ohio); Michelle Esterly, RHIA (Pennsylvania); Paul Betts, MS (Texas); and Mahesh Keitheri Cheteri, PhD (Washington). The authors also thank Lennon Turner, Renato Dasilva, and Rafaela Cruzado from the NYC DOHMH for their help in medical record confirmation; and Carolyn Greene, MD, Amy R. Kahn, and Sharon Perlman for their helpful comments.
This study was approved by the Institutional Review Board of NYC DOHMH. Cancer registry record linkages were approved by the respective institutional review boards of the state departments of health in California, Connecticut, Florida, Massachusetts, New York, North Carolina, Ohio, Pennsylvania, Texas, Washington, and Rutgers University (for New Jersey).