|Home | About | Journals | Submit | Contact Us | Français|
Because existing risk prediction models for lung cancer were developed in white populations, they may not be appropriate for predicting risk among African-Americans. Therefore, a need exists to construct and validate a risk prediction model for lung cancer that is specific to African-Americans. We analyzed data from 491 African-Americans with lung cancer and 497 matched African-American controls to identify specific risks and incorporate them into a multivariable risk model for lung cancer and estimate the 5-year absolute risk of lung cancer. We performed internal and external validations of the risk model using data on additional cases and controls from the same ongoing multiracial/ethnic lung cancer case-control study from which the model-building data were obtained as well as data from two different lung cancer studies in metropolitan Detroit, respectively. We also compared our African-American model with our previously developed risk prediction model for whites. The final risk model included smoking-related variables [smoking status, pack-years smoked, age at smoking cessation (former smokers), and number of years since smoking cessation (former smokers)], self- reported physician diagnoses of chronic obstructive pulmonary disease or hay fever, and exposures to asbestos or wood dusts. Our risk prediction model for African-Americans exhibited good discrimination [75% (95% confidence interval, 0.67−0.82)] for our internal data and moderate discrimination [63% (95% confidence interval, 0.57−0.69)] for the external data group, which is an improvement over the Spitz model for white subjects. Existing lung cancer prediction models may not be appropriate for predicting risk for African-Americans because (a) they were developed using white populations, (b) level of risk is different for risk factors that African-American share with whites, and (c) unique group-specific risk factors exist for African-Americans. This study developed and validated a risk prediction model for lung cancer that is specific to African-Americans and thus more precise in predicting their risks. These findings highlight the importance of conducting further ethnic-specific analyses of disease risk.
Lung cancer risk prediction models are helpful in identifying subgroups of smokers who may exhibit higher risks of lung cancer and may therefore benefit disproportionately from early detection programs and other interventions. Recently developed lung cancer risk models (1–4) have focused predominantly on white populations. Although minority groups such as African-Americans share similar lung cancer risk factors with whites, the level of risk may differ depending on the type and level of exposure. Moreover, African-Americans may exhibit unique exposures (such as occupational exposures) that need to be included in the risk prediction models to improve accuracy. Hence, a need exists to develop and validate group-specific risk prediction models.
The goal of this analysis was to develop and validate a lung cancer risk prediction model that is specific to African-Americans. We present a comprehensive epidemiologic analysis of risk factors for lung cancer in African-Americans that was derived from an ongoing case-control study. Using these risk factors, we further develop and validate (using an independent set of internal data as well as external data) a risk prediction model for lung cancer applicable to African-Americans.
As part of a multiracial/ethnic lung cancer case-control study, we recruited study participants from The University of Texas M. D. Anderson Cancer Center and the Michael E. DeBakey VA Medical Center, both in Houston, from 1995 to 2005. All cases with newly diagnosed, histopathologically confirmed, and untreated lung cancer were eligible for the study. Case exclusion criteria for the study included prior chemotherapy or radiotherapy or recent blood transfusion. We recruited our control population from Houston area community centers and the Kelsey-Seybold Clinic, Houston's largest multispecialty physicians group practice. Potential controls were first surveyed with a brief questionnaire about their willingness to participate in research studies and provide preliminary data to assist us in matching demographic characteristics with those of the cases (5). Controls were matched to the cases based on age (±5 y), sex, and ethnicity. To date, the response rate among both the cases and controls has been ~75%.
In this analysis, we focused on a subset of cases and controls who self-reported as being black (African-American). These individuals represented ~14% of the overall study population.
The institutional review boards of the M. D. Anderson Cancer Center, Michael E. DeBakey VA Medical Center, and Kelsey-Seybold Clinic approved this research. Each participant signed an informed consent and completed a personal interview.
Never smokers were defined as those who had smoked <100 cigarettes in their lifetimes, former smokers had quit smoking >1 y before diagnosis (cases) or interview (controls), and current smokers included recent quitters who had quit smoking within the past 12 mo. Pack-years for former and current smokers were calculated as the years smoked times the average number of cigarettes per day divided by 20. Smokers were also asked to report their use of mentholated cigarettes, and former smokers to report the age at which they stopped smoking and the years since smoking cessation.
We classified participants as positive for asbestos exposure if they self-reported such exposure for at least 8 h per week for a year or if they reported having held a job within a documented asbestos-related industry [according to the Standard Industrial Classification Manual (1972) and the Dictionary of Occupational Titles 1991; ref. 6]. Other environmental exposures included exposures through work or hobbies for ≥8 h per week for a year to wood dusts (wood dust, sawdust, or sand), fibers (textile fibers or cotton), synthetic vitreous fibers (SVF), toluene and/or xylene (paint, solvents, paint thinners, and printer's inks), and pesticides. We also classified participants according to self-reported physician-diagnosed medical comorbidities, including asthma, chronic obstructive pulmonary disease (COPD), and hay fever.
Participants self-reported the cancer histories of their first-degree relatives (i.e., parents, siblings, and offspring). Specifically, for each relative, we asked participants to provide year of birth, age at the time of study, smoking status (never or ever), presence or absence of cancer (yes or no), type of cancer, age of diagnosis, and year of death. In this analysis, we focused on family history of lung or other smoking-related (lung, bladder, kidney, head and neck, or pancreas) cancers.
We obtained clinical information about the study cases from their medical records, including stage of disease at diagnosis (based on tumor-node-metastasis staging criteria; ref. 7) and histologic characteristics.
We used descriptive statistical analyses to characterize the study population, including Pearson's χ2 test to test for distribution differences between the cases and controls for categorical variables and the Student's t test to determine differences in continuous variables. We used logistic regression analysis to calculate odds ratios (OR) and 95% confidence intervals (95% CI) to identify potential risk factors for lung cancer adjusted by age, sex, and smoking history. All analyses, unless otherwise noted, were completed using Statistical Analysis System version 9.1 software (SAS Institute, Inc.). When necessary, LogXact, available in Cytel Studio 6.3 (Cytel Inc.), was used to do exact logistic regression analyses to estimate ORs and calculate 95% CIs.
The risk factors (significant at the 5% level in univariate analyses) were incorporated into a multivariable risk model for lung cancer. We evaluated the interactions between smoking variables and environmental exposures (asbestos and wood dusts) and used the likelihood ratio test to test for significant interactions (5% α level). To identify other possible interactions that we did not specify a priori, we built a classification tree (8) using the recursive partitioning technique available in rpart package developed by Therneau and Atkinson7 for S-Plus (Insightful Corp.). We grew the decision tree such that each subsequent split yielded two daughter nodes with at least 10 participants per node. An unconditional logistic regression model was fit at each recursive split to estimate the risk of lung cancer adjusted by age and sex; any branch that was not deemed to be statistically significant at the 5% level was pruned off the tree.
In the process of developing a final model, we first included statistically significant (via stepwise regression) main effects of smoking status, pack-years of smoking, and age at smoking cessation in the multivariable model. However, we also considered a composite variable that combined these three variable with never smokers as the reference. We also considered a composite variable that was identified through our tree-building process. This particular composite variable included not only smoking status, pack-years of smoking, and age at smoking cessation but also years of cessation and physician diagnosis of emphysema with never smokers and those who had smoked for less than 13.2 pack-years and had no emphysema as the reference group. We compared the Akaike information content (AIC) among the three models and the model with the lowest AIC was chosen as our final model.
To estimate the 5-y absolute risk of lung cancer, we first converted the β coefficients from the final risk model into ORs (which are equivalent to relative risks when prevalence is low), denoted by r. We then estimated baseline hazards for African-American men and women separately as h1j = vj (1 − sj), where vj is the age, sex, and incidence rate of lung cancer for African-American men (j = 1) or women (j = 2) from the Surveillance and End Results Program (9) for 2002 and sj is the attributable risk derived from the relative risk model as described in Fears et al. (10). The absolute 5-y risk is given by the following equation:
where a is age in years, h1j as defined above, and h2j is the age and sex mortality rate from other causes (such as heart disease, stroke, diabetes, and other cancers excluding lung cancer; ref. 11) for males (j = 1) or females (j = 2) for African-Americans derived from the National Center for Health Statistics 1999 to 2003 mortality rates (12).
We validated our final model using two independent, internal and external, data sets. The internal validation was based on additional cases and controls from the same ongoing multiracial/ethnic lung cancer case-control study at M. D. Anderson Cancer Center from which the model-building data were obtained. These African-American patients and controls were enrolled between May 2005 and December 2007. The total number of participants analyzed for internal validation was 156. The data for external validation were from two different studies evaluating risk of lung cancer among African-Americans in metropolitan Detroit under the direction of Dr. Ann Schwartz (13, 14). The first study was a case-control study of the risk of non–small cell lung cancer among women (13). The second study was a case-control study of lung cancer risk among relatives of early-onset lung cancer cases and frequency-matched controls (14). Due to missing data on the final list of model variables, we only included 325 participants for the external validation analysis. In addition, these two studies did not query specifically for physician diagnosis of hay fever; hence, we used allergies as a proxy for hay fever.
For each validation set, we calculated specificity and sensitivity to construct receiver-operator characteristic curves and calculated the area under the curve (AUC) to estimate the ability of the models to discriminate between African-American patients with lung cancer and controls. Approximate 95% CIs for the AUCs were calculated by using STATA statistical software (Release 8; Stata Corp.), assuming a binegative exponential distribution. We also compared the AUCs of the African-American model to the published and internally validated Spitz model (15).
The demographic and clinical characteristics of the case and control test set used to develop the risk model are presented in Table 1. Among the 491 African-American cases and 497 controls, there were fewer male controls (40%) than cases (60%; P < 0.001) and the controls were significantly younger than cases (although this difference was still within the 5-year age-matching criterion of the study).
The proportion of ever smokers was predictably higher in the cases compared with the controls (P < 0.001), and conversely, only 7% of cases versus 29% of controls were never smokers. Among the study cases, men exhibited the highest percentage (59.9%) of current smokers, whereas 15.2% of women were never smokers (data not shown). Among controls, men had the highest percentage of former smokers (40.2%), whereas 36.8% of women were never smokers (data not shown). Cases who were current smokers smoked an average of 23 cigarettes per day, whereas the controls smoked on average 15 cigarettes per day (P < 0.001). Control smokers reported a higher preference (47% versus 41% for cases) for mentholated cigarettes, although this difference was statistically significant only among current smokers (P = 0.019; data not shown). Physician-diagnosis of COPD was more frequently reported among cases (16.5%) compared with controls (2.0%; P<0.0001).
The most common histology was adenocarcinoma (38.7%), except in men where squamous cell carcinoma was most frequent (39.7%; data not shown). We also observed that for never and former smokers, adenocarcinoma was the most prevalent histologic type, whereas in current smokers adenocarcinoma and squamous cell carcinoma were equally prevalent (data not shown). The majority of cases presented with stage III (40.7%) or stage IV (40.7%) disease.
The main effects of risk factors are summarized in Table 2. Both current (OR, 6.20) and former smokers (OR, 3.38) exhibited significantly increased risks. Among current smokers with longer smoking duration (≥30 years), the risk estimate was 1.97 (95% CI, 1.16−3.33). Current smokers who had smoked >20 cigarettes per day had an approximate 4-fold increased risk (OR, 3.94; 95% CI, 2.26−6.87) of lung cancer and those who had smoked >40 pack-years had 3.44-fold risk (95% CI, 2.15−5.47) with the risk more pronounced among women (OR, 15.91; 95% CI, 4.64−54.59; data not shown). The use of mentholated cigarettes seemed to be protective in current smokers, although the OR did not reach statistical significance (P > 0.05) even after stratification by pack-years (≤40 versus >40 pack-years; data not shown).
Former smokers who had quit smoking for >10 years had a significantly reduced risk of lung cancer (OR, 0.42; 95% CI, 0.27−0.67) compared with those who quit for less than 10 years. In addition, those former smokers who had quit smoking after age 30 had a significantly higher risk of lung cancer (OR, 2.60; 95% CI, 1.03−6.58) compared with those who quit before the age of 30.
We observed significantly elevated risks among those African-Americans who self-reported being exposed to asbestos (OR, 1.58; 95% CI, 1.18−2.12), wood dusts (OR, 1.50; 95% CI, 1.09−2.05), SVFs (OR, 2.27; 95% CI, 1.44−3.60), and toluene and/or xylene (OR, 1.40; 95% CI, 1.02−1.92). A prior history of physician-diagnosed COPD was associated with a substantially elevated risk (OR, 6.38; 95% CI, 3.24−12.59), whereas those with a reported history of hay fever had protection from lung cancer (OR, 0.68; 95% CI, 0.47−0.99). Previous diagnoses of asthma did not account for elevated risk in either smokers or nonsmokers (data not shown).
Approximately 16% of the cases and 12.9% of controls (Table 1) reported a family history of lung cancer in a first-degree relative (OR, 1.11; 95% CI, 0.76−1.61). More than 21% of the cases reported a family history of any smoking-related cancer compared with 19.1% of controls (P = 0.375), but no significant risks were observed among these cases.
We also evaluated interactions between smoking variables and asbestos and dusts, but none of these interactions reached statistical significance (data not shown).
Physician diagnosis of COPD (selected as the first split of the data), pack-years smoked, age at smoking cessation, and years since smoking cessation were all identified as important risk factors for lung cancer in the classification tree model (Fig. 1). Although no higher-order interactions were evident, we did observe that African-Americans with COPD who smoked the heaviest (≥26.4 pack-years) were at the highest risk (crude OR, 31.6; 95% CI, 13.6−73.4; data not shown) for lung cancer compared with never smokers and those former or current smokers who smoked the lightest amount of cigarettes (<13.2 pack-years). Even those African-Americans who did not have COPD, who quit smoking before age 56 and had not smoked within 3 years, but still had more than 26.4 pack-years of smoking still exhibited elevated risk (crude OR, 2.2; 95% CI, 1.3−3.7; data not shown) compared with the reference group of never and light smokers without COPD.
In our multivariable risk modeling, the main effects for exposure to toluene and/or xylene were nonsignificant (data not shown). Those variables retained in the multivariable risk model included smoking status, pack-years of smoking (≤40 versus >40 pack-years), age at smoking cessation (≤30 versus >30 years), exposure to asbestos or dusts, and history of COPD or hay fever, with an AIC of 1,075.7. For the second model that included the composite variable never smokers (reference), former smokers who quit by age ≤30 years, former smokers who quit by age >30 years, current smokers who smoked ≤40 pack-years, and current smokers who smoked >40 pack-years along with main effects for exposure to asbestos or dusts and history of COPD or hay fever, the AIC was 1,080.1. In the model including the composite variable of smoking and COPD identified from the tree modeling (Fig. 1) and the main effects for the exposure and other medical history variables, hay fever and dusts were borderline significant (P = 0.05). The AIC for this model was 1,045.5. Therefore, we chose this model as our final model (Table 3) and calculated attributable risks for men and women.
Using the methods described above, we estimated 5-year absolute risks of lung cancer and illustrated our approach with two different risk profile scenarios. First, for an African-American man aged 71 years who smoked for 40 pack-years but quit smoking after age 30, had no previous diagnoses of COPD or hay fever, and reported exposure to both asbestos and wood dusts, his relative risk of lung cancer compared with an African-American man of the same age with no risk factors (Table 3) is r = 10.1 × 1.46 × 1.38 = 20.35. The baseline hazard (Table 4) for this African-American man was h1,1 = 572.0534/100,000 * (1 − 0.79) = 0.001201. The mortality rate from the Centers for Disease Control and Prevention for African-American men between 70 and 74 years old from causes other than lung cancer is h2,1 = 4,818.6/100,000 = 0.048186 (Table 4). By substituting these numbers into the absolute risk equation (Eq. A), the estimated 5-year absolute risk for this man is 10.25%. Next, the relative risk of lung cancer for an African-American woman aged 56 who never smoked and has hay fever is 1.00 × 0.66 = 0.66. The baseline hazard (Table 4) for this African-American women is h1,2 = 160.2701/100,000 * (1 − 0.59) = 0.0006571, and the mortality rate from the Centers for Disease Control and Prevention from causes other than lung cancer is h2,2 = 971.3/100,000 = 0.00971 (Table 4). Therefore, the estimated 5-year absolute risk for this woman is 0.21%.
Summary statistics on the variables in the final model for the internal and external validation groups have been summarized in Table 5. The internal validation set included 89 lung cancer cases and 67 controls. Fifty-five percent of the cases were male, whereas only 39% of the controls were male. Controls were much younger than cases (47.6 versus 62.9 years for males and 50.9 versus 65.2 years for females). Forty percent of the controls were never smokers, whereas only 11.3% of the cases were never smokers. The discriminatory power (as measured by the AUC) for the internal validation group was 75% (95% CI, 0.67−0.82), indicating good discrimination. When we stratified the internal group by gender, the resulting AUCs for males and females were 0.70 (95% CI, 0.58−0.82) and 0.75 (95% CI, 0.64−0.85), respectively. The external validation set included 172 lung cancer cases and 153 controls. The majority of the cases (87%) and the controls (78%) were female. Controls were younger than cases (44.1 versus 46.1 years for males and 51.5 versus 55.2 years for females). The majority (86.1%) of the cases were former smokers, whereas most (56.9%) of the controls were current smokers. The discriminatory power of the model for the external validation data was more moderate with an overall AUC of 63% (95% CI, 0.57−0.69). Among women and early-onset lung cancer cases (the focus of each of the studies in the external data group), the model had discriminatory power of 61% (95% CI, 0.54−0.67) and 64% (95% CI, 0.64−0.74), respectively. The Spitz model (3) was based on white cases and controls who were frequency matched to cases on age, sex, and smoking status (never, former, and current). Separate models were presented for never, former, and current smokers, respectively. We were able to validate the Spitz model only for ever smokers because we had limited data on environmental tobacco smoke exposure (a key risk factor in the Spitz model for never smokers) among the African-American never smokers. Using our validation data, the AUCs for the Spitz model were 67% (95% CI, 0.57−0.77) and 59% (95% CI, 0.52−0.65) for African-American smokers in the internal and external validation groups, respectively. In comparison, the discriminatory power using the risk model for African-Americans was 79% (95% CI, 0.70−0.88) and 66% (95% CI, 0.61−0.71), respectively, for the two groups of African-American smokers.
This study developed and validated a specific risk prediction model of lung cancer for African-Americans. Smoking cessation (former smokers), pack-years smoked, prior COPD, no history of hay fever, and exposure to asbestos and wood dust were significant risk factors in this minority population. Internal and external validation results showed that our model discriminates well between cases and controls, except for a subgroup of early-onset lung cancer cases. The model that we propose for African-Americans outperformed a similar lung cancer risk model developed primarily for white populations.
In our comparison of risk factors in the African-American model and those developed using white-only data, we observed similarities and differences in distribution of risk factors among the two racial groups. In our study, 41% of study cases presented with advanced (stage IV) lung cancer. In contrast, 24.3% of the white cases in the entire parent study population had stage IV lung cancer at their initial examination. These differences in stage at time of presentation may be influenced not only by disparities in socio-economic factors, health insurance or access to health care in African-American populations (16), but by biologic factors as well. Difference exist in pulmonary function among different racial groups (17): African-Americans have lower forced vital capacity, lower forced expiratory volume, and higher percentage of forced expiratory capacity expired in 1 second compared to whites. Adenocarcinoma was the most common histology diagnosed in African-American women, whereas African-American men presented with squamous cell carcinoma as frequently as adenocarcinoma. Previously (3), we showed that 51% of white cases presented with adenocarcinoma followed by squamous cell carcinoma (22%).
Most of our African-American current smokers smoked between a half pack and a full pack (10−27 cigarettes) per day, similar to levels reported by Haiman et al. (18) and less than levels reported for whites. In our parallel case-control study in whites, the average number of cigarettes per day was 28.1 in cases compared with 26.4 in controls (P ≤ 0.001; ref. 3). In this study, cases who smoked ≥20 cigarettes per day exhibited about a 4-fold increase in risk (OR, 3.94; 95% CI, 2.26−6.87). Stellman et al. (19) reported that the risk of lung cancer for African-Americans who smoked >20 cigarettes per day was five times higher than that for African-American women and six times higher than men who smoked ≤20 cigarettes per day.
The use of mentholated cigarettes is believed to increase the risk of lung cancer because these cigarettes have higher tar content, generate higher levels of carcinogens, and facilitate the absorption of carbon monoxide (20). It has also been suggested that the anesthetic properties and cooling effects of menthol may lead to deeper inhalation and longer retention, which may result in increased exposure to carcinogenic substances present in tobacco smoke (21, 22). But the association between mentholated cigarette consumption and lung cancer is controversial (19, 23–25). In a recent review, Werley et al. (21) reported that 70% to 80% of African-American men and women chose mentholated cigarettes. We observed lower percentages of menthol use (41−60%) in our study among African-American men and women smokers. In our analysis, we observed no significant risks of lung cancer among former or current smokers who reported smoking mentholated cigarettes (OR range, 0.69−0.99) and our data suggested a possible protective effect of mentholated cigarettes for current smokers. Sidney et al. (26) reported an increased risk in men who smoke mentholated cigarettes, whereas others (19, 24, 25) reported a nonsignificant protective effect in both whites and African-Americans. A recent study by Murray et al. (27) also found no evidence of a higher risk of death from lung cancer in smokers who smoked mentholated cigarettes.
African-American men and women who reported longer durations of smoking cessation had lower risks of lung cancer. Others have reported similar declines in risk in African-American populations (19, 28). We further observed that men who quit smoking after age 30 remained at substantially higher risks of lung cancer. A cohort study of 34,439 male British physicians reported that the maximum years of life expectancy gained were when cases stopped smoking before the age of 30 (29). Between 10% and 15% of all cases with lung cancer are never smokers (30); however, we observed a much lower proportion (2.4%) in our male cases.
We identified exposures to asbestos and wood dusts as significant risk factors for lung cancer in African-Americans. Such an association in African-Americans has been well documented previously (31, 32). Our observations were similar to those in previous reports of asbestos exposure in whites (33) and of exposure to wood dusts in a subset of this population previously studied (34, 35).
We also noted that some specific occupational exposures (SVFs and toluene and/or xylene) were associated with lung cancer among African-Americans, although these risks were not significant in our final multivariable model. Pohlabeln et al. (36) reported an association between exposure to man-made vitreous fibers (i.e., SVFs) and lung cancer after adjusting for smoking and asbestos exposure. All SVFs have essentially the same oncogenic potential. Some studies, however, may not show a statistically significant association because they lack sufficient statistical power to detect the small level of incremental risk associated with a fiber that clears rapidly from the lungs (37–39). Cohort studies have shown an increase in mortality due to lung cancer in whites after exposure to toluene (40, 41). However, in our parallel study in whites (3), we identified a nonsignificant association between exposure to toluene and/or xylene and risk of lung cancer, although the prevalence of such exposure in cases with lung cancer (30.1%) was similar to that in African-American (34.4%) cases.
The risks associated with COPD that we report for African-Americans in our current study are substantially higher than those we reported previously for whites (OR, 1.7−2.65; refs. 3, 32). For white cases, the prevalence of COPD was 22.4% and 9.4% in controls (3). However, in this analysis, the prevalence of COPD was 16.5% among the African-American cases and only 2% among controls. These differences reflect national rates. The 2002 COPD data from the American Lung Association report that the prevalence rates for chronic bronchitis in the U.S. population are 45.7/1,000 for whites and 46.1/1,000 for blacks (42). National incidence rates for emphysema are 17.1/1,000 for whites and 8.3/1,000 for blacks (42). Hnizdo et al. (43) reported that the percent prevalence of airflow obstruction for the total working population was 10.7% for Caucasians and 7.5% for African-Americans. Littman et al. (44) reported that independent of smoking history, smokers (mostly white) with a history of chronic bronchitis and/or COPD have a higher risk of lung cancer, a finding confirmed in other studies (45, 46).
The protective effect of hay fever that we observed in the minority group is similar to our previous findings for whites (33) and also previously reported in a mixed ethnic population of never smokers (47). Borderline significant protective effects of hay fever were also noted by Osann (48) in women with lung cancer. However, a later study by Osann et al. (49) found no significant association between hay fever and the risk of small cell lung cancer among women, although the Osann et al. study focused only on women with small cell lung cancer (49).
In this analysis, we did observe an elevated but not statistically significant association between family history and lung cancer. In our previous analyses of white cases and controls, we observed increased risks of lung cancer for never smokers (OR, 2.00) and former smokers (OR, 1.59) with a family history of any cancer and among current smokers (OR, 1.51) with a family history of smoking-related cancers (3). Cote et al. (13) reported that first-degree relatives of African-American individuals with early-onset lung cancer have a greater risk of lung cancer than whites, and cigarette smoking further increased these risks. We similarly stratified African-American cases by age at onset of disease (data not shown) but observed no increased risk among cases with early-onset (age ≤50 years) disease.
Cassidy et al. (50) reviewed the history of risk prediction models and particularly emphasize lung cancer risk prediction models and their importance to lung cancer research. In our lung cancer risk model for white lung cancer patients and controls (3), the controls, unlike our minority controls, were matched to their respective patients on smoking status. Because of the large number of white participants, we were able to construct and internally validate separate models for never, former, and current smokers. Some of the risk factors included in the three models previously published (1–3) overlapped with those found to be statistically significant in our minority risk models (i.e., smoking-related measures, exposures to asbestos or wood dusts, emphysema, and no previous hay fever). However, the level of risk may differ depending on the type and level of exposure. Therefore, these three risk models may not be applicable to minority populations.
Indeed, our model validation results showed that the ability of the Spitz model to discriminate between minority patients and controls was moderate (67% for African-Americans smokers), whereas the group-specific models showed better discriminatory power (79%). The Spitz model included smoking-related variables (age at smoking cessation for former smokers and pack-years smoked for current smokers) but could not include a risk estimate for smoking status because it was a matching variable. The inclusion of smoking status in the minority risk models could have further increased the discriminatory power that we observed. In the model based on white patients, asbestos and wood dust occupational exposures were significant. In the African-American model, asbestos and wood dusts were also included. Moreover, family histories of any cancer and of smoking-related cancers were included in the Spitz model for whites (3), but family history of cancer was not a statistically significant factor in our minority models.
The results from the validation analyses indicate that our lung cancer prediction model for African-Americans can prove helpful by providing more precise estimates compared with the existing risk prediction models and further highlight the need for ethnic-specific lung cancer risk prediction models. However, the sample size for validation was small.
The other limitations of our study include the fact that our study was hospital based and the controls were drawn only from the metropolitan area of Houston, Texas; therefore, the results may vary in other geographic locations. Because the sample size of the study was small, the precision of our OR estimates is not as small compared with those in our white cases of lung cancer. We may be able to see better results with larger data sets or data sets where participants are not selectively enrolled based on particular criteria (such as gender or early age at diagnosis). Further, we were unable to confirm self-reported comorbidities or exposures. However, as noted by Cassidy et al. (50), a good risk prediction tool for lung cancer should include other factors in addition to smoking and age and our minority risk models included occupational exposures, comorbidities, and smoking phenotypes. Therefore, our results provide a foundation for lung cancer risk models for minority populations.
This study shows that African-Americans have risk factors for lung cancer that also occur in whites. However, the level of risk may differ depending on the type and level of exposure. African-Americans have a higher mortality due to lung cancer compared with whites and will benefit from early detection programs, including those using risk prediction tools. These findings highlight the importance of conducting further ethnic-specific analyses of disease risk.
We thank Tamara Locke for her assistance in editing this manuscript.
Grant support: Cancer prevention fellowship funded by the National Cancer Institute grant K07CA093592; National Cancer Institute grants CA55769, CA123208, CA60691, and CA87895; National Cancer Institute contract N01-PC35745; and Flight Attendant Medical Research Institute.
7Technical Report Series No. 61, An introduction to recursive partitioning using the RPART routines. Department of Health Science Research, Mayo Clinic, Rochester, Minnesota, 1997. http://lib.stat.cmu.edu/general/rpart.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.