The Carolina Head and Neck Cancer Epidemiology Study (CHANCE) is a population-based case-control study upon which these analyses are based (6
All cases of squamous cell carcinoma of the oral cavity, pharynx, and larynx diagnosed in 46 N.C. counties between 1/1/2002 through 2/28/2006 were eligible for enrollment. Rapid case identification was conducted by the N.C. Central Cancer Registry. CHANCE cases included ICD-O-3 topography codes C0.00–C14.8, and C32.0–C32.9, excluding salivary gland (C07.9, C08.0–C08.9), nasopharynx (C11.0–C11.9), nasal cavity (C30.0), and nasal sinuses (C31.0–C31.9). ICD-O-3 morphology codes included were 8010/3, 8051/3, 8083/3, 8071/3, 8072/3, 8073/3, 8074/3, and 8076/3. Benign tumors, carcinomas in situ, papillary carcinomas, and adenoid carcinomas were excluded. We further excluded 21 lip cancers (C00.3–C00.9, C14.2), 46 of “other” race, and 96 without genotyping data, producing a study composition of 1227 cases and 1325 controls.
Potentially eligible controls from the same counties as cases were identified through N.C. Department of Motor Vehicles records. Controls were frequency-matched to cases using random sampling with stratification on age, race, and sex.
Trained nurse-interviewers conducted an in-person interview with each subject. For this analysis only self-reported, non-proxy data were included. Questions were asked about demographics, tobacco use, drinking of alcoholic beverages, diet, oral health, medical history, and family history of cancer.
Blood samples were obtained by nurse-interviewers trained in phlebotomy. If the subject was not willing or able to consent to the blood draw, they were asked to contribute a buccal cell sample via mouthrinse.
Written informed consent was obtained from all subjects. The study was approved by the Biomedical Institutional Review Board at the University of North Carolina at Chapel Hill.
Outcome, exposure, and covariate measurement
Case tumors were classified into anatomic sub-sites according to the following 5 ICD-O categories used by the International Head and Neck Cancer Epidemiology Consortium (7
): (1) oral cavity: C02.0–C02.3, C03.0, C03.1, C03.9, C04.0, C04.1, C04.8, C04.9, C05.0, C06.0–06.2, C06.8, and C06.9; (2) oropharynx: C01.9, C02.4, C05.1, C05.2, C09.0, C09.1, C09.8, C09.9, C10.0–C10.4, C10.8, and C10.9; (3) oral cavity-oropharynx-hypopharynx NOS: C02.8, C02.9, C05.8, C05.9, C14.0, C14.2, and C14.8; (4) hypopharynx: C12.9, C13.0–C13.2, C13.8, and C13.9; and (5) larynx: C32.0–C32.3, and C32.8–C32.9.
Alcohol and tobacco use
Questions about alcohol use were designed to estimate lifetime history of consumption, and usual consumption of each beverage type, prior to the year before diagnosis. Questions asked about beer, wine, and hard liquor separately as follows: (1) Did you drink [beer/wine/hard liquor]? (2) At what age did you start? (3) At what age did you stop? (4) For how many years did you drink [beer/wine/hard liquor] during this period? (5) How much [beer/wine/hard liquor] did you usually drink? Per day/week/month/year? (6) What size did you usually drink?
As frequency of drinking has demonstrated stronger associations with SCCHN than duration (8
), a single frequency measure that included all types of alcoholic beverages would have been optimal for estimating alcohol interaction with SNPs. Because this was unavailable in CHANCE, we instead derived a lifetime measure of alcohol intake, in milliliters, for beer, wine, and liquor combined. Using splines, we confirmed that tertiles best represented the risk associated with alcohol intake.
The primary tobacco exposure covariate selected was continuous duration of cigarette smoking. Dichotomous variables representing additional potential tobacco confounders were: ever use of non-cigarette tobacco, and ever-exposed to environmental tobacco smoke (ETS) at work or at home.
SNPs and haplotypes
Seventy-five SNPs (69 tag SNPs, and 6 candidate SNPs found in prior studies to be associated with cancer incidence or survival, or alcohol dependence) were selected in 12 genes that are part of two metabolic pathways: ADH1B
, and CYP2E1
in the alcohol metabolism pathway in the upper aerodigestive tract; and CAT
, and GPX4
in the oxidative stress pathway. Tag SNPs, chosen to represent the genetic variation within each of the 12 candidate genes (gene and 2000 bp upstream and downstream) were selected using the Genome Variation Server (9
), using SNPs that were polymorphic in either CEU or YRI HapMap Release 2 (unrelated only), with the following parameters: allele frequency cutoff 10%, 0.8 R2
threshold minimum for variations to belong to the same cluster, 85% minimum data coverage for tag SNPs, 70% minimal data coverage for a variation to be potentially clustered with others.
To control for potential population stratification, we selected 157 ancestry informative markers (AIMs) to maximize (1) the difference in allele frequencies (delta) between European and African populations in the HapMap data (CEU versus YRI), and (2) the Fisher’s information criterion (FIC). AIMs were prioritized based on having the highest delta and FIC values in the following order: 90% European/10% African, 10% European/90% African, and 50% European/50% African. This allowed AIMs to represent the entire expected ancestral distribution of the study population. Individual estimates of percentage African ancestry were calculated from 145 successfully genotyped AIMs using maximum likelihood estimation (MLE) methods previously described (10
). AIMs were chosen to differentiate only between African and European ancestry, so individual ancestry estimates for the two groups sum to 1.0.
DNA was extracted from blood or buccal samples collected at time of interview. Genotyping was done by the University of North Carolina at Chapel Hill, Mammalian Genotyping Core Facility, using the Illumina GoldenGate genotyping assay with Sentrix Array matrix and 96-well standard microtiter plates.
Haplotypes using SNP data were constructed separately for African- and European-Americans using default D’ blocks in Haploview 4.2. The algorithm (13
) constructs 95% confidence limits on D’ and each comparison is defined as either “strong LD”, “inconclusive” or “strong recombination”. A block is created if 95% of informative comparisons are in “strong LD”. Markers with minor allele frequency less than 5% are ignored. Assignment of most likely haplotype for individuals with ambiguous haplotype was done using an EM algorithm in haplo.stats (14
), with minimum counts set to 10.
SES, oral health
Dichotomous variables representing additional potential confounders were: had health insurance on reference date, had a routine dental visit in the last 10 years, ever had a loose permanent tooth due to disease, ever used mouthwash, family history of SCCHN, household poverty as defined by federal guidelines, and education level.
ORs for the independent effects of SNPs and alcohol, and their interactive effects, were computed using conditional logistic regression implemented in SAS® 9.2. ORs for the main effects of haplotypes were computed using unconditional logistic regression implemented in haplo.stats 1.4.4.
A dominant genetic model (at least one minor allele versus referent of no minor alleles) was used for SNPs because for many SNPs, the number of subjects homozygous for the minor allele was too small to permit precise effect measurement.
Potential covariates were eliminated using step-wise backwards elimination, comparing each reduced model to a full model that included all covariates listed in . No collinearity was noted between variables in the full model, with one exception as described below. If a covariate did not change the ln(OR) for any SNP by a difference of at least 0.10, it was eliminated from subsequent models. Final models for genetic main effects contained a single SNP or haplotype and duration of smoking as a continuous variable. Models estimating SNP-drinking interaction also included categorized lifetime ethanol consumption. We had insufficient power to detect haplotype-drinking interaction because haplotypes were constructed and analyzed separately for African- and European-Americans. The conditional logistic regression used for SNPs by definition takes into account the matching variables of age, sex, and race. The unconditional logistic regression models used for haplotypes (for each race separately) included, as covariates, sex, age, and their 2-way interaction. Ancestry was not important for the polymorphisms studied, probably because self-reported race was already included (as a matching variable). The ancestry variable also showed evidence of collinearity with race, so for these reasons and for parsimony’s sake, ancestry was excluded from final models.
Distribution of non-genetic variables in cases and controls
A Bonferroni correction was used to adjust p-values and ICR confidence intervals (CIs) to control for Type 1 error introduced by multiple statistical testing, for either 64 tests (for 64 SNPs) or for 12 or 13 tests (for haplotypes).
Departures from additive interaction were evaluated by computing interaction contrast ratios (ICRs) and Bonferroni-corrected CIs. ICRs were calculated using cancer odds ratios of subjects in three categories: (1) the highest drinking category and no minor allele (OR01); (2) never-drinkers with at least one minor allele (OR10); and (3) subjects in the highest drinking category and at least one minor allele (OR11), compared to never-drinkers homozygous for the major allele (i.e., the referent: OR00 = 1.0). ICR is calculated as follows: ICR=OR11 − OR01 − OR10 + 1. ICRs significantly different from zero indicate departure from additive interaction.