CPD was selected as the single phenotype for analysis, because it is a highly heritable,3
widely used phenotype in genetic studies of smoking.7,8,10
To understand the relationship between CPD and DSM-IV ND, we analyzed an epidemiologic dataset, the National Epidemiologic Survey of Alcohol and Related Conditions (NESARC11
). This was a household survey of 43 000 Americans, in which data were collected on CPD and on the diagnosis of ND. Sensitivity and specificity were examined for various values of CPD and the diagnosis of ND. From this analysis (), specificity and sensitivity of DSM-IV ND are unacceptably low, even when individuals are smoking nearly a pack daily (20 CPD). The analysis suggested that the specificity for a DSM-IV diagnosis of ND improved to 90% when persons smoking ≥25 CPD were considered (). Thus, for dichotomous (case–control) analysis of genotypes, a case was defined as a person who smoked ≥25 CPD. Part of the value of this case definition is that it probably defines by DSM-IV a ND person.
Analysis of CPD and DSM-IV nicotine dependence in the NESARC data
GSK sponsored a cardiovascular disease study of a population-based sample of 6205 adult residents of the city of Lausanne. Briefly, participants in the study were randomly selected from a list of 56,694 individuals aged 35–75 years who were permanent residents of the City of Lausanne. Recruitment took place between April 2003 and March 2006, and the overall participation rate was 41%. Only individuals of European origin (persons for whom the four grandparents were of European origin) were included in the study, in an attempt to limit heterogeneity for genetic studies. Participants completed a health questionnaire and underwent a physical exam. They provided a blood sample for genetic studies and clinical chemistries. The health questionnaire included the question: If you were ever a daily smoker, what is the maximal number of CPD regularly smoked? All participants were duly informed about the sponsorship by GSK and were consented for the use of biological samples and data by GSK and its subsidiaries; the study was approved by the Local Ethics Committee.
Genome-wide SNP genotyping was performed on 6000 Lausanne participants, using the Affymetrix 500K SNP chip, as recommended by the manufacturer. A total of 366 samples were excluded from this analysis as they either had an efficiency <90% or showed gender inconsistencies, so that genetic data from 5634 individuals were included in the present study. Markers were excluded if they were monomorphic (4052), had a call rate <95% (157) or were out of Hardy–Weinberg equilibrium (35,417), leaving a total of 460,959 markers for analysis.
GSK also sponsored an unrelated case–control genetic association study of dyslipidemia, nested within the GEMS project, in a population of European-origin individuals recruited from medical clinics. A total of 923 cases, defined as individuals with high triglycerides levels and low HDL-cholesterol levels in plasma, and 924 highly discordant controls with low triglycerides, high HDL-cholesterol levels and an excess in body-weight were recruited in this study. Genotypes from Affymetrix 500K SNP chips were subjected to quality control measures similar to those for the Lausanne study.
Quantitative analyses of CPD with genotype were performed using gender as a covariate, since more men are regular smokers than women.12
A 2003 US survey revealed that 24% of men and 19% of women are regular smokers.12
Individuals who denied ever smoking were excluded from this analysis, as they may have never had sufficient exposure to cigarette smoking to become dependent.7,8
Quantile–quantile plots of the GEMS and Lausanne populations revealed little deviation from the expected distribution (see Supplementary Figure S1
). These data suggest an absence of population stratification across the phenotype of CPD. Data were analyzed for association with one and only one phenotype, CPD, using the computer program, PLINK.13
GSK also sponsored the establishment of case–control association samples for about 18 common diseases known as the High-Throughput Disease-specific target Identification Program (HITDIP).14
Each of these samples consisted of approximately 1000 Cases and 1000 controls collected at multiple sites in North America and Europe. These DNA samples were genotyped at ~6000 SNPs in a panel of 1800 ‘drugable’ candidate genes (for additional details see Roses A et al.14
). These HITDIP studies collected a common set of data concerning the medical history of each participant, including a question about smoking habits. The only ND-related phenotype available in these HITDIP studies was the answer to the question: If you ever smoked regularly, what was the number of CPD typically smoked? In some of the HITDIP studies, this may have been interpreted as the maximum number of CPD regularly smoked, as opposed to an average or typical number, perhaps due in part to the different languages in the countries in which the studies were executed. Thus, treating CPD in the HITDIP samples as a quantitative trait could have led to errors. A case–control analysis of the HITDIP data was conducted. A control was defined as anyone who reported CPD always < 5 CPD and a case anyone who reported smoking ≥25 CPD. Individuals who denied ever smoking a single cigarette were excluded from the analysis. In the dichotomous analysis using PLINK,13
the definition of a case (a person smoking ≥25 CPD) rested on the results of our examination of the NESARC data that established the relationship between DSM-IV diagnosis of ND and the maximal number of CPD regularly smoked (see ).
Consent forms were reviewed for each HITDIP study to determine whether the language in the consent form permitted anonymous analysis of the CPD phenotype. In instances where the consent form was narrowly worded (for example, did not permit analysis of phenotypes unrelated to the primary disease), the data set was not analyzed. In instances where the validity of the CPD variable might be questioned (for example, Alzheimer's disease), the data set was not analyzed.