Genome-wide association studies of European and East Asian populations have identified lung cancer susceptibility loci on chromosomes 5p15.33, 6p22.1-p21.31 and 15q25.1. We investigated whether these regions contain lung cancer susceptibly loci in African-Americans refined previous association signals by utilizing the reduced linkage disequilibrium observed in African-Americans.
1308 African-American cases and 1241 African-American controls from three centers were genotyped for 760 single nucleotide polymorphisms spanning three regions, and additional SNP imputation was performed. Associations between polymorphisms and lung cancer risk were estimated using logistic regression, stratified by tumor histology where appropriate.
The strongest associations were observed on 15q25.1 in/near CHRNA5, including a missense substitution (rs16969968: OR = 1.57, 95% CI = 1.25–1.97, P = 1.1 × 10−4) and variants in the 5′-UTR. Associations on 6p22.1-p21.31 were histology-specific and included a missense variant in BAT2 associated with squamous-cell carcinoma (rs2736158: OR = 0.64, 95% CI = 0.48–0.85, P = 1.82 × 10−3). Associations on 5p15.33 were detected near TERT, the strongest of which was rs2735940 (OR = 0.82, 95% CI = 0.73–0.93, P = 1.1 × 10−3). This association was stronger among cases with adenocarcinoma (OR = 0.75, 95% CI = 0.65–0.86, P = 8.1 × 10−5).
Polymorphisms in 5p15.33, 6p22.1-p21.31 and 15q25.1 are associated with lung cancer in African-Americans. Variants on 5p15.33 are stronger risk factors for adenocarcinoma and variants on 6p21.33 associated only with squamous-cell carcinoma.
Results implicate the BAT2, TERT and CHRNA5 genes in the pathogenesis of specific lung cancer histologies.
Lung cancer; adenocarcinoma; squamous-cell carcinoma; fine-mapping; African-American; genetic association
Angiogenesis and lymphangiogenesis are important in the progression of melanoma. We investigated associations between genetic variants in these pathways with sentinel lymph node (SLN) metastasis and mortality in two independent series of melanoma patients.
Participants at Moffitt Cancer Center were 552 patients, all Caucasian, with primary cutaneous melanoma referred for SLN biopsy. A total of 177 patients had SLN metastasis, among whom 60 died from melanoma. Associations between 238 SNPs in 26 genes and SLN metastasis were estimated as odds ratios and 95%CI using logistic regression. Competing risk regression was used to estimate hazard ratios and 95%CI for each SNP and melanoma-specific mortality. We attempted to replicate significant findings using data from a genome-wide association study comprising 1,115 melanoma patients, who were referred for SLN biopsy from MD Anderson Cancer Center (MDACC), among whom 189 patients had SLN metastasis and 92 patients died from melanoma.
In the Moffitt dataset, we observed significant associations in 18 SNPs with SLN metastasis and 17 SNPs with mortality. Multiple SNPs in COL18A1, EGFR, FLT1, IL10, PDGFD, PIK3CA and TLR3 were associated with risk of SLN metastasis and/or patient mortality. The MDACC data set replicated an association between mortality and rs2220377 in PDGFD. Further, in a meta-analysis, three additional SNPs were significantly associated with SLN metastasis (EGFR rs723526 and TLR3 rs3775292) and melanoma specific death (TLR3 rs7668666).
These findings suggest that genetic variation in angiogenesis and lymphangiogenesis contributes to regional nodal metastasis and progression of melanoma.
Additional research attempting to replicate these results is warranted.
SNP; lymph/angiogenesis; melanoma; sentinel lymph node
Aiming to identify novel genetic loci for pigmentation and skin cancer, we conducted a series of genome-wide association studies on hair color, eye color, number of sunburns, tanning ability and number of non-melanoma skin cancers (NMSCs) among 10 183 European Americans in the discovery stage and 4504 European Americans in the replication stage (for eye color, 3871 males in the discovery stage and 2496 males in the replication stage). We targeted novel chromosome regions besides the known ones for replication. As a result, we identified a new region downstream of the EDNRB gene on 13q22 associated with hair color and the strongest association was the single-nucleotide polymorphism (SNP) rs975739 (P = 2.4 × 10−14; P = 5.4 × 10−9 in the discovery set and P = 1.2 × 10−6 in the replication set). Using blue, intermediate (including green) and brown eye colors as co-dominant outcomes, we identified the SNP rs3002288 in VASH2 on 1q32.3 associated with brown eye (P = 7.0 × 10−8; P = 5.3 × 10−5 in the discovery set and P = 0.02 in the replication set). Additionally, we identified a significant interaction between the SNPs rs7173419 and rs12913832 in the OCA2 gene region on brown eye color (P-value for interaction = 3.8 × 10−3). As for the number of NMSCs, we identified two independent SNPs on chr6 and one SNP on chromosome 14: rs12203592 in IRF4 (P = 7.2 × 10−14; P = 1.8 × 10−8 in the discovery set and P = 6.7 × 10−7 in the replication set), rs12202284 between IRF4 and EXOC2 (P = 5.0 × 10−8; P = 6.6 × 10−7 in the discovery set and P = 3.0 × 10−3 in the replication set) and rs8015138 upstream of GNG2 (P = 6.6 × 10−8; P = 5.3 × 10−7 in the discovery set and P = 0.01 in the replication set).
The analysis of gene-environment (GxE) interactions remains one of the greatest challenges in the post-genome-wide-association-studies (GWAS) era. Recent methods constitute a compromise between the robust but underpowered case-control and powerful case-only methods. Inferences of the latter are biased when the assumption of gene-environment (G-E) independence fails. We propose a novel empirical hierarchical Bayes approach to GxE interaction (EHB-GE), which benefits from greater power while accounting for population-based G-E dependence. Building on Lewinger et al.'s ( Genet Epidemiol 31:871-882) hierarchical Bayes prioritization approach, the method utilizes posterior G-E association estimates in controls based on G-E information across the genome to adjust for it in resulting test statistics. These posteriori estimates are subtracted from the corresponding G-E association coefficients within cases.
We compared EHB-GE with rival methods using simulation. EHB-GE has similar or greater rank power to detect GxE interactions in the presence of large numbers of G-E associations with weak to strong effects or only a low number of such associations with large effect. When there are no or only a few weak G-E associations, Murcray et al.'s method ( Am J Epidemiol 169:219-226) identifies markers with low GxE interaction effects better. We applied EHB-GE and competing methods to four lung cancer case-control GWAS from the TRICL/ILCCO consortium with smoking as environmental factor. Genes identified by the EHB-GE approach are reasonable candidates, suggesting usefulness of the method.
population G-E association; GWAS; rank power; lung cancer
Previous biological studies showed evidence of a genetic link between obesity and pigmentation in both animal models and humans. Our study investigated the individual and joint associations between obesity-related single nucleotide polymorphisms (SNPs) and both human pigmentation and risk of melanoma. Eight obesity-related SNPs in the FTO, MAP2K5, NEGR1, FLJ35779, ETV5, CADM2, and NUDT3 genes were nominally significantly associated with hair color among 5,876 individuals of European ancestry. The genetic score combining 35 independent obesity-risk loci was significantly associated with darker hair color (beta-coefficient per ten alleles=0.12, P-value=4 10−5). However, single SNPs or genetic scores showed non-significant association with tanning ability. We further examined the SNPs at the FTO locus for their associations with pigmentation and risk of melanoma. Among the 783 SNPs in the FTO gene with imputation R-square quality metric >0.8 using the 1000 genome data set, ten and three independent SNPs were significantly associated with hair color and tanning ability respectively. Moreover, five independent FTO SNPs showed nominally significant association with risk of melanoma in 1,804 cases and 1,026 controls. But none of them was associated with obesity or in linkage disequilibrium with obesity-related variants. FTO locus may confer variation in human pigmentation and risk of melanoma, which may be independent of its effect on obesity.
obesity; pigmentation; melanoma; genetic association; FTO gene
Health care providers need simple tools to identify patients at genetic risk of breast and ovarian cancers. Genetic risk prediction models such as BRCAPRO could fill this gap if incorporated into Electronic Medical Records or other Health Information Technology solutions. However, BRCAPRO requires potentially extensive information on the counselee and her family history. Thus, it may be useful to provide simplified version(s) of BRCAPRO for use in settings that do not require exhaustive genetic counseling.
We explore four simplified versions of BRCAPRO, each using less complete information than the original model. BRCAPROLYTE uses information on affected relatives only up to second degree. It is in clinical use but has not been evaluated. BRCAPROLYTE-Plus extends BRCAPROLYTE by imputing the ages of unaffected relatives. BRCAPROLYTE-Simple reduces the data collection burden associated with BRCAPROLYTE and BRCAPROLYTE-Plus by not collecting the family structure. BRCAPRO-1Degree only uses first-degree affected relatives. We use data on 2713 individuals from seven sites of the Cancer Genetics Network and MD Anderson Cancer Center to compare these simplified tools with the Family History Assessment Tool (FHAT) and BRCAPRO, with the latter serving as the benchmark.
BRCAPROLYTE retains high discrimination, however, because it ignores information on unaffected relatives, it overestimates carrier probabilities. BRCAPROLYTE-Plus and BRCAPROLYTE-Simple provide better calibration than BRCAPROLYTE, so they have higher specificity for similar values of sensitivity. BRCAPROLYTE-Plus performs slightly better than BRCAPROLYTE-Simple. The Areas Under the ROC curve are 0.783 (BRCAPRO), 0.763 (BRCAPROLYTE), 0.772 (BRCAPROLYTE-Plus), 0.773 (BRCAPROLYTE-Simple), 0.728 (BRCAPRO-1Degree), and 0.745 (FHAT). The simpler versions, especially BRCAPROLYTE-Plus and BRCAPROLYTE-Simple, lead to only modest loss in overall discrimination compared to BRCAPRO in this dataset.
Simplified implementations of BRCAPRO can be used for genetic risk prediction in settings where collection of complete pedigree information is impractical.
In this study, we directly sequenced the Melanocortin 1 Receptor (MC1R) gene in 2,212 individuals to detect all variants and assessed their associations with cutaneous melanoma (CM) risk in a hospital-based study of 1,106 CM patients and 1,106 control subjects. Of 61 MC1R variants identified, 16 rare variants have not been previously reported by others; three MC1R variants were associated with a significant CM risk [c.451C>T (OR = 1.78, 95% CI = 1.44–2.20), c.478C>T (OR = 1.31, 95% CI = 1.05–1.63), and c.880G>C (OR = 1.69, 95% CI = 1.15–2.48)]; and two with borderline CM risk [c.942A>G (OR =1.23, 95% CI =1.00–1.51, and c.274G>A (OR = 1.23,95% CI = 0.99–1.53)] under a dominant model. When combined these five MC1R variants for cumulative effect analysis, we found that subjects with an increased number of variant genotypes from any of these five variants had significantly increased risk of CM with ORs of 1.68 (95% CI = 1.39–2.04), 1.61 (95% CI = 1.27–2.04), and 2.64 (95% CI = 1.72–4.05) for one, two, and three or more variant genotypes, respectively (trend test: P <0.001). Further haplotype and diplotype analyses based on the above-mentioned five SNPs suggested that the c.451T allele contributed to the high risk of CM and that the five variants may have joint effects on the risk of CM. Additional analysis suggests that the three most significant SNPs may be the molecular mechanisms underlying the known risk factors of the colors of the eyes, skin and hair in this study population. In conclusion, our study provided confirmatory evidence that both common and rare variants in the MC1R coding region may be biomarkers for susceptibility to CM in US populations.
melanocortin 1 receptor gene; direct sequencing; interaction; melanoma; case-control
To mine possibly hidden causal single nucleotide polymorphisms (SNPs) in the etiology of melanoma, we investigated the association of SNPs in 76 M/G1 transition genes with melanoma risk using our published genome-wide association study (GWAS) dataset with 1804 melanoma cases and 1,026 cancer-free controls. We found multiple SNPs with P < 0.01 and performed validation studies for 18 putative functional SNPs in PSMB9 in other two GWAS datasets. Two SNPs (rs1351383 and rs2127675) were associated with melanoma risk in the GenoMEL dataset (P = 0.013 and 0.004, respectively), but failed validation in the Australia dataset. Genotype-phenotype analysis revealed these two SNPs were significantly correlated with mRNA expression levels of PSMB9. Further experiments revealed that the promoter SNP rs2071480, which is in high LD with rs1351383 and rs2127675, involved in influencing transcription factor binding and gene expression. Taken together, our data suggested that functional variants in PSMB9 may contribute to melanoma susceptibility.
GWAS; Cell cycle; PSMB9; Polymorphism; melanoma
Recent evidence suggests that inflammation plays a pivotal role in the development of lung cancer. In this study, we used a two-stage approach to investigate associations between genetic variants in inflammation pathways and lung cancer risk based on genome-wide association study (GWAS) data. A total of 7,650 sequence variants from 720 genes relevant to inflammation pathways were identified using keyword and pathway searches from Gene Cards and Gene Ontology databases. In Stage 1, six GWAS datasets from the International Lung Cancer Consortium were pooled (4,441 cases and 5,094 controls of European ancestry), and a hierarchical modeling (HM) approach was used to incorporate prior information for each of the variants into the analysis. The prior matrix was constructed using (1) role of genes in the inflammation and immune pathways; (2) physical properties of the variants including the location of the variants, their conservation scores and amino acid coding; (3) LD with other functional variants and (4) measures of heterogeneity across the studies. HM affected the priority ranking of variants particularly among those having low prior weights, imprecise estimates and/or heterogeneity across studies. In Stage 2, we used an independent NCI lung cancer GWAS study (5,699 cases and 5,818 controls) for in silico replication. We identified one novel variant at the level corrected for multiple comparisons (rs2741354 in EPHX2 at 8q21.1 with p value = 7.4 × 10−6), and confirmed the associations between TERT (rs2736100) and the HLA region and lung cancer risk. HM allows for prior knowledge such as from bioinformatic sources to be incorporated into the analysis systematically, and it represents a complementary analytical approach to the conventional GWAS analysis.
Epidemiological studies of underground miners suggested that occupational exposure to radon causes lung cancer with squamous cell carcinoma (SCC) as the predominant histological type. However, the genetic determinants for susceptibility of radon-induced SCC in miners are unclear. Double-strand breaks induced by radioactive radon daughters are repaired primarily by non-homologous end joining (NHEJ) that is accompanied by the dynamic changes in surrounding chromatin, including nucleosome repositioning and histone modifications. Thus, a molecular epidemiological study was conducted to assess whether genetic variation in 16 genes involved in NHEJ and related histone modification affected susceptibility for SCC in radon-exposed former miners (267 SCC cases and 383 controls) from the Colorado plateau. A global association between genetic variation in the haplotype block where SIRT1 resides and the risk for SCC in miners (P = 0.003) was identified. Haplotype alleles tagged by the A allele of SIRT1 rs7097008 were associated with increased risk for SCC (odds ratio = 1.69, P = 8.2×10−5) and greater survival in SCC cases (hazard ratio = 0.79, P = 0.03) in miners. Functional validation of rs7097008 demonstrated that the A allele was associated with reduced gene expression in bronchial epithelial cells and compromised DNA repair capacity in peripheral lymphocytes. Together, these findings substantiate genetic variation in SIRT1 as a risk modifier for developing SCC in miners and suggest that SIRT1 may also play a tumor suppressor role in radon-induced cancer in miners.
While certain inherited syndromes (e.g. Neurofibromatosis or Li-Fraumeni) are associated with an increased risk of glioma, most familial gliomas are non-syndromic. This study describes the demographic and clinical characteristics of the largest series of non-syndromic glioma families ascertained from 14 centres in the United States (US), Europe and Israel as part of the Gliogene Consortium.
Families with 2 or more verified gliomas were recruited between January 2007 and February 2011. Distributions of demographic characteristics and clinical variables of gliomas in the families were described based on information derived from personal questionnaires.
The study population comprised 841 glioma patients identified in 376 families (9797 individuals). There were more cases of glioma among males, with a male to female ratio of 1.25. In most families (83%), 2 gliomas were reported, with 3 and 4 gliomas in 13% and 3% of the families, respectively. For families with 2 gliomas, 57% were among 1st-degree relatives, and 31.5% among 2nd-degree relatives. Overall, the mean (±standard deviation [SD]) diagnosis age was 49.4 (±18.7) years. In 48% of families with 2 gliomas, at least one was diagnosed at <40 y, and in 12% both were diagnosed under 40 y of age. Most of these families (76%) had at least one grade IV glioblastoma multiforme (GBM), and in 32% both cases were grade IV gliomas. The most common glioma subtype was GBM (55%), followed by anaplastic astrocytoma (10%) and oligodendroglioma (8%). Individuals with grades I–II were on average 17 y younger than those with grades III–IV.
Familial glioma cases are similar to sporadic cases in terms of gender distribution, age, morphology and grade. Most familial gliomas appear to comprise clusters of two cases suggesting low penetrance, and that the risk of developing additional gliomas is probably low. These results should be useful in the counselling and clinical management of individuals with a family history of glioma.
Glioma; Familial glioma; Clinical characteristics; Genetic counselling
Genome-wide association studies (GWASs) have mainly focused on top significant single nucleotide polymorphisms (SNPs), most of which did not have clear biological functions but were just surrogates for unknown causal variants. Studying SNPs with modest association and putative functions in biologically plausible pathways has become one complementary approach to GWASs. To unravel the key roles of mitogen-activated protein kinase (MAPK) pathways in cutaneous melanoma (CM) risk, we re-evaluated the associations between 47 818 SNPs in 280 MAPK genes and CM risk using our published GWAS dataset with 1804 CM cases and 1026 controls. We initially found 105 SNPs with P ≤ 0.001, more than expected by chance, 26 of which were predicted to be putatively functional SNPs. The risk associations with 16 SNPs around DUSP14 (rs1051849) and a previous reported melanoma locus MAFF/PLA2G6 (proxy SNP rs4608623) were replicated in the GenoMEL dataset (P < 0.01) but failed in the Australian dataset. Meta-analysis showed that rs1051849 in the 3ʹ untranslated regions of DUSP14 was associated with a reduced risk of melanoma (odds ratio = 0.89, 95% confidence interval: 0.82–0.96, P = 0.003, false discovery rate = 0.056). Further genotype–phenotype correlation analysis using the 90 HapMap lymphoblastoid cell lines from Caucasians showed significant correlations between two SNPs (rs1051849 and rs4608623) and messenger RNA expression levels of DUSP14 and MAFF (P = 0.025 and P = 0.010, respectively). Gene-based tests also revealed significant SNPs were over-represented in MAFF, PLA2G6, DUSP14 and other 16 genes. Our results suggest that functional SNPs in MAPK pathways may contribute to CM risk. Further studies are warranted to validate our findings.
Black/white disparities in lung cancer incidence and mortality mandate an evaluation of underlying biological differences. We have previously shown higher risks of lung cancer associated with prior emphysema in African American compared with white lung cancer patients.
We therefore evaluated a panel of 1440 inflammatory gene variants in a two phase analysis (discovery and replication), added top GWAS lung cancer hits from Caucasian populations, and 28 SNPs from a published gene panel. The discovery set (477 self-designated African Americans cases, 366 controls matched on age, ethnicity, and gender) were from Houston, Texas. The external replication set (330 cases, 342 controls) was from the EXHALE study at Wayne State University.
In discovery, 154 inflammation SNPs were significant (P<0.05) on univariate analysis, as was one of the gene panel SNPs (rs308738 in REV1, P=0.0013), and three GWAS hits, rs16969968 P=0.0014 and rs10519203 P=0.0003 in the 15q locus and rs2736100, the HTERT locus, P=0.0002. One inflammation SNP, rs950286, was successfully replicated with a concordant odds ratio of 1.46(1.14-1.87) in discovery, 1.37(1.05-1.77) in replication, and a combined OR of 1.40 (1.17-1.68). This SNP is intergenic between IRF4 and EXOC2 genes. We also constructed and validated epidemiologic and extended risk prediction models. The AUC for the epidemiologic discovery model was 0.77 and 0.80 for the extended model. For the combined datasets, the AUC values were 0.75 and 0.76, respectively.
As has been reported for other cancer sites and populations, incorporating top genetic hits into risk prediction models, provides little improvement in model performance and no clinical relevance.
Genome-wide association studies have identified hundreds of genetic variants associated with specific cancers. A few of these risk regions have been associated with more than one cancer site; however, a systematic evaluation of the associations between risk variants for other cancers and lung cancer risk has yet to be performed.
We included 18023 patients with lung cancer and 60543 control subjects from two consortia, Population Architecture using Genomics and Epidemiology (PAGE) and Transdisciplinary Research in Cancer of the Lung (TRICL). We examined 165 single-nucleotide polymorphisms (SNPs) that were previously associated with at least one of 16 non–lung cancer sites. Study-specific logistic regression results underwent meta-analysis, and associations were also examined by race/ethnicity, histological cell type, sex, and smoking status. A Bonferroni-corrected P value of 2.5×10–5 was used to assign statistical significance.
The breast cancer SNP LSP1 rs3817198 was associated with an increased risk of lung cancer (odds ratio [OR] = 1.10; 95% confidence interval [CI] = 1.05 to 1.14; P = 2.8×10–6). This association was strongest for women with adenocarcinoma (P = 1.2×10–4) and not statistically significant in men (P = .14) with this cell type (P
het by sex = .10). Two glioma risk variants, TERT rs2853676 and CDKN2BAS1 rs4977756, which are located in regions previously associated with lung cancer, were associated with increased risk of adenocarcinoma (OR = 1.16; 95% CI = 1.10 to 1.22; P = 1.1×10–8) and squamous cell carcinoma (OR = 1.13; CI = 1.07 to 1.19; P = 2.5×10–5), respectively.
Our findings demonstrate a novel pleiotropic association between the breast cancer LSP1 risk region marked by variant rs3817198 and lung cancer risk.
Unlike genome-wide association studies, few comprehensive studies of copy number variation's contribution to complex human disease susceptibility have been performed. Copy number variations are abundant in humans and represent one of the least well-studied classes of genetic variants; in addition, known rheumatoid arthritis susceptibility loci explain only a portion of familial clustering. Therefore, we performed a genome-wide study of association between deletion or excess homozygosity and rheumatoid arthritis using high-density 550 K SNP genotype data from a genome-wide association study. We used a genome-wide statistical method that we recently developed to test each contiguous SNP locus between 868 cases and 1194 controls to detect excess homozygosity or deletion variants that influence susceptibility. Our method is designed to detect statistically significant evidence of deletions or homozygosity at individual SNPs for SNP-by-SNP analyses and to combine the information among neighboring SNPs for cluster analyses. In addition to successfully detecting the known deletion variants on major histocompatibility complex, we identified 4.3 and 28 kb clusters on chromosomes 10p and 13q, respectively, which were significant at a Bonferroni-type-corrected 0.05 nominal significant level. Independently, we performed analyses using PennCNV, an algorithm for identifying and cataloging copy numbers for individuals based on a hidden Markov model, and identified cases and controls that had chromosomal segments with copy number <2. Using Fisher's exact test for comparing the numbers of cases and controls with copy number <2 per SNP, we identified 26 significant SNPs (protective; more controls than cases) aggregating on chromosome 14 with P-values <10−8.
Epidemiological studies have investigated the association between vitamin D pathway genes and breast cancer risk; however, little is known about the association between vitamin D pathway genes and breast cancer prognosis. In a retrospective cohort of 1029 patients with early-stage breast cancer, we analyzed the association between 106 tagging single nucleotide polymorphisms (SNPs) in eight vitamin D pathway genes and breast cancer disease-free survival (DFS) using Cox regression analysis adjusted for known prognostic variables. Using a false discovery rate of 10%, six intronic SNPs were significantly associated with poorer DFS: retinoid-X receptor alpha (RXRA) SNPs (rs881658, rs11185659, rs10881583, rs881657 and rs7864987) and plasminogen activator and urokinase receptor (PLAUR) SNP (rs4251864). Treatment received (no systemic therapy, hormone therapy alone or chemotherapy) was an effect modifier of the RXRA SNPs association with DFS (P < 0.05); therefore, we stratified further analysis by treatment group. Among patients who did not receive systemic therapy, RXRA SNP [rs10881583 (P = 0.02)] was associated with poorer DFS, and among patients who received chemotherapy, RXRA SNPs (rs881658, rs11185659, rs10881583, rs881657 and rs7864987) were associated with poorer DFS (P < 0.001 for all SNPs). However, RXRA SNPs: rs10881583 (P < 0.001) and rs881657 (P = 0.02) were associated with improved DFS in patients treated with hormone therapy alone. Our results suggest that SNPs in the RXRA and PLAUR genes in the vitamin D pathway may contribute to breast cancer DFS. In particular, SNPs in RXRA may predict for poorer or improved DFS in patients, according to type of systemic treatment received. If validated, these markers could be used for risk stratification of breast cancer patients.
The logistic kernel machine test (LKMT) is a testing procedure tailored towards high-dimensional genetic data. Its use in pathway analyses of GWA case-control studies results from its computational efficiency and flexibility of incorporating additional information via the kernel. The kernel can be any positive definite function; unfortunately its form strongly influences the power and bias. Most authors have recommended the use of the simple linear kernel. We demonstrate via a simulation that the probability of rejecting the null hypothesis of no association just by chance increases with the number of SNPs or genes in the pathway when applying this kernel.
We propose a novel kernel that includes an appropriate standardization, in order to protect against any inflation of false positive results. Moreover, our novel kernel contains information on gene membership of SNPs in the pathway.
In an application to data from the NARAC Rheumatoid Arthritis Consortium, we find that even this basic genomic structure can improve the ability of the LKMT to identify meaningful associations. We also demonstrate that the standardization effectively eliminates problems with size bias.
We recommend the use of our standardized kernel and urge caution when using non-adjusted kernels in the LKMT to conduct pathway analysis.
Logistic Kernel Machine Regression; Size Bias; Pathway Analysis; GWAS; Rheumatoid Arthritis
This study examined colonoscopy adherence and attitudes towards colorectal cancer (CRC) screening in individuals who underwent Lynch syndrome genetic counseling and testing.
We evaluated changes in colonoscopy adherence and CRC screening attitudes in 78 cancer-unaffected relatives of Lynch syndrome mutation carriers before pre-test genetic counseling (baseline) and at 6 and 12 months post-disclosure of test results (52 mutation-negative, 26 mutation-positive).
While both groups were similar at baseline, at 12 months post-disclosure, a greater number of mutation-positive individuals had had a colonoscopy compared with mutation-negative individuals. From baseline to 12 months post-disclosure, the mutation-positive group demonstrated an increase in mean scores on measures of colonoscopy commitment, self-efficacy, and perceived benefits of CRC screening, and a decrease in mean scores for perceived barriers to CRC screening. Mean scores on colonoscopy commitment decreased from baseline to 6 months in the mutation-negative group.
Adherence to risk-appropriate guidelines for CRC surveillance improved after genetic counseling and testing for Lynch syndrome. Mutation-positive individuals reported increasingly positive attitudes toward CRC screening after receiving genetic test results, potentially reinforcing longer term colonoscopy adherence.
Lynch syndrome; colorectal cancer screening; genetic counseling and testing; colonoscopy commitment; benefits of and barriers to screening
The risk of glioma has consistently been shown to be increased two-fold in relatives of patients with primary brain tumors (PBT). A recent genome-wide linkage study of glioma families provided evidence for a disease locus on 17q12-21.32, with the possibility of four additional risk loci at 6p22.3, 12p13.33-12.1, 17q22-23.2, and 18q23.
To identify the underlying genetic variants responsible for the linkage signals, we compared the genotype frequencies of 5,122 SNPs mapping to these five regions in 88 glioma cases with and 1,100 cases without a family history of PBT (discovery study). An additional series of 84 familial and 903 non-familial cases were used to replicate associations.
In the discovery study, 12 SNPs showed significant associations with family history of PBT (P < 0.001). In the replication study, two of the 12 SNPs were confirmed: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.031) and 17q12-21.32 SPOP rs650461 (P = 0.025). In the combined analysis of discovery and replication studies, the strongest associations were attained at four SNPs: 12p13.33-12.1 PRMT8 rs17780102 (P = 0.0001), SOX5 rs7305773 (P = 0.0001) and STKY1 rs2418087 (P = 0.0003), and 17q12-21.32 SPOP rs6504618 (P = 0.0006). Further, a significant gene-dosage effect was found for increased risk of family history of PBT with these four SNPs in the combined data set (Ptrend < 1.0 ×10−8).
The results support the linkage finding that some loci in the 12p13.33-12.1 and 17q12-q21.32 may contribute to gliomagenesis and suggest potential target genes underscoring linkage signals.
Association; Polymorphisms; Glioma; Family history of primary brain tumor; Linkage analysis
Genetic imprinting is the most well-known cause for parent-of-origin effect (POE) whereby a gene is differentially expressed depending on the parental origin of the same alleles. Genetic imprinting is related to several human disorders, including diabetes, breast cancer, alcoholism, and obesity. This phenomenon has been shown to be important for normal embryonic development in mammals. Traditional association approaches ignore this important genetic phenomenon. In this study, we generalize the natural and orthogonal interactions (NOIA) framework to allow for estimation of both main allelic effects and POEs. We develop a statistical (Stat-POE) model that has the orthogonal estimates of parameters including the POEs. We conducted simulation studies for both quantitative and qualitative traits to evaluate the performance of the statistical and functional models with different levels of POEs. Our results showed that the newly proposed Stat-POE model, which ensures orthogonality of variance components if Hardy-Weinberg Equilibrium (HWE) or equal minor and major allele frequencies is satisfied, had greater power for detecting the main allelic additive effect than a Func-POE model, which codes according to allelic substitutions, for both quantitative and qualitative traits. The power for detecting the POE was the same for the Stat-POE and Func-POE models under HWE for quantitative traits.
We aimed at extending the natural and orthogonal interaction (NOIA) framework, developed for modeling gene-gene interactions in the analysis of quantitative traits, to allow for reduced genetic models, dichotomous traits, and gene-environment interactions. We evaluate the performance of the NOIA statistical models using simulated data and lung cancer data.
The NOIA statistical models are developed for the additive, dominant, recessive genetic models, and a binary environmental exposure. Using the Kronecker product rule, a NOIA statistical model is built to model gene-environment interactions. By treating the genotypic values as the logarithm of odds, the NOIA statistical models are extended to the analysis of case-control data.
Our simulations showed that power for testing associations while allowing for interaction using the statistical model is much higher than using functional models for most of the scenarios we simulated. When applied to the lung cancer data, much smaller P-values were obtained using the NOIA statistical model for either the main effects or the SNP-smoking interactions for some of the SNPs tested.
The NOIA statistical models are usually more powerful than the functional models in detecting main effects and interaction effects for both quantitative traits and binary traits.
Statistical power; Genetic association studies; Case-control association analysis; Gene-environment interaction; Environmental risk factor; Association mapping; Orthogonal modeling
Lung cancer in lifetime never smokers is distinct from that in smokers, but the role of separate or overlapping carcinogenic pathways has not been explored. We therefore evaluated a comprehensive panel of 11,737 SNPs in inflammatory-pathway genes in a discovery phase (451 lung cancer cases, 508 controls from Texas). SNPs that were significant were evaluated in a second external population (303 cases, 311 controls from the Mayo Clinic). An intronic SNP in the ACVR1B gene, rs12809597, was replicated with significance and restricted to those reporting adult exposure to environmental tobacco smoke Another promising candidate was a SNP in NR4A1, although the replication OR did not achieve statistical significance. ACVR1B belongs to the TGFR-β superfamily, contributing to resolution of inflammation and initiation of airway remodeling. An inflammatory microenvironment, (second hand smoking, asthma, or hay fever) is necessary for risk from these gene variants to be expressed. These findings require further replication, followed by targeted resequencing, and functional validation.
lung cancer; never smokers; inflammation genes; sidestream exposure
Lynch syndrome is an autosomal dominant syndrome of familial malignancies resulting from germ-line mutations in DNA mismatch repair (MMR) genes. Our goal was to take a pathway-based approach to investigate the influence of polymorphisms in cell-cycle related genes on age of onset for Lynch syndrome using a tree-model.
We evaluated polymorphisms in a panel of cell-cycle related genes (AURKA, CDKN2A, TP53, E2F2, CCND1, TP73, MDM2, IGF1 and CDKN2B) in 220 MMR gene mutation carriers from 129 families. We applied a novel statistical approach, tree-modeling (Classification and Regression Tree), to the analysis of data on Lynch syndrome patients to identify individuals with a higher probability of developing colorectal cancer at an early age and explore the gene-gene interactions between polymorphisms in cell-cycle genes.
We found that the subgroup with CDKN2A C580T wild-type genotype, IGF1 CA-repeats ≥19, E2F2 variant genotype, AURKA wild-type genotype, and CCND1 variant genotype had the youngest age of onset, with a 45-year median onset age. While the subgroup with CDKN2A C580T wild-type genotype, IGF1 CA-repeats ≥19, E2F2 wild-type genotype and AURKA variant genotype had the latest median age of onset, which was 70 years. Furthermore, we found evidence of a possible gene-gene interaction between E2F2 and AURKA genes related to CRC age of onset.
Polymorphisms in these cell-cycle related genes work together to modify the age at onset of CRC in patients with Lynch syndrome. These studies provide an important part of the foundation for development of a model for stratifying age of onset risk among those with Lynch syndrome.
Tree model; cell cycle pathway; Polymorphisms; Lynch syndrome; Age of onset
Genome-wide case–control studies have been widely used to identify genetic variants that predispose to human diseases. Such studies are powerful in detecting common genetic variants with moderate effects, but quickly lose power as allele frequency and genotype relative risk decrease. Because patients with one or more affected relatives are more likely to inherit disease-predisposing alleles of a genetic disease than patients without family histories of the disease, sampling patients with affected relatives almost always increases the frequency of disease predisposing alleles in cases and improves the power of case–control association studies. This paper evaluates the power of case–control studies that select cases and/or controls according to their family histories of disease. Our results showed that this study design can dramatically increase the power of a case–control association study for a wide range of disease types. Because each additional affected relative of a patient reduces the required sample size roughly by a pair of case and control, inclusion of cases with affected relatives can dramatically decrease the required sample size and thus the cost of such studies.
Heterogeneity in age of onset of colorectal cancer in individuals with mutations in DNA mismatch repair genes (Lynch syndrome) suggests the influence of other lifestyle and genetic modifiers. We hypothesized that genes regulating the cell cycle influence the observed heterogeneity as cell cycle–related genes respond to DNA damage by arresting the cell cycle to provide time for repair and induce transcription of genes that facilitate repair. We examined the association of 1456 single nucleotide polymorphisms (SNPs) in 128 cell cycle–related genes and 31 DNA repair–related genes in 485 non-Hispanic white participants with Lynch syndrome to determine whether there are SNPs associated with age of onset of colorectal cancer. Genotyping was performed on an Illumina GoldenGate platform, and data were analyzed using Kaplan–Meier survival analysis, Cox regression analysis and classification and regression tree (CART) methods. Ten SNPs were independently significant in a multivariable Cox proportional hazards regression model after correcting for multiple comparisons (P < 5×10–4). Furthermore, risk modeling using CART analysis defined combinations of genotypes for these SNPs with which subjects could be classified into low-risk, moderate-risk and high-risk groups that had median ages of colorectal cancer onset of 63, 50 and 42 years, respectively. The age-associated risk of colorectal cancer in the high-risk group was more than four times the risk in the low-risk group (hazard ratio = 4.67, 95% CI = 3.16–6.92). The additional genetic markers identified may help in refining risk groups for more tailored screening and follow-up of non-Hispanic white patients with Lynch syndrome.