Rhabdomyosarcoma (RMS) is a childhood cancer originating from skeletal muscle, and patient survival is poor in the presence of metastatic disease. Few determinants that regulate metastasis development have been identified. The receptor tyrosine kinase FGFR4 is highly expressed in RMS tissue, suggesting a role in tumorigenesis, although its functional importance has not been defined. Here, we report the identification of mutations in FGFR4 in human RMS tumors that lead to its activation and present evidence that it functions as an oncogene in RMS. Higher FGFR4 expression in RMS tumors was associated with advanced-stage cancer and poor survival, while FGFR4 knockdown in a human RMS cell line reduced tumor growth and experimental lung metastases when the cells were transplanted into mice. Moreover, 6 FGFR4 tyrosine kinase domain mutations were found among 7 of 94 (7.5%) primary human RMS tumors. The mutants K535 and E550 increased autophosphorylation, Stat3 signaling, tumor proliferation, and metastatic potential when expressed in a murine RMS cell line. These mutants also transformed NIH 3T3 cells and led to an enhanced metastatic phenotype. Finally, murine RMS cell lines expressing the K535 and E550 FGFR4 mutants were substantially more susceptible to apoptosis in the presence of a pharmacologic FGFR inhibitor than the control cell lines expressing the empty vector or wild-type FGFR4. Together, our results demonstrate that mutationally activated FGFR4 acts as an oncogene, and these are what we believe to be the first known mutations in a receptor tyrosine kinase in RMS. These findings support the potential therapeutic targeting of FGFR4 in RMS.
The DNA repair pathways help to maintain genomic integrity and therefore genetic variation in the pathways could affect the propensity to develop cancer. Selected germline single nucleotide polymorphisms (SNPs) in the pathways have been associated with esophageal cancer and gastric cancer (GC) but few studies have comprehensively examined the pathway genes. We aimed to investigate associations between DNA repair pathway genes and risk of esophageal squamous cell carcinoma (ESCC) and GC, using data from a genome-wide association study in a Han Chinese population where ESCC and GC are the predominant cancers. In sum, 1942 ESCC cases, 1758 GC cases and 2111 controls from the Shanxi Upper Gastrointestinal Cancer Genetics Project (discovery set) and the Linxian Nutrition Intervention Trials (replication set) were genotyped for 1675 SNPs in 170 DNA repair-related genes. Logistic regression models were applied to evaluate SNP-level associations. Gene- and pathway-level associations were determined using the resampling-based adaptive rank-truncated product approach. The DNA repair pathways overall were significantly associated with risk of ESCC (P = 6.37 × 10−
4), but not with GC (P = 0.20). The most significant gene in ESCC was CHEK2 (P = 2.00 × 10−
6) and in GC was CLK2 (P = 3.02 × 10−
4). We observed several other genes significantly associated with either ESCC (SMUG1, TDG, TP53, GTF2H3, FEN1, POLQ, HEL308, RAD54B, MPG, FANCE and BRCA1) or GC risk (MRE11A, RAD54L and POLE) (P < 0.05). We provide evidence for an association between specific genes in the DNA repair pathways and the risk of ESCC and GC. Further studies are warranted to validate these associations and to investigate underlying mechanisms.
Chemokines play a pivotal role in immune regulation and response, and
previous studies suggest an association between immune deficiency and
Non-Hodgkin lymphoma (NHL).
We evaluated the association between NHL and polymorphisms in 18
genes (CCL1, CCL2, CCL5, CCL7, CCL8, CCL11, CCL13, CCL18, CCL20,
CCL24, CCL26, CCR1, CCR3, CCR4, CCR6, CCR7, CCR8 and CCR9)
encoding for the CC chemokines using data from a population-based
case-control study of NHL conducted in Connecticut women.
CCR8 was associated with diffuse large B-cell
lymphoma (DLBCL) (p = 0.012) and CCL13 was
associated with chronic lymphocytic leukemia or small lymphocytic lymphoma
(CLL/SLL) (p = 0.003) at gene level. After adjustment for
multiple comparisons, none of the genes or SNPs were associated with risk of
overall NHL or NHL subtypes.
Our results suggest that the genes encoding for CC chemokines are not
significantly associated with the risk of NHL, and further studies are
needed to verify these findings.
Our data indicate that CC chemokine genes were not associated with
Non-Hodgkin lymphoma; CC chemokine gene; Single nucleotide polymorphism
The National Cancer Institute’s NCI-60 cell line panel, the most extensively characterized set of cells in existence and a public resource, is frequently used as a screening tool for drug discovery. Since many laboratories around the world rely on data from the NCI-60 cells, confirmation of their genetic identities represents an essential step in validating results from them. Given the consequences of cell line contamination or misidentification, quality control measures should routinely include DNA fingerprinting. We have, therefore, used standard DNA microsatellite short tandem repeats to profile the NCI-60, and the resulting DNA fingerprints are provided here as a reference. Consistent with previous reports, the fingerprints suggest that several NCI-60 lines have common origins: the melanoma lines MDA-MB-435, MDA-N, and M14; the central nervous system lines U251 and SNB-19; the ovarian lines OVCAR-8 and OVCAR-8/ADR (also called NCI/ADR); and the prostate lines DU-145, DU-145 (ATCC), and RC0.1. Those lines also demonstrate that the ability to connect two fingerprints to the same origin is not affected by stable transfection or by the development of multidrug resistance. As expected, DNA fingerprints were not able to distinguish different tissues-of-origin. The fingerprints serve principally as a barcodes.
DNA fingerprinting; NCI-60; cell contamination
In China, esophageal cancer is the fourth leading cause of cancer death where essentially all cases are histologically esophageal squamous cell carcinoma (ESCC), in contrast to esophageal adenocarcinoma in the West. Globally, ESCC is 2.4 times more common among men than women and recently it has been suggested that sex hormones may be associated with the risk of ESCC. We examined the association between genetic variants in sex hormone metabolic genes and ESCC risk in a population from north central China with high-incidence rates. A total of 1026 ESCC cases and 1452 controls were genotyped for 797 unique tag single-nucleotide polymorphisms (SNPs) in 51 sex hormone metabolic genes. SNP-, gene- and pathway-based associations with ESCC risk were evaluated using unconditional logistic regression adjusted for age, sex and geographical location and the adaptive rank truncated product (ARTP) method. Statistical significance was determined through use of permutation for pathway- and gene-based associations. No associations were observed for the overall sex hormone metabolic pathway (P = 0.14) or subpathways (androgen synthesis: P = 0.30, estrogen synthesis: P = 0.15 and estrogen removal: P = 0.19) with risk of ESCC. However, six individual genes (including SULT2B1, CYP1B1, CYP3A7, CYP3A5, SHBG and CYP11A1) were significantly associated with ESCC risk (P < 0.05). Our examination of genetic variation in the sex hormone metabolic pathway is consistent with a potential association with risk of ESCC. These positive findings warrant further evaluation in relation to ESCC risk and replication in other populations.
Genotype imputation substantially increases available markers for analysis in genome-wide association studies (GWAS) by leveraging linkage disequilibrium from a reference panel. We sought to (i) investigate the performance of imputation from the August 2010 release of the 1000 Genomes Project (1000GP) in an existing GWAS of prostate cancer, (ii) look for novel associations with prostate cancer risk, (iii) fine-map known prostate cancer susceptibility regions using an approximate Bayesian framework and stepwise regression, and (iv) compare power and efficiency of imputation and de novo sequencing.
We used 2,782 aggressive prostate cancer cases and 4,458 controls from the NCI Breast and Prostate Cancer Cohort Consortium aggressive prostate cancer GWAS to infer 5.8 million well-imputed autosomal single nucleotide polymorphisms.
Imputation quality, as measured by correlation between imputed and true allele counts, was higher among common variants than rare variants. We found no novel prostate cancer associations among a subset of 1.2 million well-imputed low-frequency variants. At a genome-wide sequencing cost of $2,500, imputation from SNP arrays is a more powerful strategy than sequencing for detecting disease associations of SNPs with minor allele frequencies above 1%.
1000GP imputation provided dense coverage of previously-identified prostate cancer susceptibility regions, highlighting its potential as an inexpensive first-pass approach to fine-mapping in regions such as 5p15 and 8q24. Our study shows 1000GP imputation can accurately identify low-frequency variants and stresses the importance of large sample size when studying these variants.
rare variants; association; fine mapping
Genes that alter disease risk only in combination with certain
environmental exposures may not be detected in genetic association analysis. By
using methods accounting for gene-environment (G × E) interaction, we
aimed to identify novel genetic loci associated with breast cancer risk. Up to
34,475 cases and 34,786 controls of European ancestry from up to 23 studies in
the Breast Cancer Association Consortium were included. Overall, 71,527 single
nucleotide polymorphisms (SNPs), enriched for association with breast cancer,
were tested for interaction with 10 environmental risk factors using three
recently proposed hybrid methods and a joint test of association and
interaction. Analyses were adjusted for age, study, population stratification,
and confounding factors as applicable. Three SNPs in two independent loci showed
statistically significant association: SNPs rs10483028 and rs2242714 in perfect
linkage disequilibrium on chromosome 21 and rs12197388 in ARID1B on chromosome
6. While rs12197388 was identified using the joint test with parity and with age
at menarche (P-values = 3 × 10−07),
the variants on chromosome 21 q22.12, which showed interaction with adult body
mass index (BMI) in 8,891 postmenopausal women, were identified by all methods
applied. SNP rs10483028 was associated with breast cancer in women with a BMI
below 25 kg/m2 (OR = 1.26, 95% CI 1.15–1.38) but not in women
with a BMI of 30 kg/m2 or higher (OR = 0.89, 95% CI 0.72–1.11,
P for interaction = 3.2 × 10−05).
Our findings confirm comparable power of the recent methods for detecting G
× E interaction and the utility of using G × E interaction
analyses to identify new susceptibility loci.
breast cancer risk; gene-environment interaction; polymorphisms; body mass index; case-control study
The 13th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV2012: Shanghai, China, 6th 8th September 2012) was a stimulating workshop where researchers from academia and industry explored the latest progress, challenges, and opportunities in genome variation research. Key themes included advancements in next-generation sequencing (NGS) technology, investigation of common and rare diseases, employing NGS in the clinic, utilizing large datasets that leverage biobanks and population-specific cohorts, and exploration of genomic features.
variation; SNP; GWAS; next generation sequencing; NGS; inherited disease
In 2012, the National Cancer Institute (NCI) engaged the scientific community to provide a vision for cancer epidemiology in the 21st century. Eight overarching thematic recommendations, with proposed corresponding actions for consideration by funding agencies, professional societies, and the research community emerged from the collective intellectual discourse. The themes are (i) extending the reach of epidemiology beyond discovery and etiologic research to include multilevel analysis, intervention evaluation, implementation, and outcomes research; (ii) transforming the practice of epidemiology by moving towards more access and sharing of protocols, data, metadata, and specimens to foster collaboration, to ensure reproducibility and replication, and accelerate translation; (iii) expanding cohort studies to collect exposure, clinical and other information across the life course and examining multiple health-related endpoints; (iv) developing and validating reliable methods and technologies to quantify exposures and outcomes on a massive scale, and to assess concomitantly the role of multiple factors in complex diseases; (v) integrating “big data” science into the practice of epidemiology; (vi) expanding knowledge integration to drive research, policy and practice; (vii) transforming training of 21st century epidemiologists to address interdisciplinary and translational research; and (viii) optimizing the use of resources and infrastructure for epidemiologic studies. These recommendations can transform cancer epidemiology and the field of epidemiology in general, by enhancing transparency, interdisciplinary collaboration, and strategic applications of new technologies. They should lay a strong scientific foundation for accelerated translation of scientific discoveries into individual and population health benefits.
big data; clinical trials; cohort studies; epidemiology; genomics; medicine; public health; technologies; training; translational research
BACKGROUND & AIMS
Heritable factors contribute to the development of colorectal cancer. Identifying the genetic loci associated with colorectal tumor formation could elucidate the mechanisms of pathogenesis.
We conducted a genome-wide association study that included 14 studies, 12,696 cases of colorectal tumors (11,870 cancer, 826 adenoma), and 15,113 controls of European descent. The 10 most statistically significant, previously unreported findings were followed up in 6 studies; these included 3056 colorectal tumor cases (2098 cancer, 958 adenoma) and 6658 controls of European and Asian descent.
Based on the combined analysis, we identified a locus that reached the conventional genome-wide significance level at less than 5.0 × 10−8: an intergenic region on chromosome 2q32.3, close to nucleic acid binding protein 1 (most significant single nucleotide polymorphism: rs11903757; odds ratio [OR], 1.15 per risk allele; P = 3.7 × 10−8). We also found evidence for 3 additional loci with P values less than 5.0 × 10−7: a locus within the laminin gamma 1 gene on chromosome 1q25.3 (rs10911251; OR, 1.10 per risk allele; P = 9.5 × 10−8), a locus within the cyclin D2 gene on chromosome 12p13.32 (rs3217810 per risk allele; OR, 0.84; P = 5.9 × 10−8), and a locus in the T-box 3 gene on chromosome 12q24.21 (rs59336; OR, 0.91 per risk allele; P = 3.7 × 10−7).
In a large genome-wide association study, we associated polymorphisms close to nucleic acid binding protein 1 (which encodes a DNA-binding protein involved in DNA repair) with colorectal tumor risk. We also provided evidence for an association between colorectal tumor risk and polymorphisms in laminin gamma 1 (this is the second gene in the laminin family to be associated with colorectal cancers), cyclin D2 (which encodes for cyclin D2), and T-box 3 (which encodes a T-box transcription factor and is a target of Wnt signaling to β-catenin). The roles of these genes and their products in cancer pathogenesis warrant further investigation.
Colon Cancer; Genetics; Risk Factors; SNP
Solvent exposure has been inconsistently linked to the risk for non-Hodgkin lymphoma (NHL). The aim of this study was to determine whether the association is modified by genetic variation in immune genes. A population-based case–control study involving 601 incident cases of NHL and 717 controls was carried out in 1996–2000 among women from Connecticut. Thirty single nucleotide polymorphisms in 17 immune genes were examined in relation to the associations between exposure to various solvents and the risk for NHL. The study found that polymorphism in interleukin 10 (IL10; rs1800890) modified the association between occupational exposure to organic solvents and the risk for diffuse large B-cell lymphoma (Pfor interaction=0.0058). The results remained statistically significant after adjustment for false discovery rate. Compared with women who were never occupationally exposed to any organic solvents, women who were exposed to organic solvents at least once had a significantly increased risk for diffuse large B-cell lymphoma if they carried the IL10 (rs1800890) TT genotype (odds ratio=3.31, 95% confidence interval: 1.80–6.08), but not if they carried the AT/AA genotype (odds ratio=1.14, 95% confidence interval: 0.72–1.79). No significant interactions were observed for other immune gene single nucleotide polymorphisms and various solvents in relation to NHL overall and its major subtypes. The study provided preliminary evidence supporting a role of immune gene variations in modifying the association between occupational solvent exposure and the risk for NHL.
immune genes; non-Hodgkin lymphoma; occupational exposure; single nucleotide polymorphism; solvents
The Centre for Applied Genomics of the Hospital for Sick Children and the University of Toronto hosted the 10th Human Genome Variation (HGV) Meeting in Toronto, Canada, in October 2008, welcoming about 240 registrants from 34 countries. During the 3 days of plenary workshops, keynote address, and poster sessions, a strong cross-disciplinary trend was evident, integrating expertise from technology and computation, through biology and medicine, to ethics and law. Single nucleotide polymorphisms (SNPs) as well as the larger copy number variants (CNVs) are recognized by ever-improving array and next-generation sequencing technologies, and the data are being incorporated into studies that are increasingly genome-wide as well as global in scope. A greater challenge is to convert data to information, through databases, and to use the information for greater understanding of human variation. In the wake of publications of the first individual genome sequences, an inaugural public forum provided the opportunity to debate whether we are ready for personalized medicine through direct-to-consumer testing. The HGV meetings foster collaboration, and fruits of the interactions from 2008 are anticipated for the 11th annual meeting in September 2009.
SNP; CNV; GWAS; personalized medicine
The 11th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV2009: Tallinn, Estonia, 11th–13th September 2009) provided a stimulating workshop environment where diverse academics and industry representatives explored the latest progress, challenges, and opportunities in relating genome variation to evolution, technology, health, and disease. Key themes included Genome-Wide Association Studies (GWAS), progress beyond GWAS, sequencing developments, and bioinformatics approaches to large-scale datasets.
HGV2009; SNP; variation; GWAS; CNV
The 12th International Meeting on Human Genome Variation and Complex Genome Analysis (HGV2011: Berkeley, California, USA, 8th–10th September 2011) was a stimulating workshop where researchers from academia and industry explored the latest progress, challenges, and opportunities in genome variation research. Key themes included progress beyond GWAS, variation in human populations, use of sequence data in medical settings, large-scale sequencing data analysis, and bioinformatics approaches to large datasets.
human variation; GWAS; SNP; medical genomics
Epidemiologic studies have shown consistent associations between obesity and increased thyroid cancer risk, but, to date, no studies have investigated the relationship between thyroid cancer risk and obesity-related single nucleotide polymorphisms (SNPs).
We evaluated 575 tag SNPs in 23 obesity-related gene regions in a case-control study of 341 incident papillary thyroid cancer (PTC) cases and 444 controls of European ancestry. Logistic regression models, adjusted for attained age, year of birth, and sex were used to calculate odds ratios (ORs) and 95% confidence intervals (CIs) with SNP genotypes, coded as 0, 1, and 2 and modeled continuously to calculate P-trends.
Nine out of 10 top-ranking SNPs (Ptrend<0.01) were located in the FTO (fat mass and obesity associated) gene region, while the other was located in INSR (insulin receptor). None of the associations were significant after correcting for multiple testing.
Our data do not support an important role of obesity-related genetic polymorphisms in determining the risk of PTC.
Factors other than selected genetic polymorphisms may be responsible for the observed associations between obesity and increased PTC risk.
single nucleotide polymorphisms; case-control study; obesity; body mass index; thyroid neoplasms
There has been a long-standing controversy in epidemiology with regard to an appropriate risk scale for testing interactions between genes (G) and environmental exposure (E ). Although interaction tests based on the logistic model—which approximates the multiplicative risk for rare diseases—have been more widely applied because of its convenience in statistical modeling, interactions under additive risk models have been regarded as closer to true biologic interactions and more useful in intervention-related decision-making processes in public health. It has been well known that exploiting a natural assumption of G-E independence for the underlying population can dramatically increase statistical power for detecting multiplicative interactions in case-control studies. However, the implication of the independence assumption for tests for additive interaction has not been previously investigated. In this article, the authors develop a likelihood ratio test for detecting additive interactions for case-control studies that incorporates the G-E independence assumption. Numerical investigation of power suggests that incorporation of the independence assumption can enhance the efficiency of the test for additive interaction by 2- to 2.5-fold. The authors illustrate their method by applying it to data from a bladder cancer study.
additive risk model; case-control studies; gene-environment independence; gene-environment interaction; multiplicative risk model
Pulmonary inflammation may contribute to lung cancer etiology. We conducted a broad evaluation of the association of single nucleotide polymorphisms (SNPs) in innate immunity and inflammation pathways with lung cancer risk, and conducted comparisons with a lung cancer genome wide association study (GWAS).
We included 378 lung cancer cases and 450 controls from the Prostate, Lung, Colorectal and Ovarian (PLCO) Cancer Screening Trial. An Illumina GoldenGate oligonucleotide pool assay was used to genotype 1,429 SNPs. Odds ratios (ORs) and 95% confidence intervals (CIs) were estimated for each SNP, and p-values for trend were calculated. For statistically significant SNPs (p-trend<0.05), we replicated our results with genotyped or imputed SNPs in the GWAS, and adjusted p-values for multiple testing.
In our PLCO analysis, we observed a significant association between 81 SNPs located in 44 genes and lung cancer (p-trend<0.05). Of these 81 SNPS, there was evidence for confirmation in the GWAS for 10 SNPs. However, after adjusting for multiple comparisons, the only SNP that remained significantly associated with lung cancer in the replication phase was rs4648127 (NFKB1; multiple testing adjusted p-trend=0.02). The CT/TT genotype of NFKB1 was associated with reduced odds of lung cancer in the PLCO study (OR=0.56; 95% CI 0.37–0.86) and the GWAS (OR=0.79; 95% CI 0.69–0.90).
We found a significant association between a variant in the NFKB1 gene and lung cancer risk. Our findings add to evidence implicating inflammation and immunity in lung cancer etiology.
lung cancer; genetics; inflammation; immunity; epidemiology
There is growing evidence linking genetic variations to non–Hodgkin lymphoma (NHL) etiology. To complement ongoing agnostic approaches for identifying susceptibility genes, we evaluated 488 candidate gene regions and their relation to risk for NHL and NHL subtypes.
We genotyped 6,679 tag single nucleotide polymorphisms (SNPs) in 947 cases and 826 population-based controls from a multicenter U.S. case–control study. Gene-level summary of associations were obtained by computing the minimum P value (“minP test”) on the basis of 10,000 permutations. We used logistic regression to evaluate the association between genotypes and haplotypes with NHL. For NHL subtypes, we conducted polytomous multivariate unconditional logistic regression (adjusted for sex, race, age). We calculated P-trends under the codominant model for each SNP.
Fourteen gene regions were associated with NHL (P < 0.01). The most significant SNP associated with NHL maps to the SYK gene (rs2991216, P-trend = 0.00005). The three most significant gene regions were on chromosome 6p21.3 (RING1/RXRB; AIF1; BAT4). Accordingly, SNPs in RING1/RXRB (rs2855429), AIF1 (rs2857597), and BAT4 (rs3115667) were associated with NHL (P-trends ≤ 0.0002) and both diffuse large B-cell and follicular lymphomas (P-trends < 0.05).
Our results suggest potential importance for SYK on chromosome 9 with NHL etiology. Our results further implicate 6p21.3 gene variants, supporting the need for full characterization of this chromosomal region in relation to lymphomagenesis.
Gene variants on chromosome 9 may represent a new region of interesting for NHL etiology. The independence of the reported variants in 6p21.3 from implicated variants (TNF/HLA) supports the need to confirm causal variants in this region
To test the hypothesis that genetic variations in DNA repair genes may modify the association between occupational exposure to solvents and the risk of non-Hodgkin lymphoma (NHL).
A population-based case-control study was conducted in Connecticut women including 518 histologically confirmed incident NHL cases and 597 controls. Unconditional logistic regression models were used to estimate odds ratios (OR) and effect modification from the 30 SNPs in 16 DNA repair genes of the association between solvent exposure and risk of NHL overall and subtypes.
SNPs in MGMT (rs12917) and NBS1 (rs1805794) significantly modified the association between exposure to chlorinated solvents and NHL risk (Pforinteraction = 0.0003 and 0.0048 respectively). After stratified by major NHL histological subtypes, MGMT (rs12917) modified the association between chlorinated solvents and risk of diffuse large B-cell lymphoma (Pforinteraction = 0.0027) and follicular lymphoma (Pforinteraction = 0.0024). A significant interaction was also observed between occupational exposure to benzene and BRCA2 (rs144848) for NHL overall (Pforinteraction = 0.0042).
Our study results suggest that genetic variations in DNA repair genes modify the association between occupational exposure to solvents and risk of NHL.
Non-Hodgkin Lymphoma; Occupational Exposure; Solvents; Single Nucleotide Polymorphism; DNA Repair Genes
Previous studies have examined the association between ABO blood group and ovarian cancer risk, with inconclusive results.
In 8 studies participating in the Ovarian Cancer Association Consortium (OCAC), we determined ABO blood groups and diplotypes by genotyping 3 SNPs in the ABO locus. Odds ratios (ORs) and 95% confidence intervals (CIs) were calculated in each study using logistic regression; individual study results were combined using random effects meta-analysis.
Compared to blood group O, the A blood group was associated with a modestly increased ovarian cancer risk: (OR: 1.09; 95% CI: 1.01–1.18; p=0.03). In diplotype analysis, the AO, but not the AA diplotype was associated with increased risk (AO: OR: 1.11; 95% CI: 1.01–1.22; p=0.03; AA: OR: 1.03; 95% CI: 0.87–1.21; p=0.76). Neither AB nor the B blood groups were associated with risk. Results were similar across ovarian cancer histologic subtypes.
Consistent with most previous reports, the A blood type was associated modestly with increased ovarian cancer risk in this large analysis of multiple studies of ovarian cancer. Future studies investigating potential biologic mechanisms are warranted.
ovarian cancer; ABO blood group; Ovarian Cancer Association Consortium (OCAC); genetic epidemiology
We report a new model to project the predictive performance of polygenic models based on the number and distribution of effect sizes for the underlying susceptibility alleles and the size of the training dataset. Using estimates of effect-size distribution and heritability derived from current studies, we project that while 45% of the variance of height has been attributed to common tagging Single Nucleotide Polymorphisms (SNP), a model trained on one million people may only explain 33.4% of variance of the trait. Current studies can identify 3.0%, 1.1%, and 7.0%, of the populations who are at two-fold or higher than average risk for Type 2 diabetes, coronary artery disease and prostate cancer, respectively. Tripling of sample sizes could elevate the percentages to 18.8%, 6.1%, and 12.2%, respectively. The utility of future polygenic models will depend on achievable sample sizes, underlying genetic architecture and information on other risk-factors, including family history.
Copy number variants (CNV) can be called from SNP-arrays; however, few studies have attempted to combine both CNV and SNP calls to test for association with complex diseases. Even when SNPs are located within CNVs, two separate association analyses are necessary, to compare the distribution of bi-allelic genotypes in cases and controls (referred to as SNP-only strategy) and the number of copies of a region (referred to as CNV-only strategy). However, when disease susceptibility is actually associated with allele specific copy-number states, the two strategies may not yield comparable results, raising a series of questions about the optimal analytical approach. We performed simulations of the performance of association testing under different scenarios that varied genotype frequencies and inheritance models. We show that the SNP-only strategy lacks power under most scenarios when the SNP is located within a CNV; frequently it is excluded from analysis as it does not pass quality control metrics either because of an increased rate of missing calls or a departure from fitness for Hardy-Weinberg proportion. The CNV-only strategy also lacks power because the association testing depends on the allele which copy number varies. The combined strategy performs well in most of the scenarios. Hence, we advocate the use of this combined strategy when testing for association with SNPs located within CNVs.
Recent studies have identified common genetic variants that are unequivocally associated with central adiposity, BMI, and/or fasting plasma glucose among individuals of European descent. Our objective was to evaluate these associations in a population of Asian-Indians. We examined 16 single-nucleotide polymorphisms (SNPs) from loci previously linked to waist circumference, BMI, or fasting glucose in 1,129 Asian-Indians from New Delhi and Trivandrum. Trained medical staff measured waist circumference, height, and weight. Fasting plasma glucose was measured from collected blood specimens. Genotype–phenotype associations were evaluated using linear regression, with adjustments for age, gender, religion, and study region. For gene–environment interaction tests, total physical activity (PA) during the past 7 days was assessed by the International Physical Activity Questionnaire (IPAQ). The T allele at the FTO rs3751812 locus was associated with increased waist circumference (per allele effect of +1.58 cm, Ptrend = 0.0015) after Bonferroni adjustment for multiple testing (Padj = 0.04). We also found a nominally statistically significant FTO–PA interaction (Pinteraction = 0.008). Among participants with <81 metabolic equivalent (MET)-h/wk of PA, the rs3751812 variant was associated with increased waist size (+2.68 cm; 95% confidence interval (CI) = 1.24, 4.12), but not among those with 212+ MET-h/wk (−1.79 cm; 95% CI = −4.17, 0.58). No other variant had statistically significant associations, although statistical power was modest. In conclusion, we confirmed that an FTO variant associated with central adiposity in European populations is associated with central adiposity among Asian-Indians and corroborated prior reports indicating that high PA attenuates FTO-related genetic susceptibility to adiposity.
We show how to use reports of cancer in family members to discover additional genetic associations or confirm previous findings in genome-wide association (GWA) studies conducted in case-control, cohort, or cross-sectional studies. Our novel family-history-based approach allows economical association studies for multiple cancers, without genotyping of relatives (as required in family studies), follow-up of participants (as required in cohort studies), or oversampling of specific cancer cases, (as required in case-control studies). We empirically evaluate the performance of the proposed family-history-based approach in studying associations with prostate and ovarian cancers, using data from GWA studies previously conducted within the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial. The family-history-based method may be particularly useful for investigating genetic susceptibility to rare diseases, for which accruing cases may be very difficult, by using disease information from non-genotyped relatives of participants in multiple case-control and cohort studies designed primarily for other purposes.