In this paper, we propose a novel mixed-effects model for longitudinal changes of systolic blood pressure (SBP) over time that can estimate the joint effect of multiple sequence variants on SBP after accounting for familial correlation and the time dependencies within individuals. First we carried out agenome-wide association study (GWAS) using chromosome 3 single-nucleotide polymorphisms(SNPs) to identify regions associated with SBP levels. In a second step, we examined the sequence data to fine-map additional variants in these regions. Four SNPs from two intergenic regions (PLXNA1-TPRA1, BPESC1-PISTR1) and one gene (NLGN1) were detected to be significantly associated with SBP after adjusting for multiple testing. These SNPs were used to capture the multilocus genotype diversity in the regions. The multilocus genotypes derived from these four variants were then treated as random effects in the mixed-effects model, and the corresponding confidence intervals (Cis) were built to assess the significance of the joint effect of the sequence variants on SBP. We found that multilocus genotypes (GG,TT,AG,GG), (GG,TT,GG,GG), and (GG,TT,AA,AG) are associated with higher SBPand (GG,CT,AA,AA), (AA,TT,AA,AA), and (AG,CT,AA,AG) are associated with lower SBP. The linear mixed-effects models provide a powerful tool for GWAS and the analysis of joint modeling of multilocus genotypes.
The hypothalamic-pituitary-adrenal (HPA) axis regulates stress responses and HPA dysfunction has been associated with several chronic diseases. Low birthweight may be associated with HPA dysfunction in later life, yet human studies are inconclusive. The primary study aim was to identify genetic variants associated with HPA axis function. A secondary aim was to evaluate if these variants modify the association between birthweight and HPA axis function in adolescents.
Morning fasted blood samples were collected from children of the Western Australia Pregnancy Cohort (Raine) at age 17 (n = 1077). Basal HPA axis function was assessed by total cortisol, corticosteroid binding globulin (CBG), and adrenocorticotropic hormone (ACTH). The associations between 124 tag single nucleotide polymorphisms (SNPs) within 16 HPA pathway candidate genes and each hormone were evaluated using multivariate linear regression and penalized linear regression analysis using the HyperLasso method.
The penalized regression analysis revealed one candidate gene SNP, rs11621961 in the CBG encoding gene (SERPINA6), significantly associated with total cortisol and CBG. No other candidate gene SNPs were significant after applying the penalty or adjusting for multiple comparisons; however, several SNPs approached significance. For example, rs907621 (p = 0.002) and rs3846326 (p = 0.003) in the mineralocorticoid receptor gene (NR3C2) were associated with ACTH and SERPINA6 SNPs rs941601 (p = 0.004) and rs11622665 (p = 0.008), were associated with CBG. To further investigate our findings for SERPINA6, rare and common SNPs in the gene were imputed from the 1,000 genomes data and 8 SNPs across the gene were significantly associated with CBG levels after adjustment for multiple comparisons. Birthweight was not associated with any HPA outcome, and none of the gene-birthweight interactions were significant after adjustment for multiple comparisons.
Our study suggests that genetic variation in the SERPINA6 gene may be associated with altered CBG levels during adolescence. Replication of these findings is required.
While the number of established genetic variants associated with adult body mass index (BMI) is growing, the relationships between these variants and growth during childhood are yet to be fully characterised. We examined the association between validated adult BMI associated single nucleotide polymorphisms (SNPs) and growth trajectories across childhood. We investigated the timing of onset of the genetic effect and whether it was sex specific.
Children from the ALSPAC and Raine birth cohorts were used for analysis (n = 9,328). Genotype data from 32 adult BMI associated SNPs were investigated individually and as an allelic score. Linear mixed effects models with smoothing splines were used for longitudinal modelling of the growth parameters and measures of adiposity peak and rebound were derived.
The allelic score was associated with BMI growth throughout childhood, explaining 0.58% of the total variance in BMI in females and 0.44% in males. The allelic score was associated with higher BMI at the adiposity peak (females = 0.0163 kg/m2 per allele, males = 0.0123 kg/m2 per allele) and earlier age (-0.0362 years per allele in males and females) and higher BMI (0.0332 kg/m2 per allele in females and 0.0364 kg/m2 per allele in males) at the adiposity rebound. No gene:sex interactions were detected for BMI growth.
This study suggests that known adult genetic determinants of BMI have observable effects on growth from early childhood, and is consistent with the hypothesis that genetic determinants of adult susceptibility to obesity act from early childhood and develop over the life course.
DNA methylation plays an important role in carcinogenesis and is being recognized as a promising diagnostic and prognostic biomarker for a variety of malignancies including Prostate cancer (PCa). The human kallikrein-related peptidases (KLKs) have emerged as an important family of cancer biomarkers, with KLK3, encoding for Prostate Specific Antigen, being most recognized. However, few studies have examined the epigenetic regulation of KLKs and its implications to PCa. To assess the biological effect of DNA methylation on KLK6 and KLK10 expression, we treated PC3 and 22RV1 PCa cells with a demethylating drug, 5-aza-2′deoxycytidine, and observed increased expression of both KLKs, establishing that DNA methylation plays a role in regulating gene expression. Subsequently, we have quantified KLK6 and KLK10 DNA methylation levels in two independent cohorts of PCa patients operated by radical prostatectomy between 2007–2011 (Cohort I, n = 150) and 1998–2001 (Cohort II, n = 124). In Cohort I, DNA methylation levels of both KLKs were significantly higher in cancerous tissue vs. normal. Further, we evaluated the relationship between DNA methylation and clinicopathological parameters. KLK6 DNA methylation was significantly associated with pathological stage only in Cohort I while KLK10 DNA methylation was significantly associated with pathological stage in both cohorts. In Cohort II, low KLK10 DNA methylation was associated with biochemical recurrence in univariate and multivariate analyses. A similar trend for KLK6 DNA methylation was observed. The results suggest that KLK6 and KLK10 DNA methylation distinguishes organ confined from locally invasive PCa and may have prognostic value.
biomarkers; epigenetics; kallikrein-related peptidases; prostate cancer; quantitative DNA methylation analysis
The timing of associations between common genetic variants and changes in growth patterns over childhood may provide insight into the development of obesity in later life. To address this question, it is important to define appropriate statistical models to allow for the detection of genetic effects influencing longitudinal childhood growth.
Methods and Results
Children from The Western Australian Pregnancy Cohort (Raine; n = 1,506) Study were genotyped at 17 genetic loci shown to be associated with childhood obesity (FTO, MC4R, TMEM18, GNPDA2, KCTD15, NEGR1, BDNF, ETV5, SEC16B, LYPLAL1, TFAP2B, MTCH2, BCDIN3D, NRXN3, SH2B1, MRSA) and an obesity-risk-allele-score was calculated as the total number of ‘risk alleles’ possessed by each individual. To determine the statistical method that fits these data and has the ability to detect genetic differences in BMI growth profile, four methods were investigated: linear mixed effects model, linear mixed effects model with skew-t random errors, semi-parametric linear mixed models and a non-linear mixed effects model. Of the four methods, the semi-parametric linear mixed model method was the most efficient for modelling childhood growth to detect modest genetic effects in this cohort. Using this method, three of the 17 loci were significantly associated with BMI intercept or trajectory in females and four in males. Additionally, the obesity-risk-allele score was associated with increased average BMI (female: β = 0.0049, P = 0.0181; male: β = 0.0071, P = 0.0001) and rate of growth (female: β = 0.0012, P = 0.0006; male: β = 0.0008, P = 0.0068) throughout childhood.
Using statistical models appropriate to detect genetic variants, variations in adult obesity genes were associated with childhood growth. There were also differences between males and females. This study provides evidence of genetic effects that may identify individuals early in life that are more likely to rapidly increase their BMI through childhood, which provides some insight into the biology of childhood growth.
Schizophrenia is one of the most common and complex neuropsychiatric disorders, which is contributed both by genetic and environmental exposures. Recently, it is shown that NRG1-mediated ErbB4 signalling regulates many important cellular and molecular processes such as cellular growth, differentiation and death, particularly in myelin-producing cells, glia and neurons. Recent association studies have revealed genomic regions of NRG1 and ERBB4, which are significantly associated with risk of developing schizophrenia; however, inconsistencies exist in terms of validation of findings between distinct populations. In this study, we aim to validate the previously identified regions and to discover novel haplotypes of NRG1 and ERBB4 using logistic regression models and Haploview analyses in three independent datasets from GWAS conducted on European subjects, namely, CATIE, GAIN and nonGAIN. We identified a significant 6-kb block in ERBB4 between chromosome locations 212,156,823 and 212,162,848 in CATIE and GAIN datasets (p = 0.0206 and 0.0095, respectively). In NRG1, a significant 25-kb block, between 32,291,552 and 32,317,192, was associated with risk of schizophrenia in all CATIE, GAIN, and nonGAIN datasets (p = 0.0005, 0.0589, and 0.0143, respectively). Fine mapping and FastSNP analysis of genetic variation located within significantly associated regions proved the presence of binding sites for several transcription factors such as SRY, SOX5, CEPB, and ETS1. In this study, we have discovered and validated haplotypes of ERBB4 and NRG1 in three independent European populations. These findings suggest that these haplotypes play an important role in the development of schizophrenia by affecting transcription factor binding affinity.
Pathway analysis has been proposed as a complement to single SNP analyses in GWAS. This study compared pathway analysis methods using two lung cancer GWAS data sets based on four studies: one a combined data set from Central Europe and Toronto (CETO); the other a combined data set from Germany and MD Anderson (GRMD). We searched the literature for pathway analysis methods that were widely used, representative of other methods, and had available software for performing analysis. We selected the programs EASE, which uses a modified Fishers Exact calculation to test for pathway associations, GenGen (a version of Gene Set Enrichment Analysis (GSEA)), which uses a Kolmogorov-Smirnov-like running sum statistic as the test statistic, and SLAT, which uses a p-value combination approach. We also included a modified version of the SUMSTAT method (mSUMSTAT), which tests for association by averaging χ2 statistics from genotype association tests. There were nearly 18000 genes available for analysis, following mapping of more than 300,000 SNPs from each data set. These were mapped to 421 GO level 4 gene sets for pathway analysis. Among the methods designed to be robust to biases related to gene size and pathway SNP correlation (GenGen, mSUMSTAT and SLAT), the mSUMSTAT approach identified the most significant pathways (8 in CETO and 1 in GRMD). This included a highly plausible association for the acetylcholine receptor activity pathway in both CETO (FDR≤0.001) and GRMD (FDR = 0.009), although two strong association signals at a single gene cluster (CHRNA3-CHRNA5-CHRNB4) drive this result, complicating its interpretation. Few other replicated associations were found using any of these methods. Difficulty in replicating associations hindered our comparison, but results suggest mSUMSTAT has advantages over the other approaches, and may be a useful pathway analysis tool to use alongside other methods such as the commonly used GSEA (GenGen) approach.
A common feature of neoplastic cells is that mutations in SMADs can contribute to the loss of sensitivity to the anti-tumor effects of transforming growth factor-β (TGF-β). However, germline mutation analysis of SMAD3 and SMAD4, the principle substrates of the TGF-β signaling pathway, has not yet been conducted in breast cancer. Thus, it is currently unknown whether germline SMAD3 and SMAD4 mutations are involved in breast cancer predisposition.
We performed mutation analysis of the highly conserved mad-homology 2 (MH2) domains for both genes in genomic DNA from 408 non-BRCA1/BRCA2 breast cancer cases and 710 population controls recruited by the Ontario site of the breast cancer family registry (CFR) using denaturing high-performance liquid chromatography (DHPLC) and direct DNA sequencing. The results were interpreted in several ways. First, we adapted nucleotide diversity analysis to quantitatively assess whether the frequency of alterations differ between the two genes. Next, in silico tools were used to predict variants' effect on domain function and mRNA splicing. Finally, 37 cases or controls harboring alterations were tested for aberrant splicing using reverse-transcription polymerase chain reaction (PCR) and real-time PCR statistical comparison of germline expressions by non-parametric Mann-Whitney test of independent samples.
We identified 27 variants including 2 novel SMAD4 coding variants c.1350G > A (p.Gln450Gln), and c.1701A > G (p.Ile525Val). There were no inactivating mutations even though c.1350G > A was predicted to affect exonic splicing enhancers. However, several additional findings were of note: 1) nucleotide diversity estimate for SMAD3 but not SMAD4 indicated that coding variants of the MH2 domain were more infrequent than expected; 2) in breast cancer cases SMAD3 was significantly over-expressed relative to controls (P < 0.05) while the case harboring SMAD4 c.1350G > A was associated with elevated germline expression (> 5-fold); 3) separate analysis using tissue expression data showed statistically significant over-expression of SMAD3 and SMAD4 in breast carcinomas.
This study shows that inactivating germline alterations in SMAD3 and SMAD4 are rare, suggesting a limited role in driving tumorigenesis. Nevertheless, aberrant germline expressions of SMAD3 and SMAD4 may be more common in breast cancer than previously suspected and offer novel insight into their roles in predisposition and/or progression of breast cancer.
Simvastatin and lovastatin are statins traditionally used for lowering serum cholesterol levels. However, there exists evidence indicating their potential chemotherapeutic characteristics in cancer. In this study, we used bioinformatic analysis of publicly available data in order to systematically identify the genes involved in resistance to cytotoxic effects of these two drugs in the NCI60 cell line panel. We used the pharmacological data available for all the NCI60 cell lines to classify simvastatin or lovastatin resistant and sensitive cell lines, respectively. Next, we performed whole-genome single marker case-control association tests for the lovastatin and simvastatin resistant and sensitive cells using their publicly available Affymetrix 125K SNP genomic data. The results were then evaluated using RNAi methodology. After correction of the p-values for multiple testing using False Discovery Rate, our results identified three genes (NRP1, COL13A1, MRPS31) and six genes (EAF2, ANK2, AKAP7, STEAP2, LPIN2, PARVB) associated with resistance to simvastatin and lovastatin, respectively. Functional validation using RNAi confirmed that silencing of EAF2 expression modulated the response of HCT-116 colon cancer cells to both statins. In summary, we have successfully utilized the publicly available data on the NCI60 cell lines to perform whole-genome association studies for simvastatin and lovastatin. Our results indicated genes involved in the cellular response to these statins and siRNA studies confirmed the role of the EAF2 in response to these drugs in HCT-116 colon cancer cells.
An age-dependent association between variation at the FTO locus and BMI in children has been suggested. We meta-analyzed associations between the FTO locus (rs9939609) and BMI in samples, aged from early infancy to 13 years, from 8 cohorts of European ancestry. We found a positive association between additional minor (A) alleles and BMI from 5.5 years onwards, but an inverse association below age 2.5 years. Modelling median BMI curves for each genotype using the LMS method, we found that carriers of minor alleles showed lower BMI in infancy, earlier adiposity rebound (AR), and higher BMI later in childhood. Differences by allele were consistent with two independent processes: earlier AR equivalent to accelerating developmental age by 2.37% (95% CI 1.87, 2.87, p = 10−20) per A allele and a positive age by genotype interaction such that BMI increased faster with age (p = 10−23). We also fitted a linear mixed effects model to relate genotype to the BMI curve inflection points adiposity peak (AP) in infancy and AR. Carriage of two minor alleles at rs9939609 was associated with lower BMI at AP (−0.40% (95% CI: −0.74, −0.06), p = 0.02), higher BMI at AR (0.93% (95% CI: 0.22, 1.64), p = 0.01), and earlier AR (−4.72% (−5.81, −3.63), p = 10−17), supporting cross-sectional results. Overall, we confirm the expected association between variation at rs9939609 and BMI in childhood, but only after an inverse association between the same variant and BMI in infancy. Patterns are consistent with a shift on the developmental scale, which is reflected in association with the timing of AR rather than just a global increase in BMI. Results provide important information about longitudinal gene effects and about the role of FTO in adiposity. The associated shifts in developmental timing have clinical importance with respect to known relationships between AR and both later-life BMI and metabolic disease risk.
Variation at the FTO locus is reliably associated with BMI and adiposity-related traits, but little is still known about the effects of variation at this gene, particularly in children. We have examined a large collection of samples for which both genotypes at rs9939609 and multiple measurements of BMI are available. We observe a positive association between the minor allele (A) at rs9939609 and BMI emerging in childhood that has the characteristics of a shift in the age scale leading simultaneously to lower BMI during infancy and higher BMI in childhood. Assessed in cross section and longitudinally, we find evidence of variation at rs9939609 being associated with the timing of AR and the concert of events expected with such a change to the BMI curve. Importantly, the apparently negative association between the minor allele (A) and BMI in early life, which is then followed by an earlier AR and greater BMI in childhood, is a pattern known to be associated with both the risk of adult BMI and metabolic disorders such as type 2 diabetes (T2D). These findings are important in our understanding of the contribution of FTO to adiposity, but also in light of efforts to appreciate genetic effects in a lifecourse context.
Epidemiological studies have suggested an association between selenium intake and protection from a variety of cancer. Considering this clinical importance of selenium, we aimed to identify the genes associated with resistance to selenium treatment. We have applied a previous methodology developed by our group, which is based on the genetic and pharmacological data publicly available for the NCI60 cancer cell line panel. In short, we have categorized the NCI60 cell lines as selenium resistant and sensitive based on their growth inhibition (GI50) data. Then, we have utilized the Affymetrix 125K SNP chip data available and carried out a genome-wide case-control association study for the selenium sensitive and resistant NCI60 cell lines. Our results showed statistically significant association of four SNPs in 5q33–34, 10q11.2, 10q22.3 and 14q13.1 with selenium resistance. These SNPs were located in introns of the genes encoding for a kinase-scaffolding protein (AKAP6), a membrane protein (SGCD), a channel protein (KCNMA1), and a protein kinase (PRKG1). The knock-down of KCNMA1 by siRNA showed increased sensitivity to selenium in both LNCaP and PC3 cell lines. Furthermore, SNP-SNP interaction (epistasis) analysis indicated the interactions of the SNPs in AKAP6 with SGCD as well as SNPs in AKAP6 with KCNMA1 with each other, assuming additive genetic model. These genes were also all involved in the Ca2+ signaling, which has a direct role in induction of apoptosis and induction of apoptosis in tumor cells is consistent with the chemopreventive action of selenium. Once our findings are further validated, this knowledge can be translated into clinics where individuals who can benefit from the chemopreventive characteristics of the selenium supplementation will be easily identified using a simple DNA analysis.
Accurate risk (penetrance) estimates for associated phenotypes in carriers of a major disease gene are important for genetic counselling of at-risk individuals. Population-specific estimates of penetrance are often needed as well. Families ascertained from high-risk disease clinics provide substantial data to estimate penetrance of a disease gene, but these estimates must be adjusted for possible specific sources of bias.
A cohort of 12 independently ascertained HNPCC families harbouring a founder MSH2 mutation was identified from a cancer genetics clinic in St. John's, Newfoundland, Canada. Carrier status was known for 247 family members but phenotype information on up to 85 additional relatives with unknown carrier status was available; using modified segregation models these additional individuals could be included in the analyses. Three HNPCC-related phenotypes were evaluated as age at diagnosis of: any HNPCC cancer (first cancer), colorectal cancer (CRC), and endometrial cancer (EC) for females.
Lifetime (age 70) risk estimates for male and female carriers were similar for developing any HNPCC cancer (Males = 98.2%, 95% Confidence Interval (CI) = (93.8%, 99.9%); Females = 92.8%, 95% CI = (82.4%, 99.1%)) but female carriers experienced substantially reduced lifetime risk for developing CRC compared to male carriers (Females = 38.9%, 95% CI = (24.2%, 62.1%); Males = 84.5%, 95% CI = (67.3%, 91.3%)). Female non-carriers had very low lifetime risk for these two outcomes while male non-carriers had lifetime risks intermediate to the female carriers and non-carriers. Female carriers had a lifetime risk of developing EC of 82.4%. Relative risks for developing any HNPCC cancer (carriers relative to non-carriers) were substantially greater for females compared to their male counterparts (Females = 54.8, 95%CI = (4.4, 379.8); Males = 9.7, 95% CI = (0.3, 23.8)). Relative risks for developing CRC at age 70 were substantially greater for females compared to their male counterparts (Females = 23.7, 95%CI = (5.6, 137.9); Males = 6.8%, 95% CI = (2.3, 66.2)). However, the risk of developing CRC decreased with age among both genders.
The proposed modified segregation-based models used to estimate age-specific risks for HNPCC phenotypes can reduce bias due to ascertainment and missing genotype information as well as provide estimates of absolute and relative risks.
Several DNA mismatch repair (MMR) genes, responsible for the majority of Lynch Syndrome cancers, have been identified, predominantly MLH1 and MSH2, but the risk associated with these mutations is still not well established. The aim of this study is to provide population-based estimates of the risks of colorectal cancer (CRC) by gender and mutation type from the Ontario population.
We analyzed 32 families segregating MMR mutations selected from the Ontario Familial Colorectal Cancer Registry and including 199 first-degree and 421 second-degree relatives. The cumulative risks were estimated using a modified segregation-based approach, which allows correction for the ascertainment of the Lynch Syndrome families and permits account to be taken for missing genotype information.
The risks of developing CRC by age 70 were 60% and 47% among men and women carriers of any MMR mutation, respectively. Among MLH1 mutation carriers, males had significantly higher risks than females at all ages (67% vs. 35% by age 70, p-value = 0.02), while the risks were similar in MSH2 carriers (about 54%). The relative risk associated with MLH1 was almost constant with age (hazard ratio (HR) varied between 5.5-5.1 over age 30–70), while the HR for MSH2 decreased with age (from 13.1 at age 30 to 5.4 at age 70).
This study provides a unique population-based study of CRC risks among MSH2/MLH1 mutation carriers in a Canadian population and can help to better define and understand the patterns of risks among members of Lynch Syndrome families.
Promoter and 5′ end methylation regulation of tumour suppressor genes is a common feature of many cancers. Such occurrences often lead to the silencing of these key genes and thus they may contribute to the development of cancer, including prostate cancer.
In order to identify methylation changes in prostate cancer, we performed a genome-wide analysis of DNA methylation using Agilent human CpG island arrays. Using computational and gene-specific validation approaches we have identified a large number of potential epigenetic biomarkers of prostate cancer. Further validation of candidate genes on a separate cohort of low and high grade prostate cancers by quantitative MethyLight analysis has allowed us to confirm DNA hypermethylation of HOXD3 and BMP7, two genes that may play a role in the development of high grade tumours. We also show that promoter hypermethylation is responsible for downregulated expression of these genes in the DU-145 PCa cell line.
This study identifies novel epigenetic biomarkers of prostate cancer and prostate cancer progression, and provides a global assessment of DNA methylation in prostate cancer.
Estrogens are crucial tumorigenic hormones, which impact the cell growth and proliferation during breast cancer development. Estrogens are metabolized by a series of enzymes including COMT, which converts catechol estrogens into biologically non-hazardous methoxyestrogens. Several studies have also shown the relationship between estrogen and cell cycle progression through activation of CCND1 transcription.
In this study, we have investigated the independent and the combined effects of commonly occurring CCND1 (Pro241Pro, A870G) and COMT (Met108/158Val) polymorphisms to breast cancer risk in two independent Caucasian populations from Ontario (1228 breast cancer cases and 719 population controls) and Finland (728 breast cancer cases and 687 population controls). Both COMT and CCND1 polymorphisms have been previously shown to impact on the enzymatic activity of the coded proteins.
Here, we have shown that the high enzymatic activity genotype of CCND1High (AA) was associated with increased breast cancer risk in both the Ontario [OR: 1.3, 95%CI (1.0–1.69)] and the Finland sample [OR: 1.4, 95%CI (1.01–1.84)]. The heterozygous COMTMedium (MetVal) and the high enzymatic activity of COMTHigh (ValVal) genotype was also associated with breast cancer risk in Ontario cases, [OR: 1.3, 95%CI (1.07–1.68)] and [OR: 1.4, 95%CI (1.07–1.81)], respectively. However, there was neither a statistically significant association nor increased trend of breast cancer risk with COMTHigh (ValVal) genotypes in the Finland cases [OR: 1.0, 95%CI (0.73–1.39)]. In the combined analysis, the higher activity alleles of the COMT and CCND1 is associated with increased breast cancer risk in both Ontario [OR: 2.22, 95%CI (1.49–3.28)] and Finland [OR: 1.73, 95%CI (1.08–2.78)] populations studied. The trend test was statistically significant in both the Ontario and Finland populations across the genotypes associated with increasing enzymatic activity.
Using two independent Caucasian populations, we have shown a stronger combined effect of the two commonly occurring CCND1 and COMT genotypes in the context of breast cancer predisposition.
Genome scan meta-analysis (GSMA) can prove very useful in detecting genetic effects too small to be detected in an individual linkage study and can also lead to more consistent results. In this paper, we propose a new kernel-based estimation procedure for GSMA. Instead of estimating identity by descent between markers, as performed in interval mapping approaches, we estimated directly the nonparametric linkage score between markers using a kernel procedure. The GSMA is then extended to take into account the kernel estimate of the nonparametric linkage score and its variance at a given chromosomal position. The method is applied to the rheumatoid arthritis genome scan data (Genetic Analysis Workshop 15 Problem 2).
There is growing evidence that gene-gene interactions are ubiquitous in determining the susceptibility to common human diseases. The investigation of such gene-gene interactions presents new statistical challenges for studies with relatively small sample sizes as the number of potential interactions in the genome can be large. Breast cancer provides a useful paradigm to study genetically complex diseases because commonly occurring single nucleotide polymorphisms (SNPs) may additively or synergistically disturb the system-wide communication of the cellular processes leading to cancer development.
In this study, we systematically studied SNP-SNP interactions among 19 SNPs from 18 key genes involved in major cancer pathways in a sample of 398 breast cancer cases and 372 controls from Ontario. We discuss the methodological issues associated with the detection of SNP-SNP interactions in this dataset by applying and comparing three commonly used methods: the logistic regression model, classification and regression trees (CART), and the multifactor dimensionality reduction (MDR) method.
Our analyses show evidence for several simple (two-way) and complex (multi-way) SNP-SNP interactions associated with breast cancer. For example, all three methods identified XPD-[Lys751Gln]*IL10-[G(-1082)A] as the most significant two-way interaction. CART and MDR identified the same critical SNPs participating in complex interactions. Our results suggest that the use of multiple statistical approaches (or an integrated approach) rather than a single methodology could be the best strategy to elucidate complex gene interactions that have generally very different patterns.
The strategy used here has the potential to identify complex biological relationships among breast cancer genes and processes. This will lead to the discovery of novel biological information, which will improve breast cancer risk management.
cMyc and p27 are key genes implicated in carcinogenesis. Whether polymorphisms in these genes affect breast cancer risk or prognosis is still unclear. In this study, we focus on a rare non-synonymous polymorphism in cMyc (N11S) and a common polymorphism in p27 (V109G) and determine their role in risk and prognosis using data collected from the Ontario Breast Cancer Family Registry.
Risk factor data was collected at baseline on a large group of women (cases = 1,115 and population-based controls = 710) and clinical data (including treatment and follow-up) were collected prospectively by periodic review of medical records for a subset of cases (N = 967) for nearly a decade. A centralized pathology review was conducted. Unconditional logistic regression was used to determine the association of polymorphisms with breast cancer risk and the Cox proportional hazards model was used to determine their association with survival.
Our results suggest that while cMyc-N11S can be considered a putatively functional polymorphism located in the N-terminal domain, it is not associated with risk, tumor characteristics or survival. The p27-G109 allele was associated with a modest protective effect in adjusted analyses and higher T stage. We found no evidence to suggest that p27-V109G alone or in combination with cMyc-N11S was associated with survival. Age at onset and first-degree family history of breast or ovarian cancer did not significantly modify the association of these polymorphisms with breast cancer risk.
Further work is recommended to understand the potential functional role of these specific non-synonymous amino acid changes and a larger, more comprehensive investigation of genetic variation in these genes (e.g., using a tagSNP approach) in combination with other relevant genes is needed as well as consideration for treatment effects when assessing their potential role in prognosis.
Breast cancer predisposition genes identified to date (e.g., BRCA1 and BRCA2) are responsible for less than 5% of all breast cancer cases. Many studies have shown that the cancer risks associated with individual commonly occurring single nucleotide polymorphisms (SNPs) are incremental. However, polygenic models suggest that multiple commonly occurring low to modestly penetrant SNPs of cancer related genes might have a greater effect on a disease when considered in combination.
In an attempt to identify the breast cancer risk conferred by SNP interactions, we have studied 19 SNPs from genes involved in major cancer related pathways. All SNPs were genotyped by TaqMan 5'nuclease assay. The association between the case-control status and each individual SNP, measured by the odds ratio and its corresponding 95% confidence interval, was estimated using unconditional logistic regression models. At the second stage, two-way interactions were investigated using multivariate logistic models. The robustness of the interactions, which were observed among SNPs with stronger functional evidence, was assessed using a bootstrap approach, and correction for multiple testing based on the false discovery rate (FDR) principle.
None of these SNPs contributed to breast cancer risk individually. However, we have demonstrated evidence for gene-gene (SNP-SNP) interaction among these SNPs, which were associated with increased breast cancer risk. Our study suggests cross talk between the SNPs of the DNA repair and immune system (XPD-[Lys751Gln] and IL10-[G(-1082)A]), cell cycle and estrogen metabolism (CCND1-[Pro241Pro] and COMT-[Met108/158Val]), cell cycle and DNA repair (BARD1-[Pro24Ser] and XPD-[Lys751Gln]), and within carcinogen metabolism (GSTP1-[Ile105Val] and COMT-[Met108/158Val]) pathways.
The importance of these pathways and their communication in breast cancer predisposition has been emphasized previously, but their biological interactions through SNPs have not been described. The strategy used here has the potential to identify complex biological links among breast cancer genes and processes. This will provide novel biological information, which will ultimately improve breast cancer risk management.
It is generally assumed that the detection of disease susceptibility genes via fine-mapping association study is facilitated by consideration of marker haplotypes. In this study, we compared the performance of genotype-based and haplotype-based association studies using the Collaborative Study of Genetics of Alcoholism dataset, on several chromosomal regions showing evidence for linkage with ALDX1. After correction for multiple testing, the most significant results were observed with the genotype-based analyses on two regions of chromosomes 2 and 7. Interestingly, the analyses results from this dataset showed that there was no advantage of the haplotype-based analyses over genotype-based (single-locus) analyses. However, caution should be taken when generalizing these results to other chromosomal regions or to other populations.
Genetic studies of complex disorders such as hypertension often utilize families selected for this outcome, usually with information obtained at a single time point. Since age-at-onset for diagnosed hypertension can vary substantially between individuals, a phenotype based on long-term follow up in unselected families can yield valuable insights into this disorder for the general population.
Genetic analyses were conducted using 2884 individuals from the largest 330 families of the Framingham Heart Study. A longitudinal phenotype was constructed using the age at an examination when systolic blood pressure (SBP) first exceeds 139 mm Hg. An interval for age-at-onset was created, since the exact time of onset was unknown. Time-fixed (sex, study cohort) and time-varying (body mass index, daily cigarette and alcohol consumption) explanatory variables were included.
Segregation analysis for a major gene effect demonstrated that the major gene effect parameter was sensitive to the choice for age-at-onset. Linkage analyses for age-at-onset were conducted using 1537 individuals in 52 families. Evidence for putative genes identified on chromosome 17 in a previous linkage study using a quantitative SBP phenotype for these data was not confirmed.
Interval censoring for age-at-onset should not be ignored. Further research is needed to explain the inconsistent segregation results between the different age-at-onset models (regressive threshold and proportional hazards) as well as the inconsistent linkage results between the longitudinal phenotypes (age-at-onset and quantitative).
The data arising from a longitudinal familial study have a complex correlation structure that cannot be modeled using classical methods for the analysis of familial data at a single time point.
To fit the longitudinal systolic blood pressure (SBP) pedigree data arising from the Framingham Heart Study, we proposed to use multilevel modeling. That approach was used to distinguish multiple levels of information with individual repeated measurements (Level 1) being made within individuals (Level 2), and individuals clustered within pedigrees (Level 3). Residuals from the subject-specific and pedigree-specific regression models were summed both for the mean SBP and slope of SBP change over time, in order to define two new outcomes that were then used in a genome-wide linkage analysis.
Evidence for linkage for the two outcomes (mean SBP and slope) was found in several chromosomal regions with a maximum LOD score of 3.6 on chromosome 8 and 3.5 on chromosome 17 for the mean SBP, and 2.5 on chromosome 1 for SBP slope. However, the linkage on chromosome 8 was only detected when the sample was restricted to subjects between age 25 and 75 and with at least four exams (Cohort 1) or 3 exams (Cohort 2).
Multilevel modeling is a powerful approach to detect genes involved in complex traits when longitudinal data are available. It allows for complex hierarchical data structure to be taken into account and therefore, a better partitioning of random within-individual variation from other sources of variability (genetic or nongenetic).