Our genome-wide association study (GWAS) of chronic lymphocytic leukemia (CLL) identified 4 highly-correlated intronic variants within the IRF8 gene that were associated with CLL. These results were further supported by a recent meta-analysis of our GWAS with two other GWAS of CLL, supporting the IRF8 gene as a strong candidate for CLL risk.
To refine the genetic association of CLL risk, we performed Sanger sequencing of IRF8 in 94 CLL cases and 96 controls. We then performed fine-mapping by genotyping 39 variants (of which 10 were identified from sequencing) in 745 CLL cases and 1521 controls. We also assessed these associations with risk of other non-Hodgkin lymphoma (NHL) subtypes.
The strongest association with CLL risk was observed with a common SNP located within the 3’ UTR of IRF8 (rs1044873, log additive odds ratio = 0.7, P=1.81×10−6). This SNP was not associated with the other NHL subtypes (all P>0.05).
We provide evidence that rs1044873 in the IRF8 gene accounts for the initial GWAS signal for CLL risk. This association appears to be unique to CLL with little support for association with other common NHL subtypes. Future work is needed to assess functional role of IRF8 in CLL etiology.
These data provide support that a functional variant within the 3’ UTR of IRF8 may be driving the GWAS signal seen on 16q24.1 for CLL risk.
CLL; NHL; SNPs; IRF8; risk locus
The XRCC2 gene is a key mediator in the homologous recombination repair of DNA double strand breaks. We hypothesised that inherited variants in the XRCC2 gene might also affect susceptibility to, and survival from, breast cancer.
We genotyped 12 XRCC2 tagging SNPs in 1,131 breast cancer cases and 1,148 controls from the Sheffield Breast Cancer Study (SBCS), and examined their associations with breast cancer risk and survival by estimating odds ratios (ORs) and hazard ratios (HRs), and their corresponding 95% confidence intervals (CIs). Positive findings were further investigated in 860 cases and 869 controls from the Utah Breast Cancer Study (UBCS) and jointly analysed together with available published data for breast cancer risk. The survival findings were further confirmed in studies (8,074 cases) from the Breast Cancer Association Consortium (BCAC).
The most significant association with breast cancer risk in the SBCS dataset was the XRCC2 rs3218408 SNP (recessive model p=2.3×10−4, MAF=0.23). This SNP yielded an ORrec (95% CI) of 1.64 (1.25–2.16) in a two-site analysis of SBCS and UBCS, and a meta-ORrec (95% CI) of 1.33 (1.12–1.57) when all published data were included. This SNP may mark a rare risk haplotype carried by 2 in 1000 of the control population. Furthermore, the XRCC2 coding R188H SNP (rs3218536, MAF=0.08) was significantly associated with poor survival, with an increased per-allele HR (95% CI) of 1.58 (1.01–2.49) in a multivariate analysis. This effect was still evident in a pooled meta-analysis of 8,781 breast cancer patients from the BCAC [HR (95% CI) of 1.19 (1.05–1.36), p=0.01].
Our findings suggest that XRCC2 SNPs may influence breast cancer risk and survival.
Single nucleotide polymorphism; XRCC2; breast cancer risk; breast cancer survival
Shared genomic segment (SGS) analysis is a method that uses dense SNP genotyping in high-risk pedigrees to identify regions of sharing between cases. Here, we illustrate the power of SGS to identify dominant rare risk variants. Using simulated pedigrees, we consider 12 disease models based on disease prevalence, minor allele frequency, and penetrance to represent disease loci that explain 0.2% to 99.8% of total disease risk. Pedigrees were required to contain ≥15 meioses between all cases and to be high-risk based on significant excess of disease (p<0.001 or p<0.00001). Across these scenarios the power for a single pedigree ranged widely. Nonetheless, fewer than 10 pedigrees was sufficient for excellent power in the majority of the models. Power increased with the risk attributable to the disease locus, penetrance, and the excess of disease in the pedigree. Sharing allowing for one sporadic case was uniformly more powerful than sharing using all cases. Further, we do a SGS analysis using a large Attenuated Familial Adenomatous Polyposis pedigree and identified a 1.96 Mb region containing the known causal APC gene with genome-wide significance (p<5×10−7). SGS is a powerful method for detecting rare variants and offers a valuable complement to GWAS and linkage analysis.
Congenital diaphragmatic hernia (CDH) is a developmental defect of the diaphragm that causes high newborn mortality. Isolated or non-syndromic CDH is considered a multifactorial disease, with strong evidence implicating genetic factors. As low heritability has been reported in isolated CDH, family-based genetic methods have yet to identify the genetic factors associated with the defect. Using the Utah Population Database, we identified distantly related patients from several extended families with a high incidence of isolated CDH. Using high-density genotyping, seven patients were analyzed by homozygosity exclusion rare allele mapping (HERAM) and phased haplotype sharing (HapShare), two methods we developed to map shared chromosome regions. Our patient cohort shared three regions not previously associated with CDH, i.e. 2q11.2-q12.1, 4p13 and 7q11.2, and two regions previously involved in CDH, i.e. 8p23.1 and 15q26.2. The latter regions contain GATA4 and NR2F2, two genes implicated in diaphragm formation in mice. Interestingly, three patients shared the 8p23.1 locus and one of them also harbored the 15q26.2 segment. No coding variants were identified in GATA4 or NR2F2, but a rare shared variant was found in intron 1 of GATA4. This work shows the role of heritability in isolated CDH. Our family-based strategy uncovers new chromosomal regions possibly associated with disease, and suggests that non-coding variants of GATA4 and NR2F2 may contribute to the development of isolated CDH. This approach could speed up the discovery of the genes and regulatory elements causing multifactorial diseases, such as isolated CDH.
congenital diaphragmatic hernia; Utah population database; shared segment analysis; GATA4; NR2F2
A recent meta-analysis of three genome-wide association studies of chronic lymphocytic leukaemia (CLL) identified two common variants at the 6p21.31 locus that are associated with CLL risk. To verify and further explore the association of these variants with other non-Hodgkin lymphoma (NHL) subtypes, we genotyped 1196 CLL cases, 1699 NHL cases, and 2410 controls. We found significant associations between the 6p21.31 variants and CLL risk (rs210134: P=0.01; rs210142: P=6.8×10−3). These variants also showed a trend towards association with some of the other NHL subtypes. Our results validate the prior work and support specific genetic pathways for risk among NHL subtypes.
CLL; NHL; SNPs; BAK1; risk locus
Owing to their role in controlling the efflux of toxic compounds, transporters are central players in the process of detoxification and elimination of xenobiotics, which in turn is related to cancer risk. Among these transporters, ATP-binding cassette B1/multidrug resistance 1 (ABCB1/MDR1), ABCC2/multidrug resistance protein 2 (MRP2), and ABCG2/breast cancer resistance protein (BCRP) affect susceptibility to many hematopoietic malignancies. The maintenance of regulated expression of these transporters is governed through the activation of intracellular “xenosensors” like the nuclear receptor 1I2/pregnane X receptor (NR1I2/PXR). SNPs in genes encoding these regulators have also been implicated in the risk of several cancers. Using a tagging approach, we tested the hypothesis that common polymorphisms in the transporter genes ABCB1, ABCC2, ABCG2, and the regulator gene NR1I2 could be implicated in lymphoma risk. We selected 68 SNPs in the 4 genes, and we genotyped them in 1,481 lymphoma cases and 1,491 controls of the European cases-control study (EpiLymph) using the Illumina™ GoldenGate assay technology.Carriers of the SNP rs6857600 minor allele in ABCG2, was associated with a decrease in risk of B-cell lymphoma (B-NHL) overall (p<0.001). Furthermore, a decreased risk of chronic lymphocytic leukemia (CLL) was associated with the ABCG2 rs2231142 variant (p=0.0004), which could be replicated in an independent population. These results suggest a role for this gene in B-NHL susceptibility, especially for CLL.
Lymphoma; multidrug resistance 1 (MDR1); multidrug resistance protein 2 (MRP2); breast cancer resistance protein (BCRP); pregnane X receptor (PXR)
DNA damage and replication checkpoints mediated by the ATR-CHEK1 pathway are key to the maintenance of genome stability, and both ATR and CHEK1 have been proposed as potential breast cancer susceptibility genes. Many novel variants recently identified by the large resequencing projects have not yet been thoroughly tested in genome-wide association studies for breast cancer susceptibility. We therefore used a tagging SNP (tagSNP) approach based on recent SNP data available from the 1000 genomes projects, to investigate the roles of ATR and CHEK1 in breast cancer risk and survival. ATR and CHEK1 tagSNPs were genotyped in the Sheffield Breast Cancer Study (SBCS; 1011 cases and 1024 controls) using Illumina GoldenGate assays. Untyped SNPs were imputed using IMPUTE2, and associations between genotype and breast cancer risk and survival were evaluated using logistic and Cox proportional hazard regression models respectively on a per allele basis. Significant associations were further examined in a meta-analysis of published data or confirmed in the Utah Breast Cancer Study (UBCS). The most significant associations for breast cancer risk in SBCS came from rs6805118 in ATR (p=7.6x10-5) and rs2155388 in CHEK1 (p=3.1x10-6), but neither remained significant after meta-analysis with other studies. However, meta-analysis of published data revealed a weak association between the ATR SNP rs1802904 (minor allele frequency is 12%) and breast cancer risk, with a summary odds ratio (confidence interval) of 0.90 (0.83-0.98) [p=0.0185] for the minor allele. Further replication of this SNP in larger studies is warranted since it is located in the target region of 2 microRNAs. No evidence of any survival effects of ATR or CHEK1 SNPs were identified. We conclude that common alleles of ATR and CHEK1 are not implicated in breast cancer risk or survival, but we cannot exclude effects of rare alleles and of common alleles with very small effect sizes.
In spite of intensive efforts, understanding of the genetic aspects of familial prostate cancer remains largely incomplete. In a previous microsatellite-based linkage scan of 1233 prostate cancer (PC) families, we identified suggestive evidence for linkage (i.e. LOD≥1.86) at 5q12, 15q11, 17q21, 22q12, and two loci on 8p, with additional regions implicated in subsets of families defined by age at diagnosis, disease aggressiveness, or number of affected members.
In an attempt to replicate these findings and increase linkage resolution, we used the Illumina 6000 SNP linkage panel to perform a genome-wide linkage scan of an independent set of 762 multiplex PC families, collected by 11 ICPCG groups.
Of the regions identified previously, modest evidence of replication was observed only on the short arm of chromosome 8, where HLOD scores of 1.63 and 3.60 were observed in the complete set of families and families with young average age at diagnosis, respectively. The most significant linkage signals found in the complete set of families were observed across a broad, 37 cM interval on 4q13-25, with LOD scores ranging from 2.02 to 2.62, increasing to 4.50 in families with older average age at diagnosis. In families with multiple cases presenting with more aggressive disease, LOD scores over 3.0 were observed at 8q24 in the vicinity of previously identified common PC risk variants, as well as MYC, an important gene in PC biology.
These results will be useful in prioritizing future susceptibility gene discovery efforts in this common cancer.
Previously, an analysis of 14 extended, high-risk Utah pedigrees localized the chromosome 22q linkage region to 3.2 Mb at 22q12.3-13.1 (flanked on each side by three recombinants), which contained 31 annotated genes. In this large, multi-centered, collaborative study, we performed statistical recombinant mapping in fifty-four pedigrees selected to be informative for recombinant mapping from nine member groups of the International Consortium for Prostate Cancer Genetics (ICPCG). These 54 pedigrees included the 14 extended pedigrees from Utah and 40 pedigrees from eight other ICPCG member groups. The additional 40 pedigrees were selected from a total pool of 1,213 such that each pedigree was required to both contain at least four prostate cancer (PRCA) cases and exhibit evidence for linkage to the chromosome 22q region. The recombinant events in these 40 independent pedigrees confirmed the previously proposed region. Further, when all 54 pedigrees were considered, the three-recombinant consensus region was narrowed by more than a megabase to 2.2 Mb at chromosome 22q12.3 flanked by D22S281 and D22S683. This narrower region eliminated 20 annotated genes from that previously proposed, leaving only eleven genes. This region at 22q12.3 is the most consistently identified and smallest linkage region for PRCA. This collaborative study by the ICPCG illustrates the value of consortium efforts and the continued utility of linkage analysis using informative pedigrees to localize genes for complex diseases.
Multiple prostate cancer (PCa) risk-related loci have been discovered by genome-wide association studies (GWAS) based on case–control designs. However, GWAS findings may be confounded by population stratification if cases and controls are inadvertently drawn from different genetic backgrounds. In addition, since these loci were identified in cases with predominantly sporadic disease, little is known about their relationships with hereditary prostate cancer (HPC). The association between seventeen reported PCa susceptibility loci was evaluated with a family-based association test using 1,979 hereditary PCa families of European descent collected by members of the International Consortium for Prostate Cancer Genetics, with a total of 5,730 affected men. The risk alleles for 8 of the 17 loci were significantly over-transmitted from parents to affected offspring, including SNPs residing in 8q24 (regions 1, 2 and 3), 10q11, 11q13, 17q12 (region 1), 17q24 and Xp11. In subgroup analyses, three loci, at 8q24 (regions 1 and 2) plus 17q12, were significantly over-transmitted in hereditary PCa families with five or more affected members, while loci at 3p12, 8q24 (region 2), 11q13, 17q12 (region 1), 17q24 and Xp11 were significantly over-transmitted in HPC families with an average age of diagnosis at 65 years or less. Our results indicate that at least a subset of PCa risk-related loci identified by case–control GWAS are also associated with disease risk in HPC families.
We applied a new weighted pairwise shared genomic segment (pSGS) analysis for susceptibility gene localization to high-density genomewide SNP data in three extended high-risk breast cancer pedigrees.
Using this method, four genomewide suggestive regions were identified on chromosomes 2, 4, 7 and 8, and a borderline suggestive region on chromosome 14. Seven additional regions with at least nominal evidence were observed. Of particular note among these total twelve regions were three regions that were identified in two pedigrees each; chromosomes 4, 7 and 14. Follow-up two-pedigree pSGS analyses further indicated excessive genomic sharing across the pedigrees in all three regions, suggesting that the underlying susceptibility alleles in those regions may be shared in common. In general, the pSGS regions identified were quite large (average 32.2 Mb), however, the range was wide (0.3 – 88.2 Mb). Several of the regions identified overlapped with loci and genes that have been previously implicated in breast cancer risk, including NBS1, BRCA1 and RAD51L1.
Our analyses have provided several loci of interest to pursue in these high-risk pedigrees and illustrate the utility of the weighted pSGS method and extended pedigrees for gene mapping in complex diseases. A focused sequencing effort across these loci in the sharing individuals is the natural next step to further map the critical underlying susceptibility variants in these regions.
Breast cancer; High-risk pedigrees; Susceptibility; Germline; Genomic sharing
Prostate cancer has a strong familial component but uncovering the molecular basis for inherited susceptibility for this disease has been challenging. Recently, a rare, recurrent mutation (G84E) in HOXB13 was reported to be associated with prostate cancer risk. Confirmation and characterization of this finding is necessary to potentially translate this information to the clinic. To examine this finding in a large international sample of prostate cancer families, we genotyped this mutation and 14 other SNPs in or flanking HOXB13 in 2,443 prostate cancer families recruited by the International Consortium for Prostate Cancer Genetics (ICPCG). At least one mutation carrier was found in 112 prostate cancer families (4.6 %), all of European descent. Within carrier families, the G84E mutation was more common in men with a diagnosis of prostate cancer (194 of 382, 51 %) than those without (42 of 137, 30 %), P = 9.9 × 10−8 [odds ratio 4.42 (95 % confidence interval 2.56–7.64)]. A family-based association test found G84E to be significantly over-transmitted from parents to affected offspring (P = 6.5 × 10−6). Analysis of markers flanking the G84E mutation indicates that it resides in the same haplotype in 95 % of carriers, consistent with a founder effect. Clinical characteristics of cancers in mutation carriers included features of high-risk disease. These findings demonstrate that the HOXB13 G84E mutation is present in ~5 % of prostate cancer families, predominantly of European descent, and confirm its association with prostate cancer risk. While future studies are needed to more fully define the clinical utility of this observation, this allele and others like it could form the basis for early, targeted screening of men at elevated risk for this common, clinically heterogeneous cancer.
Electronic supplementary material
The online version of this article (doi:10.1007/s00439-012-1229-4) contains supplementary material, which is available to authorized users.
Prostate cancer is generally believed to have a strong inherited component, but the search for susceptibility genes has been hindered by the effects of genetic heterogeneity. The recently developed sumLINK and sumLOD statistics are powerful tools for linkage analysis in the presence of heterogeneity.
We performed a secondary analysis of 1233 prostate cancer pedigrees from the International Consortium for Prostate Cancer Genetics (ICPCG) using two novel statistics, the sumLINK and sumLOD. For both statistics, dominant and recessive genetic models were considered. False discovery rate (FDR) analysis was conducted to assess the effects of multiple testing.
Our analysis identified significant linkage evidence at chromosome 22q12, confirming previous findings by the initial conventional analyses of the same ICPCG data. Twelve other regions were identified with genomewide suggestive evidence for linkage. Seven regions (1q23, 5q11, 5q35, 6p21, 8q12, 11q13, 20p11-q11) are near loci previously identified in the initial ICPCG pooled data analysis or the subset of aggressive prostate cancer (PC) pedigrees. Three other regions (1p12, 8p23, 19q13) confirm loci reported by others, and two (2p24, 6q27) are novel susceptibility loci. FDR testing indicates that over 70% of these results are likely true positive findings. Statistical recombinant mapping narrowed regions to an average of 9 cM.
Our results represent genomic regions with the greatest consistency of positive linkage evidence across a very large collection of high-risk prostate cancer pedigrees using new statistical tests that deal powerfully with heterogeneity. These regions are excellent candidates for further study to identify prostate cancer predisposition genes.
Multiple genome-wide and candidate gene association studies have been performed in search of common risk variants for breast cancer. Recent large meta analyses, consolidating evidence from these studies, have been consistent in highlighting the caspase-8 (CASP8) gene as important in this regard. In order to define a risk haplotype and map the CASP8 gene region with respect to underlying susceptibility variant/s, we screened four genes in the CASP8 region on 2q33-q34 for breast cancer risk.
Two independent data sets from the United Kingdom and the United States, including 3,888 breast cancer cases and controls, were genotyped for 45 tagging single nucleotide polymorphisms (tSNP) in the expanded CASP8 region. SNP and haplotype association tests were carried out using Monte Carlo based methods.
We identified a three-SNP haplotype across rs3834129, rs6723097 and rs3817578 that was significantly associated with breast cancer (p<5×10−6), with a dominant risk ratio and 95% confidence interval of 1.28 (1.21–1.35) and frequency of 0.29 in controls. Evidence for this risk haplotype was extremely consistent across the two study sites and also consistent with previous data.
This three-SNP risk haplotype represents the best characterization so far of the chromosome upon which the susceptibility variant resides.
Characterization of the risk haplotype provides a strong foundation for re-sequencing efforts to identify the underlying risk variant, which may prove useful for individual-level risk prediction, and provide novel insights into breast carcinogenesis.
Single nucleotide polymorphisms; risk haplotype; breast cancer; CASP8; apoptosis pathway
Genetic variants are likely to contribute to a portion of prostate cancer risk. Full elucidation of the genetic etiology of prostate cancer is difficult because of incomplete penetrance and genetic and phenotypic heterogeneity. Current evidence suggests that genetic linkage to prostate cancer has been found on several chromosomes including the X; however, identification of causative genes has been elusive.
Parametric and non-parametric linkage analyses were performed using 26 microsatellite markers in each of 11 groups of multiple-case prostate cancer families from the International Consortium for Prostate Cancer Genetics (ICPCG). Meta-analyses of the resultant family-specific linkage statistics across the entire 1,323 families and in several predefined subsets were then performed.
Meta-analyses of linkage statistics resulted in a maximum parametric heterogeneity lod score (HLOD) of 1.28, and an allele-sharing lod score (LOD) of 2.0 in favor of linkage to Xq27-q28 at 138 cM. In subset analyses, families with average age at onset less than 65 years exhibited a maximum HLOD of 1.8 (at 138 cM) versus a maximum regional HLOD of only 0.32 in families with average age at onset of 65 years or older. Surprisingly, the subset of families with only 2–3 affected men and some evidence of male-to-male transmission of prostate cancer gave the strongest evidence of linkage to the region (HLOD = 3.24, 134 cM). For this subset, the HLOD was slightly increased (HLOD = 3.47 at 134 cM) when families used in the original published report of linkage to Xq27-28 were excluded.
Although there was not strong support for linkage to the Xq27-28 region in the complete set of families, the subset of families with earlier age at onset exhibited more evidence of linkage than families with later onset of disease. A subset of families with 2–3 affected individuals and with some evidence of male to male disease transmission showed stronger linkage signals. Our results suggest that the genetic basis for prostate cancer in our families is much more complex than a single susceptibility locus on the X chromosome, and that future explorations of the Xq27-28 region should focus on the subset of families identified here with the strongest evidence of linkage to this region.
We applied a shared genomic segment (SGS) analysis, incorporating an error model, to identify complete, or near complete, selective sweeps in the HapMap phase II data sets. This method is based on detecting heterozygous sharing across all individuals within a population, to identify regions of sharing with at least one allele in common. We identified multiple interesting regions, many of which are concordant with positive selection regions detected by previous population genetic tests. Others are suggested to be novel regions. Our finding illustrates the utility of SGS as a method for identifying regions of selection, and some of these regions have been proposed to be candidate regions for harboring disease genes.
identity by state; identity by descent; positive selection
Monoclonal B cell lymphocytosis (MBL) is a hematologic condition wherein small B cell clones can be detected in the blood of asymptomatic individuals. Most MBL have an immunophenotype similar to chronic lymphocytic leukemia (CLL), and “CLL-like” MBL is a precursor to CLL. We used flow cytometry to identify MBL from unaffected members of CLL kindreds. We identified 101 MBL cases from 622 study subjects; of these, 82 individuals with MBL were further characterized. Ninety-one unique MBL clones were detected: 73 CLL-like MBL (CD5+CD20dimsIgdim), 11 atypical MBL (CD5+CD20+sIg+), and 7 CD5neg MBL (CD5negCD20+sIgneg). Extended immunophenotypic characterization of these MBL subtypes was performed, and significant differences in cell surface expression of CD23, CD49d, CD79b, and FMC-7 were observed among the groups. Markers of risk in CLL such as CD38, ZAP70, and CD49d were infrequently expressed in CLL-like MBL, but were expressed in the majority of atypical MBL. Interphase cytogenetics was performed in 35 MBL cases, and del 13q14 was most common (22/30 CLL-like MBL cases). Gene expression analysis using oligonucleotide arrays was performed on 7 CLL-like MBL, and showed activation of B cell receptor associated pathways. Our findings underscore the diversity of MBL subtypes and further clarify the relationship between MBL and other lymphoproliferative disorders.
We applied our method of pairwise shared genomic segment (pSGS) analysis to high-risk pedigrees identified from the Genetic Analysis Workshop 17 (GAW17) mini-exome sequencing data set. The original shared genomic segment method focused on identifying regions shared by all case subjects in a pedigree; thus it can be sensitive to sporadic cases. Our new method examines sharing among all pairs of case subjects in a high-risk pedigree and then uses the mean sharing as the test statistic; in addition, the significance is assessed empirically based on the pedigree structure and linkage disequilibrium pattern of the single-nucleotide polymorphisms. Using all GAW17 replicates, we identified 18 unilineal high-risk pedigrees that contained excess disease (p < 0.01) and at least 15 meioses between case subjects. Eighteen rare causal variants were polymorphic in this set of pedigrees. Based on a significance threshold of 0.001, 72.2% (13/18) of these pedigrees were successfully identified with at least one region that contains a true causal variant. The regions identified included 4 of the possible 18 polymorphic causal variants. On average, 1.1 true positives and 1.7 false positives were identified per pedigree. In conclusion, we have demonstrated the potential of our new pSGS method for localizing rare disease causal variants in common disease using high-risk pedigrees and exome sequence data.
A recent genome-wide association (GWA) study suggested seven new loci as associated with prostate cancer (PRCA) susceptibility. The strongest associated SNP in each region was identified (rs2660753, rs9364554, rs6465657, rs10993994, rs7931342, rs2735839, rs5945619). We studied these seven SNPs in a replication study consisting of 169 familial PRCA cases selected from Utah high-risk PRCA pedigrees and 805 controls. We performed subset analyses for aggressive and early onset PRCA. At a nominal significance level, two SNPs were found to be associated with PRCA: rs10993994 on chromosome 10q11 (odds ratio (OR) =1.42 [95% confidence interval (CI), 1.05–1.90], p=0.022); and rs5945619 on chromosome Xp11 (OR=1.54 [95% CI, 1.03–2.31], p=0.035). Restricting analysis to familial PRCA cases with aggressive disease yielded very similar risk estimates at both SNPs. However, subset analysis for familial, early onset disease indicated highly significant association evidence and substantially higher risk estimates for rs10993994 (OR=2.20 [95% CI, 1.48–3.27], p<0.0001). This result suggests that the higher risk estimates from the stage 1 cohort in the original study for rs10993994 may have been due to the early-onset and familial nature of the PRCA cases in that cohort. In conclusion, in a small case-control study of PRCA cases from Utah high-risk pedigrees, we have significantly replicated association of PRCA with rs10993994 (10q11) upon study-wide correction for multiple comparisons. We also nominally replicated the association of PRCA with rs5945619 (Xp11). In particular, it appears that the susceptibility locus at 10q11 maybe involved in familial, early onset disease.
Prostate Cancer; Genetic Risk
Polymorphisms in double-strand DNA repair gene XRCC2 may play an important role in colorectal cancer (CRC) etiology, specifically in disease subtypes. Associations of XRCC2 variants and CRC were investigated by tumor site and tumor instability status in a four-center collaboration including three U.K. case-control studies (Sheffield, Leeds, Dundee) and a U.S. case-control study of cases from high-risk Utah pedigrees (total: 1,252 cases, 1,422 controls). The 14 variants studied were tagging-SNPs selected from HapMap/NIEHS data, supplemented with SNPs identified from sequencing of 125 cases chosen to represent multiple CRC groups (familial, metastatic disease, and tumor subsite). Monte Carlo significance testing using Genie software provided valid meta analyses of the total resource that includes family-based data. Similar to reports of CRC and other cancer sites, the rs3218536 R188H allele was not associated with increased risk. However, we observed a novel, highly significant association of a common SNP, rs3218499G>C, with increased risk of rectal tumors (OR 2.1, 95%CI 1.3-3.3; pchisq. =0.0006) versus controls, with the largest risk found for female rectal cases (OR 3.1, 95%CI 1.6-6.1; pchisq. =0.0006). This difference was significantly different to that for proximal and distal colon cancers (pchisq. =0.02). Our investigation supports a role for XRCC2 in CRC tumorigenesis, conferring susceptibility to rectal tumors.
XRCC2; colorectal cancer; DNA double-strand break repair; chromosomal instability; microsatellite instability
Genomewide association studies of colorectal cancer (CRC) have identified genetic variants that reproducibly associate with CRC. Associations of twelve SNPs at 8q24, 9p24, and 18q21 (SMAD7) and CRC were investigated in a three center collaborative study including two UK case-control cohorts (Sheffield and Leeds) and an American case-control study of CRC cases from high-risk Utah pedigrees.
Our combined resource included 1,092 CRC case subjects and 1,060 age- and sex-matched controls. Meta-statistics and Monte Carlo significance testing using Genie software provided a valid combined analysis of our mixed independent and related case-control resource. We also evaluated whether these associations differed by sex, age at diagnosis, family history, or tumor site.
At 8q24 we observed two independent significant associations at SNPs located in two different risk regions of 8q24: rs6983267 in region 3 (ptrend=0.01; per allele OR=1.17, 95%CI 1.03, 1.32) and rs10090154 in region 5 (ptrend=0.05; per allele OR=1.24, 95%CI 1.01, 1.51). At 18q21 associations were observed in distal colon tumors, but not in proximal or rectal cancers: rs4939827 (ptrend=0.007; per allele OR=0.77, 95%CI 0.64, 0.93; case-case pdiff=0.03) and rs12953717 (ptrend=0.01; per allele OR=1.27, 95%CI 1.06, 1.52). We were unable to detect any associations at 9p24 with CRC.
Our investigation confirms that variants across multiple risk regions of 8q24 are associated with CRC, and that associations at 18q21 differ by tumor site.
meta association; colorectal cancer; 8q24; 9p24; 18q21; SMAD7; case-control; family resource
Multiple single-nucleotide polymorphisms have been associated with low-density lipoprotein cholesterol (LDL-C), high-density lipoprotein cholesterol (HDL-C), and triglyceride (TG) levels. In this paper, we evaluate a weighted and an unweighted approach for estimating the combined effect of multiple markers (using genotypes and haplotypes) on lipid levels for a given individual.
Using data from the Framingham Heart Study SHARe genome-wide association study, we tested genome-wide genotypes and haplotypes for association with lipid levels and constructed genetic risk scores (GRS) based on multiple markers that were weighted according to their estimated effects on LDL-C, HDL-C, and TG. These scores (GRS-LDL, GRS-HDL, and GRS-TG) were then evaluated for associations with LDL-C, HDL-C, and TG, and compared with results of an unweighted method based on risk-allele counts. For comparability of metrics, GRS variables were divided into quartiles.
GRS-LDL quartiles were associated with LDL-C levels (p = 2.1 × 10-24), GRS-HDL quartiles with HDL-C (p = 5.9 × 10-22), and GRS-TG quartiles with TG (p = 5.4 × 10-25). In comparison, these p-values were considerably lower than those for the associations of the unweighted GRS quartiles for LDL-C (p = 3.6 × 10-7), HDL-C (p = 6.4 × 10-16), and TG (p = 4.1 × 10-10).
GRS variables were highly predictive of LDL-C, HDL-C, and TG measurements, especially when weighted based on each marker's individual association with those intermediate risk phenotypes. The allele-count GRS approach that does not weight the GRS by individual marker associations was considerably less predictive of lipid and lipoprotein measures when the same genetic markers were utilized, suggesting that substantially more risk-associated genetic marker information is encapsulated by the weighted GRS variables.
Methods exist to appropriately perform association analyses in pedigrees. However, for genome-wide association analysis, these methods are computationally impractical. It is therefore important to determine alternate methods that can be efficiently used genome-wide. Here, we introduce a new algorithm that considers all relationships simultaneously in arbitrary-structured pedigrees and assigns weights to pedigree members that can be used in subsequent analyses to address relatedness. We compare this new method with an existing weighting algorithm, a naïve analysis (relatedness is ignored), and an empirical method that appropriately accounts for all relationships (the gold standard).
Framingham Heart Study Genetic Analysis Workshop 16 Problem 2 data were used with a dichotomous phenotype based on high-density lipoprotein cholesterol level (1,611 cases and 4,043 controls). New and existing algorithms for calculating weights were used. Cochran-Armitage trend tests were performed for 17,333 single-nucleotide polymorphisms on chromosome 8 using both weighting systems and the naïve approach; a subset of 500 single-nucleotide polymorphisms were tested empirically. Correlations of p-values from each method were determined.
Results from the two weighting methods were strongly correlated (r = 0.96). Our new weighting method performed better than the existing weighting method (r = 0.89 vs. r = 0.83), which is due to a more moderate down-weighting. The naive analysis obtained the best correlation with the empirical gold standard results (r = 0.99).
Our results suggest that weighting methods do not accurately represent tests that account for familial relationships in genetic association analyses and are inferior to the naïve method as an efficient initial genome-wide screening tool.
Asthma is a multi-factorial disease with undetermined genetic factors. We performed a genome-wide scan to identify predisposition loci for asthma. The asthma phenotype consisted of physician-confirmed presence or absence of asthma symptoms. We analyzed 81 extended Utah pedigrees ranging from three to six generations, including 742 affected individuals, ranging from two to 40 per pedigree. We performed parametric multipoint linkage analyses with dominant and recessive models. Our analysis revealed genome-wide significant evidence of linkage to region 5q13 (LOD = 3.8, recessive model), and suggestive evidence for linkage to region 6p21 (LOD = 2.1, dominant model). Both the 5q13 and 6p21 regions indicated in these analyses have been previously identified as regions of interest in other genome-wide scans for asthma-related phenotypes. The evidence of linkage at the 5q13 region represents the first significant evidence for linkage on a genome-wide basis for this locus. Linked pedigrees localize the region to approximately between 92.3 Mbp to 105.5 Mbp.
asthma; linkage analysis; gene mapping; extended pedigrees