Search tips
Search criteria

Results 1-7 (7)

Clipboard (0)
more »
Year of Publication
Document Types
1.  Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data 
BMC Proceedings  2011;5(Suppl 9):S117.
Recently there has been great interest in identifying rare variants associated with common diseases. We apply several collapsing-based and kernel-based single-gene association tests to Genetic Analysis Workshop 17 (GAW17) rare variant association data with unrelated individuals without knowledge of the simulation model. We also implement modified versions of these methods using additional information, such as minor allele frequency (MAF) and functional annotation. For each of four given traits provided in GAW17, we use the Bayesian mixed-effects model to estimate the phenotypic variance explained by the given environmental and genotypic data and to infer an individual-specific genetic effect to use directly in single-gene association tests. After obtaining information on the GAW17 simulation model, we compare the performance of all methods and examine the top genes identified by those methods. We find that collapsing-based methods with weights based on MAFs are sensitive to the “lower MAF, larger effect size” assumption, whereas kernel-based methods are more robust when this assumption is violated. In addition, many false-positive genes identified by multiple methods often contain variants with exactly the same genotype distribution as the causal variants used in the simulation model. When the sample size is much smaller than the number of rare variants, it is more likely that causal and noncausal variants will share the same or similar genotype distribution. This likely contributes to the low power and large number of false-positive results of all methods in detecting causal variants associated with disease in the GAW17 data set.
PMCID: PMC3287841  PMID: 22373309
2.  Detecting functional rare variants by collapsing and incorporating functional annotation in Genetic Analysis Workshop 17 mini-exome data 
BMC Proceedings  2011;5(Suppl 9):S27.
Association studies using tag SNPs have been successful in detecting disease-associated common variants. However, common variants, with rare exceptions, explain only at most 5–10% of the heritability resulting from genetic factors, which leads to the common disease/rare variants assumption. Indeed, recent studies using sequencing technologies have demonstrated that common diseases can be due to rare variants that could not be systematically studied earlier. Unfortunately, methods for common variants are not optimal if applied to rare variants. To identify rare variants that affect disease risk, several investigators have designed new approaches based on the idea of collapsing different rare variants inside the same genomic block (e.g., the same gene or pathway) to enrich the signal. Here, we consider three different collapsing methods in the multimarker regression model and compared their performance on the Genetic Analysis Workshop 17 data using the consistency of results across different simulations and the cross-validation prediction error rate. The comparison shows that the proportion collapsing method seems to outperform the other two methods and can find both truly associated rare and common variants. Moreover, we explore one way of incorporating the functional annotations for the variants in the data that collapses nonsynonymous and synonymous variants separately to allow for different penalties on them. The incorporation of functional annotations led to higher sensitivity and specificity levels when the detection results were compared with the answer sheet. The initial analysis was performed without knowledge of the simulating model.
PMCID: PMC3287862  PMID: 22373324
3.  Use of Bayesian networks to dissect the complexity of genetic disease: application to the Genetic Analysis Workshop 17 simulated data 
BMC Proceedings  2011;5(Suppl 9):S37.
Complex diseases are often the downstream event of a number of risk factors, including both environmental and genetic variables. To better understand the mechanism of disease onset, it is of great interest to systematically investigate the crosstalk among various risk factors. Bayesian networks provide an intuitive graphical interface that captures not only the association but also the conditional independence and dependence structures among the variables, resulting in sparser relationships between risk factors and the disease phenotype than traditional correlation-based methods. In this paper, we apply a Bayesian network to dissect the complex regulatory relationships among disease traits and various risk factors for the Genetic Analysis Workshop 17 simulated data. We use the Bayesian network as a tool for the risk prediction of disease outcome.
PMCID: PMC3287873  PMID: 22373110
4.  Large-scale risk prediction applied to Genetic Analysis Workshop 17 mini-exome sequence data 
BMC Proceedings  2011;5(Suppl 9):S46.
We consider the application of Efron’s empirical Bayes classification method to risk prediction in a genome-wide association study using the Genetic Analysis Workshop 17 (GAW17) data. A major advantage of using this method is that the effect size distribution for the set of possible features is empirically estimated and that all subsequent parameter estimation and risk prediction is guided by this distribution. Here, we generalize Efron’s method to allow for some of the peculiarities of the GAW17 data. In particular, we introduce two ways to extend Efron’s model: a weighted empirical Bayes model and a joint covariance model that allows the model to properly incorporate the annotation information of single-nucleotide polymorphisms (SNPs). In the course of our analysis, we examine several aspects of the possible simulation model, including the identity of the most important genes, the differing effects of synonymous and nonsynonymous SNPs, and the relative roles of covariates and genes in conferring disease risk. Finally, we compare the three methods to each other and to other classifiers (random forest and neural network).
PMCID: PMC3287883  PMID: 22373389
5.  A pathway analysis applied to Genetic Analysis Workshop 16 genome-wide rheumatoid arthritis data 
BMC Proceedings  2009;3(Suppl 7):S91.
The identification of several hundred genomic regions affecting disease risk has proven the ability of genome-wide association studies have proven their ability to identify genetic contributors to disease. Currently, single-nucleotide polymorphism (SNP) association analysis is the most widely used method of genome-wide association data, but recent research shows that multi-marker tests of association may provide greater power, especially when more than one mutation is present within a gene and the mutations are in low linkage disequilibrium with each other. Here we use a multi-marker association test based on regression to SNPs located within known genes to obtain a gene-level score of association. We then perform pathway analysis using this score as a measure of gene importance. We use two tests of pathway enrichment - a binomial test and a random set method. By utilizing publicly available gene and pathway information, we identify B cell, cytokine and inflammation response, and antigen presentation pathways as being associated with rheumatoid arthritis. These results confirm known biological mechanisms for auto-immunity disorders, of which rheumatoid arthritis is one.
PMCID: PMC2795995  PMID: 20018088
6.  Two-stage joint selection method to identify candidate markers from genome-wide association studies 
BMC Proceedings  2009;3(Suppl 7):S29.
The interaction among multiple genes and environmental factors can affect an individual's susceptibility to disease. Some genes may not show strong marginal associations when they affect disease risk through interactions with other genes. As a result, these genes may not be identified by single-marker methods that are widely used in genome-wide association studies. To explore this possibility in real data, we carried out a two-stage model selection procedure of joint single-nucleotide polymorphism (SNP) analysis to detect genes associated with rheumatoid arthritis (RA) using Genetic Analysis Workshop 16 genome-wide association study data. In the first stage, the genetic markers were screened through an exhaustive two-dimensional search, through which promising SNP and SNP pairs were identified. Then, LASSO was used to choose putative SNPs from the candidates identified in the first stage. We then use the RA data collected by the Wellcome Trust Case Control Consortium to validate the putative genetic factors. Balancing computational load and statistical power, this method detects joint effects that may fail to emerge from single-marker analysis. Based on our proposed approach, we not only replicated the identification of important RA risk genes, but also found novel genes and their epistatic effects on RA. To our knowledge, this is the first two-dimensional scan based analysis for a real genome-wide association study.
PMCID: PMC2795926  PMID: 20018019
7.  Gene hunting of the Genetic Analysis Workshop 16 rheumatoid arthritis data using rough set theory 
BMC Proceedings  2009;3(Suppl 7):S126.
We propose to use the rough set theory to identify genes affecting rheumatoid arthritis risk from the data collected by the North American Rheumatoid Arthritis Consortium. For each gene, we employ generalized dynamic reducts in the rough set theory to select a subset of single-nucleotide polymorphisms (SNPs) to represent the genetic information from this gene. We then group the study subjects into different clusters based on their genotype similarity at the selected markers. Statistical association between disease status and cluster membership is then studied to identify genes associated with rheumatoid arthritis. Based on our proposed approach, we are able to identify a number of statistically significant genes associated with rheumatoid arthritis. Aside from genes on chromosome 6, our identified genes include known disease-associated genes such as PTPN22 and TRAF1. In addition, our list contains other biologically plausible genes, such as ADAM15 and AGPAT2. Our findings suggest that ADAM15 and AGPAT2 may contribute to a genetic predisposition through abnormal angiogenesis and adipose tissue.
PMCID: PMC2795899  PMID: 20017992

Results 1-7 (7)